Getting Data In

Can I loop through URL and http_referrer to find original request?

bababou
Explorer

Hi everyone,

I'd like to see the flow from a given final URL, back to original URL the user typed.

In my Web Proxy Logs, I see the following :
_time, src_ip, http_referrer, http_method, URL

For example :
003, 1.1.1.1, htp://www.bbb.com/ads.html, GET, htp://www.ccc.com/ccc.html
002, 1.1.1.1, htp://www.aaa.com/, GET, htp://www.bbb.com/ads.html
001, 1.1.1.1, -, GET, htp://www.aaa.com/

What I want to do is, given the final URL (ccc.com/ccc.html), be able to go back in time, through the pair (http_referrer, URL) and find all the URLs up to the original one (aaa.com) with http_referrer="-".

Sometimes this flow can be spread among 10 different requests mixed in the middle of other web traffic, so this is hard to find by hand.

Programmatically I would do this with one loop, but I cannot find any loops with Splunk.

Can you help me ? Thanks.

Labels (1)
0 Karma
1 Solution

bababou
Explorer

I solved my problem with an external script :


import splunk.Intersplunk

results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()

keywords, options = splunk.Intersplunk.getKeywordsAndOptions()
httpref = options.get('url', '-')

newresults = []

for result in results:
    if httpref == '-':
        break
    if result.get('url') == httpref:
        newresults.append(result)
        httpref = result.get('http_referer')

splunk.Intersplunk.outputResults(newresults)

And I call it this way :

... | referer url="htp://www.ccc.com/ccc.html" | table _time, http_referer, url

View solution in original post

0 Karma

bababou
Explorer

I solved my problem with an external script :


import splunk.Intersplunk

results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()

keywords, options = splunk.Intersplunk.getKeywordsAndOptions()
httpref = options.get('url', '-')

newresults = []

for result in results:
    if httpref == '-':
        break
    if result.get('url') == httpref:
        newresults.append(result)
        httpref = result.get('http_referer')

splunk.Intersplunk.outputResults(newresults)

And I call it this way :

... | referer url="htp://www.ccc.com/ccc.html" | table _time, http_referer, url
0 Karma

somesoni2
Revered Legend

See Splunk's map command which is looping operator.

0 Karma

neerajs_81
Builder

Can someone pls assist how to use MAP command or how to search for the original request URL  without the external script that was marked as solution ?

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @neerajs_81,

Please try below sample with map command;

index="web_proxy" sourcetype="proxy" 
| map search="search index="web_proxy" sourcetype="proxy" http_referrer=$URL$ OR http_referrer="-" | eval finalURL=$URL$ " 
| map search="search index="web_proxy" sourcetype="proxy" http_referrer=$http_referrer$ | eval finalURL=$finalURL$ " 
| search http_referrer="-" 
| dedup _raw 
| rename URL as originalURL 
| table finalURL originalURL

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

technoe
Explorer

How is the data indexed? Maybe you could use a last or first command instead of looping through each one...

0 Karma

bababou
Explorer

Some kind of "transaction" could also be fine, ideally a table with _time and url.

0 Karma

jsie_splunk
Splunk Employee
Splunk Employee

When you say "interested" how do you want the data expressed? As a single field containing the full path?

0 Karma

bababou
Explorer

What really interests me is the whole path.
In this example : aaa.com -> bbb.com/ads.html -> ccc.com/ccc.html
And not only the first and last requests.

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...