Splunk Search

How to extract a field from a GET request?

dmenon84
Path Finder

Hi All - I am having trouble extracting the following fields from a GET request .

GET **/TSGene/**images/literature.jpg

I tried the following but it did not seem to work \bGET\s+\K\S+(\/[\/[:word:]\-\.\=\&\?]+)\s

I just want to extract the part highlighted above. Thanks in advance!

Thanks,
Deepthi

0 Karma

FeatureCreeep
Path Finder

This should get you what you want:

| rex "\"GET (?P<url>\/.*?[\/ ])" | eval url=trim(url)

This will match in the case of an additional / and in the case where there isn't a second /. If there is no / then there will be a trailing space in the url so I added a trim to remove it. A fancier regex could probably remove the need for the trim but this works.

I'm a little confused about what you want to do with POSTs. In your example above, you still parsed POSTs but maybe that was just an oversight. I would suggest filtering them out so you are only processing events with ""GET " in the event. If you don't filter them out then the "url" field will be NULL since the regex will not match.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Try this.. Your fieldname will be GET

| rex (?<GET>GET\s\S+\.jpg)

0 Karma

dmenon84
Path Finder

Sorry if I wasn't clear I only want the following parts extracted. The data between the first slashes / after GET which should include the slashes / .

Extracted data -

/TSGene/
/TSGene/
/favicon.ico
/TSGene/
/static/
/static/
/orl/

Actual requests -

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

0 Karma

tiagofbmm
Influencer

Try this one:

| rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/]*)"
0 Karma

dmenon84
Path Finder

Thanks that works better but in some cases it picks up the HTTP that follows the requests.

Can this be modified to extract like this ?

"HTTPS","","GET /favicon.ico HTTP/1.1\r\n -> /favicon.ico should only be extracted.

At this time, it extracts the following -> - /favicon.ico HTTP

Thanks in advance !

0 Karma

tiagofbmm
Influencer

Yes just use the space in the rex too

 | rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/|\s]*)"
0 Karma

tiagofbmm
Influencer

Can you please paste a full example of the GET request?

0 Karma

dmenon84
Path Finder

Sure - some more samples of GET and POST

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

some logs have version number in between

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...