Regex for uri path

loadtest · ‎03-11-2014

Hi,

I'm having trouble extracting the uri_path of my log files.

Here's an example of a line in my log file

115.252.41.38 "65.165.121.16" - www.site.com [27/Feb/2014:23:29:59 -0500] "GET /images/focus/gallery/?zipCode=70006&distance=50 HTTP/1.1" 200 67362 1 esds036b.md5.site.com:9789 "-" "Mozilla/5.0 (iPad; CPU OS 7_0_4 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11B554a Safari/9537.53"

I'm trying to extract out "/images/focus/gallery/" with a Regex, but am having difficulties in doing so. Any help is appreciated.

ianathompson · ‎08-28-2014

For the weblogs, I just used the inline field extractor with some small changes to extract the uriPath from the uri field that was imported from the AWS ELB logs I received. The main issue is some URIs have query fields (?key=value) and some do not. This has worked for me so far.

rex field=uri "(?i)^(?:[^/]*/){3}(?P<uriPath>[^(\?|\s)]+)"

If I make any more changes, I will update the rex.

Also you can check out the URL Parser app in the Splunk App Store. Just take note that it has errors in the code that have to be corrected. They are noted here.

lukejadamec · ‎03-11-2014

Try this:

search | regex "\s/[^ ]+/[^ ]+/[^ ]+/"

lukejadamec · ‎03-11-2014

You can try rex:

search | rex "^.*\s(?P<uri_path>/[^ ]+/[^ ]+/[^ ]+/)\S"

That should pull out a uri_path field that can be used for statistics or charting.

loadtest · ‎03-11-2014

How would I extract the path out to a variable to chart it? For example the top used paths using "top limit=2 uri_path"

Regex for uri path

Index This | Forward, I’m heavy; backward, I’m not. What am I?

A Guide To Cloud Migration Success

Join Us for Splunk University and Get Your Bootcamp Game On!