Splunk Search

Help with Regex

newbie2tech
Communicator

Hi All,

Need help with regex for extracting desired output from below patterns. I have ecommerce site where we want to find out navigation stats, i have field called uri which has values as listed below.

I want to have a regex which will help me report stats by rolling up higher level instead of individual views.

Can this be achieved using single regex, right now i have 4 different regexes, 1 for each of the patterns and then collating those numbers to generate the summary. Wanted to check if this can be achieved using single regex.

uri field sample values
/Home/
/Home/account-settings/
/Home/men/clothing-360/XXL
/Home/men-kids/summerwear/1059770200
/Home/women-athelets/shoes/1793254100?tab=kobe-xyz
/Home/article/reviews/?id=reviewsMetadata-abc033
/Home/account-summary?actions=[setAccountType:personal;selectTab:rewards]
/Home/search?q=2016%20%20form%201099

desired output

/Home/
/Home/account-settings/
/Home/men/clothing-360/
/Home/men-kids/summerwear/
/Home/women-athelets/shoes
/Home/article/reviews/
/Home/account-summary or /Home/account-summary?
/Home/search or /Home/search?

Please let me know if any further information is needed. I am on 6.5.2

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this (everything except last line is to generate sample data, replace it with your search)

| gentimes start=-1 | eval raw="/Home/ /Home/account-settings/ /Home/men/clothing-360/XXL /Home/men-kids/summerwear/1059770200 /Home/women-athelets/shoes/1793254100?tab=kobe-xyz /Home/article/reviews/?id=reviewsMetadata-abc033 /Home/account-summary?actions=[setAccountType:personal;selectTab:rewards] /Home/search?q=2016%20%20form%201099" | table raw | makemv raw | mvexpand raw | rename raw as uri 
| rex field=uri "^(?<URI>.+)\/.*" | stats count by URI

Updated regex

..your base search..| rex field=uri "^(?<URI>(\/[^\/\?]+){1,3})"

View solution in original post

woodcock
Esteemed Legend

Check out these great apps:

URL Parser: https://splunkbase.splunk.com/app/1545/
URL Toolbox: https://splunkbase.splunk.com/app/2734/
URL Expander (what is that tinyurl?): https://splunkbase.splunk.com/app/3460/

newbie2tech
Communicator

thank you woodcock for pointing me to these apps, I would check them however I would not be able to install and use them in production for current problem, I will upvote the response for guidance.

0 Karma

somesoni2
Revered Legend

Try like this (everything except last line is to generate sample data, replace it with your search)

| gentimes start=-1 | eval raw="/Home/ /Home/account-settings/ /Home/men/clothing-360/XXL /Home/men-kids/summerwear/1059770200 /Home/women-athelets/shoes/1793254100?tab=kobe-xyz /Home/article/reviews/?id=reviewsMetadata-abc033 /Home/account-summary?actions=[setAccountType:personal;selectTab:rewards] /Home/search?q=2016%20%20form%201099" | table raw | makemv raw | mvexpand raw | rename raw as uri 
| rex field=uri "^(?<URI>.+)\/.*" | stats count by URI

Updated regex

..your base search..| rex field=uri "^(?<URI>(\/[^\/\?]+){1,3})"

newbie2tech
Communicator

Thank you somesoni2 for the response, it is working in most of the cases except below scenarios[my bad as these were not part of original ask], can the same regex be tweaked to accommodate these as well. I am accepting the answer as you made this complex scenario deal with simple regex. Hope below scenarios can be accommodated as well.

A forward slash at the end of desired output is not mandatory, no harm if we have on, if it makes any easier.
/home/clothing/dry-fit/000000274/ --> should be /home/clothing/dry-fit
/home/women-athelete/kobe/22540S836/--> should be /home/women-athelete/kobe
/home/search/products? --> should be /home/search/products and NOT /home/search
/home/search/products/?id=home --> should be /home/search/products and NOT /home/search
/home/account-summary/purchases --> should be /home/account-summary/purchases and NOT /home/account-summary
/purchases --> should be captured, current regex ignores this
/home/products/all --> should be captured, current regex makes it /home/products/

0 Karma

somesoni2
Revered Legend

Try with updated regex.

0 Karma

newbie2tech
Communicator

Thank you somesoni2...updated regex worked perfectly for all my scenarios...thanks a ton.

0 Karma

masonmorales
Influencer

Can you help us understand what the "higher level" you want to roll up to is? Also, what would your desired output look like?

0 Karma

newbie2tech
Communicator

Hi Masonmorales, desired output is what I meant by higher roll up, I want to display count for these navigations. Essentially all the URI patterns which have digits or ? are specific to individuals , we want to only consider until digit or ? is encountered.

For example in our logs, we will have 3 values for uri field as below

/Home/men-kids/summerwear/1059770200
/Home/men-kids/summerwear/1059770201
/Home/men-kids/summerwear/1059770202

While we just want to count it as -->
uri count
/Home/men-kids/summerwear 3

desired output

URI Count
/Home/ 50
/Home/account-settings/ 100
/Home/men/clothing-360/ 20
/Home/men-kids/summerwear/ 123
/Home/women-athelets/shoes 100
/Home/article/reviews/ 200
/Home/account-summary or /Home/account-summary? 220
/Home/search or /Home/search? 112

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...