Splunk Search

Help with Regex

newbie2tech
Communicator

Hi All,

Need help with regex for extracting desired output from below patterns. I have ecommerce site where we want to find out navigation stats, i have field called uri which has values as listed below.

I want to have a regex which will help me report stats by rolling up higher level instead of individual views.

Can this be achieved using single regex, right now i have 4 different regexes, 1 for each of the patterns and then collating those numbers to generate the summary. Wanted to check if this can be achieved using single regex.

uri field sample values
/Home/
/Home/account-settings/
/Home/men/clothing-360/XXL
/Home/men-kids/summerwear/1059770200
/Home/women-athelets/shoes/1793254100?tab=kobe-xyz
/Home/article/reviews/?id=reviewsMetadata-abc033
/Home/account-summary?actions=[setAccountType:personal;selectTab:rewards]
/Home/search?q=2016%20%20form%201099

desired output

/Home/
/Home/account-settings/
/Home/men/clothing-360/
/Home/men-kids/summerwear/
/Home/women-athelets/shoes
/Home/article/reviews/
/Home/account-summary or /Home/account-summary?
/Home/search or /Home/search?

Please let me know if any further information is needed. I am on 6.5.2

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this (everything except last line is to generate sample data, replace it with your search)

| gentimes start=-1 | eval raw="/Home/ /Home/account-settings/ /Home/men/clothing-360/XXL /Home/men-kids/summerwear/1059770200 /Home/women-athelets/shoes/1793254100?tab=kobe-xyz /Home/article/reviews/?id=reviewsMetadata-abc033 /Home/account-summary?actions=[setAccountType:personal;selectTab:rewards] /Home/search?q=2016%20%20form%201099" | table raw | makemv raw | mvexpand raw | rename raw as uri 
| rex field=uri "^(?<URI>.+)\/.*" | stats count by URI

Updated regex

..your base search..| rex field=uri "^(?<URI>(\/[^\/\?]+){1,3})"

View solution in original post

woodcock
Esteemed Legend

Check out these great apps:

URL Parser: https://splunkbase.splunk.com/app/1545/
URL Toolbox: https://splunkbase.splunk.com/app/2734/
URL Expander (what is that tinyurl?): https://splunkbase.splunk.com/app/3460/

newbie2tech
Communicator

thank you woodcock for pointing me to these apps, I would check them however I would not be able to install and use them in production for current problem, I will upvote the response for guidance.

0 Karma

somesoni2
Revered Legend

Try like this (everything except last line is to generate sample data, replace it with your search)

| gentimes start=-1 | eval raw="/Home/ /Home/account-settings/ /Home/men/clothing-360/XXL /Home/men-kids/summerwear/1059770200 /Home/women-athelets/shoes/1793254100?tab=kobe-xyz /Home/article/reviews/?id=reviewsMetadata-abc033 /Home/account-summary?actions=[setAccountType:personal;selectTab:rewards] /Home/search?q=2016%20%20form%201099" | table raw | makemv raw | mvexpand raw | rename raw as uri 
| rex field=uri "^(?<URI>.+)\/.*" | stats count by URI

Updated regex

..your base search..| rex field=uri "^(?<URI>(\/[^\/\?]+){1,3})"

newbie2tech
Communicator

Thank you somesoni2 for the response, it is working in most of the cases except below scenarios[my bad as these were not part of original ask], can the same regex be tweaked to accommodate these as well. I am accepting the answer as you made this complex scenario deal with simple regex. Hope below scenarios can be accommodated as well.

A forward slash at the end of desired output is not mandatory, no harm if we have on, if it makes any easier.
/home/clothing/dry-fit/000000274/ --> should be /home/clothing/dry-fit
/home/women-athelete/kobe/22540S836/--> should be /home/women-athelete/kobe
/home/search/products? --> should be /home/search/products and NOT /home/search
/home/search/products/?id=home --> should be /home/search/products and NOT /home/search
/home/account-summary/purchases --> should be /home/account-summary/purchases and NOT /home/account-summary
/purchases --> should be captured, current regex ignores this
/home/products/all --> should be captured, current regex makes it /home/products/

0 Karma

somesoni2
Revered Legend

Try with updated regex.

0 Karma

newbie2tech
Communicator

Thank you somesoni2...updated regex worked perfectly for all my scenarios...thanks a ton.

0 Karma

masonmorales
Influencer

Can you help us understand what the "higher level" you want to roll up to is? Also, what would your desired output look like?

0 Karma

newbie2tech
Communicator

Hi Masonmorales, desired output is what I meant by higher roll up, I want to display count for these navigations. Essentially all the URI patterns which have digits or ? are specific to individuals , we want to only consider until digit or ? is encountered.

For example in our logs, we will have 3 values for uri field as below

/Home/men-kids/summerwear/1059770200
/Home/men-kids/summerwear/1059770201
/Home/men-kids/summerwear/1059770202

While we just want to count it as -->
uri count
/Home/men-kids/summerwear 3

desired output

URI Count
/Home/ 50
/Home/account-settings/ 100
/Home/men/clothing-360/ 20
/Home/men-kids/summerwear/ 123
/Home/women-athelets/shoes 100
/Home/article/reviews/ 200
/Home/account-summary or /Home/account-summary? 220
/Home/search or /Home/search? 112

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...