Splunk Search

How to find five most common used categories specific to my data

splunkman341
Communicator

Hi guys,

So I am trying to pull out the five most commonly used categories, and five most commonly used subcategories from my logs.

Below, is a sample log of the information I am trying to pull out.

Category and subcategory are a parent child relationship with category being the parent. Can someone lend a hand?

Thank you

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

You have some options (try both for the last pair):

For Category:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by Category | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 Category

For subCategory:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by subCategory | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 subCategory

But perhaps for subCategory you need:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by Category,subCategory | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 subCategory by Category

View solution in original post

woodcock
Esteemed Legend

OK, it took a WebEx to figure it out but the problem was that the events were not like the sample event given so the RegEx was wrong. Here is what worked 100%:

index=doccloud_main sourcetype=doccloud_catalina "Document workspace"| rex "Category:\s*(?<Category>[^,]*),\s*subCategory:\s*(?<subCategory>.*)" | top 5 subCategory by Category

woodcock
Esteemed Legend

You have some options (try both for the last pair):

For Category:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by Category | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 Category

For subCategory:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by subCategory | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 subCategory

But perhaps for subCategory you need:

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | stats count by Category,subCategory | sort - count | head 5

Or

    ... | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 subCategory by Category

splunkman341
Communicator

Hey woodcock thanks for your anwser.

So i created a query based off some of the info you gave me and came out with :

index=doccloud_main sourcetype=doccloud_catalina "Document workspace" | stats count by Category,subCategory | sort - count | head 5

But when I click on the statistics tab, it displays " no results found".

Can you please help

0 Karma

woodcock
Esteemed Legend

I thought you had already created the field extractions but I guess not; you need to add this (examples above are corrected):

... rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)"

splunkman341
Communicator

I have added in more info into the query and tried to execute:

index=doccloud_main sourcetype=doccloud_catalina "Document workspace" | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)" | top limit=5 subCategory by Category

and still got the same result as before

0 Karma

woodcock
Esteemed Legend

Do you get any results from any of the searches? What you are describing makes no sense at all (this is a very basic thing)! Try stripping off everything after each pipe ("|") starting from the right side and see if what you see each time makes sense and find out where things break down.

splunkman341
Communicator
index=doccloud_main sourcetype=doccloud_catalina "Document workspace" | rex  ".*Category:(?<Category>\d+), subCategory:(?<subCategory>\d+)"

I tried all of the above after the regex and nothing works past the regex and i'm not sure why. Pretty much the above code is the only thing that executes correctly.

0 Karma

woodcock
Esteemed Legend

Are you sure there are events in your time range and base search? Try it for "All time" and also try it for index=doccloud_main (without the sourcetype=doccloud_catalina "Document workspace"). There is something fundamentally wrong about the basics of what you are saying.

splunkman341
Communicator

I apologize for my wording. All the above queries execute and display events, its just that when you click on the statistics or visualization tabs nothing is displayed. I also tried removing the source type and the following string, and I get the same results - events are displaying but not statistics.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...