Splunk Search

How to find the most prevalent values for a field

bigtyma
Communicator

Title Position Security_Template
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 FRONT DESK AND IP ORDERS CLERK
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 ELINK VIEW-ONLY CLINICAL

Using the example above, I would like to have a report listing the most frequent or most common value of 'Security_Template' for every Position. In the example above the most prevalent would be 'IP VIEW-ONLY'

Bonus point is if we can list the percentage of records that have each value.

Thank you!

Tags (3)
0 Karma

tgow
Splunk Employee
Splunk Employee

If the field is already extracted then you can run the "top" command. Here is a quick example:

source=... | top Security_Template, PositionID

0 Karma

bigtyma
Communicator

woohoo! this works. I had to to change it slightly to ' | top Security_Template by PositionID

0 Karma

bigtyma
Communicator

Thank you for replying, unfortunately I did not get the results I was hoping for. Security_Template is already extracted.

In this scenario I should see

PositionID SecurityTemplate Percentage
F01-7121-600002 IP VIEW ONLY CLINICAL 50%

0 Karma

bigtyma
Communicator

"wdd-d{4}-d{6}" format?

Yes.

0 Karma

tgow
Splunk Employee
Splunk Employee

You will need to extract the field first. Here is a quick example that might work:

source = .... | rex field=_raw "-\d+\s+(?[^\n]+) | stats count by Security_Template | sort -count

If you want it by percentage then use the "top" search command.

source = .... | rex field=_raw "-\d+\s+(?[^\n]+) | top Security_Template

If this is a CSV file with a header then you might be able to set the CHECK_FOR_HEADER in the props.conf file. Here is a link to more information:

http://docs.splunk.com/Documentation/Splunk/5.0.1/admin/Propsconf

0 Karma

bigtyma
Communicator

Thank you for replying, unfortunately I did not get the results I was hoping for. Security_Template is already extracted.

In this scenario I should see

PositionID SecurityTemplate Percentage
F01-7121-600002 IP VIEW ONLY CLINICAL 50%

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

The data is not easy to split into fields because of the number of spaces. Will the position column always be in the "\w\d\d-\d{4}-\d{6}" format?

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...