Splunk Search

How to merge events and average based upon specific fields?

sityuages
New Member

First, the background - I have a number of events that are parsed and indexed. The format of the log file is:

[timestamp] [Sub1] Name="Bob" Count=2 [Sub2] Name="Sarah" Count=5 [Sub1] Name="Sarah" Count=3 [Sub2] Name="Bob" Count=0

[timestamp] [Sub1] Name="Bob" Count=5 [Sub2] Name="Sarah" Count=3 [Sub1] Name="Sarah" Count=0 [Sub2] Name="Bob" Count=3

[timestamp] [Sub1] Name="Bob" Count=1 [Sub2] Name="Sarah" Count=2 [Sub1] Name="Sarah" Count=0 [Sub2] Name="Bob" Count=2

The [timestamp] is the start of a new event. Splunk splits these lines into their own events properly. The [Sub1] and [Sub2] tags denote different items in my system, so for now I am only wanting to use the fields that come after Sub1.

I cannot determine what Search query to use that will search the indexed data, take the average of the Counts based upon [Sub1] and each "Name", and spit the results out.

Does anyone have an idea on this?

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try something like this

your base search | table _time _raw | rex max_match=0 "\[Sub1\]\s*(?<Temp>[^\[]+)" | fields - _raw | mvexpand Temp| rex field=Temp "s*Name=\"(?<Name>[^\"]+)\s*Count=(?<Count>\d+)" | timechart avg(Count) by Name 

First rex command will extract everything between [Sub1] till first occurrence of "[", since there can be multiple such sections, max_match=0 for multivalued field, mvexpand to generate separate row for each Name, another rex for extracting Name and Count and finally the timechart with Avg Count by Name.

View solution in original post

somesoni2
Revered Legend

Try something like this

your base search | table _time _raw | rex max_match=0 "\[Sub1\]\s*(?<Temp>[^\[]+)" | fields - _raw | mvexpand Temp| rex field=Temp "s*Name=\"(?<Name>[^\"]+)\s*Count=(?<Count>\d+)" | timechart avg(Count) by Name 

First rex command will extract everything between [Sub1] till first occurrence of "[", since there can be multiple such sections, max_match=0 for multivalued field, mvexpand to generate separate row for each Name, another rex for extracting Name and Count and finally the timechart with Avg Count by Name.

sityuages
New Member

This worked very well. The only change I made was to break out the rex for extracting Name and Count into 2 separate ones. Thank you somesoni2!

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

I think a little extra field extraction might go a long way, in terms of making sure that you are only dealing with the names that directly follow the [Sub1].

First, extract it with rex:

| rex "\[Sub1\] Name="(?<sub1_name>[^"])"

Then, just do a nice lil timechart.

| timechart count by sub1_name

So the final would be

search here
| rex "\[Sub1\] Name="(?<sub1_name>[^"])"
| timechart count by sub1_name
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...