Splunk Search

How to merge events and average based upon specific fields?

sityuages
New Member

First, the background - I have a number of events that are parsed and indexed. The format of the log file is:

[timestamp] [Sub1] Name="Bob" Count=2 [Sub2] Name="Sarah" Count=5 [Sub1] Name="Sarah" Count=3 [Sub2] Name="Bob" Count=0

[timestamp] [Sub1] Name="Bob" Count=5 [Sub2] Name="Sarah" Count=3 [Sub1] Name="Sarah" Count=0 [Sub2] Name="Bob" Count=3

[timestamp] [Sub1] Name="Bob" Count=1 [Sub2] Name="Sarah" Count=2 [Sub1] Name="Sarah" Count=0 [Sub2] Name="Bob" Count=2

The [timestamp] is the start of a new event. Splunk splits these lines into their own events properly. The [Sub1] and [Sub2] tags denote different items in my system, so for now I am only wanting to use the fields that come after Sub1.

I cannot determine what Search query to use that will search the indexed data, take the average of the Counts based upon [Sub1] and each "Name", and spit the results out.

Does anyone have an idea on this?

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try something like this

your base search | table _time _raw | rex max_match=0 "\[Sub1\]\s*(?<Temp>[^\[]+)" | fields - _raw | mvexpand Temp| rex field=Temp "s*Name=\"(?<Name>[^\"]+)\s*Count=(?<Count>\d+)" | timechart avg(Count) by Name 

First rex command will extract everything between [Sub1] till first occurrence of "[", since there can be multiple such sections, max_match=0 for multivalued field, mvexpand to generate separate row for each Name, another rex for extracting Name and Count and finally the timechart with Avg Count by Name.

View solution in original post

somesoni2
Revered Legend

Try something like this

your base search | table _time _raw | rex max_match=0 "\[Sub1\]\s*(?<Temp>[^\[]+)" | fields - _raw | mvexpand Temp| rex field=Temp "s*Name=\"(?<Name>[^\"]+)\s*Count=(?<Count>\d+)" | timechart avg(Count) by Name 

First rex command will extract everything between [Sub1] till first occurrence of "[", since there can be multiple such sections, max_match=0 for multivalued field, mvexpand to generate separate row for each Name, another rex for extracting Name and Count and finally the timechart with Avg Count by Name.

sityuages
New Member

This worked very well. The only change I made was to break out the rex for extracting Name and Count into 2 separate ones. Thank you somesoni2!

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

I think a little extra field extraction might go a long way, in terms of making sure that you are only dealing with the names that directly follow the [Sub1].

First, extract it with rex:

| rex "\[Sub1\] Name="(?<sub1_name>[^"])"

Then, just do a nice lil timechart.

| timechart count by sub1_name

So the final would be

search here
| rex "\[Sub1\] Name="(?<sub1_name>[^"])"
| timechart count by sub1_name
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...