Solved: Re: eval mean(something) when data is split by ano...

dang · ‎11-11-2011

I'm looking to build some reports around error counts in our system. I've got a splunk search which returns an error count by server (ComputerName) using the following query

index=monitoring "Group=errors" | stats sum(linecount) as "Error Count" by ComputerName

What I'd like to do is add another column to this chart whch shows the average/mean/whateveryoucallit of all of the lines (the linecount field), so I would get output something like the following:

ComputerName     ErrorCount     Average
-------------    -----------    --------
Computer1         2              5
Computer2         10             5
Computer3         3              5

I've tried append, and appendcols - neither has worked as I would like. I think what I really want is an eval statement to define the Average, but I can't seem to get results for both the above and eval mean(linecount) because the first search is by ComputerName and the second search is for all items. Can anyone please point me in the correct direction?

Ayn · ‎11-11-2011

Is there a reason why you don't want to just add the mean as a second statistical operator in the stats command?

index=monitoring "Group=errors" | stats sum(linecount) as "Error Count", mean(linecount) as "Average" by ComputerName

EDIT: So, I didn't catch originally that you meant an average for all events that should always be included in each stats line. This could be done by making sure that average is always available in a field for each event (I'm calling it "lineavg" in this example) and then pull that absolute value into the stats command using something like first() or max(). To get the average, use a subsearch where the output field is called "query". This will make the subsearch return output that's suitable for eval.

index=monitoring "Group=errors" | eval lineavg=[search index=monitoring "Group=errors" | stats mean(linecount) as query | fields query] | stats sum(linecount) as "Error Count", first(lineavg) as "Average" by ComputerName

View solution in original post

gkanapathy · ‎11-12-2011

You want to use eventstats:

index=monitoring "Group=errors" | eventstats mean(linecount) as avglinecount | stats sum(linecount) as "Error Count", first(avglinecount) as "Average" by ComputerName

This works somewhat better than the version with a subsearch, at it only needs a single pass over the data, rather than two passes. Another way that works is:

   index=monitoring "Group=errors" | sistats mean(linecount),sum(linecount) by ComputerName | eventstats mean(linecount) as avglinecount | stats sum(linecount) as "Error Count" by ComputerName, avglinecount

This is just a performance tweak on the previous, but it's only useful if you have a very large number of errors per ComputerName, i.e., if the ratio between the size of your original base query (index=monitoring "Group=errors") and the final number of results you get at then end of the full query is very large.

chustar · ‎04-24-2015

This is a really good solution. I didn't know about eventstats

Ayn · ‎11-11-2011

Is there a reason why you don't want to just add the mean as a second statistical operator in the stats command?

index=monitoring "Group=errors" | stats sum(linecount) as "Error Count", mean(linecount) as "Average" by ComputerName

EDIT: So, I didn't catch originally that you meant an average for all events that should always be included in each stats line. This could be done by making sure that average is always available in a field for each event (I'm calling it "lineavg" in this example) and then pull that absolute value into the stats command using something like first() or max(). To get the average, use a subsearch where the output field is called "query". This will make the subsearch return output that's suitable for eval.

index=monitoring "Group=errors" | eval lineavg=[search index=monitoring "Group=errors" | stats mean(linecount) as query | fields query] | stats sum(linecount) as "Error Count", first(lineavg) as "Average" by ComputerName

dang · ‎11-11-2011

Thanks, Ayn, that seems to do what I want it to do. Now I just need to wrap my brain around how it all works.

For some reason, the math is working out strangely, but that is probably more related to my data than your query, as it all works out logically in small volumes. Thanks for the help.

Ayn · ‎11-11-2011

Ah, I see. Sorry, didn't catch that. Updating my answer with a suggestion on how to solve your problem.

dang · ‎11-11-2011

The reason I'm not using that method is that it returns per-computer averages, and that's not what I'm looking for.

eval mean(something) when data is split by another field

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases