I'm looking to build some reports around error counts in our system. I've got a splunk search which returns an error count by server (ComputerName) using the following query
index=monitoring "Group=errors" | stats sum(linecount) as "Error Count" by ComputerName
What I'd like to do is add another column to this chart whch shows the average/mean/whateveryoucallit of all of the lines (the linecount field), so I would get output something like the following:
ComputerName ErrorCount Average
------------- ----------- --------
Computer1 2 5
Computer2 10 5
Computer3 3 5
I've tried append, and appendcols - neither has worked as I would like. I think what I really want is an eval statement to define the Average, but I can't seem to get results for both the above and eval mean(linecount)
because the first search is by ComputerName and the second search is for all items. Can anyone please point me in the correct direction?
Is there a reason why you don't want to just add the mean as a second statistical operator in the stats
command?
index=monitoring "Group=errors" | stats sum(linecount) as "Error Count", mean(linecount) as "Average" by ComputerName
EDIT: So, I didn't catch originally that you meant an average for all events that should always be included in each stats line. This could be done by making sure that average is always available in a field for each event (I'm calling it "lineavg" in this example) and then pull that absolute value into the stats
command using something like first()
or max()
. To get the average, use a subsearch where the output field is called "query". This will make the subsearch return output that's suitable for eval
.
index=monitoring "Group=errors" | eval lineavg=[search index=monitoring "Group=errors" | stats mean(linecount) as query | fields query] | stats sum(linecount) as "Error Count", first(lineavg) as "Average" by ComputerName
You want to use eventstats
:
index=monitoring "Group=errors" | eventstats mean(linecount) as avglinecount | stats sum(linecount) as "Error Count", first(avglinecount) as "Average" by ComputerName
This works somewhat better than the version with a subsearch, at it only needs a single pass over the data, rather than two passes. Another way that works is:
index=monitoring "Group=errors" | sistats mean(linecount),sum(linecount) by ComputerName | eventstats mean(linecount) as avglinecount | stats sum(linecount) as "Error Count" by ComputerName, avglinecount
This is just a performance tweak on the previous, but it's only useful if you have a very large number of errors per ComputerName, i.e., if the ratio between the size of your original base query (index=monitoring "Group=errors"
) and the final number of results you get at then end of the full query is very large.
This is a really good solution. I didn't know about eventstats
Is there a reason why you don't want to just add the mean as a second statistical operator in the stats
command?
index=monitoring "Group=errors" | stats sum(linecount) as "Error Count", mean(linecount) as "Average" by ComputerName
EDIT: So, I didn't catch originally that you meant an average for all events that should always be included in each stats line. This could be done by making sure that average is always available in a field for each event (I'm calling it "lineavg" in this example) and then pull that absolute value into the stats
command using something like first()
or max()
. To get the average, use a subsearch where the output field is called "query". This will make the subsearch return output that's suitable for eval
.
index=monitoring "Group=errors" | eval lineavg=[search index=monitoring "Group=errors" | stats mean(linecount) as query | fields query] | stats sum(linecount) as "Error Count", first(lineavg) as "Average" by ComputerName
Thanks, Ayn, that seems to do what I want it to do. Now I just need to wrap my brain around how it all works.
For some reason, the math is working out strangely, but that is probably more related to my data than your query, as it all works out logically in small volumes. Thanks for the help.
Ah, I see. Sorry, didn't catch that. Updating my answer with a suggestion on how to solve your problem.
The reason I'm not using that method is that it returns per-computer averages, and that's not what I'm looking for.