I have a search to SI index=sec marker=01
sourcetype=cisco_firewall | bin _time span=5m | sistats count by log_level, hostname
When calling the results with
index=sec marker=01 | stats count by log_level, hostname
I get results,,,but :
index=sec marker=01 | stats count(eval(log_level="7")) AS Debugging, count(eval(log_level="6")) AS Information, count(eval(log_level="5")) AS Notification, count(eval(log_level="4")) AS Warning, count(eval(log_level="3")) AS Error, count(eval(log_level="2")) AS Critical, count(eval(log_level="1")) AS Alert, count(eval(log_level="0")) AS Emergency by hostname
isnt,,,am i missing something? if this is not possible how can i "SI" this search?
An SQL-like table pivot should serve as reasonable workaround for this. It's not pretty, but it does work.
index=sec marker=01 | stats count by log_level, host | eval Debug_Count=if(log_level==7,count,0) | eval Info_Count=if(log_level==6,count,0) | eval Notify_Count=if(log_level==5,count,0) | eval Warning_Count=if(log_level==4,count,0) | eval Error_Count=if(log_level==3,count,0) | eval Alert_Count=if(log_level==2,count,0) | eval Emergency_Count=if(log_level==1,count,0) | stats max(Debug_Count) as Debugging, max(Info_Count) as Informational, max(Notify_Count) as Notification, max(Warning_Count) as Warning max(Error_Count) as Error max(Alert_Count) as Alert max(Emergency_Count) as Emergency by host
The general idea is to take the "rows" containing data for each
log_level and pivot them up into "columns". Coming out of the first
stats is a series of counts for each
host combination. The series of
eval operations creates new fields ("columns") that either have the actual value from a "row" if (and only if) that "row" contains data of interest in that "column". Otherwise, the value in that row is 0. Then, the second
stats filters down to only the non-zero values in each new field for each
This is a fairly common trick done in SQL with a combination of CASE and GROUP BY. I've just adapted it a bit to Splunk's syntax.
Oddly enough, if you change the 'sistats' to stats it'll work fine.
In general you want to be careful using the SI commands, and the rule is that if you do
sistats count by log_level, hostname in your summary search, then the first reporting clause off of the summary events should be an exact copy of that "si" search but without the 'si'. ie --
stats count by log_level, hostname.
As to a full explanation of what can go off the rails when you cross the streams and why, it's a little beyond me. In this case it's so weird that it may just be a bug.
The good news is that there isnt really any reason to use sistats in such a simple use case. The fancy statistics that it preserves don't add anything here.
UPDATE: One thing you'll have to watch out if you take this approach and use
stats instead of
sistats, is that
stats count on the search side will simply count the number of rows, and pay no attention to the fact that those rows themselves have 'count' fields. As such you often have to do things like
stats sum(count) as count by host when you use stats manually in summary indexing. In this particular case with the eval syntax, I'm not sure what the equivalent would be but there is one. Watch out you dont end up with the count of summary rows rather than count of the original events.
I know it's rather late in the piece, but I just hit the same problem. The problem seems to be that the name of the field that is saved in the summary index is different to the one produced by sistats and consequently what is expected by stats running against the si.
in my case the name generated by sistats was psrsvd_ct_eval(MedianDurationAboveObjective > 0) but the name in the index is psrsvd_ct_eval_MedianDurationAboveObjective___0_
So a simple workaround (what must surely be a bug) for me was
index=summary | rename "psrsvd_ct_eval_MedianDurationAboveObjective___0_" as "psrsvd_ct_eval(MedianDurationAboveObjective > 0)" | stats ...