Need some help adding a 0 count at search time.
I have a log that contains the execution duration of a code function.
Using stats I can count the number of times the function took
|stats count by functionDuration
functionDuration|count
120|1
122|2
123|1
124|5
130|10
132|8
From this the functionDuration took 120 1 time, 122 2 times, etc...
I would like to show a count of 0 for the missing functionDuration times.
functionDuration|count
120|1
121|0
122|2
123|1
124|5
125|0
126|0
127|0
128|0
129|0
130|10
131|0
132|8
I searched around and couldn't find answer.
Thanks for any help.
It would be nice if you could just have a command called like generateRows rows=N
that would generate a result set with N rows.
However we can hack one together ourselves using a search for the 1000 most recent events in the last 30 days.
<search terms> | stats count by functionDuration | append [| search index=* OR index=_internal earliest=-30d | head 1000 | fields - * | fields - _* | streamstats count | rename count as functionDuration | eval count=0] | stats sum(count) as count by functionDuration | sort functionDuration
The basic idea is, use our weird subsearch hack to append 1000 rows, with every functionDuration value from 1-1000 represented. Then when we wash the original "real" events together with the fake events, through `| stats sum(count) as count by functionDuration), suddenly we'll have every functionDuration value from 1-1000 represented...
Alternates:
1) Instead of the weird index=* OR index=_*
search, you could also just use a csv in /var/run/dispatch that has 1000 rows in it.
2) Or you could do the trick in that other answer, where you do | eval foo="1,2,3,4,5,6,7,8,..." | eval foo=split(foo,",") | mvexpand foo
, but I'm not sure those commands will be overjoyed at expanding out to thousands of rows... You might hit a limit around 100 or a couple hundred.
It would be nice if you could just have a command called like generateRows rows=N
that would generate a result set with N rows.
However we can hack one together ourselves using a search for the 1000 most recent events in the last 30 days.
<search terms> | stats count by functionDuration | append [| search index=* OR index=_internal earliest=-30d | head 1000 | fields - * | fields - _* | streamstats count | rename count as functionDuration | eval count=0] | stats sum(count) as count by functionDuration | sort functionDuration
The basic idea is, use our weird subsearch hack to append 1000 rows, with every functionDuration value from 1-1000 represented. Then when we wash the original "real" events together with the fake events, through `| stats sum(count) as count by functionDuration), suddenly we'll have every functionDuration value from 1-1000 represented...
Alternates:
1) Instead of the weird index=* OR index=_*
search, you could also just use a csv in /var/run/dispatch that has 1000 rows in it.
2) Or you could do the trick in that other answer, where you do | eval foo="1,2,3,4,5,6,7,8,..." | eval foo=split(foo,",") | mvexpand foo
, but I'm not sure those commands will be overjoyed at expanding out to thousands of rows... You might hit a limit around 100 or a couple hundred.
Thanks - this worked great to fill in all the 0 for my missing values.
Maybe this one could help with some inspiration? http://splunk-base.splunk.com/answers/23839/include-zero-count-in-stats-count