Splunk Search

In a visualization, how to include non-existent numeric values in "stats count by" search?

CaninChristellC
Explorer

I'm working on a report for network traffic touching my organization's firewalls, and the report looks like this right now:

index=bluecoat cs_host=[cs_host search pattern goes here] | eval grouping=floor(cs_bytes/1000) | eval minBytes=grouping*1000, maxBytes=((grouping+1)*1000-1) | eval "Byte Range"=(minBytes/1000)."KB - ".round(maxBytes/1000,0)."KB" | stats count by "Byte Range" | sort by minBytes

It works well for creating the visualization (line graph), except it doesn't reserve space in the visualization for byte ranges that don't exist. For example, if no records fit in the 1000 through 2000 byte range, the visualization doesn't have space reserved where the count is shown to be 0.

Is there a way I can enforce that the missing in-between values are represented with a count of 0?

0 Karma

DalJeanis
Legend

This will generate your bin names and min/max byte values for the bins. I've assumed a maximum of 1000K with the count=1000, but you can do whatever you like here.

 index=bluecoat cs_host=[cs_host search pattern goes here] 
  | stats max(cs_bytes) as cs_max 
  | eval CSK_max = floor(cs_max/1000)
  | map search="makeresults count=1000 | streamstats count as TheBin 
                | eval CSK_max = $CSK_max$ | eval keepme = if(TheBin<=CSK_max,1,0) 
                | search keepme=1" maxsearches=1
  | append [makeresults | eval TheBin = 0]
  | eval minBytes=1000*TheBin
  | eval maxBytes=1000*TheBin+999
  | eval NextBin = TheBin + 1
  | eval TheBinName=TheBin."KB - ".NextBin."KB"
  | eval EventCount=0
  | table TheBin TheBinName minBytes maxBytes EventCount

giving results something like this -

0    0KB - 1KB      0    999   0
1    1KB - 2KB   1000   1999   0
2    2KB - 3KB   2000   2999   0
3    3KB - 4KB   3000   3999   0

then append your base results like this (without calculating anything but TheBin, because you'll get those values from the above "fake events")

| append
    [search index=bluecoat cs_host=[cs_host search pattern goes here] 
    | eval TheBin=floor(cs_bytes/1000) 
    | eval EventCount = 1
    ]

and then sum them up like this

| stats sum(EventCount) as EventCount, first(TheBinName) as TheBinName, 
    first(minBytes) as minBytes, first(maxBytes) as maxBytes by TheBin
| sort 0 minBytes

edited to use sort 0 rather than sort in case there were more than 100 values to be sorted.

0 Karma

lguinn2
Legend

Try this - it doesn't give the answer in exactly the same format, but I think it will solve the missing values problem:

index=bluecoat cs_host=[cs_host search pattern goes here]
| bin span=1000 cs_bytes as byte_range
| stats count by byte_range
| rename byte_range as "Byte Range"
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...