Splunk Search

How to find out the count of zero for the multiple values of a field?

bollam
Path Finder

Hello,

I have a following query which gives the count of "zero".

index=main item_type=television
| timechart count span=1m as item_count
| fillnull
| eval item_nonexistence = if(item_count <= 0, 1, null())
| stats sum(item_nonexistence) as item_count
| search item_count > 5

It gives me the count if the condition, else no results. I have bunch of items (10) and i would like to get if count of "zero" is greater than 5.
if count of zero is greater than 5 for item1, item2, item5, item8. It should produce me the results in the following way.

item_type count
item1 item1_missing
item2 item2_missing
item5 item5_missing
item8 item8_missing

Tags (2)
0 Karma
1 Solution

DalJeanis
Legend

To rewrite your problem statement: You want to know, for each item_type, whether for any five or more individual minutes across the search timeline, there were no events.

index=main item_type=*
| fields _time item_type
| bin _time span=1m 
| timechart span=1m count as item_count by item_type
| fillnull

Up to this point, you had it right. The fields command we added is just explicitly telling splunk that those are the only fields needed for the remainder of the calculation, and special field _time is there just for us to see, because the fields command doesn't remove hidden fields like _time unless we specifically tell it to.

| untable _time item_type Item_count
| where item_count=0
| stats count as minute_count by item_type
| where minute_count > 5

Untable is a special command that will take each record, and split it up into multiple records, each of which has the first named field (in this case _time), and the name and value of one other field. The name will go in the second parameter, and the value in the third parameter. I could have said untable _time foo bar and the field foo would contain the item_type and bar would contain the count for that time.

So, for example, if your records were these two records...

 _time     drone   radio  television  typewriter 
(time1)     0         3          3          0 
(time2)     0         0          1          5 

They would become these eight records...

 _time   item_type  Item_count

(time1)      drone       0
(time1)      radio       3
(time1)      television  3
(time1)      typewriter  0

(time2)      drone       0
(time2)      radio       0
(time2)      television  1
(time2)      typewriter  5

Passing through the where test, these ones would pass...

 _time   item_type  Item_count

(time1)      drone       0
(time1)      typewriter  0

(time2)      drone       0
(time2)      radio       0

Then the stats would put them together like this

item_type  minute_count
drone       2
radio       1
typewriter  1

And if any of them had more than five, they would pass the next test and give you your list. You can then add any wording you want with another eval.

View solution in original post

0 Karma

woodcock
Esteemed Legend

I believe that this is the Sentinel Search problem discussed (with solution) here:

https://conf.splunk.com/session/2015/conf2015-LookupTalk.pdf

0 Karma

DalJeanis
Legend

To rewrite your problem statement: You want to know, for each item_type, whether for any five or more individual minutes across the search timeline, there were no events.

index=main item_type=*
| fields _time item_type
| bin _time span=1m 
| timechart span=1m count as item_count by item_type
| fillnull

Up to this point, you had it right. The fields command we added is just explicitly telling splunk that those are the only fields needed for the remainder of the calculation, and special field _time is there just for us to see, because the fields command doesn't remove hidden fields like _time unless we specifically tell it to.

| untable _time item_type Item_count
| where item_count=0
| stats count as minute_count by item_type
| where minute_count > 5

Untable is a special command that will take each record, and split it up into multiple records, each of which has the first named field (in this case _time), and the name and value of one other field. The name will go in the second parameter, and the value in the third parameter. I could have said untable _time foo bar and the field foo would contain the item_type and bar would contain the count for that time.

So, for example, if your records were these two records...

 _time     drone   radio  television  typewriter 
(time1)     0         3          3          0 
(time2)     0         0          1          5 

They would become these eight records...

 _time   item_type  Item_count

(time1)      drone       0
(time1)      radio       3
(time1)      television  3
(time1)      typewriter  0

(time2)      drone       0
(time2)      radio       0
(time2)      television  1
(time2)      typewriter  5

Passing through the where test, these ones would pass...

 _time   item_type  Item_count

(time1)      drone       0
(time1)      typewriter  0

(time2)      drone       0
(time2)      radio       0

Then the stats would put them together like this

item_type  minute_count
drone       2
radio       1
typewriter  1

And if any of them had more than five, they would pass the next test and give you your list. You can then add any wording you want with another eval.

0 Karma

bollam
Path Finder

@DalJeanis, Awesome!! Thanks a ton! This is exactly I was looking at.

0 Karma

bollam
Path Finder

@DalJeanis, I have a problem here, When the Item_count is zero for the selected time-range. The item_type is not being listed in the results.
Though the count of zero's exists with item_type drone its not showing up.

_time item_type Item_count
(time1) drone 0
(time1) typewriter 0
(time1) radio 0
(time2) drone 0
(time2) typewriter 0
(time3) drone 0
(time4) drone 0

The output looks like below:

item_type minute_count
radio 1
typewriter 1

I would like to add the item_type with the Item_count as 1, when the Item_count are all zero's.

Expected output for the above scenario:

item_type minute_count
radio 1
typewriter 2
drone 1

Can you please help with this?

0 Karma

DalJeanis
Legend

Sure. You need to ensure there is a record for each category for the time period immediately below your time period in question, before the chart command is run, then you throw that record away after the chart command. Let's assume you have a csv file called mylist.csv with the list of item_type values you need. Then something like this should work....

index=main item_type=*
 | fields _time item_type
 | bin _time span=1m 
 | append [ | inputcsv mylist.csv 
    | addinfo 
    | eval _time = info_min_time - 60 
    | table  _time item_type
    ] 

 | timechart span=1m count as item_count by item_type
 | fillnull
 | eventstats min(_time) as mintime 
 | where _time > mintime 
 | fields - mintime  

 | untable _time item_type Item_count
 | where item_count=0
 | stats count as minute_count by item_type
 | where minute_count > 5

There might be a slightly less complicated way to use the csv, though. Hmm.

     index=main item_type=*
     | fields _time item_type
     | bin _time span=1m 
     | timechart span=1m count as item_count by item_type
     | fillnull

     | untable _time item_type Item_count
     | appendpipe [
         | stats count as present by item_type
         | inputcsv append=t mylist.csv
         | stats count as dupcount by item_type
         | where dupcount=1 
         | eval flag = "missing" 
         | table  _time item_type
        ] 

 | where item_count=0
 | stats count as minute_count max(flag) as flag by item_type
 | where minute_count > 5 or isnotnull(flag)
0 Karma

bollam
Path Finder

@DalJeanis, Thanks for the reply! The query seems to be working only if there are records for each item_type but I would like the item_type to be updated with any of the values( 0 or 1) when there are no records found for the specific item_type.
For example:
Let's assume the count of zero for drone is 3, count of zero for typewriter is 2 and count of zero for radio is 0, i.e., radio has events for each minute for the selected time-range(15 Min) and television has no records for the selected time-range(15 Min) so I'm giving a dummy value as 15.
I should get the following results when the condition is set as where count >= 2

item_type count
drone 3
typewriter 2
television 15

Below it's returning this which is inappropriate.
drone 3
typewriter 2
radio 15
television 15

item_type radio should not be in the results as it does not contain the value 0. Can you help on this?

Query:

index=main item_type=*
| fields _time item_type
| bin _time span=1m
| timechart span=1m count as item_count by item_type useother=f
| fillnull
| untable _time item_type item_count
| where item_count = 0
| stats count by item_type
| append
[| inputlookup Testing_Alerts
| fields + item_type
| stats count by item_type
| eval count = 0
| fields item_type count ]
| dedup item_type
| rename count as missing_item_count
| eval missing_item_count = if(missing_item_count = 0, 15, missing_item_count)
| where missing_item_count >= 2

0 Karma

adonio
Ultra Champion

hmmmm, looks like you are filtering the item_type in your search: .... item_type = television

0 Karma

bollam
Path Finder

@adonio, I have given it for an instance but actually the query does not contain any item_type. I want to sum the count of the events which are zero's and if the sum is greater than 5 then the results should look like below.

item_type       event_count
item1     missing events are more than five for the selected time-range
item2          1
item3          2
item4          missing events are more than five for the selected time-range

Can you help me in getting the desired output? Please let me know if you require any additional information

0 Karma

andreacorvini
Path Finder

Can you give an example of your input?
Because you would like to get the output divided by item_type but in the query you do not have any "by" condition.
I would like to better understand what you have in input, the output you want is clear.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...