Is it possible to do delta groupby some field? I have an application which is processing data from multiple queues. Each queue has independent ever increment sequence number. I need to find a missing sequence with search. The log format looks like:
2016-11-21 17:15:40,803 queueName=q1, seqid = 12
2016-11-21 17:26:40,803 queueName=q2, seqid = 32
2016-11-21 17:27:40,803 queueName=q3, seqid = 114
2016-11-21 17:44:41,803 queueName=q3, seqid = 113
2016-11-21 17:50:49,803 queueName=q2, seqid = 34
2016-11-21 17:51:40,803 queueName=q2, seqid = 33
2016-11-21 17:53:40,803 queueName=q1, seqid = 13
2016-11-21 17:58:22,803 queueName=q3, seqid = 116
I am using
sort queueName,seqid | delta seqid as seq_diff | search seq_diff > 1 | table queueName,seqid,seqid_diff
But this does not take care of checking diff across queueName. How do I restrict delta by queueName?
How about you use autoregress
which will be able to look at previous event something like this
your base query to return all the events
| sort queueName, seqId
| autoregress queueName as oldQ p=1
| autoregress seqId as oldSeq p=1
| eval flag=if( ( queueName=oldQ ) AND ( seqId != (oldSeq +1)), 1, 0)
| table queueName, seqId, oldSeqId, flag
| where flag=1
| fields -flag
You can alternatively tweak the if condition of ( seqId != (oldSeq +1))
to something like ( seqId - oldSeq > 1)
or whichever way you feel shall better represent your case.
Also if you feel sorting on _time
will also help put the sequences in a better order than already done by | sort queueName, seqId
the try to combine _time
in there to make it | sort queueName, seqId, _time
Try streamstats
instead http://blogs.splunk.com/2014/04/01/search-command-stats-eventstats-and-streamstats-2/
... | streamstats window=1 current=f values(seqid) as next_seqid by queueName | eval seq_diff = next_seqid - seqid | where seq_diff > 1 | table queueName seqid seqid_diff
How about you use autoregress
which will be able to look at previous event something like this
your base query to return all the events
| sort queueName, seqId
| autoregress queueName as oldQ p=1
| autoregress seqId as oldSeq p=1
| eval flag=if( ( queueName=oldQ ) AND ( seqId != (oldSeq +1)), 1, 0)
| table queueName, seqId, oldSeqId, flag
| where flag=1
| fields -flag
You can alternatively tweak the if condition of ( seqId != (oldSeq +1))
to something like ( seqId - oldSeq > 1)
or whichever way you feel shall better represent your case.
Also if you feel sorting on _time
will also help put the sequences in a better order than already done by | sort queueName, seqId
the try to combine _time
in there to make it | sort queueName, seqId, _time