Splunk Search

Find the list of consecutive events by a field

langlv
Engager

Hello,

I am trying to find out the list of consecutive card transactions on same terminal in period of time, eg: more than 3 inquiry transactions in a ATM machine from the hold transaction history of the system (from multiple ATMs).

 terminal_id    tran_type
ATM1    INQ
ATM2    CWD
ATM1    INQ
ATM1    INQ
ATM2    CWD
ATM1    INQ
ATM2    INQ
ATM1    CWD
ATM3    INQ
ATM3    INQ
ATM1    CWD
ATM3    INQ
ATM3    CWD
ATM1    INQ

Given the parameter for listing consecutive INQ transactions on same terminal is 3, I want to list out the following:

terminal_id tran_type
ATM1    INQ
ATM1    INQ
ATM1    INQ
ATM1    INQ
ATM3    INQ
ATM3    INQ
ATM3    INQ

where both ATM1 & ATM3 need to be checked (ATM1 has 4 and ATM3 has 3 INQ consecutive transactions individually).

Thank you guys for your help.
Lang

Tags (2)
0 Karma
1 Solution

vasanthmss
Motivator

Updated,

your base search... |  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
 | autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Sample,

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
 |  table _time, terminal_id, tran_type  
|  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Explanation,

  1. create an raw event with terminal_id and tran_type
  2. use autoregress raw event you will get raw_p1. Prepares your events for calculating the autoregression, or the moving average, by copying one or more of the previous values for field into each event
  3. field sameAsNext will indicate (with 0, 1 )same as next or not.
  4. when the sameAsNext 0 comes create an session id and count the events in the session id you will get number of repeating items
  5. simple where count>=3 will help you.

Old one,

try something like this,

your base search .... |  sort 0 _time, terminal_id, tran_type |  streamstats count by terminal_id, tran_type  |  eventstats max(count)  as count by terminal_id, tran_type  |  where count>=3 | fields - count

Sample search ,

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
|  table _time, terminal_id, tran_type 
|  sort 0 _time, terminal_id, tran_type 
|  streamstats count by terminal_id, tran_type 
|  eventstats max(count) as count by terminal_id, tran_type 
|  where count>=3
| fields - count

Hope this helps you!!!

V

View solution in original post

DalJeanis
Legend

Try this -

| makeresults 
| eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" 
| makemv delim="-" input 
| mvexpand input 
| rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
| table time, terminal_id, tran_type 
| streamstats count as recno
| eval _time = _time + recno
| sort 0 _time, terminal_id, tran_type 
| fields - recno
| rename COMMENT as "Above just enters your test data" 

| rename COMMENT as "Mark each new batch whenever tran_type changes, then mark each transaction with their batch number"
| streamstats current=f last(tran_type) as last_type by terminal_id
| eval newbatch=if(coalesce(last_type,"")=tran_type,0,1)
| streamstats sum(newbatch) as batchno by terminal_id

| rename COMMENT as "Count how many in each batch, then let pass only those with 3 or more"
| eventstats count as batchcount by terminal_id, batchno 
| where batchcount>=3
| fields _time, terminal_id, tran_type 
0 Karma

vasanthmss
Motivator

Updated,

your base search... |  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
 | autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Sample,

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
 |  table _time, terminal_id, tran_type  
|  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Explanation,

  1. create an raw event with terminal_id and tran_type
  2. use autoregress raw event you will get raw_p1. Prepares your events for calculating the autoregression, or the moving average, by copying one or more of the previous values for field into each event
  3. field sameAsNext will indicate (with 0, 1 )same as next or not.
  4. when the sameAsNext 0 comes create an session id and count the events in the session id you will get number of repeating items
  5. simple where count>=3 will help you.

Old one,

try something like this,

your base search .... |  sort 0 _time, terminal_id, tran_type |  streamstats count by terminal_id, tran_type  |  eventstats max(count)  as count by terminal_id, tran_type  |  where count>=3 | fields - count

Sample search ,

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
|  table _time, terminal_id, tran_type 
|  sort 0 _time, terminal_id, tran_type 
|  streamstats count by terminal_id, tran_type 
|  eventstats max(count) as count by terminal_id, tran_type 
|  where count>=3
| fields - count

Hope this helps you!!!

V

langlv
Engager

Thanks vasanthmss for your help, but it still count all INQ by ATM1 while I need only consecutive INQ on ATM1. If another CWD on ATM1 occur the counting will be reset. Pls consider following sample events:

terminal_id tran_type
ATM1    INQ
ATM2    CWD
ATM1    INQ
ATM1    INQ
ATM2    CWD
ATM1    INQ
ATM2    INQ
ATM1    CWD
ATM3    INQ
ATM3    INQ
ATM1    CWD
ATM3    INQ
ATM3    CWD
ATM1    INQ

The last ATM1,INQ should be be counted because there is ATM1,CWD happen before.

Thanks,

0 Karma

vasanthmss
Motivator

try something like this,

 | makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
 |  table _time, terminal_id, tran_type  
|  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Actual search will be,

your base search ...  |  eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw  | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type

Updating my answer based on this,

V
0 Karma

langlv
Engager

Thanks vasanthmss, combining autoregress with streamstats and eventstats is what i am looking for. It works.

0 Karma

knielsen
Contributor

Do you really need the events listed out, or do you want to know which combinations match your limit? The latter could easily be done with
| stats count by terminal_id, tran_type | where count>=3

which will give you a table with terminal_id, tran_type and the respective count.

0 Karma

langlv
Engager

Thanks for your comment, but | stats count... does not meet the business requirement as I need to count only consecutive INQ on terminal. If there is CWD on ATM1 then the INQ counting will be reset. I also really to list out the events for business cross check.

0 Karma

knielsen
Contributor

actually, the solution for your exact expected result would be achieved with eventstats instead of stats. I tried it out like that:

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$" 
| eventstats count by terminal_id, tran_type | where count >= 3| table terminal_id, tran_type
0 Karma

langlv
Engager

Eventstats also does not help because it count all the INQ by ATM1 while I need only consecutive events. If I add 1 more ATM1,INQ to the event list like this:

| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ-ATM1,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?[^,]+),(?.+)$" | eventstats count by terminal_id, tran_type | where count >= 3| table terminal_id, tran_type

Then the last INQ event should not be counted because there is another CWD on ATM1 before INQ happen.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...