Hello,
I am trying to find out the list of consecutive card transactions on same terminal in period of time, eg: more than 3 inquiry transactions in a ATM machine from the hold transaction history of the system (from multiple ATMs).
terminal_id tran_type
ATM1 INQ
ATM2 CWD
ATM1 INQ
ATM1 INQ
ATM2 CWD
ATM1 INQ
ATM2 INQ
ATM1 CWD
ATM3 INQ
ATM3 INQ
ATM1 CWD
ATM3 INQ
ATM3 CWD
ATM1 INQ
Given the parameter for listing consecutive INQ transactions on same terminal is 3, I want to list out the following:
terminal_id tran_type
ATM1 INQ
ATM1 INQ
ATM1 INQ
ATM1 INQ
ATM3 INQ
ATM3 INQ
ATM3 INQ
where both ATM1 & ATM3 need to be checked (ATM1 has 4 and ATM3 has 3 INQ consecutive transactions individually).
Thank you guys for your help.
Lang
Updated,
your base search... | eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Sample,
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table _time, terminal_id, tran_type
| eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Explanation,
autoregress
raw event you will get raw_p1. Prepares your events for calculating the autoregression, or the moving average, by copying one or more of the previous values for field into each event
count>=3
will help you. Old one,
try something like this,
your base search .... | sort 0 _time, terminal_id, tran_type | streamstats count by terminal_id, tran_type | eventstats max(count) as count by terminal_id, tran_type | where count>=3 | fields - count
Sample search ,
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table _time, terminal_id, tran_type
| sort 0 _time, terminal_id, tran_type
| streamstats count by terminal_id, tran_type
| eventstats max(count) as count by terminal_id, tran_type
| where count>=3
| fields - count
Hope this helps you!!!
Try this -
| makeresults
| eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ"
| makemv delim="-" input
| mvexpand input
| rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table time, terminal_id, tran_type
| streamstats count as recno
| eval _time = _time + recno
| sort 0 _time, terminal_id, tran_type
| fields - recno
| rename COMMENT as "Above just enters your test data"
| rename COMMENT as "Mark each new batch whenever tran_type changes, then mark each transaction with their batch number"
| streamstats current=f last(tran_type) as last_type by terminal_id
| eval newbatch=if(coalesce(last_type,"")=tran_type,0,1)
| streamstats sum(newbatch) as batchno by terminal_id
| rename COMMENT as "Count how many in each batch, then let pass only those with 3 or more"
| eventstats count as batchcount by terminal_id, batchno
| where batchcount>=3
| fields _time, terminal_id, tran_type
Updated,
your base search... | eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Sample,
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table _time, terminal_id, tran_type
| eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Explanation,
autoregress
raw event you will get raw_p1. Prepares your events for calculating the autoregression, or the moving average, by copying one or more of the previous values for field into each event
count>=3
will help you. Old one,
try something like this,
your base search .... | sort 0 _time, terminal_id, tran_type | streamstats count by terminal_id, tran_type | eventstats max(count) as count by terminal_id, tran_type | where count>=3 | fields - count
Sample search ,
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table _time, terminal_id, tran_type
| sort 0 _time, terminal_id, tran_type
| streamstats count by terminal_id, tran_type
| eventstats max(count) as count by terminal_id, tran_type
| where count>=3
| fields - count
Hope this helps you!!!
Thanks vasanthmss for your help, but it still count all INQ by ATM1 while I need only consecutive INQ on ATM1. If another CWD on ATM1 occur the counting will be reset. Pls consider following sample events:
terminal_id tran_type
ATM1 INQ
ATM2 CWD
ATM1 INQ
ATM1 INQ
ATM2 CWD
ATM1 INQ
ATM2 INQ
ATM1 CWD
ATM3 INQ
ATM3 INQ
ATM1 CWD
ATM3 INQ
ATM3 CWD
ATM1 INQ
The last ATM1,INQ should be be counted because there is ATM1,CWD happen before.
Thanks,
try something like this,
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| table _time, terminal_id, tran_type
| eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Actual search will be,
your base search ... | eval raw=terminal_id+"-"+tran_type |sort 0 terminal_id
| autoregress raw | eval sameAsNext=if(raw=raw_p1,1,0) | streamstats current=t count(eval(sameAsNext=0)) AS sessionID | eventstats count AS inArowCount BY sessionID | where inArowCount>=3 | table _time, terminal_id, tran_type
Updating my answer based on this,
Thanks vasanthmss, combining autoregress
with streamstats
and eventstats
is what i am looking for. It works.
Do you really need the events listed out, or do you want to know which combinations match your limit? The latter could easily be done with
| stats count by terminal_id, tran_type | where count>=3
which will give you a table with terminal_id, tran_type and the respective count.
Thanks for your comment, but | stats count... does not meet the business requirement as I need to count only consecutive INQ on terminal. If there is CWD on ATM1 then the INQ counting will be reset. I also really to list out the events for business cross check.
actually, the solution for your exact expected result would be achieved with eventstats instead of stats. I tried it out like that:
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?<terminal_id>[^,]+),(?<tran_type>.+)$"
| eventstats count by terminal_id, tran_type | where count >= 3| table terminal_id, tran_type
Eventstats also does not help because it count all the INQ by ATM1 while I need only consecutive events. If I add 1 more ATM1,INQ to the event list like this:
| makeresults | eval input="ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM1,INQ-ATM2,CWD-ATM1,INQ-ATM2,INQ-ATM1,CWD-ATM3,INQ-ATM3,INQ-ATM1,CWD-ATM3,INQ-ATM1,INQ" | makemv delim="-" input | mvexpand input | rex field=input "(?[^,]+),(?.+)$" | eventstats count by terminal_id, tran_type | where count >= 3| table terminal_id, tran_type
Then the last INQ event should not be counted because there is another CWD on ATM1 before INQ happen.