Hunting for duplicate event data to find suspiciou...

uhaba · ‎09-04-2019

I am trying to determine the right SPL to dig through a financial data set and look for duplicate entries. The data generally is unique but occasionally a vendor may submit a duplicate request resulting in bad things.

Test data:
id=11111,vendor=blah,name=tacoco,value=201,date="1/1/18"
id=11112,vendor=abc,name=jump,value=321,date="2/1/18"
id=11113,vendor=sneeze,name=china,value=421,date="3/1/18"
id=11114,vendor=alpha,name=pooch,value=521,date="4/1/18"
id=11115,vendor=splunk,name=tacos,value=221,date="5/1/18"
id=11116,vendor=internet,name=golf,value=621,date="6/1/18"
id=11117,vendor=office,name=mexico,value=721,date="7/1/18"
id=11118,vendor=splunk,name=tacos,value=221,date="5/1/18"
id=11119,vendor=random,name=burger,value=821,date="8/1/18"
id=11120,vendor=opera,name=browser,value=921,date="9/1/18"

I would like to create a search that identifies any time where vendor, name, value, and date all have the same values but id is different. (vendor=splunk rows for example above) There are other fields in the event data but this would be what I'm looking for specifically.

jacobpevans · ‎09-04-2019

Greetings @uhaba, try this run-anywhere search:

| makeresults
| eval id     = "11111" ,
       vendor = "blah"  ,
       name   = "tacoco",
       value  = "201"   ,
       date   = "1/1/18"
| append 
    [ | makeresults
      | eval id     = "11115" ,
             vendor = "splunk"  ,
             name   = "tacos",
             value  = "221"   ,
             date   = "5/1/18" ]
| append 
    [ | makeresults
      | eval id     = "11118" ,
             vendor = "splunk"  ,
             name   = "tacos",
             value  = "221"   ,
             date   = "5/1/18" ]
| stats count values(id) as ids by vendor name value date
| where count > 1

Output:

vendor  name    value   date    count   ids
splunk  tacos   221 5/1/18  2      11115
                                            11118

Cheers,
Jacob

If you feel this response answered your question, please do not forget to mark it as such. If it did not, but you do have the answer, feel free to answer your own post and accept that as the answer.

Hunting for duplicate event data to find suspicious activities

Detecting Remote Code Executions With the Splunk Threat Research Team

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

.conf24 | Session Scheduler is Live!!