Splunk Search

Streamstats - is it actually calculating time difference between the intended events?

DEAD_BEEF
Builder

Hello everyone. I inherited a saved search that I'm trying to break down and understand what it's doing. The intent of the search is to look at the web logs and calculate the time difference between every 2 consecutive events that have the same source IP and web destination. I think this is to look for beaconing-like activity. I'm going through it line by line and I don't understand a few things and am not sure it's working as intended. I have it running in a small time window and the time fields don't make any sense to me.

saved search

index=web
| eval ctime = _time
| sort 0 + ctime
| streamstats global=f window=2 current=f last(c_time) AS prev_time by src_ip dest_host
| eval diff = c_time - prev_time
| eventstats count, stdev(diff) AS std by src_ip dest_host
| where std < 5 AND count > 30

What doesn't make sense to me is where it's getting prev_time. If streamstats isn't using the current event ( current=f) than where is it pulling prev_time from to calculate the difference from c_time? I think the intent was to calculate the time difference when the same system goes to the same dest_host but it looks like prev_time is just using the previous event's time stamp regardless of source IP. This is where I'm confused and believe it is not working correctly. Not every event is getting a prev_time field either, only about 75% of the events have one.

0 Karma

macadminrohit
Contributor

First of all you should table your desired fields after the first pipe and then sort ip web_destination and time. To your question about streamstats not taking the current and then how it is getting the previous time, Since you are already calculating last(c_time) by src_ip host. It will look at the last value of c_time in the previous result , thats where your sorting of host and src_ip comes handy. Otherwise last(c_time) will give you wrong results. To help you i am putting the search i use to calculate the time difference between two events coming from the same sources :

index=advance_logs sourcetype=advanced_iis 
| eval ctime=_time 
| table _time c_ip s_ip host ctime 
| sort 0 c_ip s_ip host _time 
| streamstats count AS RecordNumber by c_ip s_ip reset_on_change=true 
| streamstats current=f last(_time) as LastTime last(RecordNumber) As previousRecord 
| eval eventchange = if(RecordNumber-previousRecord!=1,"Yes","No") 
| eval pause=if(eventchange="No",(_time-LastTime),"None") 
| fillnull pause Value="None"
0 Karma

macadminrohit
Contributor

The logic here is to club similar events ( same host and c_ip ) by _time and then calculate the difference between _time and c_time(previous event's _time)

0 Karma

woodcock
Esteemed Legend

You are making it way too complicated; try this:

index=web
| streamstats window=2 current=t range(_time) AS diff BY src_ip dest_host
| eventstats count, stdev(diff) AS std BY src_ip dest_host
| where std < 5 AND count > 30
0 Karma

Richfez
SplunkTrust
SplunkTrust

So, first things first. There are some typos in the above, I would assume.

Line 2 sets ctime equal to _time, fine. Line three sorts that way, also fine. Then lines 4 and above use c_time not ctime. Hmm.

In my own data (web logs as well), I had to change it all to

index="where_my_web_events_are" 
| eval ctime = _time 
| sort 0 + ctime 
| streamstats global=f window=2 current=f last(ctime) AS prev_time by clientip  
| eval diff = ctime - prev_time

Where I would start (unless of course someone with more time jumps in and fixes this all up for you) is to tear it down piece by piece. If you have your data down to a portion of time with 10 or 20 items, then it'll be easy to start seeing what each part does and if it's needed or not. For instance, I believe | streamstats global=f window=2 first(ctime) AS prev_time BY clientip, host_blah is equivalent to what it had been doing but simpler.

Find a small portion of data - like an hour, or a minute, or whatever - that you can start from index=myindex | eval ctime= _time | sort 0 + ctime | ... and make sure that piece makes sense. Add ctime to your displayed fields so you won't miss it later.

Then add the streamstats (and maybe the eval diff because why not) command. Read the docs on each thing it's doing (which sounds like what you are doing anyway), and then add prev_time and the diff to your displayed fields. Check that it's right - does it make sense?

Keep adding lines one at a time.

This is the way I learned SPL, interacting on Answers and in IRC (now it's Slack) and trying line by line to understand what each piece is doing.

Only when you think it is giving the right numbers

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...