Hello everyone. I inherited a saved search that I'm trying to break down and understand what it's doing. The intent of the search is to look at the web logs and calculate the time difference between every 2 consecutive events that have the same source IP and web destination. I think this is to look for beaconing-like activity. I'm going through it line by line and I don't understand a few things and am not sure it's working as intended. I have it running in a small time window and the time fields don't make any sense to me.
saved search
index=web
| eval ctime = _time
| sort 0 + ctime
| streamstats global=f window=2 current=f last(c_time) AS prev_time by src_ip dest_host
| eval diff = c_time - prev_time
| eventstats count, stdev(diff) AS std by src_ip dest_host
| where std < 5 AND count > 30
What doesn't make sense to me is where it's getting prev_time
. If streamstats
isn't using the current event ( current=f
) than where is it pulling prev_time
from to calculate the difference from c_time
? I think the intent was to calculate the time difference when the same system goes to the same dest_host but it looks like prev_time
is just using the previous event's time stamp regardless of source IP. This is where I'm confused and believe it is not working correctly. Not every event is getting a prev_time
field either, only about 75% of the events have one.
First of all you should table
your desired fields after the first pipe
and then sort
ip web_destination and time. To your question about streamstats not taking the current and then how it is getting the previous time, Since you are already calculating last(c_time) by src_ip host
. It will look at the last value of c_time
in the previous result , thats where your sorting
of host and src_ip
comes handy. Otherwise last(c_time)
will give you wrong results. To help you i am putting the search i use to calculate the time difference between two events coming from the same sources :
index=advance_logs sourcetype=advanced_iis
| eval ctime=_time
| table _time c_ip s_ip host ctime
| sort 0 c_ip s_ip host _time
| streamstats count AS RecordNumber by c_ip s_ip reset_on_change=true
| streamstats current=f last(_time) as LastTime last(RecordNumber) As previousRecord
| eval eventchange = if(RecordNumber-previousRecord!=1,"Yes","No")
| eval pause=if(eventchange="No",(_time-LastTime),"None")
| fillnull pause Value="None"
The logic here is to club similar events ( same host and c_ip )
by _time
and then calculate the difference between _time
and c_time(previous event's _time)
You are making it way too complicated; try this:
index=web
| streamstats window=2 current=t range(_time) AS diff BY src_ip dest_host
| eventstats count, stdev(diff) AS std BY src_ip dest_host
| where std < 5 AND count > 30
So, first things first. There are some typos in the above, I would assume.
Line 2 sets ctime equal to _time, fine. Line three sorts that way, also fine. Then lines 4 and above use c_time not ctime. Hmm.
In my own data (web logs as well), I had to change it all to
index="where_my_web_events_are"
| eval ctime = _time
| sort 0 + ctime
| streamstats global=f window=2 current=f last(ctime) AS prev_time by clientip
| eval diff = ctime - prev_time
Where I would start (unless of course someone with more time jumps in and fixes this all up for you) is to tear it down piece by piece. If you have your data down to a portion of time with 10 or 20 items, then it'll be easy to start seeing what each part does and if it's needed or not. For instance, I believe | streamstats global=f window=2 first(ctime) AS prev_time BY clientip, host_blah
is equivalent to what it had been doing but simpler.
Find a small portion of data - like an hour, or a minute, or whatever - that you can start from index=myindex | eval ctime= _time | sort 0 + ctime | ...
and make sure that piece makes sense. Add ctime to your displayed fields so you won't miss it later.
Then add the streamstats
(and maybe the eval diff
because why not) command. Read the docs on each thing it's doing (which sounds like what you are doing anyway), and then add prev_time and the diff to your displayed fields. Check that it's right - does it make sense?
Keep adding lines one at a time.
This is the way I learned SPL, interacting on Answers and in IRC (now it's Slack) and trying line by line to understand what each piece is doing.
Only when you think it is giving the right numbers