All Apps and Add-ons

Source IPs Communicating with Far More Hosts Than Normal (Assistant: Detect Spikes)

davidmonaghan
Explorer

Hello All

I was wondering if someone could break down what the following search does and what the final outputted fields mean?

This search was taken from the Splunk Security Essentials app...

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)
| bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
| eventstats max(_time) as maxtime 
| stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"
| eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
| eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)
0 Karma
1 Solution

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

View solution in original post

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

davidmonaghan
Explorer

Thanks

That was pretty much my reading once I broke it down.

David

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...