All Apps and Add-ons

Source IPs Communicating with Far More Hosts Than Normal (Assistant: Detect Spikes)

davidmonaghan
Explorer

Hello All

I was wondering if someone could break down what the following search does and what the final outputted fields mean?

This search was taken from the Splunk Security Essentials app...

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)
| bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
| eventstats max(_time) as maxtime 
| stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"
| eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
| eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)
0 Karma
1 Solution

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

View solution in original post

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

davidmonaghan
Explorer

Thanks

That was pretty much my reading once I broke it down.

David

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...