All Apps and Add-ons

Source IPs Communicating with Far More Hosts Than Normal (Assistant: Detect Spikes)

davidmonaghan
Explorer

Hello All

I was wondering if someone could break down what the following search does and what the final outputted fields mean?

This search was taken from the Splunk Security Essentials app...

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)
| bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
| eventstats max(_time) as maxtime 
| stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"
| eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
| eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)
0 Karma
1 Solution

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

View solution in original post

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

davidmonaghan
Explorer

Thanks

That was pretty much my reading once I broke it down.

David

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...