Solved: Source IPs Communicating with Far More Hosts Than ...

davidmonaghan · ‎10-11-2017

Hello All

I was wondering if someone could break down what the following search does and what the final outputted fields mean?

This search was taken from the Splunk Security Essentials app...

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)
| bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
| eventstats max(_time) as maxtime 
| stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"
| eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
| eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

gjanders · ‎10-12-2017

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

View solution in original post

gjanders · ‎10-12-2017

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

davidmonaghan · ‎10-12-2017

Thanks

That was pretty much my reading once I broke it down.

David

Source IPs Communicating with Far More Hosts Than Normal (Assistant: Detect Spikes)

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!