Getting Data In

How do you handle traffic spikes from servers?

twinspop
Influencer

Our developers tend to use syslog, um, carelessly. For example, one server yesterday decided to send out 1000 identical msgs per second to let us know its DB instance was down. By the time it was taken care of, our license was busted on that indexer, again. Too many violations this month, so we're down hard.

I'm thinking of crafting a scheduled search like so:

startminutesago=60 | eval size=length(_raw)/(1024*1024) 
  | stats sum(size) as MB by HOST | where MB>50

Based on that search, I'd like to set-up an alert script that would grab the offending servers' IPs, run "iptables -I INPUT 1 -s $IP -j DROP", and send out an email/snmp-trap that this has occurred.

However, with a distributed environment this task grows a little in complexity. Schedule the search on every indexer? Or only on the search head, and then make the script capable of sending the iptables commands to the indexers? Neither solution seems ideal.

So how do you deal with the occasional big spike in traffic? I'm trying to avoid manual intervention because we often get these spikes in the dead of night and I like to sleep.

Tags (2)
0 Karma
1 Solution

hazekamp
Builder

I would recommend using Splunk's internal metrics for this:

index=_internal source=*metrics* group=per_host_thruput | rename series as host | eval MB=kb/1024 | stats sum(MB) as MB by host

Then save and schedule this search to run over your desired time window. Set the alert to trigger when MB>50 and trigger a script. The script will be responsible for taking the hosts identified by the search, running iptables, and sending an email/trap.

You could have the Splunk alert handle the email part as well depending on the manner by which you want to notify.

View solution in original post

hazekamp
Builder

I would recommend using Splunk's internal metrics for this:

index=_internal source=*metrics* group=per_host_thruput | rename series as host | eval MB=kb/1024 | stats sum(MB) as MB by host

Then save and schedule this search to run over your desired time window. Set the alert to trigger when MB>50 and trigger a script. The script will be responsible for taking the hosts identified by the search, running iptables, and sending an email/trap.

You could have the Splunk alert handle the email part as well depending on the manner by which you want to notify.

JSapienza
Contributor

Before 4.2 it was messy.
But, now that I have been using the Deployment app in 4.2 its been a breeze. I specifically use the "Forwarders Sending More Data Than Expected" search with an alert set to fire when any forwarder hits 20% over of its "Average Daily KBps" .This search uses the forwarder_metrics which seems to be pretty reliable. I also have an alert set to fire if we hit 80% of our daily indexing license volume. This way I have the option to stop a forwarder or an indexer to prevent the license bust. At this time I am handling it manually.
However, I could use "Run Script" action on the alerts to kick off a script to remotely stop the forwarder, indexer or any other appropriate action.

twinspop
Influencer

I need to look into this. That sounds interesting. Unfortunately, a lot of my data comes from syslog direct to Splunk.

0 Karma

netwrkr
Communicator

"I'm trying to avoid manual intervention because we often get these spikes in the dead of night and I like to sleep."

amen to that. Even better is when a disk error occurs and spews 10 mil lines of logs in ~5 minutes.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...