I'm running a somewhat large splunk installation that monitors syslog for >40k hosts. Every once in a while, a host goes crazy and starts logging as fast as it's network will carry it (SCSI errors, OOMKiller, etc).
Does anyone know of a way to alert in splunk on a host that exceeds a certain number of messages per minute? I'd like to kick off a script or email whenever one host goes over, say, 1000 mpm, but with so many different hosts I can't really create a search with the hostnames pre-defined.
Any thoughts?
Well the good news is that you don't have to predefine the hosts... that's what fields are for 🙂
Create an alert for your search like this:
sourcetype=syslog | stats count by host
Schedule it to be run every minute, with a relative time span of:
earliest: -1m@m
latest: @m
with a custom condition to email you when:
WHERE count > 1000
From there you might want to tweak your search to throttle subsequent notifications, but there's an example of how you'd do what you're after.
Hope this helps 🙂
if i wnated to the same only < 100 how to i force the stats count to count 0?
Another way is to use time buckets. (more flexible, because you can run other longer periods)
mysearch | bucket _time span=1m | stats count by _time host | WHERE count > 10000
Well the good news is that you don't have to predefine the hosts... that's what fields are for 🙂
Create an alert for your search like this:
sourcetype=syslog | stats count by host
Schedule it to be run every minute, with a relative time span of:
earliest: -1m@m
latest: @m
with a custom condition to email you when:
WHERE count > 1000
From there you might want to tweak your search to throttle subsequent notifications, but there's an example of how you'd do what you're after.
Hope this helps 🙂
does this work on the free version?
will i be able to migrate alerts when i update from 3.x to 4.x?
Yep, edited the answer accordingly (it was late when
I did that one sorry!)
Aha! It needs to be:
WHERE count > 1000
Thanks!
which strange admin disable the "count" command ?
Maybe a typo error ?
Definitely looks like the right direction, but i get the following error message when I try to specify the custom condition:
"Encountered the following error while trying to update: In handler 'savedsearch': Cannot parse alert condition. Search operation 'count' is unknown. You might not have permission to run this operation."
I'm setting up this alert as the admin user, so permissions shouldn't be an issue