I need to monitor for unscheduled downtime while avoiding scheduled downtime that happens at unequal hour boundary.
Detecting downtime is easy (look for no heartbeat events in last 60s).
Setting crontab to start running the searches on equal hour boundary (say, 01:00) is easy.
However, it seems to be impossible to set crontab to start at unequal hour boundary (say, 00:40).
Is there another way to implement such search?
I'm not sure I understand you question, but a crontab entry of:
40 * * * *
will run every hour at 40 minutes past the hour, but I don't know what you mean when you say you want to run every minute, but also only at 40 minutes pas an hour.
I don't think you can do this only using cron. I assume you are using this for an alert so, the way I would approach it is to run every minute between midnight and 3am.
Cron * 0-3 * * *
Then use the custom condition and where command to limit the time. Assuming you were looking for zero events...
where count=0 AND date_hour*100+date_minute>39 AND date_hour*100+date_minute<221
so my savedsearch looks like this
[downtime] search = heartbeat | stats count by host enableSched = 1 cron_schedule = * 0-3 * * * dispatch.earliest_time = -1m@m dispatch.latest_time = @m alert.track = 1 alert_condition = where count=0 AND date_hour*100+date_minute>39 AND date_hour*100+date_minute<221 counttype = custom action.email = 1 action.email.inline = 1 action.email.sendresults = 1 action.email.to = firstname.lastname@example.org alert.severity = 4 alert.suppress = 1 alert.suppress.period = 5m displayview = flashtimeline request.ui_dispatch_view = flashtimeline vsid = gq1ya7b9
[edit ref comment]
Oops! Sorry not thinking straight there. Adding the following to your search will do what you want
| where date_hour*100+date_minute>39 AND date_hour*100+date_minute<221
This is the very, very low tech way of handling this - and it does not scale well - but you could always have multiple copies of the same scheduled search/alert.
It looks like it would take three cron entries:
cron_schedule=40-59 00 * * * cron_schedule=* 1-2 * * * cron_schedule=00-20 02 * * *
Note I'm assuming that Splunk's crontab is as flexible as Vixie cron - which it should be.
Saved Searches failing on one node 0 Answers