Alerting

Can you help me figure out why i'm seeing delays between the scheduling and dispatching of my alerts?

damucka
Builder

Hello,

I have a strange situation with the delays in both scheduling and dispatching of my alerts.
They should run each minute, as per cron schedule:

*/1 * * * *

but, when I am checking the schedule and dispatch times I can see that:

1/ The alerts get scheduled each second minute only
2/ There is always a delay between the schedule and dispatch, more less always 2 minutes as well, please see the attached image.

alt text

Could you please advise what's going wrong here?

How would I get my alerts executed each minute and get rid of the additional delay between schedule and dispatch?

I thought that the schedule to dispatch delay could come from the resource bottleneck, but there is none.

Also, the fact that it is always 2 minutes would not fit in the resource bottleneck theory.

Are there any parameters that could cause the above behavior?

Kind Regards,
Kamil

Tags (2)
0 Karma

dkeck
Influencer

Hi,

since this sounds like some config is actually telling splunk to wait that 2 minutes you were talking about, I suggest you may

have a look at this :https://answers.splunk.com/answers/550674/splunk-scheduler-how-can-i-reduce-latency-what-can.html

This user is providing knowledge about the schedule_window field for sheduled searches. Might be something you want to check.

0 Karma

dkeck
Influencer

Any update on this? did you try that?

Please accept the answer if it helped 🙂

0 Karma

damucka
Builder

Hello,

Unfortunately it did not help. The action I took as per the description in link was to grant explicitly the edit_search_schedule_window role to my user in order to get the schedule_window = 0 and not default.
It did not help. I can see all of my and not only my alerts to have a lag of precisely 2 minutes. This is strange, because there are still some other alerts in the system that get dispatched immediately. When I compare the parameters of the both in the system, they seem the same.

1/ my alert with the 2 min lag:

01-23-2019 13:29:15.239 +0100 INFO  SavedSplunker - savedsearch_id="nobody;mlbso;Anomaly Detection", search_type="scheduled", user="CDE", app="mlbso", savedsearch_name="Anomaly Detection", priority=default, status=success, digest_mode=1, scheduled_time=1548246420, window_time=0, dispatch_time=1548246548, run_time=5.707, result_count=0, alert_actions="", sid="scheduler__CDE__mlbso__RMD54eeec7fba2d5a846_at_1548246420_4375", suppressed=0, thread_id="AlertNotifierWorker-0"

2/
other alert, dispatched immediately (without lag):

01-23-2019 12:35:01.097 +0000 INFO  SavedSplunker - savedsearch_id="nobody;ids;sci_prod_us_east http 5xx", search_type="scheduled", user="ABC", app="ids", savedsearch_name="sci_prod_us_east http 5xx", priority=default, status=success, digest_mode=1, scheduled_time=1548246900, window_time=0, dispatch_time=1548246900, run_time=0.235, result_count=0, alert_actions="", sid="scheduler__ABC__ids__RMD5494dd652a11e08f4_at_1548246900_25299", suppressed=0, thread_id="AlertNotifierWorker-0"

Could you advise?
Is there any way to see the detailed scheduler log for this issueing a search in Splunk?
Waht would be the reason to have this kind of lag?

Kind Regards,
Kamil

0 Karma

dkeck
Influencer

Hi,

there is a scheduler.log in $SPLUNK_HOME/var/log/splunk ,maybe this could help

0 Karma

damucka
Builder

Hi,
In the scheduler log there is nothing more than the entries as above 1/ and 2/. I am not able to figure out where the lag comes from based on it.
Any further ideas?

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...