Alerting

real time alerts stopped working in splunk

sathyasubburaj
Explorer

Hi ,
real time alerts which has been configured in splunk stopped working suddenly ..when checking on schedular.log file it has log messages as "reason=realtime rtsearches limit exceeded" or "reason=real time searches pending"

Tags (1)
0 Karma

sathyasubburaj
Explorer

Hi Adonio and mwong ,

Thanks for your comments and reply . Issue has been fixed after changing the alerts to schedule and sharing option in alert tab to app

0 Karma

mwong
Splunk Employee
Splunk Employee

We have limitation to run real time searches concurrently. You can read the limits.conf.spec in splunk.

max_searches_per_cpu = <int>
* The maximum number of concurrent historical searches per CPU. The system-wide
  limit of historical searches is computed as:
  max_hist_searches =  max_searches_per_cpu x number_of_cpus + base_max_searches
* Note: the maximum number of real-time searches is computed as:
  max_rt_searches = max_rt_search_multiplier x max_hist_searches
* Defaults to 1

max_rt_search_multiplier = <decimal number>
* A number by which the maximum number of historical searches is multiplied to
  determine the maximum number of concurrent real-time searches
* Note: the maximum number of real-time searches is computed as:
  max_rt_searches = max_rt_search_multiplier x max_hist_searches
* Defaults to 1
0 Karma

sathyasubburaj
Explorer

Hi mwong,

Thanks for your comments and reply .Issue has been fixed after changing the alerts to scheduled and sharing tab option in alerts tab to app .

0 Karma

adonio
Ultra Champion

summary of comments above:
37 real time searches for alerts are to many for system to handle.
searches for alerts are being skipped and therefore alerts are not triggered.
use this search to find out which searched are being skipped and why:

 index=_internal sourcetype=scheduler status=skipped | table _time app user savedsearch_name reason

use this alerts best practice doc to modify searches times intervals and other scheduling parameters:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Alert/AlertSchedulingBestPractices

0 Karma

niketn
Legend

@sathyasubburaj... Real-Time searches/Alerts should be decided based on your Splunk Infrastructure. Avoid them unless absolutely necessary.

In case your system can support, these settings should be located in Splunk Settings > Access Control > Roles > (Specific Role like Admin)

User-level concurrent real-time search job limits and Role-level concurrent real-time search job limit settings

You might also need to consider other settings like Restrict time range, Restrict Search terms and Limit total job disk quota accordingly.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

sathyasubburaj
Explorer

Hi Niketnilay ,

thanks for the response.
Currently i created the alerts using admin user/role .
Below are the settings in splunk for admin role .
User-level concurrent real-time search job limits-100
Role-level concurrent search jobs limit-200
Restrict time range-0
Restrict search terms-*
Limit total job disk quota -10000

Do I need to change the limits ?

Below is the query i have configured as alert in real time --> trigger result when number of result is greater than 1 and trigger once in one hour .

index=windows sourcetype="WMI:Service" host= Name=HM* OR Name=SD* OR Name=H&M* OR Name=Board* OR Name=Salsa* status="Stopped" OR status="Stop"|dedup Name,host | rex "Description=(?P.+).*?" |table Name ,Description,status,_time,host |eval Name=upper(Name) |eval Env=case(host = "hostname", "DIT" ) |eval system=case(host = "hostname", "SDS") | convert timeformat="%H:%M:%S %Y-%m-%d" ctime(_time) |Rename Name as "SERVICE NAME" status as Status _time as Time host as "SERVER" Env as "Environment" system as "SYSTEM"

I have configured 37 similar alerts like above .. does this cause issue ???

0 Karma

sathyasubburaj
Explorer

Hi Niketnilay ,

Currently am getting log message as "reason="maxconcurrent limit reached" .. Any help would be greatly appreciated

0 Karma

adonio
Ultra Champion

37 realtime alerts might overload your system depends on hardware specs
try this search and see if the realtime alerts are being skipped:

index=_internal sourcetype=scheduler status=skipped | table _time app user savedsearch_name reason
0 Karma

sathyasubburaj
Explorer

Adonio,

thank you ... yes i could see all of them skipped :'( any solution to this ?

0 Karma

adonio
Ultra Champion

so the reason the alerts are not firing is the searches for the alerts are not running (skipped) most likely the reason for that is that you have many realtime searches at the same time and there are not enough cores to support it.
it is better to run a scheduled search for alerts in an interval and minimize the use of realtime searches.
so for your alerts, probably configure the searches to run lets say every 5 or 15 minutes and not real time.
this doc article can help:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Alert/AlertSchedulingBestPractices

0 Karma

sathyasubburaj
Explorer

sure .. will read the document .. but one more query ..if i reconfigure the 37 alerts into scheduled whether it will overload the system ???

0 Karma

adonio
Ultra Champion

the doc above elaborates on best practices, i will suggest to prioritize your alerts and add that factor as well when setting it up. it will take into consideration which alert has highest priority.
another important thing to pay attention to is how long the search (for the alerts) takes to complete. you dont want to schedule a search to run every minute if it takes 3 minutes to complete since it will never complete and will tie a core.

0 Karma

sathyasubburaj
Explorer

thank you so much Adonio :-):-)let me take a look into document and get back to you for queries incase !!

0 Karma

sathyasubburaj
Explorer

Hi Adonio,

I have now scheduled the alerts through cron and i am getting below log results and mail is still not triggered.anything i am missing here ?

04-28-2017 08:45:02.910 +0200 INFO SavedSplunker - savedsearch_id="nobody;BpServiceStatus;serv_sched_alert",
user="admin", app="BpServiceStatus", savedsearch_name="secc4069_sched_alert",
status=success, digest_mode=1, scheduled_time=1493361900, window_time=0,
dispatch_time=1493361901, run_time=0.940, result_count=0, alert_actions="",
sid="scheduler_adminBpServiceStatus_RMD5b4b50b150fb545cc_at_1493361900_82118", suppressed=0, thread_id="AlertNotifierWorker-0"

0 Karma

adonio
Ultra Champion

looking at your data, it seems like there are no alert actions configured for this search or that they dont match the criteria specified to trigger the alert: look at field alert_actions
if search trigers an action, you should see a value there and not ""

0 Karma

sathyasubburaj
Explorer

Hi Adonio ,

Thanks for your comments and reply .Issue has been fixed after changing the alerts to scheduled and sharing tab option in alerts tab to app .

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...