Alerting

Best Practices Guide for Alerting with Splnk on Splunk (S.O.S.)

jsmith10
Engager

We are interested in knowing if there is a Best Practices guide for proactive and reactive monitoring of Splunk, particularly what thresholds to watch when using the SoS app, and what to alert on in order to understand if there is an issue with a search head, indexer, or heavy/universal forwarder?

Thanks.

Tags (2)
1 Solution

hexx
Splunk Employee
Splunk Employee

This is something that we are likely to cover in the eventual S.o.S User Manual, but until such a time, I can issue the following recommendations:

  • Leverage the scripted inputs that ship with S.o.S to alert when the resource usage of Splunk processes is unreasonable. The ps_sos.sh scripted input, for example (and its Windows equivalent, ps_sos.ps1) track the CPU and memory usage of Splunk processes and categorize them by process type (splunkd, Splunk Web, searches). It's fairly easy to build a search that will send an alert if any splunkd process exceeds 3GB in physical memory usage, for example.

  • If needed, you can draw inspiration from the searches that power the S.o.S views - search strings of the S.o.S underlying searches should be easily accessible either by clicking on the "view results" link of the corresponding panel or by consulting the in-app help that expands when you click on the "Learn More" button.

View solution in original post

hexx
Splunk Employee
Splunk Employee

This is something that we are likely to cover in the eventual S.o.S User Manual, but until such a time, I can issue the following recommendations:

  • Leverage the scripted inputs that ship with S.o.S to alert when the resource usage of Splunk processes is unreasonable. The ps_sos.sh scripted input, for example (and its Windows equivalent, ps_sos.ps1) track the CPU and memory usage of Splunk processes and categorize them by process type (splunkd, Splunk Web, searches). It's fairly easy to build a search that will send an alert if any splunkd process exceeds 3GB in physical memory usage, for example.

  • If needed, you can draw inspiration from the searches that power the S.o.S views - search strings of the S.o.S underlying searches should be easily accessible either by clicking on the "view results" link of the corresponding panel or by consulting the in-app help that expands when you click on the "Learn More" button.

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...