Alerting

How to check if load is equally distributed on the host and create an alert?

Vicky84
Explorer

Hi,
We generally raise tickets in Prod through Splunk by putting search query as Report/Alert and now we have a requirement to alert if the load is not equally distributed b/w the hosts. With the top command I see result is in % but I wasn't able to use it in where cause to calculate the deviation.

Say we have 4 hosts sharing an app and ideally it should be almost equal distribution but in unwanted scenario if load is lesser in Prod on one of the host Or higher on a host, I should have an alert.

log ex : index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try. I've used 1 percent as the threshould difference between a host's percent versus average percent (100/total hosts).

index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"
| top host showperc=t showcount=f | eventstats count 
|eval average=100/count 
| where percent<average-1 OR percent>average+1

View solution in original post

asplunk789
Loves-to-Learn Everything

Is it possible the same way for 100's of servers (different servers like app servers, db servers etc..) comparison.

0 Karma

mattymo
Splunk Employee
Splunk Employee

Hi Vicky84,

I would recommend looking at collecting host metrics using something like collectd or the nix_ta or nmon, etc, rather than top, so you can get the CPU trend over time. then you could compare the trends and calculate a deviation

- MattyMo
0 Karma

Vicky84
Explorer

May be in a larger context what you are referring may mean more sense and to monitor OS stats but I am not well versed in that and something like below Splunk query would do the task for me.

0 Karma

somesoni2
Revered Legend

Give this a try. I've used 1 percent as the threshould difference between a host's percent versus average percent (100/total hosts).

index=data loggerName="xyzzy" threadName="thread1" appName="dataSync"
| top host showperc=t showcount=f | eventstats count 
|eval average=100/count 
| where percent<average-1 OR percent>average+1

Vicky84
Explorer

Exactly as I wanted !

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...