Alerting

Alerting When Common Variable Passes Threshold

drizzo
Path Finder

I am monitoring the percent usage of my CPU and RAM by entering the following in the search:

(index=* host=* sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR (index=* host=* sourcetype="Perfmon:CPU" counter="% Processor Time") | eval "Indexed Time"=strftime(_time, "%Y/%d/%m %H:%M") | eval "Computer Host"=host | eval "Event Type"=source | eval Object=object | eval "Percent Usage"=round(Value, 2)."%" | table "Indexed Time", "Computer Host", "Event Type", Object, "Percent Usage"

It all comes out well, however I am only trying to make the events in which the 'Value' variable (both CPU and RAM) is above the integer, 50 (greater than 50%). I have tried putting the following in along with some other variations of it too:

| where Value > 50

No errors are thrown on searching, but it pulls up zero results (which I have checked, and there are values greater than 50).

My end goal is to make this an alert. Normally I would have these in two separate searches/alerts, but my boss wants it in one (hence the 'OR'). I have been looking through Splunk Docs and Splunk Answers, but I'm only getting information on using the "where" command. Any further help -- even if it's a helpful link -- would be greatly appreciated. Thank you.

0 Karma
1 Solution

niketn
Legend

@drizzo, Have you tried the following (You need to provide span based on how frequently you feed data from forwarder, for example span=5m :

(index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR (index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time") 
| where Value>50
| timechart span=<YourForwarderSpan> list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"

Also give the following a try:

(index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR (index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time") (Value="5*" AND Value!="5.*") OR (Value="6*" AND Value!="6.*") OR (Value="7*" AND Value!="7.*") OR (Value="8*" AND Value!="8.*") OR  (Value="9*" AND Value!="9.*") OR (Value="100")
| timechart span=<YourForwarderSpan> list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"

In case you are planning to setup alert you can try the following query which fetches only the latest CPU and Memory performance counters from hosts

 index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use" 
| head 1
| append [search index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time" | head 1 ]
| timechart list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

DalJeanis
SplunkTrust
SplunkTrust

First, don't spend time "prettying up" the variable names before you've gotten the logic working. until everything is coming safely out the end, you're just risking adding to the confusion.

Testing suggestions -

Set the host= for a host you know has some issues, and then try each of these chunks of code, one at a time, and see if they work.

 index=* host="yourproblemchildhost" 
(sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR 
(sourcetype="Perfmon:CPU" counter="% Processor Time") 
| where Value>=50
| rename COMMENT as "The above should get you any events where CPU is above 50 or Memory is above 50 for a host."
| rename COMMENT as "Use the below to limit your results for testing - remove it when the search is working."
| head 20 

| rename COMMENT as "Now we rename them and check for any 5m _time period that a host has both."
| eval CPU=if(sourcetype="Perfmon:CPU",Value,null())
| eval Memory=if(sourcetype="Perfmon:Memory",Value,null())
| bin _time span=5m 
| stats max(CPU) as CPU max(Memory) as Memory by _time host
| where CPU>=50  AND Memory>=50


| rename COMMENT as "Now we can pretty them up."
| eval "CPU Percent Usage"=round(CPU, 2)."%" 
| eval "Memory Percent Usage"=round(Memory, 2)."%" 
| eval "Indexed Time"=strftime(_time, "%Y/%d/%m %H:%M") 
| eval "Computer Host"= host 
| table "Indexed Time", "Computer Host", "CPU Percent Usage", "Memory Percent Usage"
0 Karma

niketn
Legend

@drizzo, Have you tried the following (You need to provide span based on how frequently you feed data from forwarder, for example span=5m :

(index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR (index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time") 
| where Value>50
| timechart span=<YourForwarderSpan> list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"

Also give the following a try:

(index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use") OR (index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time") (Value="5*" AND Value!="5.*") OR (Value="6*" AND Value!="6.*") OR (Value="7*" AND Value!="7.*") OR (Value="8*" AND Value!="8.*") OR  (Value="9*" AND Value!="9.*") OR (Value="100")
| timechart span=<YourForwarderSpan> list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"

In case you are planning to setup alert you can try the following query which fetches only the latest CPU and Memory performance counters from hosts

 index="xyz" host="abc" sourcetype"Perfmon:Memory" collection=Memory object=Memory counter="% Committed Bytes In Use" 
| head 1
| append [search index="xyz" host="abc" sourcetype="Perfmon:CPU" counter="% Processor Time" | head 1 ]
| timechart list(counter) as Counter list(Value) as Value values(host) as "Computer Host" values(source) as "Event Type" values(object) as Object
| search Object="Memory" AND Memory="CPU"
| eval Value=round(Value,2)."%"
| rename Value as "Percent Usage"
| table _time "Computer Host" "Event Type" Object Counter "Percent Usage"
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

drizzo
Path Finder

This actually ended up working! I'm amazed with your response -- very detailed. But it is not letting me confirm yours as an answer.

0 Karma

niketn
Legend

@drizzo, glad it worked. I have converted my comment to answer. You can go ahead and accept to mark this as answered.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

drizzo
Path Finder

Thank you!

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...