alt text
I want an alert if an application pool drops more than 99% of logging. (We have an issue where before a JVM crashes, its logs start to really slow down, and they often blame Splunk) .
So I thought, okay. Get a count of the last 15 minutes. Then get a count of the previous 15 minutes by App_pool. However, the numbers I am getting don't match up to a timechart.
tag=java |
stats count as "Current" by app_pool |
appendcols [search tag=java earliest=-30m@m latest=-15m@m|
stats count as "Previous" by app_pool ] |
eval myratio=Current/Previous |
eval prcIncrease=myratio*100 |
table app_pool, Current, Previous, myratio, prcIncrease |
where prcIncrease < 1
My results:
app_pool Current Previous myratio prcIncrease
stc 5352 3874403 0.001381 0.1381
Try running this for "Last 30 minutes"
tag=java
| timechart span=15m count BY app_pool
| untable _time app_pool Current
| streamstats current=f last(Current) AS Previous BY host
| eval myratio=Current/Previous
| eval prcIncrease=myratio*100
| where prcIncrease < 1
This will actually work for time span.
try changing the order of the searches
tag=java earliest=-30m@m latest=-15m@m|stats count as "Previous" by app_pool |appendcols [tag=java earliest=-15m| stats count as "Current" by app_pool ]
| eval myratio=Current/Previous | eval prcIncrease=myratio*100 |table app_pool, Current, Previous, myratio, prcIncrease |where prcIncrease < 1