Alerting

How to build an ongoing alert that catches a sudden rise (spike) in a certain error code?

gingersoftware
New Member

Hi Guys,

I could really use an ongoing alert that catches a sudden rise (spike) in a certain error code (such as 404 or 502 etc...)
I tried giving this some thought on how to achieve that, and... Well... I could really use your help 🙂

From my understanding the search query should "know" or, "sense" the normal traffic (not sure for how long, maybe for 1hr, 2hrs) and alert when there is a spike in the error code compared to 1-2 hours ago.
I think the error code spike threshold should be more than 5% of total traffic, while occurring for longer than 90 seconds.

I appreciate your help.

Tags (1)
0 Karma
1 Solution

HiroshiSatoh
Champion

I use predictions when I create alerts by statistical analysis. I think it is easier to adjust the prediction parameters according to the current situation rather than thinking about various logic.

index=(your index) ("404" OR "502" OR ・・・)
| timechart span=90s count 
| predict lower95=lower upper95=upper algorithm=LL count as predict
| where count>'upper(predict)'

※Adjustment point:span=90s、upper95、time range、(algorithm)

View solution in original post

Anam
Community Manager
Community Manager

Hi @gingersoftware

My name is Anam Siddique and I am the Community Content Specialist for Splunk Answers. Please accept the appropriate answer that worked for you so other members of the community can benefit from it. If none of the answers have worked for you so far please post further comments so someone can help you.

Thanks

0 Karma

felipesewaybric
Contributor

Timewrap will do the trick.

0 Karma

woodcock
Esteemed Legend

Check out this INCREDIBLE answer by @mmodestino here:

https://answers.splunk.com/answers/511894/how-to-use-the-timewrap-command-and-set-an-alert-f.html

I heard that he was going to create a blog post or app based on this, what is the evolution of this answer, @mmodestino?

0 Karma

HiroshiSatoh
Champion

I use predictions when I create alerts by statistical analysis. I think it is easier to adjust the prediction parameters according to the current situation rather than thinking about various logic.

index=(your index) ("404" OR "502" OR ・・・)
| timechart span=90s count 
| predict lower95=lower upper95=upper algorithm=LL count as predict
| where count>'upper(predict)'

※Adjustment point:span=90s、upper95、time range、(algorithm)

gingersoftware
New Member

Thanks,

Could you help me modify this script to fit your description?

tag=NginxLogs host=www1 OR host=www2 |stats count by status|eventstats sum(count) as total|eval perc=round((count/total)*100,2)|where status="404" AND perc>5

Thanks

0 Karma

Noah_Woodcock
Path Finder

predictions are the way to go.

0 Karma

HiroshiSatoh
Champion

For example, it is like this.

tag=NginxLogs host=www1 OR host=www2
|timechart span=1h count as total,count(eval(status="401")) as count
|eval perc=round((count/total)*100,2)
|fields - count,total
|predict lower95=lower upper95=upper algorithm=LL perc as predict
|where perc>'upper(predict)'

As it is a sample, please change the parameters in the actual environment and try it.
If you delete the WHERE clause, you can check it on the graph.

Get Updates on the Splunk Community!

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...