Alerting

I want to know if my alert condition is possible

magilbert1
Explorer

Hi

I'm trying to create an alert that will be triggered if I have errors every 5 minutes for 30 minutes.

I'm not sure if that's possible.

Thanks for your help.

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

index=youShouldAlwaysSpecifyAnIndex AND sourcetype=AndSourcetypeToo earliest=-30m@m latest=now
| bin _time span=5m
| stats count BY _time application
| stats count BY application
| where count >= 6

View solution in original post

0 Karma

woodcock
Esteemed Legend

Like this:

index=youShouldAlwaysSpecifyAnIndex AND sourcetype=AndSourcetypeToo earliest=-30m@m latest=now
| bin _time span=5m
| stats count BY _time application
| stats count BY application
| where count >= 6
0 Karma

magilbert1
Explorer

It's seems to work.

But why my count result are all at 7
That count result should not exceed 6 ? ( 30min / 5min = 6 )

0 Karma

woodcock
Esteemed Legend

Sometimes it will be 6 and sometimes 7 because the 5-minute periods might be like this:

now=5:58, -30m@m=5:28
bin1=5:25-5:30
bin2=5:30-5:35
bin3=5:35-5:40
bin4=5:40-5:45
bin5=5:45-5:50
bin6=5:50-5:55
bin7=5:55-6:00

So you can add |head 6 or |tail 6 to trim the partial bin from one side or the other.

0 Karma

magilbert1
Explorer

Ok thank you very much that helps me a lot

0 Karma

magilbert1
Explorer

I have now these two differents searches for my problem.
But i can't figure it out how to count the number of 5min windows that have 1 or more errors.

index="MyIndex" earliest=-30m@m latest=@m | bin _time span=5m | stats count by _time | where count >0

index="Myindex" earliest=-30m@m latest=@m | streamstats time_window=5m count | where count > 0

0 Karma

acharlieh
Influencer

Assuming every event in Myindex is an error... (if not you need to adjust the search prior to the first pipe)...

index="MyIndex" earliest=-30m@m latest=@m | bin _time span=5m | stats count by _time | where count >0

gives you one result for each 5 minute window that has at least 1 error so:

index="MyIndex" earliest=-30m@m latest=@m | bin _time span=5m | stats count by _time | where count >0 | stats count

would then give you the number of 5 minute windows with at least 1 error.

0 Karma

magilbert1
Explorer

I also need to make sure that the errors come from the same application.
I mean I can have two application errors log that can make a total of 6 5-minute windows where there is an error but this case should not trigger an alert.

0 Karma

acharlieh
Influencer

That's exactly why I suggested bringing an additional dimension/field through both stats commands in my answer. You have a field in your events identifying application, you need to split by that field too

0 Karma

woodcock
Esteemed Legend

I would use this, to avoid even buckets:

index=myIndex earliest=-30m@m latest=@m | streamstats time_window=5m count BY application | where count > 0
0 Karma

magilbert1
Explorer

If I have the correct understanding of the query.

This will output only if I have more than 0 result in 5mins window.
but how I know that I have errors non-stop for 30 minutes.
I need something to count how many 5 minutes windows wich I have 1 or more errors ?
I need to know that I have 6 windows in 30 minutes with errors in it.

0 Karma

woodcock
Esteemed Legend

Good point, this will not work for that case, let me put in a new answer.

0 Karma

acharlieh
Influencer

Yes it's possible... Your base search should look for errors, and need your search time window to be 30 minutes wide. ( earliest=-30m@m latest=@m or something similar)

Then you'd use bin to bucket up your _time value to every 5 minutes.

You can get the errors by every 5 minute bucket with: stats count by _time,

Then keep only those where you have 5 errors or more per bucket with where count >= 5

Repeating similar processes without time, you can now get the number of timespans with 5 or more errors with: stats count by <other dimensions like host?>

and then where count = 6 to get down to those other dimensions with 5 errors every 5 minutes. (because 6*5min = 30min... but check me on this as off by one errors is one of the two hard problems in computer science, along with cache invalidation and naming things)

That's essentially the outline of the search to do this.

0 Karma

magilbert1
Explorer

Okay Thanks I'll try this.

0 Karma

magilbert1
Explorer

Will it needs to use a subsearch ? ( to have 2 "By" close )
Can you give me a structure example ?

For nom I have : index=myIndex earliest=-30m@m latest=@m | bin _time span=5m | stats count by _time | where count > 0

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...