Solved: timechart only show lines that have spikes

mauricio2354 · ‎04-03-2018

I have a query:

index=source STATUS=499 SERVICE!="Health#*"| timechart useother=f span=1m count by SERVICE

That will have large spikes sometimes. I want to only show the services that have those large spikes. How do I do that?
If there's no way to have spikes be detailed as greater than twice the average count or something, greater than 20 would be fine.

Thanks so much!

elliotproebstel · ‎04-03-2018

How about this, which finds all services that have timespans with a count greater than the average plus the standard deviation for that service OR a count less than the average minus the standard deviation for that service.

index=source STATUS=499 SERVICE!="Health#*"
| bin _time span=1m
| stats count BY SERVICE _time
| eventstats avg(count) as avg_count stdev(count) AS std_dev_count max(count) AS max_count min(count) AS min_count BY SERVICE 
| where max_count>avg_count+std_dev_count OR min_count<avg_count-std_dev_count 
| timechart useother=f values(count) AS count by SERVICE

View solution in original post

elliotproebstel · ‎04-03-2018

How about this, which finds all services that have timespans with a count greater than the average plus the standard deviation for that service OR a count less than the average minus the standard deviation for that service.

index=source STATUS=499 SERVICE!="Health#*"
| bin _time span=1m
| stats count BY SERVICE _time
| eventstats avg(count) as avg_count stdev(count) AS std_dev_count max(count) AS max_count min(count) AS min_count BY SERVICE 
| where max_count>avg_count+std_dev_count OR min_count<avg_count-std_dev_count 
| timechart useother=f values(count) AS count by SERVICE

mauricio2354 · ‎04-03-2018

I like this idea! I don't think the eventstats works too well for my query, so going with "had a spike of greater than 20" would be like this, right?

index=source STATUS=499 SERVICE!="Health#*"
| bin _time span=1m
| stats count BY SERVICE _time
| where count> 20
| timechart useother=f values(count) AS count by SERVICE

The only problem here is that when I visualize it, it only shows the points where those services hit above 20, and not the whole line. Would you know how to include the whole line?

elliotproebstel · ‎04-03-2018

BTW, I apologize but I had a typo in the eventstats line on my original post - it was a copy/paste error moving from my test environment to yours. I fixed it now, so you might give it another try.

mauricio2354 · ‎04-03-2018

Hmm the original post is still giving me some trouble with showing the full line, but the last query you showed me works perfectly! Thanks so much for all the help!

elliotproebstel · ‎04-03-2018

Glad we got it sorted!

elliotproebstel · ‎04-03-2018

That's the point of using eventstats. It allows you to retain all the original data while finding the min and max values. If the originally proposed approach doesn't work, here's a less calculation-intensive example of eventstats:

index=source STATUS=499 SERVICE!="Health#*"
| bin _time span=1m
| stats count BY SERVICE _time
| eventstats max(count) AS max_count BY SERVICE 
| where max_count>20
| timechart useother=f values(count) AS count by SERVICE

elliotproebstel · ‎04-03-2018

If that still doesn't work, you could try this:

index=source STATUS=499 
 [ search index=source STATUS=499 SERVICE!="Health#*"
 | bin _time span=1m
 | stats count BY SERVICE _time
 | where count> 20
 | fields SERVICE ]
| timechart useother=f span=1m count by SERVICE

This will use a subsearch to isolate the services that had spikes and then search for all events from those services and create a timechart.

JDukeSplunk · ‎04-03-2018

Maybe?

  index=source STATUS=499 SERVICE!="Health#*"
|where SERVICE =>20
| timechart useother=f span=1m count by SERVICE

mauricio2354 · ‎04-03-2018

Unfortunately, that doesn't work. I think you were looking for ">=" instead of "=>" and each SERVICE is a string, not an integer. This still wouldn't work if you counted by "count" either

timechart only show lines that have spikes

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM