Solved: How to monitor the lasting of an event with a perc...

jip31 · ‎05-21-2019

Hello

I use the search below in order to monitore process with a CPU charge > 80%
BUT
What I exactly need is to monitore events where the lasting of the CPU charge > 80% is at least one minute
Is there a finction for doing something like this?
it's all the more complex than I have the beginning event time (_time) but I have not closure event time
thanks for your help

index="tutu" sourcetype="perfmonmk:process" 
| where process_cpu_used_percent>80 
| bucket _time span=1m 
| stats avg(process_cpu_used_percent) as process_cpu_used_percent by host, _time 
| eval process_cpu_used_percent =round(process_cpu_used_percent, 1)." %" 
| table _time host process_cpu_used_percent 
| sort -_time limit=10

koshyk · ‎05-21-2019

You should put the where condition later

index="tutu" sourcetype="perfmonmk:process" 
 | bucket _time span=1m 
 | stats avg(process_cpu_used_percent) as process_cpu_used_percent by host, _time 
 | where process_cpu_used_percent>80 
  ...

View solution in original post

koshyk · ‎05-21-2019

You should put the where condition later

index="tutu" sourcetype="perfmonmk:process" 
 | bucket _time span=1m 
 | stats avg(process_cpu_used_percent) as process_cpu_used_percent by host, _time 
 | where process_cpu_used_percent>80 
  ...

jip31 · ‎05-21-2019

no
when i am doing this the search is very very long and I have no results
moreover I would be able the results in a table with 3 fields : _time, host, and lasting
lasting will be the lasting on a CPU process > 80%....

koshyk · ‎05-21-2019

this search should be faster, as you are doing the stats before you do where clause. The reason why you are not getting result is there won't be genuinely any average of CPU lasting > 80

Try putting a lower value of CPU (say >20) and see if it yields report.
Also lasting is a concept which is same as averaging out a value for timespan. So in your case, your 1min is timespan for buckets and it averages out accordingly.

Please see a quick simulation, using the internal logs

index="_introspection" host=* 
| bucket _time span=1m
| stats avg(data.mem_used) as mem_used by host,_time
| where mem_used > 300

jip31 · ‎05-21-2019

yes it's good with a lower value of CPU
I think you have forgottent something in your code? i cant see anything about lasting

jip31 · ‎05-21-2019

If I catch the field _time for an event where CPU >80% and I compare it with the field _time of the next event, if this next event is also an event where CPU >80% I am able to calculate the lasting of the process CPU >80% no??

koshyk · ‎05-21-2019

i'm slightly confused by why you need lasting. When you do span=1m, it checks for the 1minute interval average. So if 80% cpu "lasts" for that whole 1 minute, then only the SPL will have an output

jip31 · ‎05-21-2019

i trust you
what I need is to monitore a 80% CPU lasting for a whole 1 minute
so i consider that the search is goog 😉
thanks

jip31 · ‎05-21-2019

Somebody told me about transaction function

How to monitor the lasting of an event with a percentage condition with a CPU charge > 80%?

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes