Hey guys - today, I have a simple query that returns hosts that have hit a max of 75% CPU or memory usage over a given time. This works great; however, I would like to change this so that the host only appears if it breached that 75% limit for several minutes.. How would I do that?
Thanks
source="perfmon:CPU" OR source="perfmon:memory" counter="% Processor Time" OR counter="% Committed Bytes In Use" Value>75 | chart Max(Value) by host counter |rename "% Committed Bytes In Use" as "Memory Usage", "% Processor Time" as "CPU Usage"
Streamstats (https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Streamstats) is your friend here.
source="perfmon:CPU" OR source="perfmon:memory" counter="% Processor Time" OR counter="% Committed Bytes In Use" | streamstats min(_time) AS startTime, max(_time) AS endTime reset_before="(Value<75)" by host| eval timeSpan = endTime-startTime | search timeSpan>300
What we're doing here is taking the first appearance of an event over 75% and continuing to group subsequent events together until the CPU usage is less than 75%. At that point, we reset the start and end times. The eval then creates a timeSpan, so how long we were over 75% usage. Finally, the search then gives us our threshold, in this case 5 minutes (300 seconds).
maybe try the bin span=10m _time
you can set the span to your preference and then use timechart command. something like this:
index = perfmon source="perfmon:CPU" OR source="perfmon:memory" counter="% Processor Time" OR counter="% Committed Bytes In Use" Value>75 | bin span=5m _time | timechart max(Value) by host
hope it helps