search a bucket without aggregating results

josephinemho · ‎02-13-2018

I want to be able to search for all results where Percent_CPU_Load is greater than 95. However, when I do search CPUbucket=Greater than 95%, it's not giving me what I want - the sparkline is not correct. I think it might be aggregating those results and then returning it.

Here's an example of how the sparkline SHOULD look like:

But here's what I'm getting

Note: I need to have that search CPUbucket option in mu query because ultimately, I want it to return all servers that fall into that bucket category. I am only specifying one server to use in this example.

Code:

 index=os 
    | lookup sa_managed_servers.csv host 
    | search server_group=SA | eval Percent_CPU_Load = 100 - pctIdle 
    | search sourcetype=cpu host="example.com"
    | eval CPUbucket=case(Percent_CPU_Load > 95, "Greater than 95%", Percent_CPU_Load <=95 AND Percent_CPU_Load > 50, "51%-95%", Percent_CPU_Load <=50 AND Percent_CPU_Load >=20, "20%-50%", Percent_CPU_Load <20, "Less than 20%") 
    | search CPUbucket="Greater than 95%" CPU=all
    | stats sparkline(avg(Percent_CPU_Load)) as Activity avg(Percent_CPU_Load) as Average, max(Percent_CPU_Load) as Peak, min(Percent_CPU_Load) as Low, avg(pctIowait) as WaitTime by host 
    | eval Average=round(Average, 0)."%" 
    | eval Peak=round(Peak, 0)."%" 
    | eval Low=round(Low, 0)."%" 
    | eval WaitTime=round(WaitTime, 0)."%" 
    | sort -Average 
    | rename host AS Server

esix_splunk · ‎02-13-2018

One thought on this, you first result, which you say is correct, is over a span of 6 hours with 434 events. Whereas the second graph is over the similar timespan, but with only 5 events.

As an educated guess, I would say that this sparkline is accurate based on 5 events being loaded and aggregated. Look at those raw events, and most likely they are very similar in nature, as in 100%.. Whereas your first sampling is more varied as there are 434 results...

josephinemho · ‎02-15-2018

Thanks!

So the reason why the first image returns 5 results is because there are actually 5 other rows of data (for 5 different hosts/servers) which I didn't capture in the screenshot. It doesn't reflect the data that goes into creating the sparkline, it is only the result of all the hosts/servers from my query:
stats sparkline(avg(Percent_CPU_Load)) as Activity avg(Percent_CPU_Load) as Average, max(Percent_CPU_Load) as Peak, min(Percent_CPU_Load) as Low, avg(pctIowait) as WaitTime by host

That is why I believe the problem is in this part of the query: | search CPUbucket="Greater than 95%" CPU=all

However, I need that part of the query because I want to search for ALL hosts/servers that have CPU Greater than 95%, so I need that search option in there. However, I also need the sparkline to show the individual trend of a specific host/server and somehow it's not portraying it accurately.

search a bucket without aggregating results

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!