Getting Data In

List hosts with highest value

echalex
Builder

Hi,

We're debugging an issue where disk latency shoots up at a specific time. I would like to create a search which shows the host with the highest latency at any specific minute.

So the base search is:

index=os sourcetype=iostat | multikv fields avgWaitMillis

...but then I'm not sure how to continue... I would like to find every host where avgWaitMillis is the highest for every minute.

Tags (1)
0 Karma

MHibbin
Influencer

I think you may want to pipe to the timechart command, which will allow you gain stats over time. You may be able to do something like:

..| timechart span=1m max(avgWaitMillis) as maxWait

I haven't used a split-by cause (don't think you'll need one), but if you need one, just add something like, "by someField" (where someField is a unique split-by-cause you have).

Please see documentation:

http://docs.splunk.com/Documentation/Splunk/4.3.4/SearchReference/Timechart
http://docs.splunk.com/Documentation/Splunk/4.3.4/SearchReference/CommonStatsFunctions

0 Karma

echalex
Builder

To elaborate a bit on that sample table:

At time n, the avgWaitMillis of host001 equals max(avgWaitMillis) of all hosts (at that time).

Likewise, at time l, the avgWaitMillis of host219 == max(avgWaitMillis) of all hosts at that time.

0 Karma

echalex
Builder

Thanks, but that is still not what I'm after. useother only affects the grouping of the hosts in the chart.

timechart is really not the answer here, since I'm not concerned about the values themselves, but which hosts had the max value at a particular time.

Since I'm primarily interested in the hostnames, a chart is probably not the best visualization, but rather a table, with values about like this:

time  , host_with_highest_latency
time n, host001.domain.com
time m, host321.domain.com
time l, host219.domain.com
0 Karma

MHibbin
Influencer

Have you tried adding useother=f (mentioned in the docs), like so:

..| timechart span=1m max(avgWaitMillis) as maxWait by host useother=f

I can't remember how specific the useother boolean needs to be, but you can also try useother=false, or the binary equivalent (e.g. "1" OR "0").

0 Karma

echalex
Builder

Thank you for your effort to help, never the less!

0 Karma

echalex
Builder

I'm afraid this doesn't do what I want at all.

That will just show the max values of avgWaitMillis, without even mentioning the host.

I want to know which host had the highest latency, not what the highest latency was.

Doing the same by host doesn't help me either, for out of the hundred or so hosts, the majority will be lumped into OTHER. So knowing that one host of 90 in OTHER had the highest latency at 21:15 and 23:30 reveals nothing.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...