Splunk Search

How do I subtotal processor utilization?

NickJLange
Explorer

Disclaimer: I'm not saying this particular example is useful analysis - I'm just not sure how to think about solving a problem like this in Splunk properly.

I have thousands of events of Zabbix Data where socket-wide data points are normalized into individual events. i.e. system.cpu.util[socket,core,type] across heterogeneous hardware configurations (i.e. # of sockets or # of cores are different).
I want to understand the distribution of the load across a socket by machine modeltype to ensure it matches up to temperature readings - and then flag outliers (either on temperature or idle cores).

I've seen tricks around extracting the itemKey into named Variables which I think works because the timestamp is exactly the same.... but how do you run stats on variables that might not exist? (i.e. socket 4 or core 20?)

Does any of this make sense?

0 Karma

jkat54
SplunkTrust
SplunkTrust
  ... host=hostname |eval socket=if(isnull(socket),"null",socket) |  timechart avg(value) max(value) by socket

AND

  ... host=hostname | eval core=if(isnull(core),"null",core)| timechart avg(value) max(value) by core

should be fine for a host by host basis. Both would work well on a dashboard with a drop down list to select the hostname etc.

 ... |eval socket=if(isnull(socket),"null",socket) | eval core=if(isnull(core),"null",core)| stats avg(value) max(value) by host core socket

The above should be fine for an analyst to select specific time ranges with time picker and see if activity spikes occured, etc.

0 Karma

NickJLange
Explorer

Thank you for the helpful suggestion. I'm looking for more aggregate trends across a class of hosts with different underlying hardware models - which sort of precludes individual host analysis with eyeballs...

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do provide some sample data.

0 Karma

NickJLange
Explorer

It's not very exciting (one row per pseudo-event):

_time,host,itemKey="system.cpu.util[user_utilization,#socket,#core,]",value=int
....
_time,hostN,itemKey="system.cpu.util[user_utilization,#socket,#core,]",value=int

0 Karma

NickJLange
Explorer

Currently, the query uses rex to extract the #socket/#core are extracted to new variables via Rex...

0 Karma

somesoni2
Revered Legend

What will the field value contains?

0 Karma

NickJLange
Explorer

an integer value from 1- 100. representing utilization ... the equiv of /proc/stat

0 Karma

somesoni2
Revered Legend

Is list of possible socket/core fixed?

0 Karma

NickJLange
Explorer

It is hard to predict the socket/core count... but it is a finite set.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...