Splunk Search

How to plot line graph against average over different time periods?

richard_bragg
New Member

We have a set of logs from different hosts that specify a metric. I want to display a line graph over a user-selectable time period that plots the metric as a percentage difference from the 30 day average for each host.

That is to get the 30 day average for each host then plot (metric/average)*100)-100. This would mean that 80% would plot as -20, 100% as 0, 120% as 20, and so on. Additionally, we would like to highlight where the value falls outside -50 - 50. The next stage would be to alert if the value is outside the -50 to 50 more than the last, say 20 mins.

What we are looking for is to spot where the metric is larger than average for an extended period (flood condition) or lower (drought). The range could change.

I'm new to Splunk so I don't really know exactly where to start.

I can get the graph to work with averaging in the same period as plotting. It's having the average over a different time period. Hosts are selected by a pattern so would be dynamic (host matching pattern CVM_AGG). We may extend this to either a second query BUS_AGG or merge the two _AGG. Since we are using percentage against average they will scale the same.

Labels (1)
0 Karma

richard_bragg
New Member

Adding as comment to attach examples.

Pivot editor
alt text

Run report
alt text

0 Karma

to4kawa
Ultra Champion

sample:

| tstats count where index=_internal source="*metrics.log" by _time span=1h PREFIX("name=")
| rename "name=" as name
| rename COMMENT as "from here, the logic"
| bin _time span=1d
| stats sum(count) as count by _time name
| streamstats global=f window=30 avg(count) as 30days_avg by name

visualization > Line Chart with trellis by name

Why don't you try the following:

  1. make dashboard with time picker.
  2. try above logic
  3. make percentage calculation.
  4. delete extra fields
  5. visualization
0 Karma

richard_bragg
New Member

Well this chart will (if usable) fit to an existing dashboard but that bit maybe irrelevant.

This is my data model
index = "tivoli_omnibus_prod" sourcetype=objsvr_stats_log source=CVM_AGG earliest=-30d@d | eventstats avg(Status_Inserts) as Inserts by Omnibus | eval percstat=(100*(Status_Inserts/Inserts))-100

I know the "earliest" statement is fixing date range.

Then into this report
| pivot Netcool_Insert_Model RootObject avg(percstat) AS "Compare to 30 day average" SPLITROW _time AS _time PERIOD auto SPLITCOL Omnibus SORT 0 _time ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 100 SHOWOTHER 0

Is this the right way to go about things? Where does your logic fit?

0 Karma

to4kawa
Ultra Champion

I don't know. I can't understand pivot query.

0 Karma

richard_bragg
New Member

This is what I get saving pivot as a report using the interface. So it's generated by Splunk.

If I view in the Pivot editor I can see two lines and it all looks fine, I can change the time period and the graph follows that selection.

If I run in Report it's all fixed at 30 days regardless of the TimePicker.

0 Karma

to4kawa
Ultra Champion
| tstats avg(percstat) as count from datamodel=Netcool_Insert_Model by _time Omnibus
| xyseries _time Omnibus count

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Tstats

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Pivot

0 Karma

richard_bragg
New Member

I tried this in the Search app and get
Error in 'DataModelCache': Invalid or unaccelerable root object for datamodel
The search job has failed due to an error. You may be able view the job in the Job Inspector.

0 Karma

richard_bragg
New Member

Editing DataModel to use acceleration gives

You can only accelerate data models that include at least one event-based dataset or one search-based dataset that does not include streaming commands.

0 Karma

to4kawa
Ultra Champion
| pivot Netcool_Insert_Model RootObject avg(percstat) AS "Compare to 30 day average" earliest=-4h SPLITROW _time AS _time PERIOD auto SPLITCOL Omnibus SORT 0 _time ROWSUMMARY 0 COLSUMMARY 0 NUMCOLS 100 SHOWOTHER 0
0 Karma

richard_bragg
New Member

Thanks but this leads to

Error in 'PivotProcessor': Error in 'PivotCell': The dataset 'RootObject' has no field 'earliest=-4h'.

I'm going to see what I can play with. Just still getting used to terms etc.

0 Karma

to4kawa
Ultra Champion

I don't even know the data, so it's a dead end.

0 Karma

richard_bragg
New Member

Well the data is pretty simple, one file per Netcool and the lines in the log have time stamp and a number of numeric values one of which is a count of inserts.

But isn't earliest part of Splunk limiting how far back the search can go?

This all works nicely in the Pivot editor but not run as a report so there must be something in the editor that obeys the data picker that's ignored in the report.

What we are trying to highlight is where the inserts are outside a range around the 30 day average, that is if inserts are outside 50-150% of the average. If too low for an extended period possible data feed issue, if too high we could be seeing a data flood that if maintained could bring system down.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...