Knowledge Management

Data Field Entries Across Different Time Spans per Entry

mmedal
Explorer

I have a bunch of SAN usage data that I am inputting into Splunk that looks as follows, with each line representing an entry in Splunk:

Group: diskdg1 Disks: 21 Disk in use: data04 Capacity: 1%  
Group: diskdg2 Disks: 21 Disk in use: data05 Capacity: 1%  
Group: diskdg3 Disks: 5 Disk in use: data01 Capacity: 33%  
Group: diskdg4 Disks: 34 Disk in use: data08 Capacity: 1%  
Group: diskdg5 Disks: 30 Disk in use: data07 Capacity: 1%  
Group: diskdg6 Disks: 38 Disk in use: data09 Capacity: 25%

What I would like to do is display a table with these fields, plus a new field displaying a "change in capacity" since 7 days ago. In other words, I would like to evaluate the difference between the capacity field now and the capacity field for that entry 7 days ago.

Can anyone assist me with a search?

Thanks so much, Matt

Tags (1)
1 Solution

dwaddle
SplunkTrust
SplunkTrust

At first glance, the difference should be pretty easy - you can use the delta search command. But, delta lacks a by clause so you could only do one Group at a time - a bit of a limitation. But, I think you can use streamstats to roughly create a delta per-Group.

Assuming that your data above has field extractions for Group and Capacity then a search like this should get you close:

sourcetype=my_san_data 
| streamstats last(Capacity) as high first(Capacity) as low by Group window=7 global=f 
| eval delta=high-low
| table _time,Group,Capacity,delta

You may need to swap around high vs low just to get it to work out mathematically right. There is an assumption here that you are collecting this data once per day. The way this "should" work is streamstats will do a sliding window of 7 events per Group and use the first and last values of Capacity within each of those sliding windows to calculate a delta.

Obviously a sliding window of 7 events is not necessarily strictly 7 days. It depends on you collecting exactly once per day, every day, without missing one. If you are collecting once per hour, then you can adjust window to be 168 instead.

There are some more complicated ways of dealing with this like maintaining state in lookups, or time-oriented subsearches if you need a higher precision than a sliding window. But, unless your accuracy requirements are very very high, this should be "close enough".

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

At first glance, the difference should be pretty easy - you can use the delta search command. But, delta lacks a by clause so you could only do one Group at a time - a bit of a limitation. But, I think you can use streamstats to roughly create a delta per-Group.

Assuming that your data above has field extractions for Group and Capacity then a search like this should get you close:

sourcetype=my_san_data 
| streamstats last(Capacity) as high first(Capacity) as low by Group window=7 global=f 
| eval delta=high-low
| table _time,Group,Capacity,delta

You may need to swap around high vs low just to get it to work out mathematically right. There is an assumption here that you are collecting this data once per day. The way this "should" work is streamstats will do a sliding window of 7 events per Group and use the first and last values of Capacity within each of those sliding windows to calculate a delta.

Obviously a sliding window of 7 events is not necessarily strictly 7 days. It depends on you collecting exactly once per day, every day, without missing one. If you are collecting once per hour, then you can adjust window to be 168 instead.

There are some more complicated ways of dealing with this like maintaining state in lookups, or time-oriented subsearches if you need a higher precision than a sliding window. But, unless your accuracy requirements are very very high, this should be "close enough".

mmedal
Explorer

Thanks for the feedback. Great answer to my question, it certainly is "close enough" haha.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...