Splunk Search

Compute row differences

guilhem
Contributor

Hello everyone!

I have a very simple result table that looks like this:

_time s duration
2/10/13 12:20:22.000 PM 21279054471791556300 0
2/10/13 12:20:43.000 PM 21279054471791556300 21
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:52.000 PM 21279054471791556300 90
2/10/13 12:22:26.000 PM 21279054471791556300 124
2/10/13 12:22:59.000 PM 21279054471791556300 157
2/10/13 12:23:07.000 PM 21279054471791556300 165
2/11/13 5:09:16.000 AM 21242230731515268458 131
2/11/13 5:09:38.000 AM 21242230731515268458 153
2/11/13 5:09:46.000 AM 21242230731515268458 161
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:10:19.000 AM 21242230731515268458 194
2/11/13 5:10:24.000 AM 21242230731515268458 199
2/11/13 5:10:37.000 AM 21242230731515268458 212
2/11/13 5:10:45.000 AM 21242230731515268458 220
2/11/13 5:10:50.000 AM 21242230731515268458 225

And I would like to compute a new field, let say diff, that will contains the difference between each duration, row by row: like this:

_time s duration diff
2/10/13 12:20:22.000 PM 21279054471791556300 0 / Nothing
2/10/13 12:20:43.000 PM 21279054471791556300 21 21
2/10/13 12:21:07.000 PM 21279054471791556300 45 23
2/10/13 12:21:07.000 PM 21279054471791556300 45 0
2/10/13 12:21:52.000 PM 21279054471791556300 90 45
2/10/13 12:22:26.000 PM 21279054471791556300 124 69
2/10/13 12:22:59.000 PM 21279054471791556300 157 33
2/10/13 12:23:07.000 PM 21279054471791556300 165 8
2/11/13 5:09:16.000 AM 21242230731515268458 131 / Nothing
2/11/13 5:09:38.000 AM 21242230731515268458 153 22
2/11/13 5:09:46.000 AM 21242230731515268458 161 18
2/11/13 5:09:53.000 AM 21242230731515268458 168 7
2/11/13 5:09:53.000 AM 21242230731515268458 168 0
2/11/13 5:10:19.000 AM 21242230731515268458 194 26
2/11/13 5:10:24.000 AM 21242230731515268458 199 5
2/11/13 5:10:37.000 AM 21242230731515268458 212 13
2/11/13 5:10:45.000 AM 21242230731515268458 220 8
2/11/13 5:10:50.000 AM 21242230731515268458 225 5

I have tried to use the delta command, but it doesn't work because events from several s can occur at the same time.

I want to avoid using double makemv / mvexpand and compute the difference between all and filter after, as it will cost a lot of time (n square) and I really only need the duration(n) - duration(n-1) value

Thanks a lot!

Guilhem

1 Solution

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

View solution in original post

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

martin_mueller
SplunkTrust
SplunkTrust

list(duration) sounds quite cumbersome when you're basically just looking for last(duration)...

guilhem
Contributor

Be carefull thow, as I have just discovered, the lis() function only returns the first 100 results, so it may be convenient to use something else (penultimate duration value, instead of list).

I will update the answer when I found out how to find the penultimate value of a field using stats.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

guilhem
Contributor

Perfect, using the current=f to get the last-but-one value is clever!

I update the answer.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Consider this to get around the list/mvindex thingy:

... | streamstats current=f last(duration) as last_duration by s | eval diff = duration - last_duration

AshimaE
Explorer

Could you explain what current=f is used for here. Also last(x) takes us to the oldest entry for that s isn't it. so how is it being used here exactly. sorry for the naive doubts. im still a newbie.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...