Splunk Search

Compute row differences

guilhem
Contributor

Hello everyone!

I have a very simple result table that looks like this:

_time s duration
2/10/13 12:20:22.000 PM 21279054471791556300 0
2/10/13 12:20:43.000 PM 21279054471791556300 21
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:52.000 PM 21279054471791556300 90
2/10/13 12:22:26.000 PM 21279054471791556300 124
2/10/13 12:22:59.000 PM 21279054471791556300 157
2/10/13 12:23:07.000 PM 21279054471791556300 165
2/11/13 5:09:16.000 AM 21242230731515268458 131
2/11/13 5:09:38.000 AM 21242230731515268458 153
2/11/13 5:09:46.000 AM 21242230731515268458 161
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:10:19.000 AM 21242230731515268458 194
2/11/13 5:10:24.000 AM 21242230731515268458 199
2/11/13 5:10:37.000 AM 21242230731515268458 212
2/11/13 5:10:45.000 AM 21242230731515268458 220
2/11/13 5:10:50.000 AM 21242230731515268458 225

And I would like to compute a new field, let say diff, that will contains the difference between each duration, row by row: like this:

_time s duration diff
2/10/13 12:20:22.000 PM 21279054471791556300 0 / Nothing
2/10/13 12:20:43.000 PM 21279054471791556300 21 21
2/10/13 12:21:07.000 PM 21279054471791556300 45 23
2/10/13 12:21:07.000 PM 21279054471791556300 45 0
2/10/13 12:21:52.000 PM 21279054471791556300 90 45
2/10/13 12:22:26.000 PM 21279054471791556300 124 69
2/10/13 12:22:59.000 PM 21279054471791556300 157 33
2/10/13 12:23:07.000 PM 21279054471791556300 165 8
2/11/13 5:09:16.000 AM 21242230731515268458 131 / Nothing
2/11/13 5:09:38.000 AM 21242230731515268458 153 22
2/11/13 5:09:46.000 AM 21242230731515268458 161 18
2/11/13 5:09:53.000 AM 21242230731515268458 168 7
2/11/13 5:09:53.000 AM 21242230731515268458 168 0
2/11/13 5:10:19.000 AM 21242230731515268458 194 26
2/11/13 5:10:24.000 AM 21242230731515268458 199 5
2/11/13 5:10:37.000 AM 21242230731515268458 212 13
2/11/13 5:10:45.000 AM 21242230731515268458 220 8
2/11/13 5:10:50.000 AM 21242230731515268458 225 5

I have tried to use the delta command, but it doesn't work because events from several s can occur at the same time.

I want to avoid using double makemv / mvexpand and compute the difference between all and filter after, as it will cost a lot of time (n square) and I really only need the duration(n) - duration(n-1) value

Thanks a lot!

Guilhem

1 Solution

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

View solution in original post

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

martin_mueller
SplunkTrust
SplunkTrust

list(duration) sounds quite cumbersome when you're basically just looking for last(duration)...

guilhem
Contributor

Be carefull thow, as I have just discovered, the lis() function only returns the first 100 results, so it may be convenient to use something else (penultimate duration value, instead of list).

I will update the answer when I found out how to find the penultimate value of a field using stats.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

guilhem
Contributor

Perfect, using the current=f to get the last-but-one value is clever!

I update the answer.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Consider this to get around the list/mvindex thingy:

... | streamstats current=f last(duration) as last_duration by s | eval diff = duration - last_duration

AshimaE
Explorer

Could you explain what current=f is used for here. Also last(x) takes us to the oldest entry for that s isn't it. so how is it being used here exactly. sorry for the naive doubts. im still a newbie.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...