My data looks like this:
1. System CheckpointName ProcessTimestamp ConnectionId
2. SAP Checkpoint 1 2019-01-24T07:43:11.582Z Checkpoint 2
3. SAP Checkpoint 2 2019-01-24T07:44:00.582Z Checkpoint 3
4. SAP Checkpoint 3 2019-01-24T07:45:40.587Z Checkpoint 4
5. SAP Checkpoint 4 2019-01-24T07:45:55.587Z Checkpoint 1
6. SAP Checkpoint 1 2019-01-24T07:46:11.582Z Checkpoint 2
7. SAP Checkpoint 2 2019-01-24T07:46:47.582Z Checkpoint 3
8. SAP Checkpoint 3 2019-01-24T07:46:55.587Z Checkpoint 4
9. SAP Checkpoint 4 2019-01-24T07:47:31.587Z Checkpoint 1
I want to find out the average time between Checkpoint 1 and 2, 2 and 3, 3 and 4 and so on. That should be the total average duration between the respective Checkpoints (so in the example above, the duration between Checkpoint 1 and Checkpoint 2 would be (49+36)/2 = 42,5 seconds). But I really want to avoid writing several eval statements, because the query should be as generic as possible.
"CheckpointName" is the name of the checked step, and "ConnectionId" describes the next Checkpoint, that follows the current one.
Is there any way I can write a query, that outputs the desired result without writing an eval statement for each Checkpoint? Maybe with the foreach command or something else?
Thanks in advance!
@florianduhme
I'm not sure about events but tried with given data. Can you please try this?
YOUR_SEARCH | table System CheckpointName ProcessTimestamp ConnectionId
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N") | sort - _time CheckpointName
| streamstats window=2 latest(_time) as New_time earliest(ConnectionId) as ConnectionId1 | eval Time = strftime(New_time, "%Y-%m-%dT%H:%M:%S.%3N") | eval diff=New_time - _time | stats avg(diff) as avg_diff by CheckpointName
My Sample Search:
| makeresults
| eval _raw="
No System CheckpointName ProcessTimestamp ConnectionId
2. SAP Checkpoint 1 2019-01-24T07:43:11.582Z Checkpoint 2
3. SAP Checkpoint 2 2019-01-24T07:44:00.582Z Checkpoint 3
4. SAP Checkpoint 3 2019-01-24T07:45:40.587Z Checkpoint 4
5. SAP Checkpoint 4 2019-01-24T07:45:55.587Z Checkpoint 1
6. SAP Checkpoint 1 2019-01-24T07:46:11.582Z Checkpoint 2
7. SAP Checkpoint 2 2019-01-24T07:46:47.582Z Checkpoint 3
8. SAP Checkpoint 3 2019-01-24T07:46:55.587Z Checkpoint 4
9. SAP Checkpoint 4 2019-01-24T07:47:31.587Z Checkpoint 1"
| multikv
| table System CheckpointName ProcessTimestamp ConnectionId
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N") | sort - _time CheckpointName
| streamstats window=2 latest(_time) as New_time earliest(ConnectionId) as ConnectionId1 | eval Time = strftime(New_time, "%Y-%m-%dT%H:%M:%S.%3N") | eval diff=New_time - _time | stats avg(diff) as avg_diff by CheckpointName
Thanks
@florianduhme
I'm not sure about events but tried with given data. Can you please try this?
YOUR_SEARCH | table System CheckpointName ProcessTimestamp ConnectionId
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N") | sort - _time CheckpointName
| streamstats window=2 latest(_time) as New_time earliest(ConnectionId) as ConnectionId1 | eval Time = strftime(New_time, "%Y-%m-%dT%H:%M:%S.%3N") | eval diff=New_time - _time | stats avg(diff) as avg_diff by CheckpointName
My Sample Search:
| makeresults
| eval _raw="
No System CheckpointName ProcessTimestamp ConnectionId
2. SAP Checkpoint 1 2019-01-24T07:43:11.582Z Checkpoint 2
3. SAP Checkpoint 2 2019-01-24T07:44:00.582Z Checkpoint 3
4. SAP Checkpoint 3 2019-01-24T07:45:40.587Z Checkpoint 4
5. SAP Checkpoint 4 2019-01-24T07:45:55.587Z Checkpoint 1
6. SAP Checkpoint 1 2019-01-24T07:46:11.582Z Checkpoint 2
7. SAP Checkpoint 2 2019-01-24T07:46:47.582Z Checkpoint 3
8. SAP Checkpoint 3 2019-01-24T07:46:55.587Z Checkpoint 4
9. SAP Checkpoint 4 2019-01-24T07:47:31.587Z Checkpoint 1"
| multikv
| table System CheckpointName ProcessTimestamp ConnectionId
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N") | sort - _time CheckpointName
| streamstats window=2 latest(_time) as New_time earliest(ConnectionId) as ConnectionId1 | eval Time = strftime(New_time, "%Y-%m-%dT%H:%M:%S.%3N") | eval diff=New_time - _time | stats avg(diff) as avg_diff by CheckpointName
Thanks
This one actually works pretty well. But as mentioned above, I cannot confirm that the events will be ordered like shown in the example.
So there could be scenarios where Checkpoint 3 follows on Checkpoint 1, which will break the logic of the query you stated.
Thank you anyways, the query still helps me out a lot!
Assuming the data is indeed nicely ordered like in your example, the following should work.
...your search to get to this data...
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N%Z")
| delta _time as timediff
| stats avg(timediff) by CheckpointName
More info on the delta
command: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Delta
The data isn't going to be ordered like in the example above, therefore I think the ConnectionId has to be included in the search somehow.
Somethink like "Give me the average duration of Checkpoint 1 (from Checkpoint 1 to Checkpoint 2) where the ConnectionId is Checkpoint 2". I could do that with many eval-statements, but that wouldn't be generic at all.
Sounds like a matter of applying the | sort
command in the right way, to get it ordered like this. Without ordering things, it is going to be very difficult to do anything related to differences between 2 events (unless you want to venture into things like transactions, or self joins, but apart from the performance nightmare that is also going to be difficult given that your data is repetitive).
I don't think the delta command is the one I can go with. I just checked it with my data and the results are definitely not right.
I guess I will have to write a lot of eval statements then.
It works when you run it like this (thanks to @kamlesh_vaghela for the run anywhere search):
| makeresults
| eval _raw="
No System CheckpointName ProcessTimestamp ConnectionId
2. SAP Checkpoint 1 2019-01-24T07:43:11.582Z Checkpoint 2
3. SAP Checkpoint 2 2019-01-24T07:44:00.582Z Checkpoint 3
4. SAP Checkpoint 3 2019-01-24T07:45:40.587Z Checkpoint 4
5. SAP Checkpoint 4 2019-01-24T07:45:55.587Z Checkpoint 1
6. SAP Checkpoint 1 2019-01-24T07:46:11.582Z Checkpoint 2
7. SAP Checkpoint 2 2019-01-24T07:46:47.582Z Checkpoint 3
8. SAP Checkpoint 3 2019-01-24T07:46:55.587Z Checkpoint 4
9. SAP Checkpoint 4 2019-01-24T07:47:31.587Z Checkpoint 1"
| multikv
| table System CheckpointName ProcessTimestamp ConnectionId
| eval _time = strptime(ProcessTimestamp, "%Y-%m-%dT%H:%M:%S.%3N") | sort - _time CheckpointName
| delta _time as timediff
| stats avg(timediff) by CheckpointName
I think reversing the sorting does the trick, otherwise the delta's are shifted one row.
But yeah, if your actual data is much more complex than the example, it may not work as simple as that...