Hi Splunk users,
I have a simple request in appearance but I have been thinking about it the whole day without figuring out how to do it with Splunk.
My problem is the following: I have 4 types of events: A, B, C and D that occurs in a cycle (i.e. A then B then C then D) but I can have multiple occurences of this loop at the same time, whichi means I have logs like this :
A B A C B D C D
I know that the first B belongs to first A, etc. I thus have 2 cycles in the example above and I would like to determine the sequence duration (2 in this example), each event is of course timestamped.
I tried with the transaction
command but it's not working as I want.
tl;dr : what I want :
A B A C B D C D
| timing |
| timing |
What I have with transaction
:
A B A C B D C D
|timing|
Hi,
I started to work on this request again and keeping in mind @somesoni2 's remark about identifiers, I have been able to work this around.
So the request's principle is the following (dirty):
Make a subsearch to get all A events and their ID
Make a subsearch to get all D events and their ID
Sort results by _time descending order
Get transactions by ID starting with A and ending with D
????
Profit!
The request is looking something like this:
| append [
search index etc.
(A event or ID event for A)
| rex get ID in a named field
| transaction startswith="A event" endswith="ID for A event" mvlist=t
| eval ID=mvindex(ID, 1), event=A
]
| append [
search index etc.
(D event or ID event for D)
| rex get ID in a named field
| transaction startswith="D event" endswith="ID of D event" mvlist=t
| eval ID=mvindex(ID, 1), event=D
]
| sort - _time
| transaction ID startswith=eval(event==A) endswith=eval(event==D)
Hi,
I started to work on this request again and keeping in mind @somesoni2 's remark about identifiers, I have been able to work this around.
So the request's principle is the following (dirty):
Make a subsearch to get all A events and their ID
Make a subsearch to get all D events and their ID
Sort results by _time descending order
Get transactions by ID starting with A and ending with D
????
Profit!
The request is looking something like this:
| append [
search index etc.
(A event or ID event for A)
| rex get ID in a named field
| transaction startswith="A event" endswith="ID for A event" mvlist=t
| eval ID=mvindex(ID, 1), event=A
]
| append [
search index etc.
(D event or ID event for D)
| rex get ID in a named field
| transaction startswith="D event" endswith="ID of D event" mvlist=t
| eval ID=mvindex(ID, 1), event=D
]
| sort - _time
| transaction ID startswith=eval(event==A) endswith=eval(event==D)
Here's a run-anywhere version of a potential solution. Required assumption is that the event types can be identified into a single field (here called mytype
) prior to the streamstats
verb, and that exactly one of each must occur in your cycle....
| makeresults
| eval mytype="A B A C B D C D"
| makemv mytype
| mvexpand mytype
| streamstats count as recno
| eval _time=_time + recno
| fields - recno
| rename COMMENT as "The above just creates test data."
| rename COMMENT as "The solution is as below."
| streamstats count as cycleno by mytype
| transaction cycleno
There is an additional consideration here that may be a problem - having to identify the records belonging to the first cycle. If the extraction starts in the middle of a cycle, these numbers won't correctly meet up. For instance, if the search time begins at that asterisk, the first C and D will be associated wrongly with the A and B after the asterisk, rather than the one before. I'm not sure if there is a principled solution to this, since the number of crossovers is potentially infinite.
A B * A B C D C D
If I recall correctly, I provided a solution for this a few months back. It requires running a streamstats
or accum
forward across the time frame, using reverse
and running a streamstats
or accum
backward across the time frame, and then adjusting the results based on the highest positive or negative numbers in each direction.
I tried your solution, I did not quietly understood everypart of it but strange behavior: it returned 4999 events on the last 7 days on 4498 real cycles. Also, each line returned was six times the same event.
Iteratively, what I want to do is pretty simple, translation into Splunk request is harder than I thought:
for each event:
if event A then: //Nth event
loop to first B
loop to first C
loop to first D
Determine duration by substracting A's timestamp from D's
//Go to next event, (N+1)th event
Do you have some sort of transaction ID/primary key which can differentiate two cycles?
Hi, I have one in another event that occurs after A, it identifies the item that is doing the cycle