Background:
Once an asynchronous request has been triggered, a client starts to poll the system waiting for an object to have been created. The system processes the request and writes an event to the log file once done.
I am trying to create a transaction that ties together the very first poll request with the first occurrence of another event.
search terms | transaction UniqueIdentifier startswith="createObject" endswith="ObjectHasBeenCreated" maxspan=30s
This search unfortunately matches the latest "start event" instead of the first/earliest start event.
How can I get the transaction to use the earliest start event as the starting point?
The plan, once I match the correct events, is to use the "duration" method to get the timing.
You may be better off using the streamstats command http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/streamstats
One simple approach, lets assume "startswith" and "endwith" values are in a field called "action". You could do something like this
.... | streamstats count by action | sort action | delta _time as duration
This is a very sketchy SPL to get you going in a direction.
I don't think the above answer will solve the problem, unless I failed to apply it properly.
Here comes more detailed info:
In my search clause I am able to limit the results to two types of events, the StartEvent which can occur many times, and the EndEvent that closes the "transaction". I need the duration between the earliest StartEvent and the EndEvent.
The events do not have a common field, but both do have a common email address, but with different labels/field names. With the transaction approach that failed due to using the latest event instead of the first event, the events were tied together by using a RegExp Field Extraction.
If I search for a specific email address I can get the duration like this:
search terms finding exactly 1 transaction (by entering a unique email address) | stats earliest(_time) AS start latest(_time) as stop | eval durationSeconds = stop - start | stats max(durationSeconds) as max
But when not searching for a specific unique email address the duration is calculated for all the transactions.
How do I solve this if I want the duration for 1000s of individual transactions, if I cannot use the transaction command?
The end goal is to be able to graph the response times over time (per transaction), with for example "timechart avg(duration)".
I hope I was able to explain my need 🙂
I updated my field extraction so now the email address is extracted with one common field for both Events.
Too bad the transaction function does not have a feature to decide if earliest/latest should be used for StartsWith/Endswith.
Any ideas are appreciated!
This search kind of works (but is very shaky):
search terms | dedup Email sortby +_time | eval Email=lower(Email) | stats earliest(_time) as E,latest(_time) as L by Email | eval duration=L-E | where duration > 0 | where duration < 200
where Email is a field extraction that pulls the email from both types of events. The "where duration > 0" is needed since the result is output as 0 even when no EndEvent is found. The "where duration < 200" is needed to exclude the results that match a later EndEvent or where the same email is used again.
I hope you guys have some better ideas.....
Ok, I think it is working decently now.
search terms | dedup Email sortby +_time | eval Email=lower(Email) | stats earliest(_time) as E,latest(_time) as L by Email | eval duration=L-E | where duration > 0 | where duration < 200 | eval Time=strftime(E, "%m/%d %H:%M") | chart avg(duration) by Time
I am still interested in case you have ideas to improve this.