Splunk Search

Transaction Question

justintaylor9
Explorer

Trying to calculate the duration between two log messages, have found many resources online but nothing seems to work quite right...

This is what I am trying:

host=ns2 I move from | transaction Restart_status startswith="I move from normal to communications-interrupted" endswith="I move from startup to normal" | timechart avg(duration) by Restart_status

Basically I want to know how long it for the server to restart its DHCP service. I defined 'Restart_status' as a custom field.

Currently this search ends up with 0 results.

Thanks for your help!

Tags (1)
0 Karma
1 Solution

elliotproebstel
Champion

The use of the Splunk transaction command requires a field that ties together the events, like a session_id or device_id, etc. It looks like your events don't have such a unifying field; instead, you want to assume that the time-based ordering of the fields always defines a transaction. We can do that. Assuming you've extracted the messages like "I move from communications-interrupted to normal" and "I move from communications-interrupted to startup" into a field called msg, this should work:

your base search collecting the events
| sort 0 _time 
| streamstats earliest(_time) AS start_time reset_after="(msg=\"I move from startup to normal\")"
| eval duration=if(msg="I move from startup to normal", _time-start_time, NULL)

The events that define the end of a transaction (events with msg="I move from startup to normal") will get a field called duration, with the duration (in seconds) of the time between that particular event and the most recent event preceding it that contains msg="I move from normal to communications-interrupted". All events will also have a field called start_time, which is the value of _time for the most recent event marking the start of the pseudo-transaction.

View solution in original post

elliotproebstel
Champion

The use of the Splunk transaction command requires a field that ties together the events, like a session_id or device_id, etc. It looks like your events don't have such a unifying field; instead, you want to assume that the time-based ordering of the fields always defines a transaction. We can do that. Assuming you've extracted the messages like "I move from communications-interrupted to normal" and "I move from communications-interrupted to startup" into a field called msg, this should work:

your base search collecting the events
| sort 0 _time 
| streamstats earliest(_time) AS start_time reset_after="(msg=\"I move from startup to normal\")"
| eval duration=if(msg="I move from startup to normal", _time-start_time, NULL)

The events that define the end of a transaction (events with msg="I move from startup to normal") will get a field called duration, with the duration (in seconds) of the time between that particular event and the most recent event preceding it that contains msg="I move from normal to communications-interrupted". All events will also have a field called start_time, which is the value of _time for the most recent event marking the start of the pseudo-transaction.

justintaylor9
Explorer

Thank you for your response. I tried this...

I extracted the messages to the field "Restart_status"...

host=ns2 I move from | transaction Restart_status startswith="I move from normal to communications-interrupted" endswith="I move from startup to normal" | sort 0 _time | streamstats earliest(_time) AS start_time reset_after="(Restart_status=\"I move from startup to normal\")" | eval duration=if(Restart_status="I move from startup to normal", _time-start_time, NULL)

To no avail.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think you mis-interpreted @elliotproebstel's answer. This should clarify. It uses streamstats instead of transaction because the latter does not perform well.

host=ns2 "move from" | sort 0 _time | streamstats earliest(_time) AS start_time reset_after="(Restart_status=\"I move from startup to normal\")" | eval duration=if(Restart_status="I move from startup to normal", _time-start_time, NULL)
---
If this reply helps you, Karma would be appreciated.

justintaylor9
Explorer

Shouldn't it be like this:

host=ns2 "move from" | sort 0 _time | streamstats earliest(_time) AS start_time reset_after="(Restart_status=\"I move from startup to normal\")" | eval duration=if(Restart_status="I move from normal to communications-interrupted", _time-start_time, NULL)

This is still not working.

0 Karma

elliotproebstel
Champion

No, your changes should yield duration values of 0. The search suggested by @richgalloway is what I intended. Can you tell us in what way it is not working? It is calculating the wrong value? Is it not giving any results at all? This will help us troubleshoot.

justintaylor9
Explorer

It just returns with all of the events that match the base search "move from" ... without any calculations occurring.

Thanks!

0 Karma

elliotproebstel
Champion

Hmm, interesting. Can you verify that the _time fields are properly extracted? Maybe try this:

host=ns2 "move from"
| eval display_time=_time
| table _time display_time _raw

Does that show you three columns? It should display the _time field as a human-readable timestamp, the display_time field as an epoch time string (a long number), and the raw text of the event.

justintaylor9
Explorer

The time fields are indeed properly extracted. This search shows precisely what you specified, the three columns with the human-readable timestamp, the epoch, and the full event.

0 Karma

elliotproebstel
Champion

The plot thickens! How about this:

host=ns2 "move from" 
| sort 0 _time 
| streamstats earliest(_time) AS start_time reset_after="(Restart_status=\"I move from startup to normal\")" 
| eval duration=if(Restart_status="I move from startup to normal", _time-start_time, NULL)
| table _time Restart_status start_time duration 

justintaylor9
Explorer

Now we are in business. Duration is showing correct values, with the exception of one giant outlier. This should give me exactly what I need. Thank you guys so much for your help!!!

0 Karma

elliotproebstel
Champion

Glad to help. Since we've solved it, I'd like to explain the difference between this final "test" and what we did before. All we added here was a table command, which displays the results of the calculations front and center. Without the table command, the fields were still being calculated, but you'd have to expand individual events to view the results. Give it a shot. Run this search query:

 host=ns2 "move from" 
 | sort 0 _time 
 | streamstats earliest(_time) AS start_time reset_after="(Restart_status=\"I move from startup to normal\")" 
 | eval duration=if(Restart_status="I move from startup to normal", _time-start_time, NULL)

And click to expand an event that contains "I move from startup to normal". You'll see the fields start_time and duration were hiding there under the covers. 🙂 You can change the final command (table, stats, etc.) to display exactly what you want to see.

0 Karma

niketn
Legend

@justintaylor9 what is the ID between events to be correlated? What is the Restart_Status value in the sample events you have provided, as you have used Restart_Status as ID field in your transaction?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

justintaylor9
Explorer

Restart_status is a field I extracted... it contains either 'I move from normal to communications-interrupted' or 'I move from startup to normal' which indicate the start and end of the restart process.

0 Karma

niketn
Legend

ID is supposed to be matching for the two events your are correlating. If Restart_Status is not and ID your transaction will not work. So coming back to the question... What is the ID between each Start and End? Or is it every subsequent entry will be start and then end?

Event before transaction or any other approach can be applied, could you please explain based on sample data provided as to how you know that the two events are far the same transaction i.e start and stop?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

justintaylor9
Explorer

Every subsequent entry will be start and then end.

0 Karma

kmaron
Motivator

can you please provide examples of your events

0 Karma

justintaylor9
Explorer
3/28/18

10:22:43.000 AM
Mar 28 10:22:43 ns2 dhcpd[5445]: failover peer ns1-ns2(1521650181p): I move from startup to normal

host =  ns2 
source =    /var/log/splunk-data/splunk.log 
sourcetype =    infoblox:dhcp   

3/28/18

10:22:42.000 AM
Mar 28 10:22:42 ns2 dhcpd[2911]: failover peer ns1-ns2(1521650181p): I move from communications-interrupted to startup

host =  ns2 
source =    /var/log/splunk-data/splunk.log 
sourcetype =    infoblox:dhcp   

3/28/18

10:19:03.000 AM
Mar 28 10:19:03 ns2 dhcpd[28860]: failover peer ns1-ns2(1521650181p): I move from normal to communications-interrupted

host =  ns2 
source =    /var/log/splunk-data/splunk.log 
sourcetype =    infoblox:dhcp   

3/28/18

10:01:45.000 AM
Mar 28 10:01:45 ns2 dhcpd[28860]: failover peer ns1-ns2(1521650181p): I move from communications-interrupted to normal

host =  ns2 
source =    /var/log/splunk-data/splunk.log 
sourcetype =    infoblox:dhcp   
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...