I have multiple logs in a file and I want to find the difference between them. For Example:
*
1. 05/10/2018 - 14:04:49 --- Deployment Process completed
2. 05/10/2018 - 14:04:39 --- extra processes completed
3. 05/10/2018 - 14:04:36 --- extra processes started
4. 05/10/2018 - 14:04:34 --- ftp completed
5. 05/10/2018 - 14:04:30 --- About to ftp data
6. 05/10/2018 - 14:04:29 --- Deployment Process started *
Desired output:-
Description. -> Time taken
Time taken for ftp(5. - 4.). -> 4s
Time taken for extra process(3. - 2.). -> 3s
Total time taken(6. - 1.) -> 20s
I'm using:
|transaction host startswith="Running the Deployment Process" endswith="Deployment Process Completed OK" | timechart avg(duration) as difference |
This helps in finding the time difference between 2 events at a time but I'm unable to understand how to do this with multiple events and display it as mentioned above(one after the other in different rows of a table).
Could someone please help me with this?
@poojadevadas
Can you please try this?
YOUR_SEARCH | appendpipe [transaction host startswith="About to ftp data" endswith="ftp completed" | eval "Time taken for ftp"=duration ] | appendpipe [transaction host startswith="extra processes started" endswith="extra processes completed" | eval "Time taken for extra process"=duration ] | appendpipe [ |transaction host startswith="Deployment Process started" endswith="Deployment Process Completed" | eval "Total time taken"=duration] | stats values(*) as * by host | eval Description="Time taken" | table Description "Time taken for ftp" "Time taken for extra process" "Total time taken" | transpose header_field=Description column_name=Description | eval "Time taken"='Time taken'."s"
Note: For designing a search I have used your provided sample only.
Based on the data provided, try this,
"your initial search " |replace "ftp completed" with "ftp process completed","About to ftp data" with "ftp process started"
|eval _time=strptime(_time,"%d/%m/%Y %H:%M:%S")
|rex field=REPLACE_THIS_WITH_YOUR_FIELD_NAME "(?<process_name>\w+)\s+(?i)process"
|sort process_name,_time
|streamstats current=f window=1 last(_time) as start_time by process_name
|eval "Time taken"=_time-start_time|search "Time taken"=*
|eval Description="Time taken for ".process_name." process"
|table Description,"Time taken"| addcoltotals labelfield="Description" label="Total time taken"
Removed transactions since they are bit expensive comparatively.
REPLACE_THIS_WITH_YOUR_FIELD_NAME should be replaced with your fieldname where you have the strings " Deployment Process,extra processes" etc.
This took some time for me to understand(since I'm new to Splunk) but was able to use it with some changes. Thanks a ton!
hi @poojadevadas ,
It looks like @renjith.nair helped solve your problem. Would you mind approving their answer and up-voting? Thanks for posting!
@poojadevadas
Can you please try this?
YOUR_SEARCH | appendpipe [transaction host startswith="About to ftp data" endswith="ftp completed" | eval "Time taken for ftp"=duration ] | appendpipe [transaction host startswith="extra processes started" endswith="extra processes completed" | eval "Time taken for extra process"=duration ] | appendpipe [ |transaction host startswith="Deployment Process started" endswith="Deployment Process Completed" | eval "Total time taken"=duration] | stats values(*) as * by host | eval Description="Time taken" | table Description "Time taken for ftp" "Time taken for extra process" "Total time taken" | transpose header_field=Description column_name=Description | eval "Time taken"='Time taken'."s"
Note: For designing a search I have used your provided sample only.
This is a very straight forward one. Except for the the fact mentioned by @renjith.nair that transactions are expensive, this is good to go for. Used it and got the output as expected. Thanks a ton!
@poojadevadas, performance consideration of transaction
comes when you a have large number of events. Whichever answer suits your requirement better, "accept it as answer" so that the thread is closed.
Accepted this since I'm using less number of events and hence suits my requirement better.
@poojadevadas,
Are these from same host/source/sourcetype or is there any unique value we could use to differentiate and group these messages? Also is there a definite set of processes or are these process messagess dynamic? Just trying to understand if we can normalize this data to have uniformity in log messages
Hi @renjith.nair,
These are from the same host, source and source type.
There's no unique value which we can use to differentiate and group these messages. All I have is this log files which looks exactly like how I have mentioned above(except for the numbers on the left).
And yes, we use the same processes in each log until and unless the deployment is unsuccessful(which I'm not looking for at present). So, the logs that I'm gonna consider is only for successful deployments and hence these processes are definite set of processes.