I have log file in which each log has a conversationId field and a transactionId field. The logs are of various machines communicating with each other and each communication between the machines is called a conversation. For every conversation there are multiple log entries all having the same conversationId but different transactionId. I want to first find the earliest transactionId for each conversation in my logs and then fetch all the events that have this conversation + transactionId pair and also the string "input record" in it.
How do I go about doing this?
I first tried to use the stats command to get the values of all conversationIds but I dont know how to find the earliest transactionId for each conversation and then iterate this over all the conversationIds returned by the stats command.
Okay, now that I understand the question better, try this to return the events, not the stats:
"input record" [ search "xyz" | stats earliest(transactionId) as transactionId by conversationId ]
The only problem here is that the subsearch (the part in brackets) is limited to 10,000 results. If that is a problem for you, then you could do this:
"input record"
| join transactionId conversationId [ search "xyz" | stats earliest(transactionId) as transactionId by conversationId ]
This has a limit of 50,000 events, but it will be slower. However, neither of these searches will be particularly fast.
not sure if this possible and can't test right now, but was thinking of something like this. Create a table of conversationId, min transactionId and a list of the _raw events from that conversation. Then search for only those conversations whose raw events match your search string.
[search for all conversations] |stats min(transactionId) values(_raw) as conversations by conversationId | where match(conversations,"xyz")
Okay, now that I understand the question better, try this to return the events, not the stats:
"input record" [ search "xyz" | stats earliest(transactionId) as transactionId by conversationId ]
The only problem here is that the subsearch (the part in brackets) is limited to 10,000 results. If that is a problem for you, then you could do this:
"input record"
| join transactionId conversationId [ search "xyz" | stats earliest(transactionId) as transactionId by conversationId ]
This has a limit of 50,000 events, but it will be slower. However, neither of these searches will be particularly fast.
Try this
yoursearchhere "input record"
| stats earliest(transactionId) by conversationId
If that isn't what you want, you could do this
yoursearchhere
| stats earliest(transactionId) as transactionId earliest(_raw) as raw by conversationId
| where raw like "%input record%"
If you want to only see the transaction id and conversation id, add this to the end of the search
| fields conversationId transactionId
Thanks but the stats command returns statistics and I want events. Basically, here are the steps I want to follow:
1) Find all events that have some string "xyz" in them. This will give a filtered list of conversations.
2) For each of the above conversations, find the earliest transaction id in its respective conversation.
3) Use this transaction id and conversation id pair to search through ALL events (not just the filtered ones in step 1 because the events filtered in step 1 may not contain the events that have the 2 ids and the string "input record" in them) and find the event that has the string "input record" in it along with the conversation id and transaction id pair and do this for all the id pairs found in step 2.
4) Display the events found in step3
So in the end, I want the events not the stats