Splunk Search

Calculating times between segments/steps in a conversation/transaction

esp
New Member

Is it possible to dynamically calculate the RHS of a search comparison?

I'm looking to use Splunk to do latency measurements across various segments of a processing pipeline, e.g.:

A -> B -> C

I have a log that looks like:

  <conversationId> <timestamp> <segment (e.g. A, B or C)>

Where conversationId is used to correlate messages related to a single 'conversation' as they flow through the pipeline.

I can calculate end-to-end latency like so:

sourcetype="source" segment="C" |
eval endTime=timestamp |
fields conversationId, endTime |
join type=outer conversationId [
  search sourcetype="source" segment="A" |
  eval startTime=timestamp |
  fields conversationId, startTime
] |
eval latency=(endTime-startTime) |
fields conversationId, latency

which works, but I need to explicitly identify the start and end segments. I'd like to be able to generalize this so that I can calc latency across each of the subsegments without having to name each of them (this becomes a pain as the number of segments increases or changes).

My idea was to include info about the previous segment in the log messages:

  <conversationId> <timestamp> <segment (e.g. A, B or C)> <previousSegment)

And then have a search like:

sourcetype="source" |
eval prev=previousSegment |
eval endTime=timestamp |
fields conversationId, previousSegment, endTime |
join type=outer conversationId [
  search sourcetype="source" segment=***prev*** |
  eval startTime=timestamp |
  fields conversationId, startTime
] |
eval latency=(endTime-startTime) |
fields conversationId, latency

I can't get this to work however. Is there some way to be able to use a calculated field in the RHS of a search comparison?

Thanks, Edwin

Tags (1)
0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

I don't think you need to (nor should you) do what you seem to be trying.

Seems this search could much more easily and efficiently be done with:

sourcetype=source 
| stats 
    first(_time) as latest
    last(_time) as earliest
  by conversationId
| eval latency = latest-earliest

Alternatively, if for some reason _time isn't the same as timestamp:

sourcetype=source 
| stats 
    max(timestamp) as latest
    min(timestamp) as earliest
  by conversationId
| eval latency = latest-earliest


Update: Oh I see, you want the diffs between each stage. Then you'd need:

source=sourcetype
| streamstats global=f window=2 current=t
    max(_time) as currenttime
    min(_time) as prevtime
  by
    conversationId
| eval latency=later-earlier

I think. I may have it off-by-one, so the latency is for the next stage instead of the previous stage.

View solution in original post

splunksolutions
Splunk Employee
Splunk Employee

This is something we've built a search command to do in the Splunk App for Transaction Profiling. Look in the menu for Samples -> Steps. The current version on Splunkbase, Preview 2, still requires you to identify each segment; but we're looking at ways to more generally define when a new segment starts.

Esp, the product team would like to engage with you offline. Can you please email transactionprofiling@splunk.com?

Thanks!

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I believe that your approach is more complicated and less efficient than necessary. Instead of your specific question about variable substitution, I have answered with what I think is a better way to get the results you seem to be asking for.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I don't think you need to (nor should you) do what you seem to be trying.

Seems this search could much more easily and efficiently be done with:

sourcetype=source 
| stats 
    first(_time) as latest
    last(_time) as earliest
  by conversationId
| eval latency = latest-earliest

Alternatively, if for some reason _time isn't the same as timestamp:

sourcetype=source 
| stats 
    max(timestamp) as latest
    min(timestamp) as earliest
  by conversationId
| eval latency = latest-earliest


Update: Oh I see, you want the diffs between each stage. Then you'd need:

source=sourcetype
| streamstats global=f window=2 current=t
    max(_time) as currenttime
    min(_time) as prevtime
  by
    conversationId
| eval latency=later-earlier

I think. I may have it off-by-one, so the latency is for the next stage instead of the previous stage.

hulahoop
Splunk Employee
Splunk Employee

Hello esp, have you considered using the transaction command to accomplish this? It will automatically group events across segments (A->B->C) whose conversationid field have the same value. As a bonus, you also get the latency calculated between the earliest event and latest event in the same transaction. This latency is computed as the duration field.

Since the sample data didn't come through, I'll just sketch the search:

sourcetype=source segment=A OR segment=B OR segment=C | transaction conversationid

There are lots of options to defining transactions, including how far apart the events are in relation to each other, what is the maximum time range for a group of events, what event marks the start/end of the transaction, etc. Details on the transaction command are in the Command Reference.

ftk
Motivator

hey, your sample log didn't show up in the question...

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...