Splunk Search

How do I create a table or other output of total process durations with multiple subprocess durations

wjblazek
Explorer

I have log data like this:
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.1; Duration: 0 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.2; Duration: 5 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.3; Duration: 10 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.4; Duration: 20204 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.5; Duration: 100 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.6; Duration: 647 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.7; Duration: 899 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.8; Duration: 399 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.9; Duration: 411 ms;
. . .
2015-04-22 14:10:07,590 [ACTIVE] AfNDC [CCID] - Duration: 5239 ms Context: REST

This last statement is the total process duration

I would like a table or other output that shows all the subprocess durations for each total process duration.
I'm not picky about the format.
Something like this:
Subprocess Name Duration(ms) Total Process Duration(ms)
subprocess.name.1 0 5239
subprocess.name.2 5 5239

subprocess.name.3 10 5239
subprocess.name.4 20204 5239
subprocess.name.5 100 5239
subprocess.name.6 647 5239
subprocess.name.7 899 5239
subprocess.name.8 399 5239
subprocess.name.9 411 5239
The spacing above may not show up correctly.

So far I have this much with no problem:
search |
use rex to define a common field, CCID |
use rex to define subproccess field names and durations |
use rex to define total process duration |
use rex to define a few other identifying fields |
transaction CCID |
table URI,Processor,SegDuration,TotalDuration,RespCode

This give me nice transactions of
all the subprocess events with names and durations and the total duration event.
But all the subprocess names are in alphabetical order, in one field, possibly a multivalue field, and
the all the duration values in some other random order, not chronological, increasing or decreasing,
also in one field, possibly a multivalue field,
so that the names do not match with the duration values

How do I get the subprocess names to correspond to the correct values?
I'm also trying mvzip and mvexpand but without any luck.

Thanks in advance.

0 Karma
1 Solution

wjblazek
Explorer
  • You could try with max_match to capture multiple values once the transaction has been created. Yes! Thanks!

max_match is the answer!

Per the rex documentation page, under Optional Arguments:

max_match
Syntax: max_match=< int >
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields will be multivalued fields.
Defaults to 1, use 0 to mean unlimited.

As you suggested, define the fields with multiple events AFTER the transaction AND use the max_match argument on those fields:
search |
use rex to define a common field, CCID |
transaction CCID |
use rex to define:

subproccess field names and durations with max_match=0,
total process duration and

a few other identifying fields, URI, RespCode |
table URI,Processor,SegDuration,TotalDuration,RespCode

gives me exactly what I want, like in your 1st response.

View solution in original post

0 Karma

wjblazek
Explorer
  • You could try with max_match to capture multiple values once the transaction has been created. Yes! Thanks!

max_match is the answer!

Per the rex documentation page, under Optional Arguments:

max_match
Syntax: max_match=< int >
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields will be multivalued fields.
Defaults to 1, use 0 to mean unlimited.

As you suggested, define the fields with multiple events AFTER the transaction AND use the max_match argument on those fields:
search |
use rex to define a common field, CCID |
transaction CCID |
use rex to define:

subproccess field names and durations with max_match=0,
total process duration and

a few other identifying fields, URI, RespCode |
table URI,Processor,SegDuration,TotalDuration,RespCode

gives me exactly what I want, like in your 1st response.

0 Karma

ramdaspr
Contributor

Going with the info you have provided define the transaction first and then do the Rex for the duration and the subtransaction names.

search |
use rex to define a common field, CCID |
transaction CCID |
use rex to define subproccess field names and durations |
use rex to define total process duration |
use rex to define a few other identifying fields

If this an input to splunk, you should already have them broken into separate fields with input.conf so the below solution might not apply exactly.

However. tested out with the sample data you provided...

index=_internal | stats count| eval text="2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.1; Duration: 0 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.2; Duration: 5 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.3; Duration: 10 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.4; Duration: 20204 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.5; Duration: 100 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.6; Duration: 647 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.7; Duration: 899 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.8; Duration: 399 ms;
2015-04-22 14:10:02,351 [ACTIVE] PerfLogger [CCID] - Message: subprocess.name.9; Duration: 411 ms;
2015-04-22 14:10:07,590 [ACTIVE] AfNDC [CCID] - Duration: 5239 ms Context: REST;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.1; Duration: 0 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.2; Duration: 6 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.3; Duration: 11 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.4; Duration: 20205 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.5; Duration: 101 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.6; Duration: 647 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.7; Duration: 899 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.8; Duration: 399 ms;
2015-04-22 14:10:03,351 [ACTIVE] PerfLogger [CCID11] - Message: subprocess.name.9; Duration: 411 ms;
2015-04-22 14:10:08,690 [ACTIVE] AfNDC [CCID11] - Duration: 5243 ms Context: REST;
"| makemv text delim=";
" | mvexpand text| rex field=text "^(?<time>\S+\s\S+)" | eval _time = strptime(time, "%Y-%m-%d %H:%M:%S,%3N") | rex field=text ".*\s\[(?<proc>\S+)\]\s\-.*" | transaction proc | rex field=text ".*Message:\s(?<sname>\S+)\;.*" | rex field=text ".*\;\sDuration\:\s(?<sdur>\d+)\s.*" | rex field=text "\]\s\-\sDuration\:\s(?<tdur>\d+)\s.*" | table proc,sname,sdur,duration,tdur

gives me

Output

0 Karma

wjblazek
Explorer

Thanks much!
More ??s.

1st, that is an elegant way to input sample data to a search:
just assign it to a variable and then operate on that variable:
eval text="..." ...
A great example in and of itself.

I can cut/paste your example and get your exact results but
when I try to apply it to my scenario, it doesn't.

define the transaction first and then do the Rex for the duration and the subprocess names.

That is actually what I initially tried:
search |
use rex to define a common field, CCID |
transaction CCID |
use rex to define subproccess field names and durations,
total process duration and a few other identifying fields, URI, RespCode |
table URI,Processor,SegDuration,TotalDuration,RespCode

But this is only listing the
1st subprocess name and duration for each total duration,
not all the subprocess names and durations for each total duration.

I'm still working on understanding how and why your example works and
why my real search doesn't.

What does this do in your example:
rex field=text "^(?\S+\s\S+)" | eval _time = strptime(time, "%Y-%m-%d %H:%M:%S,%3N")
It doesn't look like it does anything but it doesn't work without it.

It looks like your 1st part:
index=_internal | stats count| eval text="...
"| makemv text delim=";
" | mvexpand text|

should be equivalent to
search |
use rex to define a common field, CCID |
transaction CCID |

Then define the fields and table.
But I only get the 1st subprocess name/duration, not all.
Still trying!

0 Karma

ramdaspr
Contributor

Unfortunately, this is the disadvantage of not having access to the same data/sources which you have..

Reg your question, the reason I am doing the rex is to capture the _time field and converting it into a format which Splunk can understand i.e. timestamp value.
I am guessing my example works because it is trying to generate the fields after the trasnaction is created whereas your data has them created already which is why you are getting only the first transaction name.
You could try with max_match to capture multiple values once the transaction has been created.

| rex field=text ".*\s\[(?<proc>\S+)\]\s\-.*" | transaction proc | rex field=text ".*Message:\s(?<sname>\S+)\;.*"  max_match=10| rex field=text ".*\;\sDuration\:\s(?<sdur>\d+)\s.*" max_match=10 | rex field=text "\]\s\-\sDuration\:\s(?<tdur>\d+)\s.*" max_match=10 | table proc,sname,sdur,duration,tdur
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...