Splunk Search

How to find individual averages of each stages{}.duration by stages.name{} in Splunk for Jenkins app?

joelwizard
Explorer

I have some SPL that generates a table that looks like this for several builds of a job:

Prepare 1.003
Execute Test 44.544
Collate Results 556.44
Post 23.33


And it outputs this for each build that matches the SPL query. I want to be able to calculate the average time elapsed per stages{}.name to each stages{}.duration returned. When I use avg(stages{}.duration), it seems to average over all of the results in a way that isn't coherent.

For instance, I want to display a bar chart that gives you a chart of the following table:

Prepare <Average of Prepare stage>
Execute Test <Average of Execute Test stage>
Collate Results <Average of Collate Results stage>
Post <Average of Post stage>
Labels (4)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

You do not have to share proprietary events.  But corporate guidelines will not prevent you from anonymizing data.  For volunteers to help, you should illustrate key features/structure/format of your data, replacing every string if needed.

Based one what you have divulged so far, I deduce that you have something like

{"stage": [{"name":"Prepare", "duration": 1.003}, {"name":"Execute Test", "duration": 44.544}, {"name":"Collate Results", "duration": 556.44}, {"name":"Post", "duration": 23.33}]}

If you illustrate your data like this, I don't think your corporate tzar will mind.  You would have saved volunteers tons of time reading mind.

I am not sure what you mean by "mvexpand seems to do nothing."  If your raw event is anything like the above, spath + mvexpand is exactly the pathway to solution.

 

 

| spath path=stage{}
| mvexpand stage{}
| spath input=stage{}
| stats avg(duration) as avg_duration by name

 

 

Using that single data point, the output is

nameavg_duration
Collate Results556.44
Execute Test44.544
Post23.33
Prepare1.003

Below is a data emulation you can play with and compare with real data.

 

 

| makeresults
| eval _raw = "{\"stage\": [{\"name\":\"Prepare\", \"duration\": 1.003},
{\"name\":\"Execute Test\", \"duration\": 44.544},
{\"name\":\"Collate Results\", \"duration\": 556.44},
{\"name\":\"Post\", \"duration\": 23.33}]}"
``` data emulation above ```

 

 

 

View solution in original post

Tags (2)

joelwizard
Explorer

The SPL is returning multiple jobs, yet the stats of the avg(stages{}.duration) is that the average is the same for each stage.

joelwizard_0-1683650955462.png

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Can you illustrate your data and results, and explain what is the way that isn't coherent?  Maybe you have syntax difficulty, for example, you should be using avg('stages{}.duration') instead?

0 Karma

joelwizard
Explorer

I am using avg(stages{}.duration) and the averages aren't making sense. They are all in the same range when I know there are stages that take very little time on average.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Without knowing what your actual events look like, it is not easy to suggest a solution.

Having said that, I am going to guess that your events contain more than one stage, each with a name and a duration. What you should try to do is split the event into multiple events each with just one stage. You may be able to do with with spath (to extract at the stages level), and mvexpand (to create an event for each stage), then use stats to calculation your averages.

If this isn't enough to help you solve your issue, please be more specific and share your events and the SPL you have already tried.

0 Karma

joelwizard
Explorer

My jobs have multiple nested stages{} each with a name and a duration and children. Doing this:

spath path=stages{} output=extracted_stages

Gets me a new field with a block of stages. mvexpand seems to do nothing. If I generate a table with just the extracted_stages, I see the same table. Unfortunately, I can't really post the full copy/pasta because of corporate guidelines.

In my current example, I just want the duration of the "top-level" stages{}, not their children. I think the avg(stages{}.duration) is non-sensical, because I have an initial stage "Prepare" that takes never more than a couple of seconds, yet when I display the average it is in a range result that is similar to all the other stages, which makes me think that somehow the average may be getting coalesced in a way that doesn't make sense. 


0 Karma

yuanliu
SplunkTrust
SplunkTrust

You do not have to share proprietary events.  But corporate guidelines will not prevent you from anonymizing data.  For volunteers to help, you should illustrate key features/structure/format of your data, replacing every string if needed.

Based one what you have divulged so far, I deduce that you have something like

{"stage": [{"name":"Prepare", "duration": 1.003}, {"name":"Execute Test", "duration": 44.544}, {"name":"Collate Results", "duration": 556.44}, {"name":"Post", "duration": 23.33}]}

If you illustrate your data like this, I don't think your corporate tzar will mind.  You would have saved volunteers tons of time reading mind.

I am not sure what you mean by "mvexpand seems to do nothing."  If your raw event is anything like the above, spath + mvexpand is exactly the pathway to solution.

 

 

| spath path=stage{}
| mvexpand stage{}
| spath input=stage{}
| stats avg(duration) as avg_duration by name

 

 

Using that single data point, the output is

nameavg_duration
Collate Results556.44
Execute Test44.544
Post23.33
Prepare1.003

Below is a data emulation you can play with and compare with real data.

 

 

| makeresults
| eval _raw = "{\"stage\": [{\"name\":\"Prepare\", \"duration\": 1.003},
{\"name\":\"Execute Test\", \"duration\": 44.544},
{\"name\":\"Collate Results\", \"duration\": 556.44},
{\"name\":\"Post\", \"duration\": 23.33}]}"
``` data emulation above ```

 

 

 

Tags (2)
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...