I have some SPL that generates a table that looks like this for several builds of a job:
Prepare | 1.003 |
Execute Test | 44.544 |
Collate Results | 556.44 |
Post | 23.33 |
And it outputs this for each build that matches the SPL query. I want to be able to calculate the average time elapsed per stages{}.name to each stages{}.duration returned. When I use avg(stages{}.duration), it seems to average over all of the results in a way that isn't coherent.
For instance, I want to display a bar chart that gives you a chart of the following table:
Prepare | <Average of Prepare stage> |
Execute Test | <Average of Execute Test stage> |
Collate Results | <Average of Collate Results stage> |
Post | <Average of Post stage> |
You do not have to share proprietary events. But corporate guidelines will not prevent you from anonymizing data. For volunteers to help, you should illustrate key features/structure/format of your data, replacing every string if needed.
Based one what you have divulged so far, I deduce that you have something like
{"stage": [{"name":"Prepare", "duration": 1.003}, {"name":"Execute Test", "duration": 44.544}, {"name":"Collate Results", "duration": 556.44}, {"name":"Post", "duration": 23.33}]}
If you illustrate your data like this, I don't think your corporate tzar will mind. You would have saved volunteers tons of time reading mind.
I am not sure what you mean by "mvexpand seems to do nothing." If your raw event is anything like the above, spath + mvexpand is exactly the pathway to solution.
| spath path=stage{}
| mvexpand stage{}
| spath input=stage{}
| stats avg(duration) as avg_duration by name
Using that single data point, the output is
name | avg_duration |
Collate Results | 556.44 |
Execute Test | 44.544 |
Post | 23.33 |
Prepare | 1.003 |
Below is a data emulation you can play with and compare with real data.
| makeresults
| eval _raw = "{\"stage\": [{\"name\":\"Prepare\", \"duration\": 1.003},
{\"name\":\"Execute Test\", \"duration\": 44.544},
{\"name\":\"Collate Results\", \"duration\": 556.44},
{\"name\":\"Post\", \"duration\": 23.33}]}"
``` data emulation above ```
The SPL is returning multiple jobs, yet the stats of the avg(stages{}.duration) is that the average is the same for each stage.
Can you illustrate your data and results, and explain what is the way that isn't coherent? Maybe you have syntax difficulty, for example, you should be using avg('stages{}.duration') instead?
I am using avg(stages{}.duration) and the averages aren't making sense. They are all in the same range when I know there are stages that take very little time on average.
Without knowing what your actual events look like, it is not easy to suggest a solution.
Having said that, I am going to guess that your events contain more than one stage, each with a name and a duration. What you should try to do is split the event into multiple events each with just one stage. You may be able to do with with spath (to extract at the stages level), and mvexpand (to create an event for each stage), then use stats to calculation your averages.
If this isn't enough to help you solve your issue, please be more specific and share your events and the SPL you have already tried.
My jobs have multiple nested stages{} each with a name and a duration and children. Doing this:
spath path=stages{} output=extracted_stages
Gets me a new field with a block of stages. mvexpand seems to do nothing. If I generate a table with just the extracted_stages, I see the same table. Unfortunately, I can't really post the full copy/pasta because of corporate guidelines.
In my current example, I just want the duration of the "top-level" stages{}, not their children. I think the avg(stages{}.duration) is non-sensical, because I have an initial stage "Prepare" that takes never more than a couple of seconds, yet when I display the average it is in a range result that is similar to all the other stages, which makes me think that somehow the average may be getting coalesced in a way that doesn't make sense.
You do not have to share proprietary events. But corporate guidelines will not prevent you from anonymizing data. For volunteers to help, you should illustrate key features/structure/format of your data, replacing every string if needed.
Based one what you have divulged so far, I deduce that you have something like
{"stage": [{"name":"Prepare", "duration": 1.003}, {"name":"Execute Test", "duration": 44.544}, {"name":"Collate Results", "duration": 556.44}, {"name":"Post", "duration": 23.33}]}
If you illustrate your data like this, I don't think your corporate tzar will mind. You would have saved volunteers tons of time reading mind.
I am not sure what you mean by "mvexpand seems to do nothing." If your raw event is anything like the above, spath + mvexpand is exactly the pathway to solution.
| spath path=stage{}
| mvexpand stage{}
| spath input=stage{}
| stats avg(duration) as avg_duration by name
Using that single data point, the output is
name | avg_duration |
Collate Results | 556.44 |
Execute Test | 44.544 |
Post | 23.33 |
Prepare | 1.003 |
Below is a data emulation you can play with and compare with real data.
| makeresults
| eval _raw = "{\"stage\": [{\"name\":\"Prepare\", \"duration\": 1.003},
{\"name\":\"Execute Test\", \"duration\": 44.544},
{\"name\":\"Collate Results\", \"duration\": 556.44},
{\"name\":\"Post\", \"duration\": 23.33}]}"
``` data emulation above ```