Splunk Search

Aggregating fields in JSON array

mbosse
Explorer

I'm relatively new to Splunk queries. I have an event that contains JSON and within the JSON data is an array. There's some data about a web page request, then an array of resources that make up the page, with some timing data for each resource.

I'd like to get things like the resource with the longest duration for a given request, and the average duration across all resources for a given request.

Here's the data:

{
"ts":"2016-01-21T12:15:16.0451054-05:00",
"id":"a95dc052-de57-45a5-be8b-4eee1b7a39ec",
"url":"http://www.mysite.com/default.htm",
"res":
[
{
"r_tid":"a95dc052-de57-45a5-be8b-4eee1b7a39ec",
"r_nm":"http://www.mysite.com/file1.css",
"duration":8.2
},
{
"r_tid":"a95dc052-de57-45a5-be8b-4eee1b7a39ec",
"r_nm":"http://www.mysite.com/file2.css",
"duration":7.731
},
{
"r_tid":"a95dc052-de57-45a5-be8b-4eee1b7a39ec",
"r_nm":"http://www.mysite.com/file1.js",
"duration":16.909
}
]
}

And for max duration, something like this as the output:
r_nm, duration
http://www.mysite.com/file1.js, 16.909

and for the average, something like this:
url, average_duration
http://www.mysite.com/default.htm, 10.947

I've tried spath a number of different ways based on other posts in here, but can't seem to get the right syntax. Any help would be greatly appreciated. The t_id for each resource will always equal the id of the request, in case a join is needed to do this.

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Here's an idea:

| stats count | eval _raw = "{\"ts\": \"2016-01-21T12:15:16.0451054-05:00\",\"id\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"url\": \"http://www.mysite.com/default.htm\",\"res\": [{\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file1.css\",\"duration\": 8.2}, {\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file2.css\",\"duration\": 7.731}, {\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file1.js\",\"duration\": 16.909}]}" 
| spath url | spath res{} output=res | mvexpand res | spath input=res

The first line generates a dummy event using your example, the second line extracts the url, splits the event per resource, and extracts the resource fields. (Side note: If you already have configured field extractions and see a url field you can drop the first spath.) After that, you can append one of these two to get your statistics:

| stats avg(duration) by url
| stats max(duration) by r_nm

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

Here's an idea:

| stats count | eval _raw = "{\"ts\": \"2016-01-21T12:15:16.0451054-05:00\",\"id\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"url\": \"http://www.mysite.com/default.htm\",\"res\": [{\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file1.css\",\"duration\": 8.2}, {\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file2.css\",\"duration\": 7.731}, {\"r_tid\": \"a95dc052-de57-45a5-be8b-4eee1b7a39ec\",\"r_nm\": \"http://www.mysite.com/file1.js\",\"duration\": 16.909}]}" 
| spath url | spath res{} output=res | mvexpand res | spath input=res

The first line generates a dummy event using your example, the second line extracts the url, splits the event per resource, and extracts the resource fields. (Side note: If you already have configured field extractions and see a url field you can drop the first spath.) After that, you can append one of these two to get your statistics:

| stats avg(duration) by url
| stats max(duration) by r_nm

mbosse
Explorer

I did a little more digging and trouble shooting and found that during my import I was being presented with a warning on the entries that exceeded 10000 bytes. I determined the config setting that controls that and increased the truncate value in props.conf. Everything is working well now. Thanks again.

martin_mueller
SplunkTrust
SplunkTrust

Well done 🙂

0 Karma

mbosse
Explorer

Oh - forgot to mention, if I use the dummy data approach and put exactly the same data with the 91 resources in the JSON array as was imported, it seems to work fine. So this tells me it's likely a limitation on the import, the indexing, or on the search (or maybe a combination). So again, thanks for your idea. I think what you proposed is a sound approach and now I have a different issue to work through.

0 Karma

mbosse
Explorer

Thanks martin_mueller. When I used this on some real data, I had to add "| fields url, r_dur, r_nm" before the stats in order to actually get a value. I still don't fully understand when to include the fields clause, but that got what I needed. However, in the real data, on the one request event I'm validating with, there are 91 resources in the JSON array. For some reason, only the first 34 in the array are being used to calculate the average or the max. I'll likely post another question in here specific to that. Otherwise, I believe what you provided is working with the additional of the fields clause. Thank you.

0 Karma

paulwrussell
Explorer

i still cant get real data to display anything other than a big count. didnt make a difference adding fields. what is feilds menat ot be doing in this scenario? why would eval behave different to real data?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...