Re: Hourly stats of all columns with multiple join...

rakeshreddy123 · ‎02-29-2024

Hello,

I have a query that gathers all the data from API calls, P90/P95 and P99 time, along with capturing API response times in time buckets (<1s, 1 to 3 seconds, till >10s) along with Avg and Peak TPS, no matter how much I try, I am unable to get these to report hourly over the course of last 24 hours. I am using multiple joins as well in the query.

index= X
| eval eTime = responsetime
| stats count(responsetime) as TotalCalls, p90(responsetime) as P90Time,p95(responsetime) as P95Time, p99(responsetime) as P99Time by fi
| eval P90Time=round(P90Time,2) | eval P95Time=round(P95Time,2) | eval P90Time=round(P90Time,2)
| table TotalCalls,P90Time,P95Time,P99Time
| join type=left uri [search index=X | eval pTime = responsetime | eval TimeFrames = case(pTime<=1, "0-1s%", pTime>1 AND pTime<=3, "1-3s%", pTime>3, ">3s%") | stats count as CallVolume by platform, TimeFrames | eventstats sum(CallVolume) as Total | eval Percentage=(CallVolume/Total)*100 | eval Percentage=round(Percentage,2) | chart values(Percentage) over platform by TimeFrames | sort -TimeFrames]
| join type=left uri [search index=X | eval resptime = responsetime | bucket _time span=1s | stats count as TPS by _time,fi | stats max(TPS) as PeakTPS, avg(TPS) as AvgTPS by fi | eval AvgTPS=round(AvgTPS,2) | fields PeakTPS, AvgTPS]

My stats currently look like this:

TotalCalls	P90Time	P95Time	P99Time	0-1s%	1-3s%	AvgTPS	Platform	PeakTPS
1565113	0.35	0.44	1.283	98.09	1.91	434.75	abc	937

I just need these stats every hour over the course of last X days. I only able to get certain columns worth of data, but the chart in the first join and the fields in the second join are somehow messing it up.

Richfez · ‎02-29-2024

I've taken a few stares at that search, and I honestly think you are trying to do way too much in one search.

There are several giveaways. a) There's different binning/bucketing for one of them - though I think the entire thing could be binned to 1s intervals to no actual detriment, b) results are joined together using 'join' which is rarely the right command to use because Splunk Is Not A Relational DB, and c) but you sort of *have* to use join simply because of how convoluted it would be to pile all these results into one table.

But it's not necessarily the worst solution if (and only if!) there's only a single or only a few uri/fi's to actually do this for. E.g. if it's a 1-1 mapping for a few dozen items, ... well, this is still inefficient but it's not going to be the end of the world.

A slight improvement would be to use append to get all the outputs into one table, then a final "| stats ... by fi" (or uri?) to get the three sets of results back into a single row per fi/uri.

Next best might be to rework the entire thing into one much bigger pile o' data, with a more complex stats that does all this work at once. You'd likely have to do some conditional counting inside the stats, but though it may look a little odd, it's actually still going to be about a hundred times more efficient than your existing search, especially as it scales up to lots of uri/fi's. https://docs.splunk.com/Documentation/Splunk/latest/Search/Usestatswithevalexpressionsandfunctions

But I think best would be three simple searches with their results placed on a dashboard. You'll have simpler searches that are much faster, far more easily edited/adjusted as things change, and likely way more robust as well. You'll also have a LOT more ways you could display the results.

So a few other suggestions - I'll bet you are a DBA, are you not? This looks similar to what someone familiar with any common RDBMS dropped into Splunk-land would do. That's not a complaint or any slight, it's where lots of folks start. In that case, you might want to start here: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SQLtoSplunk

But even in the docs, Splunk still suggests join in Splunk to handle the left join from SQL and that I think is a disservice. See Nick Mealy's talk https://conf.splunk.com/watch/conf-online.html?search=mealy

(and even better if you can find the 2016-2018 version of that talk which I think is even better and more detailed)

Hourly stats of all columns with multiple joins that include table and charting of data

chart

table

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases