Solved: caculating max count and time it occured

andersmholmgren · ‎12-11-2011

I have a query to calculate some hourly stats like

index=txndata | bucket _time span=1h | stats count as Volume, median(txnDuration) as MedianDuration, max(txnDuration) as MaximumDuration by _time, txnType

This gets put in another index called txnSummary

From this I need to know what the max Volume was per day (or other time period) and which hour it occured in. Similarly for max duration and max median duration.

Max volume I can calculate like

index=txnSummary earliest=-1d@d | bucket _time span=1h | stats count as Volume by _time, txnType | stats max(Volume) by txnType

That gets me the value but not the time it occured in.

To get the hour is the best thing to do a join like

index=txnSummary earliest=-1d@d | bucket _time span=1h | stats count as Volume by _time, txnType | join type=inner Volume,txnType [search index=txnSummary earliest=-1d@d | bucket _time span=1h | stats count as Volume by _time, txnType | stats max(Volume) as Volume by txnType]

and to pick just one then I can

| dedup txnType

This is similar to what you'd need to do in SQL. Is this the best approach in splunk?

I found another possibility by using eventstats but this may be a bit dodgy

index=txnSummary earliest=-1d@d | bucket _time span=1h | stats count as Volume by _time, txnType | eventstats max(Volume) as result  by txnType | where result = Volume | dedup txnType

I guess eventstats must be doing a similar subquery behind the scenes. The advantage to me is that I don't need to repeat the whole search in the subsearch, but it does seem like a hack.

Any other options?

kristian_kolb · ‎12-13-2011

Hi, maybe I misunderstood something in your question, but to me the answer is much simpler;

... | bucket _time span=1h | stats count as Volume by _time, txnType | sort - Volume | dedup txnType

This will give you a table of time, txnType and Volume, with only the top 1 count (Volume) per txnType.

time                    txnType    Volume
12/12/2011 05:00:00     B          102
12/12/2011 11:00:00     A          89
12/12/2011 08:00:00     C          73
12/12/2011 01:00:00     D          12

Hope this helps,

Kristian

View solution in original post

kristian_kolb · ‎12-13-2011

Hi, maybe I misunderstood something in your question, but to me the answer is much simpler;

... | bucket _time span=1h | stats count as Volume by _time, txnType | sort - Volume | dedup txnType

This will give you a table of time, txnType and Volume, with only the top 1 count (Volume) per txnType.

time                    txnType    Volume
12/12/2011 05:00:00     B          102
12/12/2011 11:00:00     A          89
12/12/2011 08:00:00     C          73
12/12/2011 01:00:00     D          12

Hope this helps,

Kristian

andersmholmgren · ‎12-13-2011

Thanks Kristian. That's much better. I hadn't thought of using dedup that way

caculating max count and time it occured

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!