Splunk Search

Report hourly max count events per day over a month

egrignon
Explorer

Hello,

I m trying to get the hour per day which gets the most hits on my application over a month but having some issues to get the right data output.

I would like to get a table report which would have:

DAY1 HOURX MaxEventNumber

DAY2 HOURX MaxEventNumber

....

I tried the following queries but none of them work. I think I m missing something on splunk reporting concepts:

  • | timechart span=1h count as HourlyCount | timechart span=1d max(HourlyCount)
  • | stats count AS hit BY date_hour, date_mday | stats max(hit) BY date_hour, date_mday
  • | timechart span=1h count | convert timeformat="%D" ctime(_time) AS c_time | chart max(count) AS MaxPerDay by c_time | table MaxPerDay date_hour

Any help would be greatly appreciated,

Thanks,

EG

1 Solution

lguinn2
Legend

Try this:

| stats count as hit by date_hour, date_mday 
| eventstats max(hit) as maxhit by date_mday 
| where hit=maxhit
| fields - maxhit

I am not sure it will work. But it should figure out the max hits for each day, and only keep the events with that have have the maximum number.

New and better solution, which works for for all events - not just those with date_* fields:

yoursearchhere
| timechart count as hit span=1h
| eval date=strftime(_time, "%x")
| eventstats max(hit) as maxhit by date
| where hit=maxhit
| fields - maxhit

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

mmm, old post!

I like to use a slightly different approach. It is very clear, and pretty cheap too. You can pipe one timechart into another, using different time resolutions. Like:

sourcetype=access_combined GET OR POST 
| timechart span=1h count as count 
| timechart span=1d max(count) as max

The first timechart gives you a count by hour, piped into the second one which pulls out only the biggest hour per day. The part you lose is what hour of the day was the busy hour. If which hour was the busiest hour is not important to you, then this is a quick and easy approach.

lguinn2
Legend

I like this. Also, I just realized that I would give a different answer at this point!

0 Karma

cygnetix
Path Finder

Simple, yet effective. Thanks.

0 Karma

egrignon
Explorer

Thank you very much both of you.

It seems both solutions work. Do you know which method is best practice or best from a performance perspective?

EG

0 Karma

lguinn2
Legend

With multiple indexers and/or lots of data, I think that eventstats may be slightly more efficient.

You can always run both and compare - use the Search Job Inspector (the 'i' button in 4.3, and under the Actions menu in earlier versions).

0 Karma

sideview
SplunkTrust
SplunkTrust

restating your problem to help clarify. I think you want to end up with a table where each day is only represented by one row, and that row represents the busiest hour during that day.

where the fields in the table are 'day of month', 'hour of day' and 'event count'

Here's the search you need:

<your search terms> | stats count by date_hour date_mday | sort - date_mday count | streamstats count as perMdayCount by date_mday | where perMdayCount=1

it's weird so lets walk it through step by step. You can run the query piece by piece and look at what the results look like at each pipe character (this is the best way to understand every complex search).

The stats clause gives you the count of events rolled up by unique combinations of day-of-month and hour-of-day.

Next the sort clause sorts it primarily by the day of month. But then within those the highest counts will be first in that subset.

Next we use streamstats count, with a very important by date_mday, to actually paint little numbers on each event. If streamstats with a by clause makes your head hurt that's totally normal. Just play with it and you'll understand what it does. In this case since within each subset of a given day-of-month we just sorted the date_hours such that the highest count hour ends up first in the set, these highest count hours all get painted with a 'perMdayCount' field whose value is "1".

Which ends up being marvelously convenient because now we can just use a where clause to match only rows where perMdayCount is "1", and this conveniently throws away all the hours that were not the highest count hour in that given day.

lguinn2
Legend

Try this:

| stats count as hit by date_hour, date_mday 
| eventstats max(hit) as maxhit by date_mday 
| where hit=maxhit
| fields - maxhit

I am not sure it will work. But it should figure out the max hits for each day, and only keep the events with that have have the maximum number.

New and better solution, which works for for all events - not just those with date_* fields:

yoursearchhere
| timechart count as hit span=1h
| eval date=strftime(_time, "%x")
| eventstats max(hit) as maxhit by date
| where hit=maxhit
| fields - maxhit

splunk_zen
Builder

Iguinn,
I've a similar requirement but having trouble joining the search and subsearch fields with the where clause.
Can you please take a look at http://splunk-base.splunk.com/answers/72928/subsearch-busy-hour-cpu-usage ?

0 Karma

ambujhbti
New Member

Thank you. This help me with my work ! Cheers 🙂

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...