Solved: Acceleration or Summary Index for hourly/daily/wee...

yacht_rock · ‎09-20-2015

The setup is like this...

index=myindex myfield=*FOO* | timechart span=1h count by myfield

Where myfield's values have a few variations. I want to get counts of those values over time in a few different ways for a "summary dashboard"

Today's count by hour
This week's count by day
This month's count by week

This environment logs millions of events a day and each event with have the myfield with a value. The docs are lacking and I'm not sure if there's a fancier way to do this besides the "brute force" approach of three different accelerated searches with different timechart span=XYZ values and earliest/latest set. I tried to build a "master summary" using eventstats so I could pipe it into timechart after the fact (used eventstats because there were a few other fields besides myfield that I wanted to try in charts as well) but eventstats can't be accelerated.

Tldr; I'm lost in the best way to show counts of counts of events over massive data. Need an "explain like I'm 5" to-do list.

esix_splunk · ‎09-21-2015

Without spending a lot of time, you can use accelerated searches here, and aggregate them.

Base search

index=myindex myfield=*FOO* | timechart span=1h count by myfield

And your aggregating searches can look like

Daily

| savedsearch mysavedsearch1 | timechart span=1d count by myfield

Weekly

| savedsearch mysavedsearch1 | timechart span=1w count by myfield

Eventstats probably isnt what you want to use. Eventstats looks at every event and transforms it. If you're not wanting to modify singular events, then you can pipe those to stats instead of timechart..

Daily

   index=myindex myfield=*FOO* | bin _time span=1m | stats count by myfield myfield2 myfield3 myfieldN

Note that with the stats command we have to use the bin command for our time slicing. This can be aggregated downstream in the search to wider timespans. E.g, after the 1m bin, we can change it to 1h/1w/1month.

Datamodels are also an option, and maybe better. They will automatically backfill when the savedsearches dont run.

View solution in original post

esix_splunk · ‎09-21-2015

Without spending a lot of time, you can use accelerated searches here, and aggregate them.

Base search

index=myindex myfield=*FOO* | timechart span=1h count by myfield

And your aggregating searches can look like

Daily

| savedsearch mysavedsearch1 | timechart span=1d count by myfield

Weekly

| savedsearch mysavedsearch1 | timechart span=1w count by myfield

Eventstats probably isnt what you want to use. Eventstats looks at every event and transforms it. If you're not wanting to modify singular events, then you can pipe those to stats instead of timechart..

Daily

   index=myindex myfield=*FOO* | bin _time span=1m | stats count by myfield myfield2 myfield3 myfieldN

Note that with the stats command we have to use the bin command for our time slicing. This can be aggregated downstream in the search to wider timespans. E.g, after the 1m bin, we can change it to 1h/1w/1month.

Datamodels are also an option, and maybe better. They will automatically backfill when the savedsearches dont run.

yacht_rock · ‎09-21-2015

Thanks, this was very helpful. Second question, for the base search, what should I put for earliest and latest? - My guess is that if I'm doing weekly/daily/hourly views, that the base search should be -1mon@mon to now?

yacht_rock · ‎09-21-2015

Additional question - the summary queries using the base - they show NULL where I could expect myfield to be. Is that because the base search isn't done running yet, or is something wrong with the query?

The queries I'm using are identical except myfield is the actual field name.

esix_splunk · ‎09-22-2015

if its null, its not returning a value. Use the loadjob or job inspector to load search results and see if your results are as expected. You might adjust the base search to and fill null values with something you'll recognize as not a valid value.
For your search spans, they should match the aggregated value in the search. Your base search every 1h, daily every 24h etc..

Acceleration or Summary Index for hourly/daily/weekly/monthly charts

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!