The setup is like this...
index=myindex myfield=*FOO* | timechart span=1h count by myfield
Where myfield's values have a few variations. I want to get counts of those values over time in a few different ways for a "summary dashboard"
Today's count by hour
This week's count by day
This month's count by week
This environment logs millions of events a day and each event with have the myfield with a value. The docs are lacking and I'm not sure if there's a fancier way to do this besides the "brute force" approach of three different accelerated searches with different timechart span=XYZ values and earliest/latest set. I tried to build a "master summary" using eventstats so I could pipe it into timechart after the fact (used eventstats because there were a few other fields besides myfield that I wanted to try in charts as well) but eventstats can't be accelerated.
Tldr; I'm lost in the best way to show counts of counts of events over massive data. Need an "explain like I'm 5" to-do list.
Without spending a lot of time, you can use accelerated searches here, and aggregate them.
Base search
index=myindex myfield=*FOO* | timechart span=1h count by myfield
And your aggregating searches can look like
Daily
| savedsearch mysavedsearch1 | timechart span=1d count by myfield
Weekly
| savedsearch mysavedsearch1 | timechart span=1w count by myfield
Eventstats probably isnt what you want to use. Eventstats looks at every event and transforms it. If you're not wanting to modify singular events, then you can pipe those to stats instead of timechart..
Daily
index=myindex myfield=*FOO* | bin _time span=1m | stats count by myfield myfield2 myfield3 myfieldN
Note that with the stats command we have to use the bin command for our time slicing. This can be aggregated downstream in the search to wider timespans. E.g, after the 1m bin, we can change it to 1h/1w/1month.
Datamodels are also an option, and maybe better. They will automatically backfill when the savedsearches dont run.
Without spending a lot of time, you can use accelerated searches here, and aggregate them.
Base search
index=myindex myfield=*FOO* | timechart span=1h count by myfield
And your aggregating searches can look like
Daily
| savedsearch mysavedsearch1 | timechart span=1d count by myfield
Weekly
| savedsearch mysavedsearch1 | timechart span=1w count by myfield
Eventstats probably isnt what you want to use. Eventstats looks at every event and transforms it. If you're not wanting to modify singular events, then you can pipe those to stats instead of timechart..
Daily
index=myindex myfield=*FOO* | bin _time span=1m | stats count by myfield myfield2 myfield3 myfieldN
Note that with the stats command we have to use the bin command for our time slicing. This can be aggregated downstream in the search to wider timespans. E.g, after the 1m bin, we can change it to 1h/1w/1month.
Datamodels are also an option, and maybe better. They will automatically backfill when the savedsearches dont run.
Thanks, this was very helpful. Second question, for the base search, what should I put for earliest and latest? - My guess is that if I'm doing weekly/daily/hourly views, that the base search should be -1mon@mon to now?
Additional question - the summary queries using the base - they show NULL where I could expect myfield to be. Is that because the base search isn't done running yet, or is something wrong with the query?
The queries I'm using are identical except myfield is the actual field name.
if its null, its not returning a value. Use the loadjob or job inspector to load search results and see if your results are as expected. You might adjust the base search to and fill null values with something you'll recognize as not a valid value.
For your search spans, they should match the aggregated value in the search. Your base search every 1h, daily every 24h etc..