Hi
The concept of summary indexing, is that some data looses its validity over time at the granularity it was originally collected at, and at a certain point in time, you can summarize it and still have the answers you need. A great example of this, would be CPU metrics. You collect them every 30 seconds (maybe) throughout the day. That would equate to 2880 records per day for Splunk to retrieve. If you searched over 30 days, then it would retrieve 86400 records.
That level of granularity is probably only relevant for a single day. After that, a summarized (maybe min, max & avg) per hour are good enough. Using summary indexing, you can run a search once a day to retrieve all records (at 30 second intervals) and summarize them on a per hour basis, meaning when you search over the summary index, you only need to retrieve 24 records per day. It is this that makes Splunk more efficient and therefore performance improves.
You can play the same game over a longer time span also, where you create a per day statistic for longer term (months or years) trend analysis.
The key point here, is that you index all the data initially into your wineventlog index (as an example), then search that index hourly or daily and write a summary into the summary index. That will also allow you set a small retention period on the original index to minimize the disk usage.
Hope this helps.
... View more