Just looking for the best practice solution to the below problem. I'm pretty new to Splunk, so I feel the answer might be quite simple.
The problem:
Currently, a million logs come into a location daily. At the end of every month, these logs are indexed, and a report based on search results is created. Since thirty million logs are all being processed in a block, it takes a lot of time to index them - and an even longer time to search.
The fix:
A single search runs over the course of the month, indexing new logs as they arrive, searching them, and appending all results in one large XML, or CSV, or similar.
The implementations:
• Set an alert that triggers on detecting new files to be indexed. I.e: if not already indexed, index them and immediately run the search on these new files, then append resulting search data to file.
• Run tscollect daily on data that is already not indexed in a .tsidx to collect a relevant subset of data from raw, then process it in a block to create a report at end-of-month using the quicker tstats .
• Simply set a scheduled search (searching last 24h) to run daily after the logs are indexed, appending results to file.
Thanks for the help!
... View more