I know this is dependent on the variance of the data being indexed, but since the indexing mechanism is proprietary, I’d like some real-world numbers.
50% (2:1 ratio) is a safe bet. That includes not only the indexes but also the compressed raw log data. So allocating 5GB per day is a good bet, though I'd probably add at least 20% on top for growth.
Also when talking storage, you'll want to consider average search time. So, the majority of searches which take place are usually "last 24 hours" or "last 7 days" but rarely do most searches go beyond that 7 day period. So having 40GB of local storage (as fast as possible as that disk is going to handle collection, compression, indexing and search for that short time period). Then it can be pushed out to slower storage afterward.
50% (2:1 ratio) is a safe bet. That includes not only the indexes but also the compressed raw log data. So allocating 5GB per day is a good bet, though I'd probably add at least 20% on top for growth.
Also when talking storage, you'll want to consider average search time. So, the majority of searches which take place are usually "last 24 hours" or "last 7 days" but rarely do most searches go beyond that 7 day period. So having 40GB of local storage (as fast as possible as that disk is going to handle collection, compression, indexing and search for that short time period). Then it can be pushed out to slower storage afterward.