Hello,
I went through a few forum posts and Splunk documentations on retention settings but it's still not 100% clear on which properties are needed and what their values should be. Would greatly appreciate everyone's help with this topic.
For example: lets say on an average, Index X stores 1 GB data/day and we want to keep the data in Hot/Warm for 5 days and in Cold for 365 days then will the properties below help in achieving the data retention goal?
We have 3 nodes with RF and SF set to 3. The properties below were generated by the sizing app.
[X]
homePath = volume:primary/X/db
coldPath = volume:secondary/X/colddb
homePath.maxDataSizeMB = 2559
--> ~2.5 GB
--> Specifies the maximum size of the directory <>/X/db and if this size is exceeded, Splunk will move buckets with the oldest value of latest time (for a given bucket) into the cold DB until homePath (<>/X/db) is below the maximum size
coldPath.maxDataSizeMB = 184319
--> ~184 GB
--> Specifies the maximum size of the directory <>/X/colddb and if this size is exceeded, Splunk will freeze buckets with the oldest value of latest time (for a given bucket) until coldPath is below the maximum size.
maxWarmDBCount = 100
--> The maximum number of warm buckets default it 300 but we want to limit it to 100
frozenTimePeriodInSecs = 31536000
--> 365 days
--> Number of seconds after which indexed data rolls to frozen. If you do not specify a coldToFrozenScript, data is deleted when rolled to frozen
maxDataSize = auto
--> The maximum size in MB for a hot DB to reach before a roll to warm is triggered. Default will be 750MB/bucket
maxHotSpanSecs=432000
--> 5 days
--> Upper bound of timespan of hot/warm buckets in seconds.
Thanks!
If you want to keep 1 GB daily ingestion for 370 days then your settings need to be a little different.
# 5 GB for 5 days of hot storage
homePath.maxDataSizeMB = 5120
# 365GB to store 365 day's of 1GB/day
coldPath.maxDataSizeMB = 373760
# 370 days = 5 days as hot + 365 days as cold
frozenTimePeriodInSecs = 31968000
Hi Rich, thank you for your prompt response.
Since I was using the sizing app which takes the raw size and adds the compression/metadata size factor into account the numbers were off (ex: 1Gb/day for 5 days was resulting in value 2559MB instead of 5GB) but based on your output the calculation looks straight forward.
Thanks!
I did not account for compression in my storage numbers.
You still need the larger frozenTimePeriodInSecs
value, though.