Solved: Bucket rotation and warm, cold...

rmorlen · ‎11-25-2013

I have a question about bucket rotation and the number of files in a bucket.

Here are our the settings for the index=main.

[main]

homePath = /splunkidx/defaultdb/db

coldPath = /splunkidx/defaultdb/colddb

thawedPath = /splunkidx/defaultdb/thaweddb

maxDataSize = auto_high_volume

maxTotalDataSizeMB = 400000

maxHotSpanSecs = 86400

frozenTimePeriodInSecs = 2592000

maxWarmDBCount = 30

The goal was to have 30 days worth of data (give or take a day). So with 86400=1 day that tells me a hot bucket should stay around 1 day, then roll to warm. With maxWarmDBCount=30 that says stuff stays in warm for 30 days and then rolls to cold. FrozenTimePeriodInSecs=259200=30 days, so data should get deleted "about" every 30 days. Worst case it sticks in cold for 30 days which means we really might have something like 60 days worth of data. That still tells me we would have something close to 30-60 files in colddb. We recently ran into a problem where there was 32,000 files in the colddb folder which is a Linux system limit and caused issues with buckets not rolling from warm to cold. How does this happen? Data coming in with bad date/time stamps (with old dates older than 30-60 days)?

Looking at the docs for indexes.conf I see:

maxHotSpanSecs = positive integer

- Upper bound of timespan of hot/warm buckets in seconds.

- Defaults to 7776000 seconds (90 days).

- NOTE: If you set this too small, you can get an explosion of hot/warm
  buckets in the filesystem.

- If you set this parameter to less than 3600, it will be automatically reset to
  3600, which will then activate snapping behavior (see below).

- This is an advanced parameter that should be set
  with care and understanding of the characteristics of your data.

- If set to 3600 (1 hour), or 86400 (1 day), becomes also the lower bound
  of hot bucket timespans.  Further, snapping behavior (i.e. ohSnap)
  is activated, whereby hot bucket boundaries will be set at exactly the hour
  or day mark, relative to local midnight.

- Highest legal value is 4294967295

maxHotIdleSecs = nonnegative integer

- Maximum life, in seconds, of a hot bucket.

- If a hot bucket exceeds maxHotIdleSecs, Splunk rolls it to warm.

- This setting operates independently of maxHotBuckets, which can also cause hot buckets to roll.

- A value of 0 turns off the idle check (equivalent to infinite idle time).

- Defaults to 0.

- Highest legal value is 4294967295

Should we be looking at using maxHotIdleSecs?

My questions are:

How are we getting anything near 32,000 files in colddb?
What settings should we be using to keep "about" 30 days worth of data?

Thanks,
Randy

lguinn2 · ‎11-25-2013

I agree with Kristian. I would remove maxHotSpanSecs = 86400 This is not something that you should normally set. Plus, this setting is not based on the time when the data is indexed, it is based on the a*ctual timestamp* of the events. So events with different dates would be in different buckets, even if they arrived at the indexer at the same time.

Since you have set frozenTimePeriodInSecs, you should not need to do anything else.

However, if you want to fine tune further, you could set maxDataSize to the size of a single day's data (but not less than 750 MB). This would not guarantee a bucket per day, because Splunk optimizes the placement of data in buckets to speed searching. Also, Splunk is usually working with multiple hot buckets simultaneously. But it might help. You can set maxHotIdleSecs, but I would not set it lower than 86400.

A hot bucket rolls to warm when (1) Splunk restarts (2) it fills (3) it receives no new data for maxHotIdleSecs (4) Splunk needs to open a new bucket, and it is the oldest open bucket and (5) maybe other reasons that I don't know about... My guess in this case is that maxHotSpanSecs caused the problem. Data arriving with inconsistent timestamps could certainly be part of the problem, too.

View solution in original post

rmorlen · ‎11-26-2013

Thanks. I will do that.

lguinn2 · ‎11-25-2013

I agree with Kristian. I would remove maxHotSpanSecs = 86400 This is not something that you should normally set. Plus, this setting is not based on the time when the data is indexed, it is based on the a*ctual timestamp* of the events. So events with different dates would be in different buckets, even if they arrived at the indexer at the same time.

Since you have set frozenTimePeriodInSecs, you should not need to do anything else.

However, if you want to fine tune further, you could set maxDataSize to the size of a single day's data (but not less than 750 MB). This would not guarantee a bucket per day, because Splunk optimizes the placement of data in buckets to speed searching. Also, Splunk is usually working with multiple hot buckets simultaneously. But it might help. You can set maxHotIdleSecs, but I would not set it lower than 86400.

A hot bucket rolls to warm when (1) Splunk restarts (2) it fills (3) it receives no new data for maxHotIdleSecs (4) Splunk needs to open a new bucket, and it is the oldest open bucket and (5) maybe other reasons that I don't know about... My guess in this case is that maxHotSpanSecs caused the problem. Data arriving with inconsistent timestamps could certainly be part of the problem, too.

kristian_kolb · ‎11-25-2013

my guess;

2) do not mess with anything other than frozenTimePeriodInSecs - just leave them at the defaults.

Bucket rotation and warm, cold...

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms