Getting Data In

Controlling the hot bucket size in Splunk - indexes.conf

splunker12er
Motivator

1. Controlling the size of a hot bucket :

 maxDataSize = auto | auto_high_volume

auto = 750 mb
auto_high_volume = 10 Gb

maxHotSpanSecs = <positive integer>

default value = 90 days

  • If i set both the parameters, which takes the first precedence ?

  • Does changing maxDataSize paramter's value to higher one , requires tuning of other parameters also , accordingly?

2. Performance of Search query

when i make the maxDataSize = auto_high_volume i.e 10Gb

where a hot bucket will grow to a size of 10Gb , till that period it retains in hot bucket ., Will the search be faster , since the data available in HOT bucket ?

Labels (1)
1 Solution

lguinn2
Legend

The maxDataSize controls the size of a single bucket and maxHotSpanSecs controls the timespan of a bucket. Whenever either of these is reached, a hot bucket rolls to warm. Therefore, you will have some buckets that are smaller than maxDataSize because they hit the timespan limit. And you will have some buckets rolled because they were full (they hit maxDataSize) even though their timespan is less than maxHotSpanSecs.

Generally, you should not need to tune other parameters. However, I would not set a bucket larger than 10GB without careful thought. It is slower to search many small buckets, but a super large bucket that contains many days of data is also not efficient.

Most searches in Splunk are run on timespans of 24 hours or less. If that is your case, you may to size the buckets so that they roll about once a day. Again, avoid buckets smaller than 750MB or larger than 10GB.

View solution in original post

lguinn2
Legend

The maxDataSize controls the size of a single bucket and maxHotSpanSecs controls the timespan of a bucket. Whenever either of these is reached, a hot bucket rolls to warm. Therefore, you will have some buckets that are smaller than maxDataSize because they hit the timespan limit. And you will have some buckets rolled because they were full (they hit maxDataSize) even though their timespan is less than maxHotSpanSecs.

Generally, you should not need to tune other parameters. However, I would not set a bucket larger than 10GB without careful thought. It is slower to search many small buckets, but a super large bucket that contains many days of data is also not efficient.

Most searches in Splunk are run on timespans of 24 hours or less. If that is your case, you may to size the buckets so that they roll about once a day. Again, avoid buckets smaller than 750MB or larger than 10GB.

fredclown
Contributor

Would it be logical then to set the indexes up like this ...

maxDataSize=auto_high_volume
maxHotSpanSecs=86400 #one day

... so that the bucket doesn't get too big but it would also roll at least daily?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...