Getting Data In

How does the volume size maxVolumeDataSizeMB apply if you have a mix of volumes and indexes paths ?

yannK
Splunk Employee
Splunk Employee

I want to use Volumes in indexes.conf to limit the space used by my indexes.

On each index, I see 4 paths : homePath / coldPath / thawedPath / tstatsHomePath
the last one seems to be used for the accelerated datamodels or report accelerations.

How does this works ?

  • I noticed that they are several paths possible, and some of them (the summary) are already using volumes, that happen to point on the default $SPLUNK_DB path.
    • Does a volume considers the other folders that not managed by splunk
    • Does a volume considers the other folder in the same location if the use paths (instead of volumes) ?
Tags (2)
1 Solution

yannK
Splunk Employee
Splunk Employee

After testing and researching confirm. Here are the conclusions :

  • Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

  • Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB 

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

  • Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb 

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

  • Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB 

View solution in original post

yannK
Splunk Employee
Splunk Employee

After testing and researching confirm. Here are the conclusions :

  • Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

  • Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB 

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

  • Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb 

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

  • Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...