Getting Data In

How does the volume size maxVolumeDataSizeMB apply if you have a mix of volumes and indexes paths ?

yannK
Splunk Employee
Splunk Employee

I want to use Volumes in indexes.conf to limit the space used by my indexes.

On each index, I see 4 paths : homePath / coldPath / thawedPath / tstatsHomePath
the last one seems to be used for the accelerated datamodels or report accelerations.

How does this works ?

  • I noticed that they are several paths possible, and some of them (the summary) are already using volumes, that happen to point on the default $SPLUNK_DB path.
    • Does a volume considers the other folders that not managed by splunk
    • Does a volume considers the other folder in the same location if the use paths (instead of volumes) ?
Tags (2)
1 Solution

yannK
Splunk Employee
Splunk Employee

After testing and researching confirm. Here are the conclusions :

  • Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

  • Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB 

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

  • Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb 

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

  • Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB 

View solution in original post

yannK
Splunk Employee
Splunk Employee

After testing and researching confirm. Here are the conclusions :

  • Volumes definitions are logical. When measuring the volume size, splunk will only count the size of the indexes (coldPath, homePath, thawedPath or tstatsHomePath) that are defined using this volume.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

in this case index1 homePath, coldPath and thawedPath will be considered on the same logical volume.

To enforce the possible volume size limit, only the previous indexes locations will be summed up, and when a bucket has to be frozen, it will be one of the buckets defined on this logical location.

  • Now the possible situation is when you have : several volumes that are pointing to the same path :

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 
[volume:testvolumeB] 
path = /mount/disk 
maxVolumeDataSizeMB=100 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[index2] 
homePath = volume:testvolumeB/index2/db 
coldPath = volume:testvolumeB/index2/colddb 
thawedPath = volume:testvolumeB/index2/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

The 2 volumes testvolumeA and testvolumeB will be both monitored as 2 separate entities. and each of them will only measure the subfolders defined using the volume.

That means that if you do enforce a volume size limit, they both apply there limits separately, to their specific indexes folders.
In my example
testvolumeA will keep it's monitored sub folders under 500MB
testvolumeB will keep it's monitored sub folders under 100MB
This mean that the actual physical path /mount/disk can grow up to 500+100MB = 600MB

I think that this will be the situation if you use a volume pointing to $SPLUNK_DB, as _splunk_summaries are also using it :

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
[volume:summary] 
path = $SPLUNK_DB 

So you can estimate your volumes size limits to ensure that the sum of them will not fill your physical disk.
Or you can redefine all your path to use a single volume, and manage the size globally.

  • Another situation is when you mix a volume with a path.

example :

[volume:testvolumeA] 
path = /mount/disk 
maxVolumeDataSizeMB=500 

[index1] 
homePath = volume:testvolumeA/index1/db 
coldPath = volume:testvolumeA/index1/colddb 
thawedPath = volume:testvolumeA/index1/thaweddb 
tstatsHomePath = volume:_splunk_summaries/defaultdb/datamodel_summary 

[_internal] 
homePath = $SPLUNK_DB/_internaldb/db 
coldPath = $SPLUNK_DB/_internaldb/colddb 
thawedPath = $SPLUNK_DB/_internaldb/thaweddb 

In this case you will see a warning in the splunkd.log for the _internal index homePath and coldPath and thawedPath, as they are not on a volume, but are on the same path that a volume.

example :

06-07-2017 16:27:01.976 -0700 WARN ProcessTracker - (child_6__Fsck) IndexConfig - idx=summary Path homePath='/mount/disk/_internaldb/db' (realpath '/mount/disk/_internaldb/db') is inside volume=testvolumeA (path='/mount/disk', realpath='/mount/disk'), but does not reference that volume. Space used by homePath will not be volume-mananged. Please check indexes.conf for configuration errors.

  • Why is the tstatsHomePath volumes not throwing errors like the others out of the box ?

However it appears that the warning only exist for homePath and coldPath and thawedPath, it does not exists for tstatsHomePath.
This is why we do not get those errors on a vanilla splunk install, as by default we have the tstatsHomePath using the volume

[volume:_splunk_summaries] 
path = $SPLUNK_DB 
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...