We use a volume configuration for our storage, and the amount of disk being used is measured differently by Splunk and df:
05-12-2012 14:34:21.961 -0400 INFO VolumeManager - The size of volume 'local' exceeds the limit, will have to acquiesce it (size=901197562019, max_size=900228710400, path='/opt/splunk/var/lib/splunk')
05-12-2012 14:34:21.962 -0400 INFO VolumeManager - Getting a list of candidate buckets for moving (chilling or freezing)
05-12-2012 14:34:21.992 -0400 INFO VolumeManager - Will move bucket with latest=1205445442, path='/opt/splunk/var/lib/splunk/sharepoint/db/db_1205445442_1197673042_20'
05-12-2012 14:34:21.993 -0400 INFO VolumeManager - Bucket moved successfully (current size=901197556937, max=900228710400)
05-12-2012 14:34:21.993 -0400 INFO VolumeManager - Will move bucket with latest=1281595719, path='/opt/splunk/var/lib/splunk/sharepoint/db/db_1281595719_1281595719_21'
05-12-2012 14:34:21.993 -0400 INFO VolumeManager - Bucket moved successfully (current size=901197553132, max=900228710400)
05-12-2012 14:34:21.993 -0400 INFO VolumeManager - Will move bucket with latest=1294374586, path='/opt/splunk/var/lib/splunk/proxy/db/db_1294374586_1294203869_3'
05-12-2012 14:34:21.994 -0400 INFO VolumeManager - Bucket moved successfully (current size=894885008853, max=900228710400)
05-12-2012 14:34:21.994 -0400 INFO VolumeManager - Acquiescing volume 'local' completed.
at the same time, df --block-size=1 tells me:
Filesystem 1B-blocks Used Available Use% Mounted on
/dev/mapper/db_dg-opt
1005947170816 931457794048 23390171136 98% /opt/splunk/var/lib/splunk
or df -m
Filesystem 1M-blocks Used Available Use% Mounted on
/dev/mapper/db_dg-opt
959346 888655 21960 98% /opt/splunk/var/lib/splunk
My volume configuration in indexes.conf is
[volume:local]
path = /opt/splunk/var/lib/splunk
maxVolumeDataSizeMB = 858525
The "max_size" in the logs matches correctly to my maxVolumeDataSizeMB (using 1024 MB per KB). But what Splunk is measuring as "size" (901197562019) does not match with what df tells me (931457794048). Splunk's measurement is 3% smaller than what df tells me.
Why the mismatch, and how to I measure/plan adequately to maximize disk utilization without leaving too little free space? And how much free space is necessary, since this volume is used only for hot/warm bucket storage and nothing else?
The short answer is that Splunk only accounts for volumes using the volume configuration.
To explain further, there are two things that could be on your disk that are taking up space that Splunk is not accounting for:
1) Files that are not current indexes. This could include any file that is on the filesystem or possibly old indexes that were removed from the config and not cleaned up
2) Indexes that use somethign other than volume:local to define where they live. So, for example, it might use $SPLUNK_DB/mydb as the location. While this might point to the same filesystem, it won't be counted against your maxVolumeDataSizeMB quota.
Make sure to check all indexes.conf on your system for indexes of the second kind. Some apps will define their own indexes and, in my experience, most will use $SPLUNK_DB. Just change the definition by copying the indexes.conf to the local dir and using the correct volume setting.
Furthermore, keep in mind that accelerated data models by default will store data in the default index location as per:
Accelerate Data Models
This can be customized but in version 8.0 the splunk_summaries volume does not have a set maximum as per the indexes.conf.spec "maxVolumeDataSizeMB = , * Optional"
On my test server on 8.0.0 it was not specified at all.
This will not count towards your custom-made volume unless you customise the tstatsHomePath for each index with an accelerated data model...
Hi
"This can be customized but by default they get 100GB per index and a limit of 1TB in total in the volume called _splunk_summaries"
where above statement is written in docs and can be customized it?
FYI this post is from over 3 years ago!
Splunk indexes.conf.spec
* Default: volume:_splunk_summaries/$_index_name/datamodel_summary,
where "$_index_name" is runtime-expanded to the name of the index
The max volume size I checked in 8.0.0 appears to be unlimited, I will remove the 100GB per index comment, and I've updated my above post.
The short answer is that Splunk only accounts for volumes using the volume configuration.
To explain further, there are two things that could be on your disk that are taking up space that Splunk is not accounting for:
1) Files that are not current indexes. This could include any file that is on the filesystem or possibly old indexes that were removed from the config and not cleaned up
2) Indexes that use somethign other than volume:local to define where they live. So, for example, it might use $SPLUNK_DB/mydb as the location. While this might point to the same filesystem, it won't be counted against your maxVolumeDataSizeMB quota.
Make sure to check all indexes.conf on your system for indexes of the second kind. Some apps will define their own indexes and, in my experience, most will use $SPLUNK_DB. Just change the definition by copying the indexes.conf to the local dir and using the correct volume setting.
In my case, it was the Splunk built-in indexes that were causing the most pain (specifically audit and summarydb). Thank you the tip to use 'splunk cmd btool indexes list' to see how all the various config files got merged. That made it much easier to see which indexes were and weren't using my volumes configuration.
I still dont' get why the total volume size doesn't match what DF tells me, in this case linux on a dedicated volume.