Monitoring Splunk

Which file type consumes the most data?

itsmevic
Communicator

I'm curious, which file type within an index bucket is largest? I'm getting conflicting responses. Some say the .tsidx file and others point to the bloom filter? Which file is it? Thanks for your help.

Tags (1)
0 Karma
1 Solution

sduff_splunk
Splunk Employee
Splunk Employee

It will really depend on many factors. An individual tsidx file may be smaller than the bloom filter file, but as you end up with more buckets, the number of tsidx files will increase, and may end up consuming more space than the bloom filter. It also depends on the number of unique words that the bloom filter needs to calculate and store, and the number of fields that are indexed and stored in the tsidx.

On my test system, my _internal index's bloom filter is 5906606 bytes in size, I have 15 tsidx files that range from 34755 bytes to 2095069 bytes.

So many many factors!

View solution in original post

0 Karma

sduff_splunk
Splunk Employee
Splunk Employee

It will really depend on many factors. An individual tsidx file may be smaller than the bloom filter file, but as you end up with more buckets, the number of tsidx files will increase, and may end up consuming more space than the bloom filter. It also depends on the number of unique words that the bloom filter needs to calculate and store, and the number of fields that are indexed and stored in the tsidx.

On my test system, my _internal index's bloom filter is 5906606 bytes in size, I have 15 tsidx files that range from 34755 bytes to 2095069 bytes.

So many many factors!

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...