Monitoring Splunk

splunk consuming more disk space- compress indexed data

npandith
Explorer

Hello,

Couple of months back we deployed a new splunk server 4.2.3 on RHEL 5 server and our old splunk version is 4.0.8 which is also running on RHEL5. In our older splunk(still running) currently we have around 18Billion events which is consuming around 700G of disk space in total. While as our new splunk server 4.2.3 have 20Billion events but its consuming around 2TB of disk space. I have heard that the data will be compressed by default, but not sure how why there is a huge disk space difference between the 2 versions. FYI, we have around 350 Universal forwarder sending data to this 1 indexer. Do you guys know what can be checked? will migrating from universal forwarder to heavy forwarder makes difference? and also is there a way we can compress the indexed data?

Thanks!!

Tags (2)
0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I don't know why you are seeing such a large difference, but I suspect one of the reported numbers is just wrong, as the space in both versions should be comparable (e.g., maybe you didn't actually have 18 billion events before, or if they were, they were in frozen state and unsearchable. Without knowing your data looks like, I would say that 18 billion events in searchable form into 700GB seems a bit light to me, but 20 billion in 2 TB seems more reasonable).

The data is already about as compressed as it can reasonably be, if it is to be searchable. When rolled to frozen storage (and unsearchable and unviewable from within Splunk without thawing back out) much of the data can be deleted and considerable additional space claimed.

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...