Splunk Search

compression rate of indexed data: 50gig/day in 3 weeks uses 100gig HDD space

jan_wohlers
Path Finder

Hey,

we just set up a indexer 3 weeks ago. By now we are indexing about 50gig/24h. If I go to Manager -> Indexes I can see that our main index only has a size of about 100gigs. Mostly just eventlogs will be indexed. Is there such a good compression that about 20 days of 50gigs/day will be stored in an 100gig index?

Thanks for your answer in advance!

Jan

Tags (1)
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi jan.wohlers

basically you can say compression between 40-50% are normal, you can check this with this search:

| dbinspect index=_internal
| fields state,id,rawSize,sizeOnDiskMB 
| stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB
| eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
| eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
| table rawTotalinMB, diskTotalinMB, compression

cheers,

MuS

jan_wohlers
Path Finder

Okay, thanks for the answer. Compressionrate is 21% which seems pretty good.

0 Karma

ibondarets
Explorer

Thank you for this handy search example!
I've couple of questions regarding it:
1) how can i build a search which will give me a table of all indexes present wih compression ratio information? I tried this:

| dbinspect index=*
  | fields state,id,rawSize,sizeOnDiskMB 
  | stats sum(rawSize) AS rawTotal, sum(sizeOnDiskMB) AS diskTotalinMB by index
  | eval rawTotalinMB=(rawTotal / 1024 / 1024) | fields - rawTotal
5.  | eval compression=tostring(round(diskTotalinMB / rawTotalinMB * 100, 2)) + "%"
  | table rawTotalinMB, diskTotalinMB, compression

but it didn't work.
2) what does it mean when I get this:
alt text

0 Karma

ConnorG
Path Finder

I have a similar compression percentage as ibondarets.

Guess that means our data is actually larger once indexed.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...