Splunk Cluster Buckets

adrianathome · ‎06-18-2013

I understand that the hot/warm/cold buckets need the same same storage characteristics for performance. However, is there an in between bucket for data that has aged out but that I would like to retain in slower cheaper disks?

mikelanghorst · ‎05-20-2014

In 6.0, replicated buckets are now kept in the same "level" on both sides. Hot buckets are replicated into the same location as the hot buckets on the peer.

stevenpoitras · ‎06-18-2013

From my perspective it will all depend on your configuration and requirements, however I put the cold bucket on a lower performance tier knowing that search times "could" be impacted since disk IO wouldn't be the same as that serving the hot/warm buckets.

I optimize the hot bucket for read and write IO, and the warm and cold buckets for read (array side read caches can help dramatically here). In my opinion it isn't necessarily economical to have cold buckets stored on high-performance storage especially as the data grows to TBs+.

stevenpoitras · ‎06-18-2013

Correct, replication will go from the hot bucket on node 1 to the cold bucked on node 2.

If the cold bucket is "optimized for reads", there will be some penalty on write performance causing replication times to be increased. But for cold, spindle count + read caching is optimal for the sequential repl traffic

In reality what you're really focusing on is the rate at which data can be read from that replica in the case that the originating node fails.

As always its about finding the correct balance to fit the IO requirements, but replication IO is essentially a secondary operation.

adrianathome · ‎06-18-2013

So if your lower tier storage is optimized for reads, how does the replication affect the performance of the cluster? From what I understand the replication is write intensive and it happens on the cold bucket.

Splunk Cluster Buckets

Introducing the Splunk Community Dashboard Challenge!

Wondering How to Build Resiliency in the Cloud?

Updated Data Management and AWS GDI Inventory in Splunk Observability