Archiving Best Practices for a Clustered Environme...

marxsabandana · ‎01-19-2022

We currently have a C1 Architecture (3 clustered indexers/1 search head, replication factor of 3) and would like to ask if there are any best practices and guidelines on how to do it ourselves?

I've checked the docs and somehow it is indicated that having a replication factor of more than 3 can become more complicated to archive. Please see the excerpt below from https://docs.splunk.com/Documentation/Splunk/8.2.4/Indexer/Automatearchiving :

The problem of archiving multiple copies

Because indexer clusters contain multiple copies of each bucket. If you archive the data using the techniques described earlier in this topic, you archive multiple copies of the data.

For example, if you have a cluster with a replication factor of 3, the cluster stores three copies of all its data across its set of peer nodes. If you set up each peer node to archive its own data when it rolls to frozen, you end up with three archived copies of the data. You cannot solve this problem by archiving just the data on a single node, since there's no certainty that a single node contains all the data in the cluster.

The solution to this would be to archive just one copy of each bucket on the cluster and discard the rest. However, in practice, it is quite a complex matter to do that. If you want guidance in archiving single copies of clustered data, contact Splunk Professional Services. They can help design a solution customized to the needs of your environment.

Archiving Best Practices for a Clustered Environment

The problem of archiving multiple copies

index

indexer

Introducing the Splunk Community Dashboard Challenge!

Wondering How to Build Resiliency in the Cloud?

Updated Data Management and AWS GDI Inventory in Splunk Observability