Deployment Architecture

How do you restore buckets in an indexer cluster?

sjcoluccio67
Explorer

I have a bunch of buckets that I want to restore. According to documentation, the dirt step it finding the buckets you want to restore and then copying them to the $SPLUNK_HOME/var/lib/splunk/{INDEX}/thaweddb directory. Then you have to run the rebuild command. It is not clear in the documentation, however, whether or not you have to thaw the buckets on the same indexer that they came from.

I am looking at http://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Restorearchiveddata

I am running version 6.5.2 with an indexer cluster.

First, the documentation says that on versions 4.2 and higher, you can thaw data on any indexer instance, not just the one that it originated on.

"For the most part, you can restore an
archive to any instance of the
indexer, not just the one that
originally indexed it. This, however,
depends on a couple of factors:Splunk
Enterprise version. You cannot restore
a bucket created by Splunk Enterprise
4.2 or later to a pre-4.2 indexer. The bucket data format changed between 4.1
and 4.2, and pre-4.2 indexers do not
understand the new format. This means:
4.2+ buckets: You can restore a 4.2+ bucket to any 4.2+ instance."

Then, at the bottom of the page, it talks about restoring data in a clustered environment and it says that you should place the buckets in the thawed directory of the indexer that it originated on:

"However, as described in "Archive
indexed data", it is difficult to
archive just a single copy of
clustered data in the first place. If,
instead, you archive data across all
peer nodes in a cluster, you can later
thaw the data, placing the data into
the thawed directories of the peer
nodes from which it was originally
archived."

Do I have to thaw buckets only on the indexer that the data origniated on?

0 Karma

gjanders
SplunkTrust
SplunkTrust

Do I have to thaw buckets only on the
indexer that the data origniated on?

No, you can restore the bucket on any indexer instance that is running a newer than 4.x version. If you have the rawdata you will need to run the bucket rebuild under the section Thaw a 4.2+ archive of the documentation for restore archived indexed data (6.5.2 specific link)

In a clustered environment you may have multiple copies of a bucket that might make it more tricky to know which one to restore, but that will not effect restoring/thawing a bucket. You can restore it on a new instance or a current member, note that in some versions (6.5.x from memory) the thawed directory does not work as expected in a cluster until 6.5.7 (the workaround is to restore to a non-clustered instance!)

0 Karma

prakash007
Builder

@sjcoluccio67: yes, according to the documentation if you are thawing buckets back to the cluster you have to thaw on the indexers where the bucket was originated...
I prefer to thaw the buckets(only db_*) on a stand-alone indexer and add it as a search-peer to the search-head.,in fact we did it in one of our use cases..

https://answers.splunk.com/answers/708814/when-backing-up-frozen-data-with-replication-facto.html#an...

0 Karma

sjcoluccio67
Explorer

So, as long as the indexer is not part of the cluster that the buckets came from, it should be able to rebuild them all, regardless of the GUID in the bucket name?

0 Karma

prakash007
Builder

yes, as long as the buckets are on a stand-alone indexer, GUIDs doesn't matter(you could also rename them, but I haven't tried that)

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...