Solved: What happens to my multisite indexer cluster when ...

davidpaper · ‎04-13-2016

Background:

There are two types of ACKs in play here.

First is an inter-indexer ACK for data replication in an indexing cluster. When an indexer replicates a slice of data (when the slice hits 128K, or the slice is less than 128k and 60s elapses) to a replicate peer, it expects an ACK from that peer when the data has been received.
Second is a forwarder ACK (useACK). This gets sent from the indexer to the forwarder when the indexer has successfully received ACKs indicating that RF-1 instances of that slice has been successfully replicated. So, if RF=4, then the indexer will send the ACK back to the forwarder when it has successfully received inter-indexer ACKs for 2 replicates, thus writing 3 copies (2 + itself) and satisfying RF-1 (3) replicates.

Scenario:

Right now I have a bunch of indexers split between two sites, none of which are clustered together. I would like to setup a multi-site indexing cluster, with SF=2, RF=4 (origin:2, site1: 2). I'll be turning on useACK on the forwarders so I don't lose any data. My team does Disaster Recovery testing, and I want to make sure that Splunk will still work (forwarders will get ACKs, indexers will index data, et al) during the DR test. The DR test itself will last 48 hours and consist of severing the links connecting the 2 data centers (siteA, siteB).

Splunk in both sites must continue to function properly during the test & when the links are brought back online.

The concern with this is that the indexer receiving the chunk of data from the forwarder MUST successfully complete RF -1 replications of the raw data to other peers before the ACK is sent back to the forwarders. With the WAN links disabled between the two sites, the best an indexer will ever be able to muster is 1 replicate, and will never get to the 2 replicates required to return the ACK to the forwarder. Thus, losing all visibility into what's going on in the environment, and this becomes a show stopper for rolling out multi-site indexer clustering.

What's going to happen?

davidpaper · ‎04-13-2016

As with most things in Splunk, there are timeouts governing how long the indexer will wait for an ACK from another peer. From server.conf:

rep_max_send_timeout = : Maximum send timeout for sending replication slice data between cluster nodes.
rep_max_rcv_timeout = : Maximum cumulative receive timeout for receiving acknowledgement data from peers.

The default for both of these is 600 seconds.

The receiving node sends the ACK to the forwarder when it gets notification from each of the target peers of either

1) successful write
2) unsuccessful write

Or, to put it slightly differently, replication success or replication failure. either way, the forwarder will receive an ACK and keep sending data to the indexers.

When the link is terminated, hot buckets will timeout after the defined period(s) above, and roll to become warm buckets. Buckets are still searchable, and Splunk continues to hum along.

Additional information:

When the WAN links are re-established, newly created buckets will attempt to replicate to the indexers on the other site. The behavior doesn't change while the connection is down, the indexer just fails to make a new connection for streaming_replication.

When buckets roll to warm due to replication timeouts during their hot life, fixup activity will continue even when there is a partition between sites. When connection is re-established, fixup activity will resume as the CM can now see both sites again and work towards fully satisfying RF & SF.

If the CM is on one of the sites and unable to communicate to the other site during this partitioning, the CM will consider the other site "down" and not attempt to fixup to the other site. Indexing in the other site will continue as if the CM has disappeared. Replications for new hot buckets will use the last set of targets.

I will add to this answer if further information about this scenario is needed.

View solution in original post

splk · ‎06-20-2016

Please be aware that some WAN optimizer products (like RiverBed) do some magic with ACKs. So you have to exclude Splunk from WAN optimization to prevent any weird behavior (also for Indexer <--> Forwarder)!

Just for information.

davidpaper · ‎04-13-2016