Solved: Do we lose data when an indexer crashes?

ddrillic · ‎11-20-2016

Our standard universal forwarders, at the moment, specify in outputs.conf all the indexers of the cluster we have in the [tcpout:indexers] stanza, such as - server = host1:9997,host2:9997,....

We don't have the indexer acknowledgment enabled - does it mean that when an indexer goes down for a day, let's say, we lose data?

martin_mueller · ‎11-20-2016

Yes and no.

Your in-flight data can be at risk, depending on how an indexer crashes. That'd be events sent by the forwarder, but not fully written to disk (or better: replicated to peers in an indexer cluster) when the crash happens.
With indexer acknowledgement, this in-flight data will be repeated to another indexer by the forwarder because it doesn't get its ack.
If the indexer stays down for a day you won't lose a day of data though. The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.

Your on-disk data can be at risk if the indexer crashes in a way that damages data on disk, e.g. catastrophic hardware failure, and if you don't have replication in an indexer cluster... and assuming the catastrophic failure isn't large enough to take out other peers or sites too 😉

View solution in original post

martin_mueller · ‎11-20-2016

Yes and no.

Your in-flight data can be at risk, depending on how an indexer crashes. That'd be events sent by the forwarder, but not fully written to disk (or better: replicated to peers in an indexer cluster) when the crash happens.
With indexer acknowledgement, this in-flight data will be repeated to another indexer by the forwarder because it doesn't get its ack.
If the indexer stays down for a day you won't lose a day of data though. The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.

Your on-disk data can be at risk if the indexer crashes in a way that damages data on disk, e.g. catastrophic hardware failure, and if you don't have replication in an indexer cluster... and assuming the catastrophic failure isn't large enough to take out other peers or sites too 😉

ddrillic · ‎11-20-2016

Makes perfect sense Martin !!!

martin_mueller · ‎11-20-2016

You could instruct the forwarder to clone the data to two indexers, but that's probably not what you want. The two receiving indexers would not later de-duplicate against each other, each event would be indexed, licensed, and searched twice.

If you want high availability without risk for in-flight data you want indexer clustering with replication and indexer acknowledgement, it's what they're there for.

martin_mueller · ‎11-20-2016

I'm not 100% sure if there are additional things on the application level (probably), but at least at the TCP level the forwarder will know something's wrong.

ddrillic · ‎11-20-2016

Great - and the forwarder sends the data to only one active indexer? If so, is it possible to configure the forwarder to send data to two active indexers?

ddrillic · ‎11-20-2016

Perfect. I get it about in-flight data.

You said -

-- The forwarders will not send further events to the crashed indexer, and instead fail over to the other indexers.

What's the mechanism here? how does the forwarder know not to send data to this indexer for the time being?

Ok, I see - it won't send data to a down server...

abhayneilam · ‎05-03-2018

guys, there is something called Load Balancing, UF / HF by default act as a Load Balancer, so if one indexer is down , automatically data goes through the other Indexer.

Load Balance concept is , first of all it sends a heart beat to the server, if it does not get a response back from the server at a specific time , LB will think this server is dead and try to send the data to other server.

Thanks !!

Cheers,

Do we lose data when an indexer crashes?

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!