Why does clustering always appear as a repeat phen...

xsstest · ‎07-16-2018

hello, I have a strange question, This question is described as a bit rough.
I have a single site cluster that contains 5 indexers, 4 search heads, a deploye, a cluster master, some deployment servers, some heavy forwarders, and some universal forwarders. The deployment server also acts as the role of a heavy forwarder.
The search factor of indexer clustering is 2 and replication factor is 3. Universal forwarder monitor log files then forward to HF, then hf forward it to indexers cluster.
Strange things always happen unreasonably. When the cluster is running for a period of time, some sourcetype event will be duplicated, Sometimes, each event is repeated 5 times. if I restart heavy forwarders. The repetition of the phenomenon will disappear. The whole cluster will return to normal but sometimes I need to restart their universal forwarder for it to work.
Some soucetype events have been duplicate again and I will need to restart HF OR UF to return to normal state.
I tried to find out the reason from the indexer's splunkd.log, but I didn't find any clues.
I think index replication has a problem but I couldn't find any error logs. Why does it return to normal when I restart HF or UF?

richgalloway · ‎07-26-2018

In addition to @woodcock's great answer, you should avoid the intermediate HF if you don't need it for a specific purpose. UFs distribute events among indexers better than an HF. Also, the HF can actually make the indexers work harder to process events.

If you eliminate the HF, be sure to set useAck=true on the UFs.

---
If this reply helps you, Karma would be appreciated.

mdsnmss · ‎07-25-2018

I've seen something similar before and for us it seemed to be due to a misbehaving indexer. When you search the events and see duplicates, what does the splunk_server field show? The splunk_server field will show you which indexer the search is pulling the event from. In our case each duplicate showed one server that every event had in common, while other indexers where distributed across each duplicate. We identified the problem indexer and took it out of the cluster and it resolved. Since it is single site you shouldn't have an issue with search affinity.

felipesewaybric · ‎07-25-2018

Can you send the output.conf and the input.conf?

woodcock · ‎07-25-2018

For UF->HF set useAck to false but for HF/UF->IDX set useAck to true. Also be sure to use EVENT_BREAKER everywhere.

hurricanelabs · ‎07-25-2018

What version of splunk are you running?

What search heads are listed in the Cluster Master? (It should just be the search heads and the cluster master, not any of the other stuff)

What does your outputs.conf look like on the HF?

vidhyaArumalla · ‎07-25-2018

I am facing a similar issue on the cloud architecture, but on-prem architecture so far did not have the issue mentioned above.
I am skeptical if that has to do something with timezones.

Why does clustering always appear as a repeat phenomenon without a reason?

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!