Deployment Architecture

Why is a standalone search head getting duplicate events?

cpetterborg
SplunkTrust
SplunkTrust

We have had a Standalone Search Head that has been getting duplicate events for all searches. He have found the cause and the fix, but I wanted to add this here in order to help anyone else that comes up with this kind of problem. Thanks to Tyler Germer and Duane (a.k.a. duckfez) for their help.

The symptoms are easy to describe. When doing a search there are exactly two events for every event that is returned on a search head from the cluster. Doing a dedup on _raw gives the right number of events with no duplicates. It didn't matter what index or source.

The search head was set up with the UI where the indexers were configured through Settings -> Distributed Search -> Search Peers, instead of tying the search head to the cluster master, which will provide the peers automatically.

1 Solution

cpetterborg
SplunkTrust
SplunkTrust

So the answer to this problem is to remove all the search peers from the Settings -> Distributed Search -> Search Peers area of the UI, then connect the search head to the cluster master through the UI using Settings -> Indexer Clustering -> Enable Indexer Clustering.

Here is Tyler Germer's explanation to this problem:

This sounds like the standalone search head may not be configured correctly for Index Clustering. To elaborate on that, when a Search Head is configured to connect to an Index Cluster, it connects to the Cluster Master, the Cluster Master tells it which Indexers are in the Cluster, and also which Indexers have the data the Search Head may be searching for. The Cluster Master basically controls all the requests for data, going in and out of the Indexer Cluster.

If instead, you add the individual Indexers manually through Settings / Distributed Search, the Search Head will communicate directly with the Indexers. Because your data is replicated between Indexers (Search Factor / Replication Faction), you have multiple copies of your data on multiple Indexers. When you search for that data, technically more than one Indexer has what you need, so all will respond with that, thus getting duplicate events.

The fix is to remove all Indexers from the list in Distributed Search, then go to Settings / Indexer Clustering, Enable Indexer Clustering, configure your Search Head as a Search Head Node, then enter in the Cluster Master URI, along with the Secret Key. Then the Search Peers will automatically be populated by the Cluster Master, AND in the future if you add / remove Indexers, the Cluster Master will automatically update that list for you.

So, the problem was in the configuration of the standalone search head. When you have a SA SH in a clustered environment, still connect it to the cluster so that you don't get the duplicate events because of replication.

View solution in original post

aaraneta_splunk
Splunk Employee
Splunk Employee

@cpetterborg - Thanks so much for providing a solution to this issue. Do you think you can post your solution as an answer to be accepted below? That way your question does not look unresolved and it can be easily found by other users with the same issue. Thanks!

0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

Ah-ha! Nevermind--I refreshed and your answer was there 🙂

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

So the answer to this problem is to remove all the search peers from the Settings -> Distributed Search -> Search Peers area of the UI, then connect the search head to the cluster master through the UI using Settings -> Indexer Clustering -> Enable Indexer Clustering.

Here is Tyler Germer's explanation to this problem:

This sounds like the standalone search head may not be configured correctly for Index Clustering. To elaborate on that, when a Search Head is configured to connect to an Index Cluster, it connects to the Cluster Master, the Cluster Master tells it which Indexers are in the Cluster, and also which Indexers have the data the Search Head may be searching for. The Cluster Master basically controls all the requests for data, going in and out of the Indexer Cluster.

If instead, you add the individual Indexers manually through Settings / Distributed Search, the Search Head will communicate directly with the Indexers. Because your data is replicated between Indexers (Search Factor / Replication Faction), you have multiple copies of your data on multiple Indexers. When you search for that data, technically more than one Indexer has what you need, so all will respond with that, thus getting duplicate events.

The fix is to remove all Indexers from the list in Distributed Search, then go to Settings / Indexer Clustering, Enable Indexer Clustering, configure your Search Head as a Search Head Node, then enter in the Cluster Master URI, along with the Secret Key. Then the Search Peers will automatically be populated by the Cluster Master, AND in the future if you add / remove Indexers, the Cluster Master will automatically update that list for you.

So, the problem was in the configuration of the standalone search head. When you have a SA SH in a clustered environment, still connect it to the cluster so that you don't get the duplicate events because of replication.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...