I'd like to know the history of this issue but I cannot find any evidence in the Splunk logs. The issue appears in UI banner or in email alerts where there was a time out to one or more peers.
How do I get this event logged?
Hi the_wolverine,
did you check splunkd.log
for DispatchThread
messages?
cheers, MuS
Yes, checked splunkd.log and there is no logged event.
This error occurs when your Search Heads attempts to send a search job to a Search Peer (usually one of your Indexers) and the Indexer does not respond in within the default timeout period so the Search continues but without using that Indexer (which of course probably means that some of your events are not returned so your search is wrong). In my experience, the problem can often be cleared simply by restarting the Splunk instance on the Indexer in question but sometimes you need to dig deeper. In any case, something is keeping your Indexers so busy that it cannot reliably respond to search requests even though the Splunk instance is running. I am sure this kind of thing can also commonly be caused by misconfigured/misbehaving load-balancers or other identity/load-shifting equipment that is between your Search Head and your Indexer peers.