Splunk Search

job timeout error - what to do without increasing the receiveTimeout in distsearch.conf

splunk_zen
Builder
06-08-2015 15:41:47.050 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.1:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.2:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing https://ip.4:port/services/streams/search?sh_sid=1433773905.807685)

06-08-2015 15:41:47.056 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx001.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx002.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx003.iggroup.local


06-08-2015 15:41:47.072 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx001.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx002.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx003.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!

06-08-2015 15:42:46.481 INFO  DispatchThread - Download request for search.log from spkidx004.iggroup.local status=200, msg=OK
06-08-2015 15:42:46.522 INFO  DispatchThread - Download request for search.log from spk_bidx001.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.522 ERROR DispatchThread - Failed to download     from 'https://ip_b.1:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.522 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx001.iggroup.local', uri='https://ip_b.14:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.559 INFO  DispatchThread - Download request for search.log from spk_bidx002.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.559 ERROR DispatchThread - Failed to download from 'https://ip_b.3:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.559 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx002.iggroup.local', uri='https://ip_b.27:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.596 INFO  DispatchThread - Download request for search.log from spk_bidx003.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.597 ERROR DispatchThread - Failed to download from 'https://ip_b.4:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.597 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx003.iggroup.local', uri='https://ip_b.45:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.630 INFO  DispatchThread - Download request for search.log from spk_bidx004.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.630 ERROR DispatchThread - Failed to download from 'https://ip_b.2:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.630 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx004.iggroup.local', uri='https://ip_b.57:port', sid='remote_shead001.iggroup.local_1433773905.807685'

The search takes the form (trying to simplify as much as possible as this happens even before the | stats command)

index=a sourcetype=b app.trial ("Complete" OR "Initiated") "finished"

The job inspector shows the runtime took around 603 seconds
(which is confusing me as other tested searches kept going for over 1000 seconds)

Tags (2)
0 Karma

lguinn2
Legend

My first thought is that the network connection between the search head and the indexers is slow/flaky or misconfigured. Based on the messages, I don't think it has anything to do with the search that you are running.
Might also be caused by a flaky or slow DNS service.

I would take a look at these things and make sure the network / DNS are operating properly before you change any settings in distsearch.conf

Remember that ping, while a useful tool, is not the same protocol as an https/tcp connection. So use it, but a ping connection (or lack thereof) does not verify an https/tcp connection, just that the server can be reached.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...