Hi,
We have a Splunk indexer cluster with two indexers in each data center. Occasionally, we see a network traffic spike in the backbone of the network due to Splunk replication. Specifically, according to following log, only the .tsidx are being replicated (but not rawdata itself).
06-17-2016 09:25:53.144 INFO CMSlave - sending search files for bucket bid=main~541~DE42B65D-16D8-4E25-9937-3F45CBCD2376 to guid=FCCBD56B-A38C-4AC8-B1E0-7BEE1291F340
06-17-2016 09:25:53.145 -0500 INFO BucketReplicator - Created asyncReplication task to replicate files="1466027268-1465947413-3839916974449336698.tsidx 1466027824-1465971628-3842925951557594666.tsidx 1466024974-1465947419-3843695670987834010.tsidx 1466010664-1465963140-3841590830700805022.tsidx 1466027849-1465966921-3851985394289068044.tsidx 1466027462-1465947413-3843855991050422627.tsidx 1466027852-1466018039-3853405925451469935.tsidx 1466027852-1465947414-3853405982510090481.tsidx Hosts.data Sources.data SourceTypes.data rawdata/slicemin.dat rawdata/slicesv2.dat merged_lexicon.lex bloomfilter Strings.data .rawSize" bid=main~541~DE42B65D-16D8-4E25-9937-3F45CBCD2376 to guid=FCCBD56B-A38C-4AC8-B1E0-7BEE1291F340 host=10.111.2.40 s2sport=9887
06-17-2016 09:25:53.145 -0500 INFO BucketReplicator - event=asyncSendFiles bid=main~541~DE42B65D-16D8-4E25-9937-3F45CBCD2376 jobId=7 files="1466027268-1465947413-3839916974449336698.tsidx 1466027824-1465971628-3842925951557594666.tsidx 1466024974-1465947419-3843695670987834010.tsidx 1466010664-1465963140-3841590830700805022.tsidx 1466027849-1465966921-3851985394289068044.tsidx 1466027462-1465947413-3843855991050422627.tsidx 1466027852-1466018039-3853405925451469935.tsidx 1466027852-1465947414-3853405982510090481.tsidx Hosts.data Sources.data SourceTypes.data rawdata/slicemin.dat rawdata/slicesv2.dat merged_lexicon.lex bloomfilter Strings.data .rawSize"
Since these index files are very big (few Gigs), this replication causes network congestion. I know rawdata are being replicated gradually, but looks like tsidx files are being replicated in one shot which choke the network if the size is big. How can I control this replication so it does not choke the network?
Thanks
... View more