Deployment Architecture

splunkd and splunkweb are up for a cluster peer, but why is it repetitively downloading the cluster bundle with the following errors?

sat94541
Communicator

We have clustered environment
Version : v6.1.1

1 x CM (Multi-Site, rf=2, site_rf=origin:1,site1:1,site2:1,total:2, site_sf=origin:1,site1:1,site2:1,total:2)
2 x SHs (Originally Mounted Bundle was enabled. But, it was disabled when this problem happened. A misconfiguration still existed in SH.)
8 x CPs (Two Splunk instances per physical machine, one of the CP is a search head.)
41 x Indexes db

Trying to fix misconfigration of indexes.conf and applying a cluster bundle to peers, one of the Cluster Peers became the state of endlessly downloading
bundle with the errors shown below

07-11-2015 17:13:18.855 -0400 INFO  CMBundleMgr - Removed the untarred bundle folder=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/bce9740371ce42bcc4b9f4920bd99857-1436649198
07-11-2015 17:13:18.855 -0400 INFO  CMBundleMgr - Removed the bundle downloaded from master to '/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/bce9740371ce42bcc4b9f4920bd99857-1436649198.bundle'
07-11-2015 17:13:19.880 -0400 ERROR CMBundleMgr - failed to compute checksum err=Truncated tar archive
07-11-2015 17:13:19.935 -0400 INFO  CMBundleMgr - Downloaded bundle to /backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/3d76761010429034685696b0faeed1f5-1436649199.bundle
07-11-2015 17:13:19.935 -0400 INFO  CMBundleMgr - untarring bundle=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/3d76761010429034685696b0faeed1f5-1436649199.bundle
07-11-2015 17:13:19.944 -0400 INFO  ClusterBundleValidator - Validating bundle path=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/3d76761010429034685696b0faeed1f5-1436649199/apps
......
07-11-2015 17:13:20.206 -0400 INFO  CMBundleMgr - Removed the untarred bundle folder=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/3d76761010429034685696b0faeed1f5-1436649199
07-11-2015 17:13:20.206 -0400 INFO  CMBundleMgr - Removed the bundle downloaded from master to '/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/3d76761010429034685696b0faeed1f5-1436649199.bundle'
07-11-2015 17:13:21.234 -0400 ERROR CMBundleMgr - failed to compute checksum err=Truncated tar archive
07-11-2015 17:13:21.294 -0400 INFO  CMBundleMgr - Downloaded bundle to /backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/4685cbfb25fac1c4945bece7f3287b9d-1436649201.bundle
07-11-2015 17:13:21.294 -0400 INFO  CMBundleMgr - untarring bundle=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/4685cbfb25fac1c4945bece7f3287b9d-1436649201.bundle
07-11-2015 17:13:21.302 -0400 INFO  ClusterBundleValidator - Validating bundle path=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/4685cbfb25fac1c4945bece7f3287b9d-1436649201/apps
..........
07-11-2015 17:13:21.560 -0400 INFO  CMBundleMgr - Removed the untarred bundle folder=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/4685cbfb25fac1c4945bece7f3287b9d-1436649201
07-11-2015 17:13:21.560 -0400 INFO  CMBundleMgr - Removed the bundle downloaded from master to '/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/4685cbfb25fac1c4945bece7f3287b9d-1436649201.bundle'
07-11-2015 17:13:22.587 -0400 ERROR CMBundleMgr - failed to compute checksum err=Truncated tar archive
07-11-2015 17:13:22.644 -0400 INFO  CMBundleMgr - Downloaded bundle to /backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/da769c2cc82d7154a445f326ca44e621-1436649202.bundle
07-11-2015 17:13:22.645 -0400 INFO  CMBundleMgr - untarring bundle=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/da769c2cc82d7154a445f326ca44e621-1436649202.bundle
07-11-2015 17:13:22.651 -0400 INFO  ClusterBundleValidator - Validating bundle path=/backup/splunk1/splunk/var/run/splunk/cluster/remote-bundle/da769c2cc82d7154a445f326ca44e621-1436649202/apps

We are makinga change in the etc/slave-apps/all_indexes/local/indexes.conf to fix the follow some error messages. For the change to be in effect:

Restart CM.
Rolling restart of the peer.
But 1 out of the 8 peers' status is down, even though Splunk status is up.

esix_splunk
Splunk Employee
Splunk Employee

Have you tried removing the peer from the cluster, restarting it, and then re-adding it?

Additionally, if you are editing the slave-apps, this can cause some issues. Be careful with the user you edit this as and what you edit. When you apply a bundle, all the peers will pull the master-apps/* from the Cluster Master. If file level permissions are wrong, then there will some issues...

0 Karma
Get Updates on the Splunk Community!

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Combine Multiline Logs into a Single Event with SOCK: a Step-by-Step Guide for ...

Combine multiline logs into a single event with SOCK - a step-by-step guide for newbies Olga Malita The ...

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...