Splunk Enterprise Security

Downloaded an old snapshot created 485320 seconds ago

sylim_splunk
Splunk Employee
Splunk Employee

As part of the destructive resync that I performed on the 2 members that were out of sync, I saw the below messages on the SH’s after process completion.

They have downloaded a snapshot from the captain that is 5 days old.

Does this mean that the Captain does not have a common that is recent than 5 days.

--- resync and results --
$ splunk resync shcluster-replicated-config
Your session is invalid. Please login.
Splunk username: admin
Password:

Downloaded an old snapshot created 485324 seconds ago; Check for clock skew on this member or the captain; If no clock skew is found, check the captain for possible snapshot creation failures*

0 Karma
1 Solution

sylim_splunk
Splunk Employee
Splunk Employee

I found error messages repeating as below, which suggests it has been failing for days.

09-18-2019 18:35:58.803 +0000 ERROR ConfReplication - Error creating snapshot: /opt/splunk/var/run/splunk/snapshot/15831677-5b6c4f95a711c6431341ba397e4c6b012a.bundle.f3effb6944a1e.tmp; Configurations changed while generating snapshot, original_latest_change=5b6c4f95a711c6431341ba397e4c6b012a, new_latest_change=2f2baeb33f5867261227d7636d5c7ed3b0d38749; consecutiveRejectionFromNewChanges=336;* Check conf.log to see if any app or client is making frequent configuration changes; Continuous snapshot creation failures can lead to configuration replication issues if this member becomes the captain*

As it suggests in the message above the conf.log shows a lot of changes "addCommit" from ES import, due to this it updates local.meta and interrupts the creation of snapshot.

== Use the below searches to identify the changes that interrupts the operation ==
index=_internal source=*/splunkd.log consecutiveRejectionFromNewChanges  earliest=-1d latest=now

Index=_internal source=/conf.log source=*/conf.log* data.task=addCommit| timechart span=5m count by data.optype_desc

Especially this issue was caused by the ES import modular input which updates several 100s of apps and add-ons installed on the SH. The import operation is only needed when new apps/add-ons installed on the server, without it ES will not recognize the data to be monitored.
This has been worked around by increasing the interval to, like 2hrs, for ES import mod input, which is in inputs.conf of /etc/apps/SplunkEnterpriseSecuritySuite, this import has been removed in the latest version of ESS 5.3.1.

It depends on the deployments environment - this time it was caused by ES import but there could be some other apps/add-on which could frequently update the configs.

View solution in original post

sylim_splunk
Splunk Employee
Splunk Employee

I found error messages repeating as below, which suggests it has been failing for days.

09-18-2019 18:35:58.803 +0000 ERROR ConfReplication - Error creating snapshot: /opt/splunk/var/run/splunk/snapshot/15831677-5b6c4f95a711c6431341ba397e4c6b012a.bundle.f3effb6944a1e.tmp; Configurations changed while generating snapshot, original_latest_change=5b6c4f95a711c6431341ba397e4c6b012a, new_latest_change=2f2baeb33f5867261227d7636d5c7ed3b0d38749; consecutiveRejectionFromNewChanges=336;* Check conf.log to see if any app or client is making frequent configuration changes; Continuous snapshot creation failures can lead to configuration replication issues if this member becomes the captain*

As it suggests in the message above the conf.log shows a lot of changes "addCommit" from ES import, due to this it updates local.meta and interrupts the creation of snapshot.

== Use the below searches to identify the changes that interrupts the operation ==
index=_internal source=*/splunkd.log consecutiveRejectionFromNewChanges  earliest=-1d latest=now

Index=_internal source=/conf.log source=*/conf.log* data.task=addCommit| timechart span=5m count by data.optype_desc

Especially this issue was caused by the ES import modular input which updates several 100s of apps and add-ons installed on the SH. The import operation is only needed when new apps/add-ons installed on the server, without it ES will not recognize the data to be monitored.
This has been worked around by increasing the interval to, like 2hrs, for ES import mod input, which is in inputs.conf of /etc/apps/SplunkEnterpriseSecuritySuite, this import has been removed in the latest version of ESS 5.3.1.

It depends on the deployments environment - this time it was caused by ES import but there could be some other apps/add-on which could frequently update the configs.

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...