Splunk Enterprise

SearcH Head WARN- ConfReplicationThread & ConfMetrics

venkateshparank
Path Finder

Can anyone help why we are seeing these WARN in logs and how to fix permanently.

We are performing manual resync whenever count of events is > 5 in 15mins time range using below query:

index=_internal host=searchhead* component=ConfReplicationThread log_level=WARN "Cannot accept push" |bin span=15m _time|stats max(consecutiveErrors) as count by host,_time|where count>5

LOGS:

==========

WARN ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=8d89fca5ef4520b00b8ffe8b1366a178b92b52fb; current_baseline_op_id=a948f0e3f0fcae707ce37ca7d7a73"

ConfReplicationThread - Error pushing configurations to captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=66098bdc22c2bcacf951fb104558db365ac64820; current_baseline_op_id=085e675e4c9d8c9fafabee"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

ConfReplicationThread - Error pulling configurations from captain=https://searchhead01.domain.com:8089, consecutiveErrors=1 msg="Error in fetchFrom, at=a6a747e7138353bd07873f04fe90f2c9b4564567: Network-layer error: Connect Timeout"

=============

Even tried to reduce the max_push count to 50 (default is 100). How can we resolve this permanently ?

==============

WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=1525! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PUSH_TO took wallclock_ms=2644! Consider a lower value of conf_replication_max_push_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=2011! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.
WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=1778! Consider a lower value of conf_replication_max_pull_count in server.conf on all members.

 

Below are the settings in server.conf

conf_replication_max_push_count = 50
conf_replication_purge.period = 3h
conf_replication_period = 10

 

we do not want to do resync everytime.

splunk resync shcluster-replicated-config

Tags (2)
0 Karma

Nisha18789
Builder

hi @venkateshparank , please check this resolved thread, it seems similar to what you are facing

https://community.splunk.com/t5/Deployment-Architecture/Search-Head-Cluster-Error-acceptPush-non-200...

 

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...