Splunk Search

I disabled global metadata on some indexers and now my indexing queue is blocked. Why?

DerekB
Splunk Employee
Splunk Employee

We upgraded from 4.2 to 4.3.5 because we had a sources.data that was many GB in size. To resolve this, we tried to upgrade to 4.3.5, which has the ability to disable global metadata. When we did this, the behavior described in title started. Why did this happen, and what can we do about it?

1 Solution

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

View solution in original post

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...