Getting Data In

What is causing the following warning from the Monitoring Console's Health Check?: "Saturation of event-processing queues"

tcmarquesi
Explorer

Monitoring saturation of event-processing queues in Heavy Forwarders

I have a distributed environment with multiple indexes, search heads, and a pair of heavy forwarders. But over the last few days, one of my heavy forwarders started to alert a issue. The Monitoring Console's Health Check is warning "Saturation of event-processing queues". Besides that, the heavy forwarders performances have decreased a lot, delaying event delivery and failing scripts execution. splunkd is consuming 100% of its CPU core full time.

Checking docs (Identify and triage indexing performance problems), they suggest to determine queue fill pattern through the Monitoring Console > Indexing > Indexing Performance: Instance. But, seems it applies only to the indexers, not to the heavy forwarder.

Please, how could I discover what is causing such issue? How could I monitor such an issue? How can I see when it starts and how long it takes in order to do a cross with other systems behavior? Is such info available in the Monitoring Console?

Thanks in advance and regards,

Tiago

gjanders
SplunkTrust
SplunkTrust

in Alerts For Splunk Admins I have an alert called IndexerLevel - Indexer Queues May Have Issues (refer to the github location if you don't want to download the app)

The monitoring console covers this under Use the monitoring console to view indexing performance

In terms of finding a cause there are various posts on the answers site, try this google search for a start

0 Karma

ddrillic
Ultra Champion

You should try to find out the cause obviously, but keep in mind that the default queue sizes is tiny. We ended up with the indexers to have something like at $SPLUNK_HOME/etc/system/local/server.conf -

[queue=AEQ]
maxSize = 200MB

[queue=parsingQueue]
# Default maxSize = 6MB
maxSize = 3600MB

[queue=indexQueue]
maxSize = 4000MB

[queue=typingQueue]
maxSize = 2100MB

[queue=aggQueue]
# Default maxSize = 1MB
maxSize = 3500MB

This buffer of memory helped us to remain stable during peak usage time.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...