Deployment Architecture

Memory leak on indexers in a cluster

gregbo
Communicator

I have a splunk cluster (RHEL) with 2 indexers, and they seem to have a memory leak. Memory usage grows steadily until it runs out of memory and the OOM Killer kills the splunkd process. The only way to get the memory back is to restart the server (restarting splunk doesn't help). I've got two clusters that do this, and a few single instance splunk servers that never do this. Oddly enough, it's usually one of the indexers in the cluster that eats up memory first. ulimit and THP is set properly on all the servers. Has this happened to anyone?

0 Karma

codebuilder
Influencer

On your indexers, create or update $SPLUNK_HOME/etc/system/local/limits.conf and add the following:

[defaults]
max_mem_usage_mb = 2000

Where 2000 is the amount of memory you want to restrict Splunk to use for the search process on that node.
Cycle Splunk.

You also may want to evaluate the number of scheduled searches you have running, or the type of ad-hoc queries may be using.
Too many of either can cause the issue as well.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

niketn
Legend

@gregbo which version of Splunk are you running on?

If you have Splunk Entitlement you should work with Splunk Support by providing them with heap dump, diag file and dispatch directory with debug level details.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...