Splunk Search

splunkd died every day with the same error

vincenty
Explorer

splunkd died every day with the same error
FATAL ProcessRunner - Unexpected EOF from process runner child!
ERROR ProcessRunner - helper process seems to have died (child killed by signal 9: Killed)!

I can't see anything that may caused this... it does not last for 24 hours after restart...

here's the partial log:
04-13-2013 13:37:03.498 +0000 WARN FilesystemChangeWatcher - error getting attributes of path "/home/c9logs/c9logs/edgdc2/sdi_slce28vmf6011/.zfs/snapshot/.auto-1365364800/config/m_domains/tasdc2_domain/servers/AdminServer/adr": Permission denied
04-13-2013 13:37:03.499 +0000 WARN FilesystemChangeWatcher - error getting attributes of path "/home/c9logs/c9logs/edgdc2/sdi_slce28vmf6011/.zfs/snapshot/.auto-1365364800/config/m_domains/tasdc2_domain/servers/AdminServer/sysman": Permission denied
04-13-2013 13:38:37.102 +0000 FATAL ProcessRunner - Unexpected EOF from process runner child!
04-13-2013 13:38:37.325 +0000 ERROR ProcessRunner - helper process seems to have died (child killed by signal 9: Killed)!

Tags (1)

codebuilder
Influencer

Your ulimits are not set correctly, or are using the system defaults.
As a result, splunkd is likely using more memory than allowed or available, so the kernel kills the process in order to protect itself.

----
An upvote would be appreciated and Accept Solution if it helps!

RishiMandal
Explorer

Did you get this resolved?
Can you validate and confirm if splunk was getting killed post an active session is terminated, that is, as soon as some one logs out of your splunk session or server, and if it dies after that.

0 Karma

splunkreal
Motivator

We had this problem with an infinite loop inside a macro (calling itself) even though we had [search] limits.conf set up on memory.

* If this helps, please upvote or accept solution 🙂 *
0 Karma

RishiMandal
Explorer

how did you find the macro causing issues and calling itslef. Will be helpful for me to validate the same

0 Karma

splunkreal
Motivator

Correlated with changes made that day

* If this helps, please upvote or accept solution 🙂 *
0 Karma

mweissha
Path Finder

My .02 is that this is memory related. I am having the same issue and a check on /var/log/messages shows:

Apr 20 01:59:06 splog1 kernel: Out of memory: Kill process 45929 (splunkd) score 17 or sacrifice child
Apr 20 01:59:06 splog1 kernel: Killed process 45934, UID 5000, (splunkd) total-vm:66104kB, anon-rss:1260kB, file-rss:4kB

This was happening on a new instance of Enterprise 6.5.3. I traced it to an input source that was particulary large and hadn't been indexed for a while due to the upgrade. I had to restart splunkd a few times on the indexer and now it's running well.

rsolutions
Path Finder

Was this ever resolved?

0 Karma

rvenkatesh25
Engager

Check syslog/dmesg to see if the kernel's oom_killer is getting invoked

Out of memory: Kill process 7575 (splunkd) score 201 or sacrifice child
Killed process 7576, UID 1000, (splunkd) total-vm:70232kB, anon-rss:392kB, file-rss:152kB

gkanapathy
Splunk Employee
Splunk Employee

Signal 9 is a KILL signal from an external process. It is likely that your OS has some kind of monitor or other setting on it that kills processes that do certain things. Perhaps your administrator is watching for memory usage, access to certain files, or other things. You should consult with your system admin to find out what they have put in place.

Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...