Monitoring Splunk

splunkd PID going in "defuct" after restart.

splunkuseradmin
Path Finder

Hi all,

Need in some issue which is might be a known one, please help.
every frequently one or the other indexer(splunkd) service is going in "defunct" state then when i try to restart the service all PID's related to splunk going in state so cannot restart the service aswell.

Initially, it was showing below "defunct" then i tried to restart splunk.service
[root@hostname ~]# ps -ef | grep defunct
root 23399 23167 0 21:49 pts/1 00:00:00 grep --color=auto defunct
svc.spl+ 32079 4383 0 09:07 ? 00:00:03 [splunkd]

[root@hostname ~]#

I tried to restart and the status is below.

[root@hostname ~]# systemctl status splunk.service
● splunk.service - Splunk
Loaded: loaded (/usr/lib/systemd/system/splunk.service; enabled; vendor preset: disabled)
Active: activating (start) since Sun 2020-02-02 22:45:54 MST; 1min 9s ago
Process: 29434 ExecStop=/opt/splunk/bin/splunk stop (code=killed, signal=TERM)
Main PID: 4381; : 32121 (splunk)
CGroup: /system.slice/splunk.service
└─32121 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license
‣ 4381 [splunkd]

Feb 02 22:45:54 hostnme.domain.com systemd[1]: Starting Splunk...
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: splunkd 4381 was not running.
Feb 02 22:45:54 hostnme.domain.com splunk[32121]: Stopping splunk helpers...

[root@hostname ~]# ps -ef | grep splunk

svc.spl+ 4381 1 99 Jan30 ? 5-12:21:04 [splunkd]

svc.spl+ 7638 1 0 09:05 ? 00:00:37 [splunkd]

svc.spl+ 7639 7638 0 09:05 ? 00:00:00 [splunkd]

svc.spl+ 26167 1 0 09:06 ? 00:00:14 [splunkd]

svc.spl+ 26168 26167 0 09:06 ? 00:00:00 [splunkd]

svc.spl+ 32079 1 0 09:07 ? 00:00:03 [splunkd]

svc.spl+ 32080 32079 0 09:07 ? 00:00:00 [splunkd]

svc.spl+ 32121 1 0 22:45 ? 00:00:00 /opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license

root 32281 23128 0 22:47 pts/1 00:00:00 grep --color=auto splunk

svc.spl+ 36590 1 0 09:08 ? 00:00:03 [splunkd]

svc.spl+ 36597 36590 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 36744 1 0 09:08 ? 00:00:08 [splunkd]

svc.spl+ 36746 36744 0 09:08 ? 00:00:00 [splunkd]

svc.spl+ 45222 1 0 09:10 ? 00:00:01 [splunkd]

svc.spl+ 45224 45222 0 09:10 ? 00:00:00 [splunkd]

any help would be appreciated.

Thankyou.

Labels (1)
0 Karma

codebuilder
Influencer

This happens when you initially install/start Splunk as root, but then change the owner to "splunk" e.g.
Stop Splunk gracefully, systemctl splunk stop.
Check for any remaining processes (ps -ef |grep -i splunk), and kill any that remain (kill -9 e.g.).
Ensure that Splunk is set to run as the user you intend, check /opt/splunk/etc/splunk-launch.conf.
Restart Splunk and verify. Note: if using systemd, update also the user in the unit file (/etc/systemd/system/splunkd.service, or whatever you had named it).

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

harsmarvania57
Ultra Champion

Have you looked at dmesg or /var/log/messages ? It looks like splunk process killed but why it is killed that you need to check in OS logs (For example: OOM killer)

0 Karma

splunkuseradmin
Path Finder

i see some hung_task_kernel messages which "Splunkd : is blocked for more than 120 secs".

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...