Monitoring Splunk

How to determine why Splunk 6.2.1 shut down unexpectedly?

sc0tt
Builder

Splunk (6.2.1) unexpectedly shut down and needed to be restarted. There were no issues with the server so the issue seems to be specific to Splunk. Is there a way to determine the root cause? Any specific log entries that should be checked?

0 Karma
1 Solution

sc0tt
Builder

Thanks for all the suggestions. It seems that it may be a memory usage related to the fact that Splunk 6.2.1 ignores user time zone setting for cron scheduled searches and runs cron scheduled searches in relation to system time instead. We have many saved searches scheduled across multiple time zones but now they are all running at the same EST time which is using more resources.

View solution in original post

0 Karma

damionsteen
New Member

I ran into the same issue:

Feb 20 15:46:09 DCTM1 kernel: Out of memory: Kill process 27016 (splunkd) score 193 or sacrifice child
Feb 20 15:46:09 DCTM1 kernel: Killed process 27016 (splunkd) total-vm:6018500kB, anon-rss:2398292kB, file-rss:1460kB, shmem-rss:0kB
Feb 20 15:46:09 DCTM1 kernel: splunkd: page allocation failure: order:0, mode:0x201da
Feb 20 15:46:09 DCTM1 kernel: CPU: 1 PID: 27016 Comm: splunkd Not tainted 3.10.0-514.el7.x86_64 #1,I ran into the same issue on Redhat:

Feb 20 15:46:09 DCTM1 kernel: Out of memory: Kill process 27016 (splunkd) score 193 or sacrifice child
Feb 20 15:46:09 DCTM1 kernel: Killed process 27016 (splunkd) total-vm:6018500kB, anon-rss:2398292kB, file-rss:1460kB, shmem-rss:0kB
Feb 20 15:46:09 DCTM1 kernel: splunkd: page allocation failure: order:0, mode:0x201da
Feb 20 15:46:09 DCTM1 kernel: CPU: 1 PID: 27016 Comm: splunkd Not tainted 3.10.0-514.el7.x86_64 #1

0 Karma

sc0tt
Builder

Thanks for all the suggestions. It seems that it may be a memory usage related to the fact that Splunk 6.2.1 ignores user time zone setting for cron scheduled searches and runs cron scheduled searches in relation to system time instead. We have many saved searches scheduled across multiple time zones but now they are all running at the same EST time which is using more resources.

0 Karma

thomrs
Communicator

I had an issue where Red Hat was killing splunk b/c of memory. Messages where in /var/log/messages. As mentioned above _internal or even _* should help determine the cause. Look at the system activity dashboards us another good place to look.

esix_splunk
Splunk Employee
Splunk Employee

If the splunk instance is back up and running, run a search for "index=_internal" from the time range it crashed and start looking for events.

0 Karma

harshilmarvani1
New Member

Have you checked crash log in splunk log directory??

Check your ulimit as well, on Linux default ulimit is 1024. And splunk suggest alteast 8192 for user from splunkd is running.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...