Monitoring Splunk

Why is Splunk crashing whenever I try to start the splunkd service?

kiran331
Builder

Hi all,

Splunk is crashing when I tried to start the service. Here's the crash report.

Received fatal signal 6 (Aborted).
Cause:
   Signal sent by PID 3263 running under UID 31204.
Crashing thread: SplunkdSpecificInitThread
Registers:
    RIP:  [0x00007F3194A775F7] gsignal + 55 (/lib64/libc.so.6 + 0x355F7)
    RDI:  [0x0000000000000CBF]
    RSI:  [0x0000000000000CCF]
    RBP:  [0x00007F3194BC0288]
    RSP:  [0x00007F318E5FE458]
    RAX:  [0x0000000000000000]
    RBX:  [0x00007F3194A41000]
    RCX:  [0xFFFFFFFFFFFFFFFF]
    RDX:  [0x0000000000000006]
    R8:  [0x00007F3189E00000]
    R9:  [0x00007F318FFD3880]
    R10:  [0x0000000000000008]
    R11:  [0x0000000000000202]
    R12:  [0x00007F3197B82570]
    R13:  [0x00007F3197C36D60]
    R14:  [0x00007F318DE4A460]
    R15:  [0x00007F318E5FE950]
    EFL:  [0x0000000000000202]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0xFFFF000000000033]
    OLDMASK:  [0x0000000000000000]

OS: Linux
Arch: x86-64

Backtrace (PIC build):
  [0x00007F3194A775F7] gsignal + 55 (/lib64/libc.so.6 + 0x355F7)
  [0x00007F3194A78CE8] abort + 328 (/lib64/libc.so.6 + 0x36CE8)
  [0x00007F3194A70566] ? (/lib64/libc.so.6 + 0x2E566)
  [0x00007F3194A70612] ? (/lib64/libc.so.6 + 0x2E612)
  [0x00007F3196A066CD] _ZN14IndexerService35disableIndexesAndReinitGlobalConfigERKN9__gnu_cxx17__normal_iteratorIPK3StrSt6vectorIS2_SaIS2_EEEESA_ + 1741 (splunkd + 0x9B76CD)
  [0x00007F3196A076E7] _ZN14IndexerService18initPerIndexConfigEP9StrVectorb + 455 (splunkd + 0x9B86E7)
  [0x00007F3196A09CB1] _ZN14IndexerService12reloadConfigERK14IndexConfigRef + 481 (splunkd + 0x9BACB1)
  [0x00007F3196FE4050] _ZN9EventLoop20internal_runInThreadEP13InThreadActorb + 256 (splunkd + 0xF95050)
  [0x00007F3196A05BA8] _ZN14IndexerService16loadLatestConfigEP14IndexConfigRef + 808 (splunkd + 0x9B6BA8)
  [0x00007F3196A05D1B] _ZN14IndexerService16loadLatestConfigEv + 43 (splunkd + 0x9B6D1B)
  [0x00007F3196A0A3AB] _ZN14IndexerServiceC2Ev + 859 (splunkd + 0x9BB3AB)
  [0x00007F3196A0A847] _ZN14IndexerService14_new_singletonEv + 55 (splunkd + 0x9BB847)
  [0x00007F31966AD84F] _ZN25SplunkdSpecificInitThread4mainEv + 159 (splunkd + 0x65E84F)
  [0x00007F31970A1490] _ZN6Thread8callMainEPv + 64 (splunkd + 0x1052490)
  [0x00007F3194E0ADC5] ? (/lib64/libpthread.so.0 + 0x7DC5)
  [0x00007F3194B3828D] clone + 109 (/lib64/libc.so.6 + 0xF628D)
Linux / pcpnplsplidx01 / 3.10.0-327.el7.x86_64 / #1 SMP Thu Oct 29 17:29:29 EDT 2015 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
    2016-05-21 21:37:57.820 -0500 splunkd started (build f2c836328108)
    splunkd: /home/build/build-src/galaxy/src/pipeline/indexer/IndexerService.cpp:921: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable indexes on a clustering slave."' failed.
    2016-05-21 21:42:25.272 -0500 splunkd started (build f2c836328108)
    splunkd: /home/build/build-src/galaxy/src/pipeline/indexer/IndexerService.cpp:921: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable indexes on a clustering slave."' failed.

/etc/redhat-release: Red Hat Enterprise Linux Server release 7.2 (Maipo)
glibc version: 2.17
glibc release: stable
Last errno: 2
Threads running: 23
Runtime: 2.965932s
argv: [splunkd -p 8089 start]
Thread: "SplunkdSpecificInitThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f3190276410:
00000000  00 f7 5f 8e 31 7f 00 00                           |.._.1...|
00000008

InThreadActor @0x7f318e5feaa0: _queuedOn=(nil), ran=N, wantWake=Y, wantFailIfLoopDone=N
First 128 bytes of InThreadActor object @0x7f318e5feaa0:
00000000  f8 78 17 98 31 7f 00 00  01 00 00 8e 31 7f 00 00  |.x..1.......1...|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 a0 e4 8d 31 7f 00 00  00 f0 e4 8d 31 7f 00 00  |....1.......1...|
00000060  e0 eb 5f 8e 31 7f 00 00  95 9a e4 72 7f 83 d3 1c  |.._.1......r....|
00000070  50 9b 04 96 31 7f 00 00  50 eb 5f 8e 31 7f 00 00  |P...1...P._.1...|
00000080


x86 CPUID registers:
         0: 0000000F 756E6547 6C65746E 49656E69
         1: 000306F2 03020800 9ED83203 1FABFBFF
         2: 76036301 00F0B5FF 00000000 00C10000
         3: 00000000 00000000 00000000 00000000
         4: 00000000 00000000 00000000 00000000
         5: 00000000 00000000 00000000 00000000
         6: 00000075 00000002 00000009 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000000 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07300401 0000007F 00000000 00000000
         B: 00000000 00000000 000000CD 00000003
         C: 00000000 00000000 00000000 00000000
         😧 00000000 00000000 00000000 00000000
         E: 00000000 00000000 00000000 00000000
         F: 00000000 00000000 00000000 00000000
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000001 28100800
  80000002: 65746E49 2952286C 6F655820 2952286E
  80000003: 55504320 2D354520 30333632 20337620
  80000004: 2E322040 48473034 0000007A 00000000
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 01006040 00000000
  80000007: 00000000 00000000 00000000 00000100
  80000008: 00003028 00000000 00000000 00000000
terminating...
Tags (2)
0 Karma

acharlieh
Influencer

It's funny getting the notification for this today. I actually just ran into the same crash myself recently. I have a support case of 440220, which resulted in enhancement request of ENH-6091. If you have a support account and want to be notified of this you can log a case to be added to the CC list of these.

But in $SPLUNK_HOME/var/log/splunk/splunkd.log (or one of the rolled copies if it's been a while, timestamp just before my crash I saw messages like this):

01-10-2017 17:30:01.936 -0600 ERROR DatabaseDirectoryManager - idx=idxname bucket=db_1484082640_1483977006_1_{guid} Detected directory manually copied into its database, causing id conflicts [path1='{idx:homePath}/db_1484082715_1483977061_1_{guid}' path2='/{idx:homePath}/db_1484082640_1483977006_1_{guid}'].
01-10-2017 17:30:01.936 -0600 ERROR IndexerService - Error intializing IndexerService: idx=idxname bucket=db_1484082640_1483977006_1_{guid} Detected directory manually copied into its database, causing id conflicts [path1='/{idx:homePath}/db_1484082715_1483977061_1_{guid}' path2='/{idx:homePath}/db_1484082640_1483977006_1_{guid}'].

After fixing the conflicting buckets, (I had to do a couple rounds, as it only reported a single pair of buckets each crash), but I was able to start successfully myself as @kiran331 mentioned

MichaelRye
Engager

I did finally manage to find the offending bucket(s). After removing them that were manually copied in, startup works now and we're back up and running. Thank you!

0 Karma

acharlieh
Influencer

If you have a support contract I would definitely log a case for this. I should be able to disable indexes across an entire cluster without a crash. (Disabling on an individual slave should not happen, but the ideal case would be not to crash when detecting this state, but failing more gracefully.

0 Karma

katanguriabhi
Explorer

@kiran331 what is the solution for this issue

0 Karma

kiran331
Builder

In the Crash.log I saw the replicated Bucket is causing errors, I removed the bucket and splunk service is started.

0 Karma

MichaelRye
Engager

I have the same problem here on one of my indexers, but I do not see a bucket name or ID. Where does the crash log show the bucket?

0 Karma

katanguriabhi
Explorer

I did the same but it is not coming up, i just don't know what else might be the problem.

0 Karma

kiran331
Builder

Ok. Better to file a case with Support.

0 Karma

Richfez
SplunkTrust
SplunkTrust

It appears you have a config in place that attempts to disable indexes on a clustered slave.

I would check what changes have taken place in your configs between the last restart of Splunk and this most recent one. A review those changes will probably point out where it's being disabled from.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...