Splunk Enterprise

Restarted Splunk Now Missing All Data

andrewkenth
Communicator

I restarted Splunk and now I am missing all of my data before today (this data was loaded after I restarted I believe).

Can someone help me to understand what happend (or could have happend) here?

Everything seems to be owned correctlly:

[root@wnl-svr184b var]# ls -l /apps/wcm-splunk/var/lib/splunk
total 92
drwx------ 6 wcsplunku wcsplunku 4096 Nov  5 15:26 audit
-rw------- 1 wcsplunku wcsplunku    2 Dec 10 12:53 _audit.dat
drwx------ 2 wcsplunku wcsplunku 4096 Nov  5 15:26 authDb
drwx------ 6 wcsplunku wcsplunku 4096 Nov  5 15:26 blockSignature
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 _blocksignature.dat
drwx------ 6 wcsplunku wcsplunku 4096 Nov  7 08:20 charlesriver
-rw------- 1 wcsplunku wcsplunku    2 Dec 10 12:53 charlesriver.dat
drwx------ 6 wcsplunku wcsplunku 4096 Nov  7 14:39 defaultdb
drwx------ 8 wcsplunku wcsplunku 4096 Dec 10 14:28 fishbucket
drwx------ 2 wcsplunku wcsplunku 4096 Nov  5 15:26 hashDb
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 history.dat
drwx------ 6 wcsplunku wcsplunku 4096 Nov  5 15:26 historydb
-rw------- 1 wcsplunku wcsplunku    2 Dec 10 12:53 _internal.dat
drwx------ 6 wcsplunku wcsplunku 4096 Nov  5 15:26 _internaldb
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 main.dat
drwx------ 3 wcsplunku wcsplunku 4096 Dec 10 14:17 persistentstorage
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 summary.dat
drwx------ 6 wcsplunku wcsplunku 4096 Nov  5 15:26 summarydb
drwx------ 6 wcsplunku wcsplunku 4096 Nov  6 11:06 test
drwx------ 6 wcsplunku wcsplunku 4096 Nov  8 09:44 testapp
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 testapp.dat
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 test.dat
-rw------- 1 wcsplunku wcsplunku    1 Dec 10 12:53 _thefishbucket.dat
Tags (1)
0 Karma
1 Solution

andrewkenth
Communicator

It appears that my index size was too small and the data was frozen (however I have no frozen directory configured). I ran this query to find that data had been frozen in the charlesriver index:

index=_internal source="/apps/wcm-splunk/var/log/splunk/splunkd.log" charlesriver freeze

It showed records such as this:

11-28-2013 04:03:28.156 -0500 INFO  BucketMover - AsyncFreezer freeze succeeded for bkt='/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'   2013-11-28T04:03:28.156-0500    '/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'       BucketMover     4   28  3   november    28  thursday    2013    -300    splunkd-log     wnl-svr184b _internal       1   INFO        AsyncFreezer freeze succeeded for bkt='/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'     --_::._-____-_____='//-//////'  /apps/wcm-splunk/var/log/splunk/splunkd.log splunkd wnl-svr184b 29      0

Now my question is, can I recover these or are they lost for good considering I have no directory configured for frozen data?

View solution in original post

0 Karma

andrewkenth
Communicator

It appears that my index size was too small and the data was frozen (however I have no frozen directory configured). I ran this query to find that data had been frozen in the charlesriver index:

index=_internal source="/apps/wcm-splunk/var/log/splunk/splunkd.log" charlesriver freeze

It showed records such as this:

11-28-2013 04:03:28.156 -0500 INFO  BucketMover - AsyncFreezer freeze succeeded for bkt='/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'   2013-11-28T04:03:28.156-0500    '/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'       BucketMover     4   28  3   november    28  thursday    2013    -300    splunkd-log     wnl-svr184b _internal       1   INFO        AsyncFreezer freeze succeeded for bkt='/apps/wcm-splunk/var/lib/splunk/charlesriver/db/db_1385355600_1384810904_22'     --_::._-____-_____='//-//////'  /apps/wcm-splunk/var/log/splunk/splunkd.log splunkd wnl-svr184b 29      0

Now my question is, can I recover these or are they lost for good considering I have no directory configured for frozen data?

0 Karma

jpass
Contributor

Important: don't run the commands below if you aren't sure what they do. You could end up changing owner:group permissions on your entire system which is a pain in the arse.

Without much info to go on...it sounds like you might have restarted splunk as the wrong user. Are you using Linux? I am and I've done this before. On my set-up I run Splunk as the user 'Splunk'. All the files & folders should be owned by this user.

I found out a few hours after IT restarted Splunk as 'root' user that something was wrong. I restarted via the command line and dictated which user (Splunk) it should run under:

sudo -H -u splunk /$splunk_home_directory$/bin/splunk restart

This didn't solve the issue completely because, after IT restarted Splunk0 as 'root', newly indexed data and other files were now owned by 'root'. The symptom was that after I restarted Splunk as user 'splunk', I could not see anything indexed while SPlunk was running under 'root' user. My data only showed events from the day before back.

To fix, I stopped Splunk and changed owner:group on ever single file and directory in the splunk home directory:

From the parent directory of the splunk home directory:
sudo chown splunk:splunk -R splunk/

Then I restarted again:
sudo -H -u splunk /$splunk_home_directory$/bin/splunk restart

For some reason this didn't change some files so I had to do a search for any files in the Splunk directory that weren't owned by splunk user. I manually ran chown against these files, restarted splunk correctly, and voila. Back to normal.

0 Karma

andrewkenth
Communicator

It appears all of the datafiles are owned appropriatly (see original post above, edited for ls -l).

0 Karma

jpass
Contributor

is your splunk/var directory a mapped network drive or symlinked? When I chowned the first time it didn't hit the symlinked directory so I had to go into that directory and run the command.

0 Karma

andrewkenth
Communicator

This is what I was thinking as well. to I did manage to chown the directoriey correclty but when I restart I am still missing my data. Proving this may not be it I started Splunk as root and still am missing the data.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...