Hi all,
We're running an indexer cluster on 6.5.1. We found error in splunkd.log on one of the search peers:
12-06-2016 15:32:53.654 +0800 ERROR BucketReplicator - Replication failed due to open failure: file=/opt/splunk/var/lib/splunk/_internaldb/db/db_1480966426_1480764958_87_BF4B1947-4FB6-4464-BD62-299457B51B72/1480941784-1480764958-4821280532600088189.tsidx error='No such file or directory'
2-06-2016 15:34:01.061 +0800 ERROR BucketReplicator - Replication failed due to open failure: file=/mnt/security/db_1481009551_1480929735_9_BF4B1947-4FB6-4464-BD62-299457B51B72/1481007010-1481003314-4825555822462713745.tsidx error='No such file or directory'
Seems some tsidx files are 'lost'. There are other tsidx files in the same bucket. I've no idea what happened. Would anyone please help? Thanks.
Besides, a cold bucket folder for heavy index looks like following :
indexer1:
db_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72
db_1479873491_1479686071_1_BF4B1947-4FB6-4464-BD62-299457B51B72
rb_1478498103_1478252227_4_7DE7B2FF-7653-48F6-8C1B-4F611554920C
rb_1478568321_1478498104_5_7DE7B2FF-7653-48F6-8C1B-4F611554920C
indexer2:
db_1478498103_1478252227_4_7DE7B2FF-7653-48F6-8C1B-4F611554920C
db_1478568321_1478498104_5_7DE7B2FF-7653-48F6-8C1B-4F611554920C
rb_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72
rb_1479873491_1479686071_1_BF4B1947-4FB6-4464-BD62-299457B51B72
Seems buckets are renamed to "rb*" when replicated to peer. Is that correct?
Sorry for the newbie questions.
Thanks a lot.
Regards,
/ST Wong
The first (quick) answer is that yes, buckets have their directory name to begin with "rb_" when they're replicated.
As for the missing TSIDX files, it may be possible to rebuild the bucket. From the CLI, you'd use something like splunk rebuild db_1479686070_1479451778_0_BF4B1947-4FB6-4464-BD62-299457B51B72
. This builds the TSIDX files (and *.data files - metadata) from the raw data journal, all afresh.