Deployment Architecture

Are bucket corruption and configuration initialization errors related?

elliotproebstel
Champion

We have been getting a lot of errors of this nature lately:

[indexer] Failed to read size=1235 event(s) from rawdata in bucket='my_index~3~66FDB370-3E8C-4495-9F62-60F0490E21DF' path='/opt/splunk/var/lib/splunk/hotwarm/my_index/db/rb_1521159284_1520938932_3_66FDB370-3E8C-4495-9F62-60F0490E21DF. Rawdata may be corrupt, see search.log. Results may be incomplete!

We see that maybe 2-3 times/week in the last month or so. Additionally (and maybe related?), we've been seeing errors of this nature almost every time we run a search for the last few months:

Dispatch Runner: Configuration initialization for /opt/splunk/var/run/searchpeers/my-server-1521808017 took longer than expected (1028ms) when dispatching a search (search ID: remote_my-server_1521808321.22898); this typically reflects underlying storage performance issues
  1. Are these likely to be related?
  2. Regardless of #1 - is there good advice for fixing/avoiding these, other than routinely putting the system into maintenance mode and manually running fixups?
0 Karma
1 Solution

elliotproebstel
Champion

We have determined that these were not related. It turns out that our increase in corrupt bucket errors was actually caused by a Linux OS-level configuration error that was causing our indexers to hard restart unpredictably every day or two. We fixed the underlying issue, and we stopped getting the abundance of corrupt buckets.

View solution in original post

0 Karma

elliotproebstel
Champion

We have determined that these were not related. It turns out that our increase in corrupt bucket errors was actually caused by a Linux OS-level configuration error that was causing our indexers to hard restart unpredictably every day or two. We fixed the underlying issue, and we stopped getting the abundance of corrupt buckets.

0 Karma

dm1
Contributor

@elliotproebstel we are facing exact same issue. Our deployment is on AWS.

Can you please share what was the cause of this issue in your environment and how did you fix it ?

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...