Deployment Architecture

Proactively monitor for bucket corruption

jamesoconnell
Path Finder

I just repaired corrupt buckets for a partner index on one of our enterprise indexers.
The issue only became apparent after the customer saw the warnings on their reports.

My question is: are there easy proactive warnings the administrators can receive highlighting index bucket corruption -- rather than leaving it up to our customers to find the problems.

0 Karma
1 Solution

bheemireddi
Communicator

If you are using "monitoring console" that would be a good starting point. It has the visibility into monitoring Indexer clustering activities. Below link might get you started, these are all the dashboards/searches, so may be you can setup the alerts on them. Also on the cluster master settings->indexer clustering might give you some insights too.
https://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Viewindexerclusteringstatus

View solution in original post

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

we can found corrupted buckets from multisite cluster by next search / alert:

index=_internal component=CMMaster state=Discard incoming_bucket_size=* earliest=-30d@d 
| dedup bid 
| table _time,bid,peer_name,existing_bucket_size,incoming_bucket_size
| sort bid,_time

This shows bucket id + source peer.

r. Ismo

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Even this is old case, I would like to add which the one can do with current versions.

Just run this:

| dbinspect index=* OR index=_* corruptonly=true 
| search state!=hot

Select enough long time period to found all corrupted buckets.

r. Ismo 

sloshburch
Splunk Employee
Splunk Employee

A peer of mine shared this search. Does it jive with your environment? I wanna see if we can add these things into the MC as well so I'm curious to hear how you make out.

index=_internal sourcetype=splunkd component=ProcessTracker (BucketBuilder OR JournalSlice) (NOT "rawdata was truncated")
|eval message=replace(message, "^\(child.*?\)\s+", "")
|bin _time span=1m
|stats c by _time, host, splunk_server, message
|fields - c
|rename splunk_server as Indexer, host as Host, message as Issue
0 Karma

jamesoconnell
Path Finder

Thank you Mr. Burch. I tried running this but didn't get any results.

This could either mean that we don't have any bucket issues, or your search isn't worth the paper it is written on -- not sure which.

I'm not sure where the truth lies yet, but I am guessing we must have some bucket issues somewhere given the amount of data we pump each day.

More testing required I think.

thank you!

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Us neither could see any issues with previous search, but there are still couple of corrupted buckets (e.g. journal.gz was only couple of bytes).

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Would you provide more detail on how you identified the buckets were corrupted? That might add color into an existing way to be notified.

0 Karma

jamesoconnell
Path Finder

There was an exclamation symbol / warning on the Dashboard with some cryptic message saying there was an error related to the indexer in question: "[indexer_] Streamed search execute failed because: JournalSliceDirectory: Cannot seek to rawdata offset 0 ..."
This type of error scares the crap out of users and they freak-out to the admin...

0 Karma

bheemireddi
Communicator

If you are using "monitoring console" that would be a good starting point. It has the visibility into monitoring Indexer clustering activities. Below link might get you started, these are all the dashboards/searches, so may be you can setup the alerts on them. Also on the cluster master settings->indexer clustering might give you some insights too.
https://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Viewindexerclusteringstatus

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...