Getting Data In

Indexer hosts with shared storage

danielwan
Explorer

I am planning to build a Splunk 6.5.x indexer "farm" to balance the workload. The "farm" can be either a bunch of individual indexer hosts or an indexer cluster. All indexer hosts are expected to read from/write to all indexes.

To comply with the company policy, the VM template used to build the indexer has very limited disk space, but I can use a huge shared storage (NFS) to store indexes files.

Given this NFS share's performance and reliability satisfy the requirement, is there any practical solution to use shared storage with any type of indexer "farm"? I know Splunk usually recommends indexer cluster which uses the isolated disk attached to each host, but my index size is much larger the disk space of any of my VM template. In addition, index HA and index replication is a plus to me but NOT a must feature.

If it's OK to use shared storage, is it allowed to share the indexes to all index hosts? e.g. For indexA, I create a single directory call /IndexA on the shared storage, afterward, mount this directory to the same mount point on all indexer hosts, such as SPLUNK_HOME/var/lib/splunk/. All indexer hosts would read/write the same index.

  • In the circumstance that several individual indexer hosts use the shared storage, I concern if it would screw up indexes as multiple hosts read/write the same index at the same time.
  • In the circumstance of an indexer cluster, my understanding is the bucket append a GUID at end of file name, and GUID is different from host to host. So I guess indexer hosts would NOT share the bucket file, instead, only read/write to the bucket with their own GUID, is it correct?
Tags (2)
0 Karma

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @danielwan, if they answered your question, please remember to "√Accept" the answer to award karma points and to let other Splunkers know there’s a working solution. We’re hosting a karma point contest, so it’s particularly awesome to up vote on Answers these days. 😄

0 Karma

dwaddle
SplunkTrust
SplunkTrust

Your company policy is at odds with Splunk best practices. Those best practices exist for a reason - because they have been proven in the field to provide robust, supportable deployments. You deviate from them at your own risk.

Let's start straight outta the docs: http://docs.splunk.com/Documentation/Splunk/7.0.0/Installation/Systemrequirements#Considerations_reg...

Do not use NFS to host hot or warm index buckets, because a failure in NFS can cause data loss. NFS works best with cold or frozen buckets.
Do not use NFS to share cold or frozen index buckets amongst an indexer cluster, as this potentially creates a single point of failure.

Let me reiterate to be clear, for all of your indexes, the homePath must be on a filesystem that is not NFS. Do not put homePath for any index on NFS.

Under no circumstance should indexers see each others' files. Ever. If you are going to use a shared NFS volume for your cold and frozen buckets (remember: putting hot+warm on NFS is unsupported), then you should make sure that the directories are organized in a way that indexer1 can never see files for indexer2. For example, on the NAS you might have directories like "/vol/splunk/indexer1" and "/vol/splunk/indexer2" and each indexer mounts their specific directory.

These things are what you need to know in order to get a working configuration.

But, there's way more to consider. Now that you've decided to use NFS then your Splunk is at the mercy of the shared storage robustness, both for availability and for performance. Your performance will be variable, depending on how many other things are accessing the shared storage at once. And from an availability perspective, having the tools you rely on to troubleshoot things be the things that are impacted when a system goes down is not fun.

May you live in interesting times.

Richfez
SplunkTrust
SplunkTrust

There are experts in this and I am not one of those, but I'm happy to tell you what I DO know.

First, the following is all about indexers: no other Splunk instances need massive IO capabilities and for those "regular system" types of environments are just fine for the most part.

NFS is not a particularly good storage location for Splunk at all. I believe you can often put cold on NFS with only moderate pain, but hot/warm, no. Just no. Bad Things will happen and you will not be happy. Even for cold, you must have separate mount points for each indexer or at least 100% assure they won't share FS level stuff. Basically, NFS and Splunk is a bad time waiting to happen, but if you have a great storage team, a good NFS implementation, LOTS of experience in running tricky stuff on NFS, ONLY use it for cold, and have read the system requirements docs and warnings thoroughly, then maybe it'll be OK.

Shared storage in general and IOPS: The short of it is that Splunk needs ALL the IOPS. 🙂 Now, that being said there's nothing wrong with running Splunk on shared storage but in places that want to "require" this, there will invariably be a battle between Splunk's needs and the ridiculously underprovisioned, shared storage the storage admins will give you. If you can ensure that each Splunk indexer can get 1200+ IOPS - all at the same time (even when backups are happening!) then shared storage has a shot at being good enough. Usually though shared storage is HUGE yet not real fast. That's important - it IS fast, but it relies on not everything needing lots of IOPS all at the same time. We have a SAN here, we have to stagger out SQL Server maintenance plans so not too many are trying to "do things" at the same time or else we saturate its 20k IOPS. I don't even think of running my splunk Indexers on there - they're on SSD on separate boxes and work fabulously - and they don't slow down anything else, and nothing else slows them down.

Litmus test: Ask your storage admins what their peak IOPS (read, write, total) are during their peak hour, and during that period how long it takes to fulfill an IO request (full stack, from request to having OS receive the data). They should be able to nearly immediately with numbers that sound good, or will be able to in the blink of an eye bring up a report of the past 24 hours and tell you. If there's even hesitation, you are VERY unlikely to be happy with the shared storage because either they won't work with you or they're not sufficiently clued in. Or both.

Litmus test part two: Then ask them what they do differently for their relational DBs? Do they run THOSE on NFS? How are they handled? If you have no SQL, replace with "Other actual HIGH IO app you may have". If you have no other high IO apps, well, run away from shared storage. You will bring it to its knees and your storage folks won't know what to do.

So, the crux of the matter: Companies have policies. This is fine. When it's a "one size fits all, regardless of how poor the fit is," intelligently run companies have ways to have exceptions. From your questions, I get the feeling you will need an exception AT LEAST to the NFS situation. If you can get past the NFS mounts, then for low to moderate ingestion and search rates on a SAN that's reasonably quick and has sufficient performance headroom open, separate LUNS (especially if they can be pinned or separated to different sets of spindles) attached to each VM can be OK. If no exceptions can be made, then ... well, there are bigger things wrong than just this problem.

Again - All this is just for your indexers. Search heads and things can totally be VMs.

traxxasbreaker
Communicator

It is not recommended to use NFS for hot/warm index buckets at all, and for the cold or frozen buckets it's not recommended to share them. You should read through this very carefully if you want to continue going down that route. Also, If you're going to be running indexers on VM's and can't get dedicated hardware for them, you'll also want to make sure that you are following the best practices to keep the performance reasonable.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...