Getting Data In

Sharding an index on shared storage for load balancing

aholzer
Motivator

Goal:

Load balance across two indexers writing to the same location on a NetApp Filer (NFS)

  • 2 Indexers
  • Multiple Forwarders
  • A sharded index on NFS
  • Search Head

Question: (I am new to Splunk, so I may be asking the wrong question to begin with)

How can I configure my splunk setup with an index on shared storage to handle dynamic load-balancing between two indexers?

My understanding:

  • In isolated storage, the index that the indexers write to would be named the same and this would be logically the same index, just split between two locations.
  • Sharding the index manually would involve several steps: a) creating a new index, b) pointing different forwarders to each index separately (defeating the dynamic load-balancing between the two indexers), c) manually making searches combine data from both indexes.
  • While an indexer writes to an index, the indexer holds a lock on the index, not allowing any other indexer to write or read from the index.

Thank you very much!

1 Solution

gkanapathy
Splunk Employee
Splunk Employee

Your understanding is mostly correct. (Although it doesn't really change anything, you can read from and execute searches against an index shard that you're not writing to, at least in theory. There is an index setting in indexes.conf isReadOnly that supposedly makes an instance not write to an index, but I've never used it. You are correct that only one instance can write to an index shard though; I'm not sure if there is actually any lock on the files, or if the instance simply assumes that it's the sole owner though.)

You will need to set up basically four instances of Splunk, two on each node (one active and one failover) and two "shards" of each index:

  • On the NFS, two separate index "shards", call them iA and iB.
  • On server node 1, an active Splunk instance, call it sA-1, that is active and reads and writes from iA.
  • On server node 2, an active Splunk instance, call it sB-1, that is active and reads and writes from iB.
  • On server node 2, a standby failover Splunk instance, call it sA-2, that is normally shut down, but configured to iA.
  • On server node 1, a standby failover Splunk instance, call it sB-2, that is normally shut down, but configured to iB.

You will have to adjust the network port numbers so that the sA-* instances don't conflict with the SB-* instances if both are running on the same node.

In case of a failure, you would ensure that the failed node and splunkd process were stopped, the start up the corresponding standby instance on the other node. You would also do whatever was needed to switch the IP/hostname of the instances to point to the standby node. This can be done manually, or via clustering software, or VIP on a network load balancer.

I will also warn that while indexing over NFS will work, it is harder to guarantee the IOPs you'd like to have for excellent search performance. If your NFS is up to it, it should work fine. However, since no shard will be used on more than one node at a time, it's possible to use SAN volumes rather than NFS for each index shard.

I will also add that most of what you get from this setup will be rendered unnecessary by index replication within the Splunk product in an upcoming release. It is quite different from what I've described here, but provides similar functionality.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Your understanding is mostly correct. (Although it doesn't really change anything, you can read from and execute searches against an index shard that you're not writing to, at least in theory. There is an index setting in indexes.conf isReadOnly that supposedly makes an instance not write to an index, but I've never used it. You are correct that only one instance can write to an index shard though; I'm not sure if there is actually any lock on the files, or if the instance simply assumes that it's the sole owner though.)

You will need to set up basically four instances of Splunk, two on each node (one active and one failover) and two "shards" of each index:

  • On the NFS, two separate index "shards", call them iA and iB.
  • On server node 1, an active Splunk instance, call it sA-1, that is active and reads and writes from iA.
  • On server node 2, an active Splunk instance, call it sB-1, that is active and reads and writes from iB.
  • On server node 2, a standby failover Splunk instance, call it sA-2, that is normally shut down, but configured to iA.
  • On server node 1, a standby failover Splunk instance, call it sB-2, that is normally shut down, but configured to iB.

You will have to adjust the network port numbers so that the sA-* instances don't conflict with the SB-* instances if both are running on the same node.

In case of a failure, you would ensure that the failed node and splunkd process were stopped, the start up the corresponding standby instance on the other node. You would also do whatever was needed to switch the IP/hostname of the instances to point to the standby node. This can be done manually, or via clustering software, or VIP on a network load balancer.

I will also warn that while indexing over NFS will work, it is harder to guarantee the IOPs you'd like to have for excellent search performance. If your NFS is up to it, it should work fine. However, since no shard will be used on more than one node at a time, it's possible to use SAN volumes rather than NFS for each index shard.

I will also add that most of what you get from this setup will be rendered unnecessary by index replication within the Splunk product in an upcoming release. It is quite different from what I've described here, but provides similar functionality.

aholzer
Motivator

In which case the load-balancing at the forwarders would only need to worry about the indexers, the indexers would write their own index version of the index AND the search head(s) would basically treat it as if the index were being written locally on each of the separate indexers.

That's a very simple and elegant solution. Thanks.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Well, I would say sA-1 knows only iA (shard1/index1), and sA-2 is standby on the other node, but knows the same iA (shard1/index1) on the other node. sB-1 is on the same node as sA-2, and knows iB (shard2/index1)

0 Karma

aholzer
Motivator

Am i understanding you correctly; the indexes are named the same, but have different paths on the nfs. In this case, we would have to make sure the indexers only know about one of the indexes.

  • iA and iB are named "index1" with paths (on the NFS): /splunk/shard1/index1 /splunk/shard2/index1

and

  • SA-1 knows only /splunk/shard1/index1
  • SA-2 knows only /splunk/shard2/index1

Edit: (since we are not currently concerned about resiliency)

  • SA knows only /splunk/shard1/index1
  • SB knows only /splunk/shard2/index1
0 Karma

gkanapathy
Splunk Employee
Splunk Employee

the index name is the same on both sides. The shard does not have distinct name, it's just a different path that is set within the indexer config. If you mean "indexer", rather than "index", though, you just list both instances (or rather the virtual names/ips of each primary instance) and let forwarder load-balancing deal with it.

aholzer
Motivator

Thanks for the response.

Ignoring resiliency, and relying on the load-balancing feature on the forwarders, how would I specify the index name in the inputs.conf since we won't know which indexer it's feeding? Unless the intention is to hardcode the load-balancing...

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...