Getting Data In

In a Splunk cluster on Cloud, what are some best practices when Increasing indexer disk space?

LordLeet
Path Finder

Hello,

I'm running my Splunk cluster on cloud, and I'm running out of disk space. I'm planning on increasing the available disk space but I'm wondering if there might be any side effects on doing this that I should prepare for.

Since this would be done in a Production environment I need to avoid at all costs losing access to the indexed data.
I'll also perform a disk snapshot just in case.

All the indexes are set to:
maxDataSize = auto_high_volume

The steps involved would be:
1. Stop the Splunk Forwarder.
2. Stop the Splunk Indexer.
3. Perform Splunk Indexer disk Snapshot.
4. Increase the disk space on Splunk Indexer.
5. Wait for the change to be in effect.
6. Restart the Splunk Indexer.
7. Restart the consumers on the Splunk Forwarder.

Are there any other steps that I should perform?

Thanks in advance!

0 Karma
1 Solution

woodcock
Esteemed Legend

If you are already clustered, and your cluster is in good health (all factors met) there isn't necessarily any reason to do the snapshot. If something went wrong on 1 indexer, I would just rebuild from scratch and let the CM rebuild the buckets. However, if your testing is flawed and you finish your work on all 3 indexers and then found that something was toast, it would be good to be able to restore from a snapshot. We just got done doing this in Azure on RedHat and are using volume groups. Make sure that you run df and that this command reports the correct size. In our case, although the volume was increased and being used, there was an extra command to make some disk tools aware of the space. Also, you did not say what volume you are doing (hot/cold/archive) and that may make a difference. Finally:

There is no reason for your step #1 (why stop the forwarders?); definitely DO NOT do this.
Replace existing step #1 with Put your Cluster Master into Maintenance Mode.

View solution in original post

woodcock
Esteemed Legend

If you are already clustered, and your cluster is in good health (all factors met) there isn't necessarily any reason to do the snapshot. If something went wrong on 1 indexer, I would just rebuild from scratch and let the CM rebuild the buckets. However, if your testing is flawed and you finish your work on all 3 indexers and then found that something was toast, it would be good to be able to restore from a snapshot. We just got done doing this in Azure on RedHat and are using volume groups. Make sure that you run df and that this command reports the correct size. In our case, although the volume was increased and being used, there was an extra command to make some disk tools aware of the space. Also, you did not say what volume you are doing (hot/cold/archive) and that may make a difference. Finally:

There is no reason for your step #1 (why stop the forwarders?); definitely DO NOT do this.
Replace existing step #1 with Put your Cluster Master into Maintenance Mode.

LordLeet
Path Finder

Hello woodcock,

Thanks for your input!

Even though we planned to have a clustered architecture, we are only running 1 indexer, that is why I've mentioned that I would stop the Splunk forwarder and the consumers to guarantee that there would be no data loss during the disk Snapshot.

Being that said would you still advise to put the cluster into maintenance mode and then stopping the indexer?

We did it on our testing environment and like you said, we had to perform these commands:
growpart /dev/xvdg 2
resize2fs /dev/xvdg2

Thank you

0 Karma

woodcock
Esteemed Legend

If you only have one Indexer then you probably do not have a Cluster Master so forget about Maintenance Mode (that is a CM-only thing). There is never any reason to stop a forwarder; it should queue just fine if the indexer disappears. The only thing stopping it will do is stop it from complaining about no indexers in its _inernal log.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...