What is the max recommended of indexes on a single...

kalibaba2021 · ‎06-26-2023

Hello Community,

I am looking at deploying Splunk Enterprise on AWS on a HEFTY EC2 compute-optimized instance, and attached EBS. i'd like to maximize the # of indexes on this EC2 since search performance is of no concern .

I see the default index size is 500 GB, but I also know I can configure indexes.conf to whatever I want.

For example, if I think I'll have ~ 97 TB of data I could say maxVolumeDataSizeMB = 102603162 on a single BIG indexer . But off course just because I can doesn't mean I should.

I see no clear recommendation of how to design multiple indexes with relation to Indexers and Search Heads. Maybe because it always depends. 🤗

In my case, since I care none for performance, can I put everything on one BIG EC2 ? Split it in 2 BIG machines ? As in, install Indexer and SH on same instance.

Thanks in advance👍

isoutamo · ‎06-26-2023

Hi

you are right, it depends 😉 When you are using any EBS backend EC2 nodes wit gp3 type EBS disk you definitely get more than 1200 IOPS, which is enough for normal use. But how big the indexes can be on one node? This depends on e.g. how much new data per delay you are ingesting. If you are not concerned about searches, you could probably use estimate 150-200B/day, but if you are using ITSI/ES then it’s less than 100.

If you have indexer cluster then you also need to think how long you can wait for rolling restarts? Those will disturbed longer time if you have fewer nodes with more data as there are assigning primaries when rebooting has done.

Those are only some items which you should think before you will make decisions. Fortunately you could change your environment later if needed. To do that easier I propose to instal at least indexer cluster or even multi site version.

r. Ismo

kalibaba2021 · ‎06-27-2023

isoutamo, thank you for the quick response. we plan on restoring data from the archive S3 bucket into this splunk instance, only on a per-need basis. So, very large amount of data will be ingested in a relatively short time period, but only rarely. Exact volume of data is unknown, but 50 - 100 TB's is possible.

But, not on continuous basis, so "per-day" metric will only apply during this rare ingestion event. Neither redundancy not fast searches are of major concern.

what I gather from your response is: at least ONE Indexer Cluster with 2 large nodes to start with . Is it needed to have a dedicated Search Head, or can I install SH on one of the nodes in the said Cluster ?

isoutamo · ‎06-27-2023

As this is just a restoring splunk instance for thawed data, you could start with one instance. If/when you are using several instances for storing data, then you must use own scripts to retrieve data from S3 and rebuild it into thawedb directories. Probably this will be the most time consuming part of your process? I propose you to developing and testing this environment and process to achieve needed time to search from request. With this amount of data this could be a days before you could start searching!

kalibaba2021 · ‎06-27-2023

@isoutamo - thanks very much for the info!

VatsalJagani · ‎06-26-2023

@kalibaba2021

If you don't care about about Splunk performance:

Do whatever you want.

If you care about Splunk performance and want to design the proper system:

Indexes - For splitting indexes, I have a few recommendations for you:
- Do not keep the high-volume data source and low-volume data source in the same index.
- If more than 1 large data sources, put them on separate index for better performance, even though logically they are related.
Indexer and SH
- Design depending on the number of users and daily ingestion
- See here - https://lantern.splunk.com/Splunk_Success_Framework/Platform_Management/Platform_capacity_considerat...

I hope this helps!!! Kindly upvote if this helps!!!

kalibaba2021 · ‎06-27-2023

@VatsalJagani - thank you for the info!

What is the max recommended of indexes on a single indexer in AWS?

indexer

indexer clustering

other

search head

search head clustering

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases