Getting Data In

indexer sizing and virtualization

Steve_Litras
Path Finder

Hi -

I'm embarking on a re-organization in my splunk environment. I've come into possession of a couple big x86 boxes (4 socket, 8 core, 256GB RAM), and given that I heard multiple times that indexing is better distributed horizontally across smaller boxes, it leads me to wonder this: If I replace two of my 3 indexers with these boxes, can splunk take advantageof the hardware? Or am I better off partitioning these boxes via virtualization into a number of smaller boxes (but still using local disk resources, etc.)?

Thanks
Steve

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

View solution in original post

0 Karma

lguinn2
Legend

Splunk can definitely use all the resources of the box. It is high-performance multi-threaded code, I would not virtualize the box; this will not help Splunk use it better. The overhead of virtualization will not be offset by any performance improvements. (I am a VMware-certified VCP4, if that helps my credibility on this one.)

Think of the Splunk advice as "Spend your money on lots of average-sized servers, rather than a few giant servers." Why?

  • average-sized "commodity" servers are relatively cheap. When you start buying "extra large" memory, etc., it tends to come at a premium price. So "commodity" servers are the most effective way to spend the money. (But your servers were free!)
  • more boxes gives you more concurrent IO - for Splunk indexers, IO is usually the bottleneck
  • Splunk is designed to be distributed; adding more indexers increases search performance nearly linearly

Lucky you - I wish someone would give me some physical servers (that weren't ready for the junk heap)! I would probably make them all indexers, unless my search head was overloaded. Adding indexers = reduced search time = more searches per minute = can serve more users effectively.

0 Karma

RicoSuave
Builder

You might want to check out this link http://docs.splunk.com/Documentation/Splunk/4.2.3/Installation/CapacityplanningforalargerSplunkdeplo...

The real question is, how much data are you indexing and how many users do you have? I would probably just set up the "big" boxes up as dedicated indexers and the lesser box as a dedicated search head. The biggest consideration is Disk I/O performance. If you have a lot of users, it might make more sense to make one of the big boxes a dedicated search head instead to accommodate more concurrent searches.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...