Splunk Search

Splunk EPS

EricksonOng
Explorer

Indexing throughput.
Events-per-second (EPS) is a common throughput measurement, but consider that event sizes can vary from a few hundred bytes to a megabyte or more. EPS ratings are usually calculated at whatever size is optimal for one specific vendor’s appliance or solution. Look for vendors that index every byte in your data, without the need for custom parsers or connectors. If the vendor is unable or unwilling to quote you EPS figures based on this criteria, move on and find someone who will.

http://www.splunk.com/web_assets/pdfs/secure/Splunk_Guide_to_Operational_Intelligence.pdf
Got the above from Splunk OI itself and have a question over this.

Given that each of the log size is 512 bytes. Using Splunk recommended server specs itself.
What would be the EPS that we are looking at for the server ?
Can the below assumption be made based on that.

Change in limits.conf to allow this at max indexing speed to max out indexer processor.
[thruput]
maxKBps = 0

So max indexing speed would be around 9 - 10 MB/s ?

9MB = 9 * 1024 * 1024 = 9437184bytes.
9437184 bytes / 512 bytes = 18432 EPS?

any comments on the above?

Tags (1)
0 Karma

Drainy
Champion

Firstly to max out the indexer processor may have negative impacts on the performance. (By the way, that maxKBps relates to networking output from a universal forwarder or a splunk indexer forwarding but if you release any kind of cap on it on a work network you can risk flooding the network or blocking an indexer)
There are different "queues" within Splunk that handle different jobs, as an example we're talking about things like;

  • Line breaking
  • Regex field extraction
  • Timestamping
  • Writing unwanted logs to null queue (sending them into the abyss)
  • Perhaps some additional processing on something like windows event logs

If a queue starts to block then it will delay or cause other events to drop depending on how they are being indexed. Also consider that Splunk can monitor and index local log files which allow it to index and read changes relatively fast when compared to a network connection where you may be forwarding events from another machine which will be restricted by the disk IO on the remote server (it may be in use or idle) and the network traffic. This is also over TCP which may require data to be re-transmitted if dropped en route.

You may also be logging syslog from remote systems which may be coming in on an UDP connection which again causes another area for potential problems.

I haven't really looked a great deal at your maths, assuming you've taken the right figures then I am sure it probably reflects a clean setup on a clean network and reading in local files, realistically the only way to really have any idea is to perform a proof of concept and actually gauge the performance of your systems indexing into Splunk as well as a realistic number of EPS.

EDIT: Also, in relation to the quote above I believe although I can't say for sure that the point is that vendors who quote EPS figures are giving overly optimistic results, Splunk knows that data varies but it does have a very good indexing engine, to see what you can do with it then the best option is to try 🙂

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...