Splunk Enterprise

Splunk Forwarder Performance

smahtha
Engager

We have a "complicated" directory structure for logging. I wanted to understand the impact on performance for Splunk Forwarder for the below case.

Lets say our logs have the below structure /foo/bar*/*/abc/def/ghi*.log and /foo/bar*/*/abc/def/jkl*.log

Now if we use the above wildcard structure for monitoring purpose, Splunk will translate it as
[monitor://foo/]
whitelist: <appropriately set for ghi*.log and jkl*.log files]

This will effectively monitor the directory and all the subdirectories inside /foo/. And as per my understanding, whenever a new files is added or updated anywhere inside /foo/ it will be compared against the whitelist by Splunk forwarder to decide whether to index it or not.

The number of files eligible for logging is not a lot (hardly 3-4). But the number of files that can be created inside /foo/ (all of which are eligible for whitelist comparison) can be in thousands.

Can anyone give a rough estimate of [if understood and known] the impact of this kind of directory structure will have on Splunk forwarder performance.

Tags (1)
0 Karma
1 Solution

smahtha
Engager

We did a performance profling to understand the impact of the "directory structure and number of files eligible for whitelist comparison" on Splunk forwarder performance. The overall idea is this:
a) If a higher level directory is monitored instead of 'n' lower level directories [eg. /foo/ vs (/foo/bar1/123/abc/def/ and /foo/bar2/123/abc/def/ ... 10-times ... /foo/bar10/123/abc/def/)] CPU usage of the forwarder is higher.
b) If the number of files eligible for whitelist comparison is more then also CPU usage is greater.

Now some hard numbers.
We compared monitoring /foo/ vs ~50 /foo/bar1/123/abc/def/. In the case of /foo/ the number of files eligible for monitoring were around ~1000 whereas the number of files eligible for monitoring in the case of 50 /foo/bar1/123/abc/def/ was ~200.

The CPU usage in the first case rose up to around 10% whereas in the second case it was around 2%. The memory footprint remained roughly the same arounf 120MB.

View solution in original post

smahtha
Engager

We did a performance profling to understand the impact of the "directory structure and number of files eligible for whitelist comparison" on Splunk forwarder performance. The overall idea is this:
a) If a higher level directory is monitored instead of 'n' lower level directories [eg. /foo/ vs (/foo/bar1/123/abc/def/ and /foo/bar2/123/abc/def/ ... 10-times ... /foo/bar10/123/abc/def/)] CPU usage of the forwarder is higher.
b) If the number of files eligible for whitelist comparison is more then also CPU usage is greater.

Now some hard numbers.
We compared monitoring /foo/ vs ~50 /foo/bar1/123/abc/def/. In the case of /foo/ the number of files eligible for monitoring were around ~1000 whereas the number of files eligible for monitoring in the case of 50 /foo/bar1/123/abc/def/ was ~200.

The CPU usage in the first case rose up to around 10% whereas in the second case it was around 2%. The memory footprint remained roughly the same arounf 120MB.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...