Monitoring Splunk

From forwarder to index to search is taking too long -- roughly 10 to 15 minutes

wwhitener
Communicator

Hey all,

I have a system that is generating a log that I need to have indexed and pull into Splunk. The system is on several individual boxes--so it spits out output and was set up with 3.4.5 to go to our central server.

The only problem is that it is now taking 10+ minutes to get from the system to the saved searches on the Splunk server. The system times are in sync, and there's no time zones to screw up the timing. The entries appear in order so it doesn't appear to be a problem with entries being lost or anything.

I am thinking that by adjusting some of the configuration, I can help reduce the problem. I saw these things that looked somewhat promising:

  • maxKBps = increasing this size
  • Making sure that we are tailing the file, rather than trying to do the entire file
  • indexAndForward = false to prevent the local 3.4.5 forwarder indexing it
  • not sending cooked data and sending raw data instead
  • Using tcpout....

If anyone else has any suggestions on how we might improve the speed from soup to nuts I'm very interested in hearing about it.

Thanks!

0 Karma

Simeon
Splunk Employee
Splunk Employee

By default, light (universal) forwarders will usually limit themselves to transferring data at a maximum of 256 kbps. In these scenarios, increasing the limit may help with more real-time results.

You should find out the true delay of the data and check for any indexing problems.

If Splunk is behind with respect to indexing, you will see a delay like this. To check if Splunk is behind on indexing, look for blocked or filled queues:

index=_internal source=*metrics.log blocked

OR

index=_internal source=*metrics.log group=queue | timechart avg(current_size) by name

If you have consistently blocked queues or they are filled (1000 is the max value) then you will need to debug why Splunk is queue-ing data.

wwhitener
Communicator

Back again with an update.....

We're down to about 5 to 7 minutes of delay in getting from the log to the forwarder to the index. Our times are all synchronized, so there are no issues from there.

We are looking at settings and tweaking. Ideally, we want it to go down to about 3 minutes.

Thanks for allowing me to pick your great brains.
Wwhitener

0 Karma

wwhitener
Communicator

Thanks everyone.

We're looking at the issue of possible time lags and time zone difficulties right now.

0 Karma

Drainy
Champion

Is there any reason why you couldn't update the system to a more up to date universal forwarder?
The system footprint will be less (although it still clearly shouldn't take as long as you have indicated).
The number of files shouldn't slow down the forwarder due to its system of CRC checking a file, have you looked in the splunkd.log to see if there are any errors or issues happening?

I guess with stuff like this you want to verify how often these logs are being written, be sure that they are updating very frequently. Perhaps even do a manual run of the log to track through the system?
If it is a large log file you can play around with the maxkbps however this shouldn't "restrict" events from showing up, it may delay some but I suppose if it is a large file this could have an adverse effect if too low.

You're doing extractions on the data before sending? this is going to slow things down and I believe the speed of this has been improved on 4+ but in most circumstances it is best to simply define a target index for the data and let the indexer handle the rest.

woodcock
Esteemed Legend

You probably have too many files on the forwarder and Splunk is getting bogged down in the housekeeping of checking each one of them for changes (changes that will probably never happen). Try moving/deleting the old files and see if this helps.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...