Solved: Monitoring large number of files

joonradley · ‎10-19-2010

We have a server that generates 100k log files a day. The logs must be forwarded to an indexer. Due to the critical nature of the server we can only install a light forwarder. The files only need to be loaded once monitoring is not needed.

Using monitor slows down the server to a crawl and we cannot use BATCH as the data must be preserved. Sadly we cannot copy the files to another directory for BATCH input.

Tried using fschange, but it does not forward the actual files to the indexer when sendCookData=false.

Any ideas?

brianirwin · ‎11-24-2010

Using monitor I would issue is the time_before_close, this exists to tell Splunk to not close a file until x seconds after the last write. Default value for this is 3 seconds, and with only 86400 seconds in a day just opening and closing 100K files uses up more time than you have.

Looking at the manual it seems when you override this for monitor in inputs.conf you can only set to an integer, so even if you go to 1, you will be in trouble.

You could try setting it to time_before_close = 1, but if you have 100K files you are still going to take longer than you want.

To the earlier point you may need to tarball, or cat x number of files together and send to a separate directory where you sinkhole/batch them or do anything to reduce the number of files to be eaten. If nothing else I think your inode tables will thank you if you can combine some of these files.

View solution in original post

eashwar · ‎01-31-2013

hello you got a spell error!!

sendCookedData = false

i am leaning splunk!! i set up an forwarder and indexer working perfectly. the forwarded logs get indexed in the MAIN index which is default.

i want to know how to index the data in a custom index.

thanks in advance

stefandagerman · ‎01-31-2013

How about you create your own topic, given the completely different nature of your question, once you have determined the the Splunk documentation at http://docs.splunk.com/Documentation/Splunk/latest/admin/inputsconf does not provide the answer to your question?

Please don't hijack threads as it is unlikely that you will get a response.

brianirwin · ‎11-24-2010

Using monitor I would issue is the time_before_close, this exists to tell Splunk to not close a file until x seconds after the last write. Default value for this is 3 seconds, and with only 86400 seconds in a day just opening and closing 100K files uses up more time than you have.

Looking at the manual it seems when you override this for monitor in inputs.conf you can only set to an integer, so even if you go to 1, you will be in trouble.

You could try setting it to time_before_close = 1, but if you have 100K files you are still going to take longer than you want.

To the earlier point you may need to tarball, or cat x number of files together and send to a separate directory where you sinkhole/batch them or do anything to reduce the number of files to be eaten. If nothing else I think your inode tables will thank you if you can combine some of these files.

Genti · ‎10-19-2010

perhaps you could tarball the files into a .gz and have splunk monitor that instead.

Monitoring large number of files

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life