Getting Data In

Log file in /etc/log is reindexed resulting in duplicate events

elusive
Splunk Employee
Splunk Employee

I have Splunk monitor a log directory in /etc/log. The logs in this directory are updated and rotated. However, Splunk keeps reindexing the files again and again from the beginning resulting in multiple duplicate events.

Tags (2)

elusive
Splunk Employee
Splunk Employee

The issue was related to having log in /etc directory. By default Splunk assumes files in /etc directory are configuration file and Splunk has a stanza by default that will index the whole file in $SPLUNK_HOME/etc/system/default/props.conf:

[source::(.../etc/...|....(config|conf|cfg|inii|cfg|emacs|ini|license|lng|plist|presets|properties|props|vim|wsdl))]
sourcetype=config_file
CHECK_METHOD = modtime

The workaround is to set priority higher than the default stanza using source stanza in props.conf (If you use host or sourcetype in the props.conf will not work), specify CHECK_METHOD so it will not use modtime.

For example:

[source::/etc/log/*] 
sourcetype=log_in_etc
CHECK_METHOD = endpoint_md5 
priority = 10 

Type of message to look for in splunkd.log is:

02-09-2011 21:15:17.993 INFO  WatchedFile - Will use tracking rule=modtime for file='/etc/log/test.log'.
02-09-2011 21:15:17.994 INFO  WatchedFile - Modtime is newer than stored, will reread file='/etc/log/test.log'.
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...