Getting Data In

Log file in /etc/log is reindexed resulting in duplicate events

elusive
Splunk Employee
Splunk Employee

I have Splunk monitor a log directory in /etc/log. The logs in this directory are updated and rotated. However, Splunk keeps reindexing the files again and again from the beginning resulting in multiple duplicate events.

Tags (2)

elusive
Splunk Employee
Splunk Employee

The issue was related to having log in /etc directory. By default Splunk assumes files in /etc directory are configuration file and Splunk has a stanza by default that will index the whole file in $SPLUNK_HOME/etc/system/default/props.conf:

[source::(.../etc/...|....(config|conf|cfg|inii|cfg|emacs|ini|license|lng|plist|presets|properties|props|vim|wsdl))]
sourcetype=config_file
CHECK_METHOD = modtime

The workaround is to set priority higher than the default stanza using source stanza in props.conf (If you use host or sourcetype in the props.conf will not work), specify CHECK_METHOD so it will not use modtime.

For example:

[source::/etc/log/*] 
sourcetype=log_in_etc
CHECK_METHOD = endpoint_md5 
priority = 10 

Type of message to look for in splunkd.log is:

02-09-2011 21:15:17.993 INFO  WatchedFile - Will use tracking rule=modtime for file='/etc/log/test.log'.
02-09-2011 21:15:17.994 INFO  WatchedFile - Modtime is newer than stored, will reread file='/etc/log/test.log'.
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...