Hi
I have been using syslog to store my server logs and splunk will be monitoring the syslog.log file located at /opt/splunk/var/syslog-ng/ path. Now while splunk montoring the files i could see duplicate events in my logs. when i checked the splunkd log file i could see at partiucular timestamps i.e
06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
i could see splunk reading the file twice ..hence i could see duplicates events in my index. Posted you the snippet of splunkd log file.
06-17-2013 07:18:30.689 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:33.690 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:36.690 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:39.690 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:42.690 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:45.692 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:48.691 +0100 INFO WatchedFile - Will begin reading at offset=0 for file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:50.551 +0100 INFO BatchReader - Removed from queue file='/opt/splunk/var/syslog-ng/syslog.log'.
06-17-2013 07:18:56.561 +0100 INFO TcpOutputProc - Connected to idx=host1:8089
06-17-2013 07:19:26.563 +0100 INFO TcpOutputProc - Connected to idx=host2:8089
06-17-2013 07:19:56.576 +0100 INFO TcpOutputProc - Connected to idx=host3:8089
Can any one help me.. wats happening here .why splunk is reading a file a twice and generating duplicate events ??
for Syslog-log rotation i have defined the following configuration in syslog-ng file
//syslog-ng logrotation configuration
/etc/logrotate.d/syslog-ng
/opt/splunk/var/syslog-ng/syslog.log {
size 30M
copytruncate
create 750 splunk splunk
rotate 500
}
crontab - entry to check the syslog size every 5 min and rotate
// crontab
#Added entry to rotate logs generated from syslog-ng
*/5 * * * * /usr/sbin/logrotate /etc/logrotate.d/syslog-ng
I cleary see duplicates . You can find the same with the screenshot below.
I had the exact same problem with syslog (and others).
Try changing the inputs.conf for monitor:///var/log (or whichever stanza controls your syslogs:
Blacklist .gz
Whitelist .log$
Basically, you want to ignore all rotated logs, and just tail the current log. It worked for me.
Try this in your inputs.conf. The blacklist is overkill, but it can't hurt. Also, it looks like you're missing a \
[monitor:///opt/splunk/var/syslog-ng/syslog.log]
blacklist = (\.log\.)
whitelist = (syslog\.log$)
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
There was a post recently about a bug that resulted in duplicate events, but I don't think it applies to you: http://answers.splunk.com/answers/100001/504-duplicate-blocks-of-events
_time indextime source _raw
1 8/22/13 3:50:44.287 AM 08/22/2013 03:54:22 /opt/splunk/var/syslog-ng/syslog.log 2013-08-22T03:50:44.287+01:00 10.35.90.213 08.22.2013 03:49:07,705 Id-00fbd1d452157c2343408da5 The filter 'Request document:' lXXXXXXXX
2 8/22/13 3:50:44.287 AM 08/22/2013 03:50:46 /opt/splunk/var/syslog-ng/syslog.log 2013-08-22T03:50:44.287+01:00 10.35.90.213 08.22.2013 03:49:07,705 Id-00fbd1d452157c2343408da5 The filter 'Request document:' lXXXXXXXX
I have used the below query to check duplicates. cleary it shows it as duplicates..
index="xmlgapps" xmlg_message="Request document" Id-00fbd1d452157c2343408da5 | convert ctime(_indextime) AS indextime | table _time indextime source _raw
you can see the output below ,it has same source name , same _time, same _raw event but different _indextime i.e splunk is indexing the same event twice ..
by the way this config is defined in Heavy Forwarder 4.3.2 version..is this any bug in splunk ??
there in no other input configuration.only one input configuration is there..and now presently it is like this..
[monitor:///opt/splunk/var/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
whitelist = (syslog.log$)
and log rotation is done by the above mentioed conf /etc/logrotate.d/syslog-ng and cron job.. and by default logs are rotated and named as syslog.log.1 , .2,and so on..i am not finding any clue where it is going wrong ? 😞
Let's back up a bit. Earlier you said you 'removed the tail option from inputs.conf and still saw duplicates'. That is not possible unless there is another input configuration for the syslog folder. It could be a script, or a UDP input. What apps do you have installed on the heavy forwarder?
Also, please post your current input stanza, and how are you changing the log file names when you rotate them?
yeah . i have run.and i could see only one monitored file..and only instance splunk is reading it..my syslog server has the heavy fwder installed in it..
Did you run the btool on the syslog server?
./splunk cmd btool inputs list
hmm.even i did the same..in the "source" field ..i could see only syslog.log coming..but even then i am seeing duplicates...pls help.. can you give log rotating configurations for your syslog-ng...why splunk is creating duplicates if they are not there in my source file .. 😞
Don't forget to escape the .
whitelist=(syslog\.log$)
I have added whitelist option to above stanza but that event didnt work..i have used something like this...
[monitor:///opt/software/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
whitelist=syslog.log$
Sorry, the slashes were removed from the post.
You need an escape slash before each '.'
I'll try again.
blacklist = (\.log\.)
Try this.
[monitor:///opt/software/syslog-ng/syslog.log]
blacklist = (.log.)
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
rotated log files will be named as syslog.log.1 ,syslog.log.2 and so on..
This is not a script, it is a stanza. And you have it set up to monitor you rotated files. A * is implied after the .log
Try adding the line:
whitelist=(.log$)
Sure. Can you tell me how you are naming the rotated files?
Hi lukejadamec ,
even i am monitor the current log only..the monitor script i have used is
[monitor:///opt/software/syslog-ng/syslog.log]
queue = parsingQueue
index = xmlgapps
sourcetype = xmlg_syslog
Can you pls tell..based on the configurations i mentioned above where could be the problem.
Hi rakesh_498115
probably some duplicated input/monitor config, use cmd tool btool
to check for any duplicates
./splunk cmd btool inputs list
see docs for more information http://docs.splunk.com/Documentation/Splunk/5.0.3/Troubleshooting/CommandlinetoolsforusewithSupport#...
cheers, MuS