Getting Data In

IIS log file data duplication - "Checksum for seekptr didn't match, will re-read entire file"

mParticle
Explorer

I have a base install of 1 indexer and a few UFs. Both the indexer and UFs are version 6.0, build 182037 (UFs are Windows 2012, indexer is on Ubuntu).

In the UF's .\etc\system\local\inputs.conf I have a basic stanza:

[monitor://C:\inetpub\logs\LogFiles\W3SVC1]
sourcetype = iis
index = iis_logs
disabled = false

After making the change above and restarting the UF, it starts reading the IIS logs, then logs this entry:

12-02-2013 11:54:39.390 -0500 INFO  WatchedFile - Checksum for seekptr didn't match, will re-read entire file='C:\inetpub\logs\LogFiles\W3SVC1\u_ex131202.log'.
12-02-2013 11:54:39.390 -0500 INFO  WatchedFile - Will begin reading at offset=0 for file='C:\inetpub\logs\LogFiles\W3SVC1\u_ex131202.log'.
12-02-2013 11:54:39.437 -0500 INFO  WatchedFile - Resetting fd  to re-extract header.

and then a couple of minutes later, the above 3 lines repeat... then again, and again, duplicating data, using up the indexing quota and chewing through disk space. I am not the only person with this issue, as it seems from a quick search through the answers - here is one. I tried the workaround in this post and it worked, but since Splunk 6.0 changed the way IIS logs are handled (see this product announcement), I thought I'd try to use the new way, instead of hacking it to make it work and (probably) eventually break something when this gets fixed.

Does anyone have any suggestions? An official fix maybe?

Thanks in advance!

1 Solution

ekost
Splunk Employee
Splunk Employee

There is an issue that causes duplicate IIS events to appear when using a new feature in Splunk 6.0. The Answers post here: discusses the issue.

View solution in original post

0 Karma

ekost
Splunk Employee
Splunk Employee

There is an issue that causes duplicate IIS events to appear when using a new feature in Splunk 6.0. The Answers post here: discusses the issue.

0 Karma

ekost
Splunk Employee
Splunk Employee

No, not yet. The core issue is still being investigated. A workaround has been identified for use with version 6 forwarders and is being validated.

0 Karma

mParticle
Explorer

Thanks. Do you have any details on when it is coming out?

0 Karma

buckeye07
Engager

I have the same issue, but the logs are for other W3C formats and the log files are much larger, so the impact is greater for me. No answers yet, but will report back if I figure something out.

0 Karma

stephanyespence
New Member

Us too. Anyone have a solution yet??

0 Karma

bruceclarke
Contributor

Did you ever find a solution to this? We're running into the same issue, and it's causing us to forward GIGAbytes of data from what should only be 10MB daily.

0 Karma

ShaneNewman
Motivator

I think everyone has this issue. It is actually there to make sure that if the logs roll over and keeps the same name, it doesn't see the file as a duplicate. Chances are, if you see this, those files are extremely small and shouldn't really impact your indexing license unless you have hundreds of thousands of files that single UF is monitoring (not going to happen).

We have hundreds of files being monitored via a Heavy Forwarder and we get this message anytime it is restarted. Unless we clean the fish bucket before we start the instance, the most I have seen re-indexed from a restart is about 15MB.

0 Karma

ShaneNewman
Motivator

I agree, if it is re-indexing files of that size, that is a bug.

0 Karma

mParticle
Explorer

Well, the files are small, but not THAT small - 5-10 MB each. I know Splunk has functionality to calculate a hash of a file so it knows if the file rolled over, but these file do NOT roll over, they are appended to, and this behavior was different (read "working properly") prior to version 6, so I'd say this is a bug. Also, it only takes a minute or two to ingest a 10 MB file, so multiply this by 24 hours, and then by a few web servers and you'll quickly see it can become an issue for someone who doesn't have a large Splunk license.

Bottom line, I think this is a bug and should be addressed.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...