Getting Data In

indexing issue with IIS logs (File will not be read, seekptr checksum did not match)

DaClyde
Contributor

I'm supporting a system where we have deployed servers that are uploading their IIS logs to a central location. The indexer is configured to monitor the central location where each deployed server has its own uniquely named folder structure. The deployed servers are configured to upload their IIS logs every 12 hours. The IIS logs are configured to roll every day, but because the servers are uploading the logs twice a day, that means each log should be updated at least once.

So far, we've not had any issues (that I'm aware of) with duplicate events. However, some logs are simply not being indexed, and checking the _internal log today, I noticed a lot of these entries for the "missing" logs:

File will not be read, seekptr checksum did not match (file=\FILESERVER\SHARE\DEPT\UNIQUE_SVR_NAME\_admin\iislogs\u_ex170518.log). Last time we saw this initcrc, filename was different. You may wish to use larger initCrcLen for this sourcetype, or a CRC salt on this source.

And also some of these, which I assume just means the total log length was shorter than the default 256 byte initCrcLength value?

File will not be read, is too small to match seekptr checksum (file=\FILESERVER\SHARE\DEPT\UNIQUE_SVR_NAME\_admin\iislogs\u_ex170515.log). Last time we saw this initcrc, filename was different. You may wish to use larger initCrcLen for this sourcetype, or a CRC salt on this source.

The vast majority of these logs are being indexed just fine. What need I do to cleaned up these outliers? Just set the initCrcLength to something longer? I don't want any duplication, but I do want to be sure all of the logs are being indexed. I'm reading the documentation, but not really grasping how the CrcSalt and initCrcLength work to know exactly what to do with them or if they would actually solve this problem.

1 Solution

somesoni2
Revered Legend

You get this error when there are different files which are having same first 256 bytes (initCrcLength). One option would be to increase the initCrcLength of the file so that file for each day can have unique Crc Handler.
Assuming that your file name contains the date and update are being done (Either the whole content is replaced or new stuffs are added to end of the file), you can use crcSalt = <SOURCE> (exact string to be used), so that Crc Handler will be created based on file path and file for each day will have unique Crc Handler.

View solution in original post

0 Karma

somesoni2
Revered Legend

You get this error when there are different files which are having same first 256 bytes (initCrcLength). One option would be to increase the initCrcLength of the file so that file for each day can have unique Crc Handler.
Assuming that your file name contains the date and update are being done (Either the whole content is replaced or new stuffs are added to end of the file), you can use crcSalt = <SOURCE> (exact string to be used), so that Crc Handler will be created based on file path and file for each day will have unique Crc Handler.

0 Karma

DaClyde
Contributor

So since my IIS logs have all of this stuff at the top:

#Software: Microsoft Internet Information Services 5.1
#Version: 1.0
#Date: 2004-09-29 00:13:03
#Fields: time c-ip cs-method cs-uri-stem sc-status

Is that included in the initCrcLength calculation, or since my transform is configured to ignore anything beginning with #, does the length calculation start at the actual event that is indexed?

0 Karma

somesoni2
Revered Legend

That is added to the Crc Handler. Since the CRC is calculated at forwarder level and transform is applied at Indexer/heavy forwarder, you ignoring contents doesn't affect the CRC calculation.

0 Karma

DaClyde
Contributor

These logs aren't being forwarded, so how does that change your statement if the files are being picked up directly monitored by the indexer?

0 Karma

somesoni2
Revered Legend

It'll still hold true as the order of CRC calculation and application of Transform is done one after another and by different component of Splunk engine. Any change you make, you would need to restart Splunk so that it can re-enumerate the list of files to be monitoring and CRC handlers.

0 Karma

niketn
Legend

Seems like some of your logs are being identified as duplicates since they are failing in Cyclic Redundancy Check. Have you already applied crcSalt=<SOURCE> for your input?
If setting crcSalt to <SOURCE> does not work then may actually have to increase initCrcLength. Refer to documentation: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf

Also check out the following answers which talk about adding a string to make files unique instead of complete source path through <SOURCE>.
https://answers.splunk.com/answers/35210/crcsalt-issue.html
https://answers.splunk.com/answers/186232/how-to-configure-inputsconf-to-apply-crcsalt-for-o.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

DaClyde
Contributor

If I increase the initCrcLength setting, will Splunk automatically re-read the files it skipped or do I have to do something to get it to retry?

0 Karma

niketn
Legend

Manually move the files to a separate location where it will not be read by Splunk. Once crcSalt=<SOURCE> is in place copy the files over to the folder being monitored.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...