Getting Data In

Why are my files being re-indexed?

SK110176
Path Finder

I'm noticed tons of duplicate events and the following message in splunkd.log correlates with the time I started seeing the dupes. It also started after I upgraded from v4.0.9 to v4.1.4:

"File too small to check seekcrc, probably truncated. Will re-read entire file=....."

Does anyone know why this is occurring?

My settings in inputs.conf include:

crcSalt = <SOURCE>
followtail = 1

I've already checkd for the following and none of these apply:

Causes of reindexing:

File contents (especially the first 256 bytes) are modified in-place. This shouldn't happen for log files (they're supposed to be a record).

The CHECK_METHOD for the files was set to entire_md5 or modtime. This forces the files to be reindexed.

Some sourcetypes like 'text_file' intentionally set the CHECK_METHOD because it is desired to index the complete file each time.

Tags (1)

Genti
Splunk Employee
Splunk Employee

crcSalt =
followtail = 1

crcSalt =
* Use this to force Splunk to consume files with matching CRCs.
* Set any string to add to the CRC.
* If set to "crcSalt = ", then the full source path is added to the CRC.

Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...