Getting Data In

Indexing XML files from universal forwarder

lmacneil76
Explorer

Hi all, attempting to index 44,833 xml files for parsing. I know splunk needs some configuration changes to work better with xml depending on your needs. In my case each file is unique.

My problem is out of 44,833 xml files about 300 are marked dups.

Log Error

06-13-2014 12:43:51.734 -0700 ERROR TailingProcessor - File will not be read, seekptr checksum did not match (file=C:\var\xml\new_production_items\2-10240750-Qti.xml).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

So I have used crcSalt and initCrcLength, but failing on the implementation. I have placed them in inputs.conf on the server than changed to the forwarder to no avail.

This is the inputs.conf settings

[monitor://c:\var\xml\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

I have tried higher initCrcLength values from 1024 to 10500, nothing seems to take.

Each time I make a change I run the following commands:

Splunk universal forwarder:

splunk stop
splunk clean all
splunk start

Splunk Server:

splunk stop
splunk clean eventdata
splunk start

Any help would be greatly appreciated!

(Update)..
Just noticed that even with the error the file is still logged in some cases. My final results still indicate not all files are indexed but maybe the error above is a red herring!

0 Karma
1 Solution

lmacneil76
Explorer

Found the solution. Each sub folder needs its own stanza.

So files at C:\var\xml\new_production_items\2-10240750-Qti.xml would look like this.

[monitor://c:\var\xml\new_production_items\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

And the inputs.conf on the forwarder has this configuration.

View solution in original post

briansutherland
Explorer

Thanks, Windows, separate entry for each directory required and 'initCrcLength' stanza error goes away!

0 Karma

lmacneil76
Explorer

Found the solution. Each sub folder needs its own stanza.

So files at C:\var\xml\new_production_items\2-10240750-Qti.xml would look like this.

[monitor://c:\var\xml\new_production_items\*.xml]
disabled = 0
followTail = 0
crcSalt = <SOURCE>
initCrcLength = 2048

And the inputs.conf on the forwarder has this configuration.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...