Getting Data In

Why is my XML file in a monitored directory not being indexed?

adityaanand
Explorer

Hi,

I am monitoring a directory which contains some xml file.
Suppose there is a file 1.xml exists in directory. Now I put another file 2.xml which contains almost similar data, but there is some change in the few lines at the end of the file .
Size of both files is the same.
changes exists in last 256 bytes.
Initial 256 bytes are same.

As per my knowledge

The monitoring processor picks up new files and reads the first and last 256 bytes of the file. This data is hashed into a begin and end cyclic redundancy check (CRC).
The begin CRC is matched against a database that contains all the CRCs of files Splunk has seen before, but the end CRC does not match. This means that Splunk has previously read the file but that some of the material that it read has since changed. In this case, Splunk must re-read the whole file.  

But my file is not indexed.
Why is this happening?

Regards,
Aditya

tom_frotscher
Builder

Hi,

i don't know where you got this info from, but as far as i know, splunk only checks the first 256 bytes of a file and not the last.

See here : http://docs.splunk.com/Documentation/Splunk/6.2.3/Data/HowLogFileRotationIsHandled#How_Splunk_Enterp...

You shpuld adjust the crcSalt or initCrcLength for the corresponding input.

Four more information take a look into the admin manual for the inputs.conf:

http://docs.splunk.com/Documentation/Splunk/6.2.3/Admin/Inputsconf

Greetings

Tom

0 Karma

adityaanand
Explorer

What is role of seekCRC ?
It doesn't mean that it is hash value of last 256 bytes of file.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...