[test_header]
INDEXED_EXTRACTIONS = CSV
HEADER_FIELD_LINE_NUMBER = 1
KV_MODE = none
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
pulldown_type = 1
TRANSFORMS-NoHeader = test_header
First file gets indexed accordingly with only the data captured and header ignored, but subsequent files are not indexed at all.
At the moment I'm not using crcSalt setting, as mentioned I don't want any possibility of logs being re-indexed.
My working configuration...
PROPS.CONF:
[host::testcsvwithheader]
CHECK_METHOD = entire_md5
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = CSV
KV_MODE = none
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
pulldown_type = 1
REPORT-AutoHeader = skipheader
INPUTS.CONF
[monitor:///...]
disabled = false
followTail = 0
host = testcsvwithheader
index = test
sourcetype = testcsvwithheader
initCrcLength = 654
I'll test out the suggested configuration.
I installed a new instance of Splunk 6.02 on my laptop, created a test app and using the same configurations tried pulling in data for indexing the same set of files. It WORKED! My header is 433 characters. I'm a bit stumped, but feel like this is a bug.
[monitor:...]
disabled = false
followTail = 0
host = testheader
index = testheader
sourcetype = testheader
Above is my inputs.conf. I'll check out the "CHECK_METHOD = entire_md5" option, and thanks for pointing out the correct stanza it works with.
I had a similar problem due to the first 260 chars in the file being alway the same due to long headers.
I solved this in the inputs.conf like this:
[monitor:///........./appdir/SD*.ERR_*.Z]
disabled = false
followTail = 0
sourcetype = my_sourcetype
initCrcLength = 330
crcSalt = <SOURCE>
In my case, we had thousands of file being written in the same "appdir" and severa times the "ERR" files were skipped because of same headers.
Marco
Read about crcSalt option and decided not to use that. Thanks.
Using "checkMethod" and "initCrcLength" is better than using "crcSalt". Be cautious about using
I'll test out the suggested configuration.
I installed a new instance of Splunk 6.02 on my laptop, created a test app and using the same configurations tried pulling in data for indexing the same set of files. It WORKED! My header is 433 characters. I'm a bit stumped, but feel like this is a bug.
beware that this option is valid only for a stanza like [source::filename]
Add "CHECK_METHOD = entire_md5" to props.conf file and retry.
Splunk, by default, check the first and last 256 bytes of the file. When it's finds matches, Splunk lists the file as already indexed and indexes only new data, or ignores it if there is no new data.
http://docs.splunk.com/Documentation/Splunk/6.0.2/admin/Propsconf
what does your inputs.conf look like?