We have a custom application
log file which looks something like below, this file is not getting indexed with the 1st 4 lines in it.
These logs are generated for a number of similar programs the entries in "XXXXs..." will vary based on this
=============================================================================================================================================================================================
Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)
Application ---------------------------------------------------------------
Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)
Application ---------------------------------------------------------------
Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"
Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"
==============================================================================================================================================================================================
Same file (below) with the discarded lines works fine.
==============================================================================================================================================================================================
Application prg="XXXXX XXXXXX - Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"
Application prg="XXXXX XXXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"
===============================================================================================================================================================================================
Is it possible to ignore or omit the 4 lines for the file to be indexed since it is not going to be possible to remove the entries from the applciation side to remove these lines.
Thanks
If I read your question correct, your file is not indexed.
Am I right in assuming, that the start of the file is identical with other files for the first 256 bytes?
You might see some lines in splunkd.log that looks like this: "File will not be read, is too small to match seekptr checksum" I use the following search:
index=_internal source=*splunkd.log "File will not be read, is too small to match seekptr checksum" component="TailingProcessor" | dedup host file | table _time host file | sort host
Try to look at initCrcLength in inputs.conf, this option came in 5.0.1
Oh..reading your answer, I think you understood better what the problem may be.
You should probably read the following for guidance on how to skip indexing of some events.
In props.conf:
[your_sourcetype]
TRANSFORMS-set= setnull,setparsing
In transforms.conf:
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = some_string
DEST_KEY = queue
FORMAT = indexQueue
You'll have to replace 'some_sting' with a something that distinguishes the lines you want to keep, e.g. in your example "phase
" or "curr_date
" occur in the events you want to keep.
When these transforms are called from props.conf, the order is important; first ALL events are set to be thrown away (setnull), then followed by the second transform (setparsing) that re-set the destination for the matching events from the nullQueue
back to the indexQueue
.
Hope this helps,
Kristian