Getting Data In

Log file is not getting indexed

yogonline
Engager

We have a custom application
log file which looks something like below, this file is not getting indexed with the 1st 4 lines in it.
These logs are generated for a number of similar programs the entries in "XXXXs..." will vary based on this

=============================================================================================================================================================================================

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application XXXXX XXXXX - YYYY BC and XX XXXXXX XXXXXX XXXXXX XX (XXX XX)

Application ---------------------------------------------------------------

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"

==============================================================================================================================================================================================

Same file (below) with the discarded lines works fine.

==============================================================================================================================================================================================

Application prg="XXXXX XXXXXX - Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:02" end_date="N/A" requestid=37541696 curr_time="09-APR-2013 17:15:13"

Application prg="XXXXX XXXXXX- Receipt Creation Program" phase=Running status=Normal start_date="09-APR-2013 17:01:03" end_date="N/A" requestid=37541697 curr_time="09-APR-2013 17:15:13"

===============================================================================================================================================================================================

Is it possible to ignore or omit the 4 lines for the file to be indexed since it is not going to be possible to remove the entries from the applciation side to remove these lines.
Thanks

Tags (1)
0 Karma

las
Contributor

If I read your question correct, your file is not indexed.
Am I right in assuming, that the start of the file is identical with other files for the first 256 bytes?

You might see some lines in splunkd.log that looks like this: "File will not be read, is too small to match seekptr checksum" I use the following search:

index=_internal source=*splunkd.log "File will not be read, is too small to match seekptr checksum" component="TailingProcessor" | dedup host file | table _time host file | sort host

Try to look at initCrcLength in inputs.conf, this option came in 5.0.1

0 Karma

kristian_kolb
Ultra Champion

Oh..reading your answer, I think you understood better what the problem may be.

0 Karma

kristian_kolb
Ultra Champion

You should probably read the following for guidance on how to skip indexing of some events.

http://docs.splunk.com/Documentation/Splunk/5.0.2/Deploy/Routeandfilterdatad#Keep_specific_events_an...

In props.conf:

[your_sourcetype]
TRANSFORMS-set= setnull,setparsing

In transforms.conf:

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setparsing]
REGEX = some_string
DEST_KEY = queue
FORMAT = indexQueue

You'll have to replace 'some_sting' with a something that distinguishes the lines you want to keep, e.g. in your example "phase" or "curr_date" occur in the events you want to keep.

When these transforms are called from props.conf, the order is important; first ALL events are set to be thrown away (setnull), then followed by the second transform (setparsing) that re-set the destination for the matching events from the nullQueue back to the indexQueue.

Hope this helps,

Kristian

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...