I have an application that creates XML log files. Each entry takes multiple lines and is enclosed in <error> </error>
tags, but there are other tags with in it.
From reading other questions I believe I define a source type in the inputs.conf on the machine that has these logs files (light forwarder). If that is true then I believe I define the specifics of this source type in the props.conf file. Should that props.conf file be on the machine with the logs or the indexing machine. Also, what should that props.conf file look like for a file like this. Lastly, how will I clear the already indexed log entries (confused, they think it is one per line) and get it to reindex them properly?
Line breaking / event breaking issues:
Your xml file is not being parsed correctly and is creating single line events for an event that should be multiline. I assume also the timestamp recognition is wrong?
Then what you need to do is use props.conf / transforms.conf in order to force splunk to linebreak the events correctly.
Since your forwarder is a light weight forwarder then you need to put these props/transforms on the INDEXER side.
Lastly, to clean your already indexed data you can use the clean command
hope this helps.
.gz
NOTE: use the clean command at your own risk, ie. you will lose data if you do not have the raw data still available...