I am struggling to break multi-line events correctly with source defined as monitor input. Occassionally, Splunk breaks events incorrectly. If I cleanup event index and _thefishbucket index, the event that had got incorrectly broken previously gets broken correctly the second time during reindexing.
My event log files are XML-formatted.
Here is my props.conf:
LINE_BREAKER=([\r\n]+)
TIME_PREFIX=
TZ=UTC
MAX_TIMESTAMP_LOOKAHEAD=27
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
MUST_BREAK_AFTER=
TRUNCATE=0
This is how one of the events had got broken:
estamp>2012-07-13T13:23:30.118117Z
As you might see the event should have got broken at "
I will appreacite a quick reponse.
Using your data, I used the following sourcetype configuration in props.conf:
[your_sourcetype]
BREAK_ONLY_BEFORE = <auditrecord>
SHOULD_LINEMERGE = true
TIME_PREFIX = <extended_timestamp>
pulldown_type = 1
This gave me clean extractions for what you provided (after applying "| xmlkv").
How does this fare with your larger data set?
EDIT: Also, looking at this:
LINE_BREAKER=([rn]+)<auditrecord>
Are you meaning to to look for end line characters? This will match against one or more instances of the letter 'r' or 'n'.
LINE_BREAKER=([\r\n]+)<auditrecord>
I'm no regex expert, but I think that's right.
Hi Dilip. A couple of things I can think of:
Hi Turk, I do have REPORT-oracleaudit_xml defined in my props.conf. Here is my complete props.conf file.
[oracleaudit_xml]
LINE_BREAKER=([\r\n]+)<\AuditRecord>
TIME_PREFIX=<\Extended_Timestamp>
TZ=UTC
MAX_TIMESTAMP_LOOKAHEAD=27
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
MUST_BREAK_AFTER=<\/AuditRecord>
TRUNCATE=0
TRANSFORMS-disc_xml_header=disc_xml_header
SEDCMD-disc_xml_end_tag=s/<\/Audit>//g
KV_MODE=none
REPORT-oracleaudit_xml=oracleaudit_xml_extractions
That was just something I put in there for testing purposes to manually select the sourcetype.
Hi R. Turk, What pulldown_type=1 does in props.conf output?
Hmm... this gives me the event extractions & fields:
props.conf
[your_sourcetype]
LINE_BREAKER = ([\r\n]+)
MUST_BREAK_AFTER =
REPORT-field_extract = oracleaudit_xml_extractions
SHOULD_LINEMERGE = true
TIME_PREFIX =
pulldown_type = 1
transforms.conf
[oracleaudit_xml_extractions]
REGEX = <(\w+)>([^<]+?)
FORMAT = $1::$2
Hope this is some help 😛
Yes, my XML records have line break before
In short my event starts with
Secondly, I do not use xmlkv but use the following field extraction stanza in my transforms.conf.
[oracleaudit_xml_extractions]
REGEX=<(\w+)>([^<]+?)</\1>
FORMAT=$1::$2
This happens occassionaly and especially for active log file. For example, let us say I generate new log file by doing some activity on my database by logging to the database and running some SQL statements, then I log off then from my database session. My DB generates new XML log file for each session. If I go to Splunk and query the events for this newly generated log file, I see one of the events broken incorrectly. Now, if I stop splunk, delete event index including _thefishbucket index and restart splunk, I find the event broken correctly that had got broken incorrectly the first time.