The logs I'm trying to index are in a log4j style, and entries such as
2010-06-15 09:04:08,204 [[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel
.Default (self-tuning)'][intOrdId=17746,intVrsn=18846] DEBUG com.att.canopi.idis
.ordermanagement.sms.cramer.adapter.util.ReflectionsTools.ipagi1a1.snt.bst.bls.c
om - Set value1:Customer100 field name1:customerName
are properly split into unique entries, however some entries are multi-line, and have embedded XML, and these are split wherever a date (with a different format than the date at the start of an entry) is found, so the log entry
2010-06-15 09:04:07,686 [[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel
.Default (self-tuning)'][intOrdId=17746,intVrsn=18846] INFO fooHandler - <env:Envelope xmlns:env="
http://schemas.xmlsoap.org/soap/envelope/">
<env:Header/>
<env:Body>
<v1:getFooDetailRequest xmlns:v1="ns1" correlationId="coid" systemId="mybox" clientId="cid" mock="false" requestTime="2010-06-15T09:04:07.643-04:00">
<v1:fooId>foo<v1:fooId>
</v1:getFooDetailRequest>
</env:Body>
</env:Envelope>
gets split on the "getFooDetailRequest" line into two entries.
I've tried writing my own sourcetype in a props.conf with
MORE_THAN_100 = ^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \[.*?
and setting the files to that sourcetype manually, but I get the same result.
Does anyone know how to modify the built-in log4j sourcetype (since it is so close to being perfect), or have any other suggestions?
... View more