Hello,
I'm trying to create a working props/transforms to separate standard events from json formatted logs (by filtering/resetting the json logs to their own sourcetype). Here's what I've tried so far and I am able to do most of what I want with the exception of timestamp recognition of the json events.. The below trims my json event headers only and filters/resets the json events to their own separate sourcetype. Since the header is trimmed splunk is doing a great job auto extracting my json field value pairs. I'm looking for help on getting the timestamp or _time value to match my json field "log_time".
PROPS.conf
[mainlog]
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_PREFIX = (?=[20])|log_time:
SEDCMD-remove-jsonheader = s/^[0-9T\:Z]*.*?\s*{/{/g
TRANSFORMS-set_sourcetype = example_json
[mainlog:json]
TIME_PREFIX = log_time:
#TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3NZ
MAX_TIMESTAMP_LOOKAHEAD = 3
INDEXED_EXTRACTIONS = json
TRANSFORMS.conf
[example_json]
REGEX = \{\"json\"\:
FORMAT = sourcetype::mainlog:json
DEST_KEY = MetaData:Sourcetype
sample log:
2023-08-21 11:59:10 TRACE [pool-12-thread-1] c.a.l.m.e.AbstractElasticSearchBatch$ElasticSearchBatch [Slf4jLogging.scala:13] Deadline time left is 302ms and record count is 72
2023-08-21 11:11:41 TRACE [pool-11-thread-1] c.a.l.m.e.AbstractElasticSearchBatch$ElasticSearchBatch [Slf4jLogging.scala:13] Indexing {"json":"s3://example/logs/2023/08/21/0111111a-2222-33ff-9e4e-c1a01dfdf448.gz","phase":"ingest","log_time":"2023-08-21T15:11:31.073Z","tick":"7777777777","id":"0111111a-2222-33ff-9e4e-c1a01dfdf448","source_time":"2023-08-21T11:11:25Z","status":"submitted","client":"555555","environment":"test","category":"changestream","account":"9","level":7}
The regular expression used in TIME_PREFIX must match the data, which is not the case in the example. Try these settings.
[mainlog:json]
TIME_PREFIX = log_time":"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%Z
MAX_TIMESTAMP_LOOKAHEAD = 24
INDEXED_EXTRACTIONS = json
Also, the "Z" in the timestamp is a time zone abbreviation which should be represented as "%Z" in TIME_FORMAT.