I came across a weird log format where the seconds and milliseconds are concatenated without padded zeros.
Example data
2019,8,6,9,31,1,event data
2019,8,6,9,31,12,event data
2019,8,6,9,31,123,event data
2019,8,6,9,31,1234,event data
2019,8,6,9,31,12345,event data
Problem
From my testing TIME_FORMAT doesn't work correctly in this case. It would if this number had padded zeros (e.g 00012)
Formats I tested and the results
%Y,%m,%d,%H,%M,%S%3N - works on the 5 digit but not the others since they show the wrong amount of seconds
%Y,%m,%d,%H,%M,%S - same as before, in most cases it shows the wrong amount of seconds
%Y,%m,%d,%H,%M,%5N - doesn't extract anything after the minutes
How can I solve this without building a custom input or pre-processing the data before indexing it?
Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:
transforms.conf
[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true
[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)
props.conf
[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec
Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.
Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234
I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.
Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:
transforms.conf
[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true
[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)
props.conf
[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec
Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.
Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234
I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.