Getting Data In

How to extract an event timestamp where seconds and milliseconds are concatenated without padded zeros?

diogofgm
SplunkTrust
SplunkTrust

I came across a weird log format where the seconds and milliseconds are concatenated without padded zeros.

Example data

2019,8,6,9,31,1,event data
2019,8,6,9,31,12,event data
2019,8,6,9,31,123,event data
2019,8,6,9,31,1234,event data
2019,8,6,9,31,12345,event data

Problem
From my testing TIME_FORMAT doesn't work correctly in this case. It would if this number had padded zeros (e.g 00012)
Formats I tested and the results
%Y,%m,%d,%H,%M,%S%3N - works on the 5 digit but not the others since they show the wrong amount of seconds
%Y,%m,%d,%H,%M,%S - same as before, in most cases it shows the wrong amount of seconds
%Y,%m,%d,%H,%M,%5N - doesn't extract anything after the minutes

How can I solve this without building a custom input or pre-processing the data before indexing it?

------------
Hope I was able to help you. If so, some karma would be appreciated.
1 Solution

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.

View solution in original post

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.
Get Updates on the Splunk Community!

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...

Splunk APM: New Product Features + Community Office Hours Recap!

Howdy Splunk Community! Over the past few months, we’ve had a lot going on in the world of Splunk Application ...