Getting Data In

How to extract an event timestamp where seconds and milliseconds are concatenated without padded zeros?

diogofgm
SplunkTrust
SplunkTrust

I came across a weird log format where the seconds and milliseconds are concatenated without padded zeros.

Example data

2019,8,6,9,31,1,event data
2019,8,6,9,31,12,event data
2019,8,6,9,31,123,event data
2019,8,6,9,31,1234,event data
2019,8,6,9,31,12345,event data

Problem
From my testing TIME_FORMAT doesn't work correctly in this case. It would if this number had padded zeros (e.g 00012)
Formats I tested and the results
%Y,%m,%d,%H,%M,%S%3N - works on the 5 digit but not the others since they show the wrong amount of seconds
%Y,%m,%d,%H,%M,%S - same as before, in most cases it shows the wrong amount of seconds
%Y,%m,%d,%H,%M,%5N - doesn't extract anything after the minutes

How can I solve this without building a custom input or pre-processing the data before indexing it?

------------
Hope I was able to help you. If so, some karma would be appreciated.
1 Solution

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.

View solution in original post

diogofgm
SplunkTrust
SplunkTrust

Starting with Splunk 7.2 its possible to do some eval operations during index time using INGEST_EVAL attribute in transforms.conf and applying them to the source type in question.
So, in this case we can do the following configuration:

transforms.conf

[get_sec_msec]
REGEX = ^(?:\d+,){5}(?<sec_msec>\d+),
FORMAT = sec_msec::$1
WRITE_META = true

[eval_sec]
INGEST_EVAL = _time=round(_time+(sec_msec/1000),3)

props.conf

[your_sourcetype]
TRANSFORMS-evalingest = get_sec_msec, eval_sec

Explanation:
The approach I used was to extract the number in indextime and, using INGEST_EVAL, divide it by 1000 and adding it to _time.

Example
2019,8,6,9,31,1234,event data
the correct extraction would be 1 sec and 234 msec
1234/1000 = 1.234
_time = _time + 1.234

I use the round to force the value to add the .234. Testing I've done regarding this, if I didn't use the round(_time,3) I ended up only with the sec added and not the msec.

------------
Hope I was able to help you. If so, some karma would be appreciated.
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...