Getting Data In

How to apply an arbitrary offset to the timestamp at index time?

martin_mueller
SplunkTrust
SplunkTrust

I have an input that writes timestamps as the number of milliseconds passed since January 1st 1601 that sadly cannot be changed to either human-readable or a Unix timestamp.

For example, 12995561169293 corresponds to October 24th 2012, 14:06:09. Splunk interprets this as a Unix timestamp, treating the last four digits as milliseconds and 100 microseconds: 1299556116.929(3) corresponding to March 8th 2011, 04:48:36.929.

I can convert "my" timestamp into a Unix timestamp by substracting a constant with an external preprocessing application before loading a file into Splunk. However, I'd prefer it if I could teach Splunk to understand it directly.

The usual sed/regex-transformations at index time cannot do maths to subtract the offset, is there any other way to do the conversion within Splunk?

1 Solution

yannK
Splunk Employee
Splunk Employee

A regex will not be able to do subtractions for you.
It seems that the only method is to use a scripted input that will parse the events before indexing.

View solution in original post

woodcock
Esteemed Legend

You can set TZ=+NumberOfHoursToAddHere:NumberOfMinutesToAddHere in props.conf.
You can also look at a solution using Cribl:
https://www.cribl.io/

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do you have a working example using TZ?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Just under six years later, 7.2 promises a fix \o/

http://docs.splunk.com/Documentation/Splunk/7.2.0/Admin/transformsconf

INGEST_EVAL = <comma-separated list of evaluator expressions>
* NOTE: This setting is only valid for index-time field extractions.
* Optional. When you set INGEST_EVAL, this setting overrides all of the other 
  index-time settings (such as REGEX, DEST_KEY, etc) and declares the 
  index-time extraction to be evaluator-based.
* The expression takes a similar format to the search-time "|eval" command.
  For example "a=b+c*d" Just like the search-time operator, you can
  string multiple expressions together, separated by commas like
  "len=length(_raw), length_category=floor(log(len,2))".
* Keys which are commonly used with DEST_KEY or SOURCE_KEY (like
  "_raw", "queue", etc) can be used directly in the expression.
  Also available are values which would be populated by default when
  this event is searched ("source", "sourcetype", "host", "splunk_server",
  "linecount", "index"). Search-time calculated fields (the "EVAL-" settings
  in props.conf) are NOT available.
* When INGEST_EVAL accesses the "_time" variable, subsecond information is 
  included. This is unlike regular-expression-based index-time extractions, 
  where  "_time" values are limited to whole seconds.
...

yannK
Splunk Employee
Splunk Employee

A regex will not be able to do subtractions for you.
It seems that the only method is to use a scripted input that will parse the events before indexing.

martin_mueller
SplunkTrust
SplunkTrust

Using scripted inputs to do the conversion means I need to re-implement the handling of log rotations and correct tailing after restarts, right?

I was hoping to get around that with some kind of more-powerful-than-sed pre-processing at index time.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...