Getting Data In

How to force Splunk use epoch time in the log file as index time

jgcsco
Path Finder

I have following logs from a customer device:

0080101c40ba,10.10.1.2,1481421584,host1.labtest.com,error-message1,sev1
0080101c4114,10.33.1.3,1481421595,host2.labtest.com,error-message2,sev2

props.conf

[csv]
FIELD_DELIMITER = ,
FIELD_NAMES = transactionId, hostIp, time, fqdn, MsgType, Severity
TIME_PREFIX = ^(?:[^,]*,){2}
MAX_TIMESTAMP_LOOKAHEAD = 10
TIME_FORMAT = %s

SHOULD_LINEMERGE = False
pulldown_type = 1
REPORT-getfields = testlog_fields

transforms.conf:

[testlog_fields]
DELIMS=","
FIELDS = "transactionId", "hostIp", "time", "fqdn", "MsgType", "Severity"

The log files I received have incorrect timestamp on it, meaning not the time when the logs were generated. After ingested the logs, I noticed Splunk is using the log ingest time for index time (as shown in _time). Is there anyway to force Splunk use the epoch time inside the logs as Index time so that I can search for "last 7 days", "last month" event?

Thanks

1 Solution

beatus
Communicator

jgcsco,
It looks like the "csv" sourcetype utilizes "INDEXED_EXTRACTIONS" by default. This causes much of the props work to happen at the very first splunk instance the data hits, including Universal forwarders. If that's the case, you'd want to put your props there and utilize a few other config settings. Namely:

[csv]
FIELD_DELIMITER = ,
FIELD_NAMES = transactionId, hostIp, time, fqdn, MsgType, Severity
MAX_TIMESTAMP_LOOKAHEAD = 10
TIME_FORMAT = %s
TIMESTAMP_FIELDS = time
INDEXED_EXTRACTIONS = csv

Note the "TIMESTAMP_FIELDS" and "INDEXED_EXTRACTIONS". I'm setting "INDEXED_EXTRACTIONS" to be verbose and avoid confusion in the future. Additionally you do not need a timeprefix as you're specifying the specific field for Splunk to look at for a time stamp.

You can check with btool what's being set on your sourcetype:

splunk btool props list csv --debug

This will show all the props settings for that stanza and where they're set.

View solution in original post

0 Karma

beatus
Communicator

jgcsco,
It looks like the "csv" sourcetype utilizes "INDEXED_EXTRACTIONS" by default. This causes much of the props work to happen at the very first splunk instance the data hits, including Universal forwarders. If that's the case, you'd want to put your props there and utilize a few other config settings. Namely:

[csv]
FIELD_DELIMITER = ,
FIELD_NAMES = transactionId, hostIp, time, fqdn, MsgType, Severity
MAX_TIMESTAMP_LOOKAHEAD = 10
TIME_FORMAT = %s
TIMESTAMP_FIELDS = time
INDEXED_EXTRACTIONS = csv

Note the "TIMESTAMP_FIELDS" and "INDEXED_EXTRACTIONS". I'm setting "INDEXED_EXTRACTIONS" to be verbose and avoid confusion in the future. Additionally you do not need a timeprefix as you're specifying the specific field for Splunk to look at for a time stamp.

You can check with btool what's being set on your sourcetype:

splunk btool props list csv --debug

This will show all the props settings for that stanza and where they're set.

0 Karma

jgcsco
Path Finder

I added the following two line to props.conf

TIMESTAMP_FIELDS = time
INDEXED_EXTRACTIONS = csv

And here is the output:
/opt/splunk/etc/apps/search/local# /opt/splunk/bin/splunk btool props list csv --debug
/opt/splunk/etc/apps/search/local/props.conf [csv]
/opt/splunk/etc/system/default/props.conf ANNOTATE_PUNCT = True
/opt/splunk/etc/system/default/props.conf AUTO_KV_JSON = true
/opt/splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE =
/opt/splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE_DATE = True
/opt/splunk/etc/system/default/props.conf CHARSET = UTF-8
/opt/splunk/etc/system/default/props.conf DATETIME_CONFIG = /etc/datetime.xml
/opt/splunk/etc/system/default/props.conf HEADER_MODE =
/opt/splunk/etc/system/default/props.conf INDEXED_EXTRACTIONS = csv
/opt/splunk/etc/system/default/props.conf KV_MODE = none
/opt/splunk/etc/system/default/props.conf LEARN_MODEL = true
/opt/splunk/etc/system/default/props.conf LEARN_SOURCETYPE = true
/opt/splunk/etc/system/default/props.conf LINE_BREAKER_LOOKBEHIND = 100
/opt/splunk/etc/system/default/props.conf MAX_DAYS_AGO = 2000
/opt/splunk/etc/system/default/props.conf MAX_DAYS_HENCE = 2
/opt/splunk/etc/system/default/props.conf MAX_DIFF_SECS_AGO = 3600
/opt/splunk/etc/system/default/props.conf MAX_DIFF_SECS_HENCE = 604800
/opt/splunk/etc/system/default/props.conf MAX_EVENTS = 256
/opt/splunk/etc/system/default/props.conf MAX_TIMESTAMP_LOOKAHEAD = 128
/opt/splunk/etc/system/default/props.conf MUST_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf MUST_NOT_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf MUST_NOT_BREAK_BEFORE =
/opt/splunk/etc/system/default/props.conf SEGMENTATION = indexing
/opt/splunk/etc/system/default/props.conf SEGMENTATION-all = full
/opt/splunk/etc/system/default/props.conf SEGMENTATION-inner = inner
/opt/splunk/etc/system/default/props.conf SEGMENTATION-outer = outer
/opt/splunk/etc/system/default/props.conf SEGMENTATION-raw = none
/opt/splunk/etc/system/default/props.conf SEGMENTATION-standard = standard
/opt/splunk/etc/system/default/props.conf SHOULD_LINEMERGE = False
/opt/splunk/etc/system/default/props.conf TRANSFORMS =
/opt/splunk/etc/system/default/props.conf TRUNCATE = 10000
/opt/splunk/etc/system/default/props.conf category = Structured
/opt/splunk/etc/system/default/props.conf description = Comma-separated value format. Set header and other settings in "Delimited Settings"
/opt/splunk/etc/system/default/props.conf detect_trailing_nulls = false
/opt/splunk/etc/system/default/props.conf maxDist = 100
/opt/splunk/etc/system/default/props.conf priority =
/opt/splunk/etc/system/default/props.conf pulldown_type = true
/opt/splunk/etc/system/default/props.conf sourcetype =

After restart Splunk, I am still not seeing the _time change to match the epoch time in the logs. Do I miss anything else?

Thanks

0 Karma

beatus
Communicator

Was this change made on the UF sending the data or the indexer? Indexed extractions have to be where the data is ingested, so if it's a UF then the props have to be there.

0 Karma

jgcsco
Path Finder

This is a single node environment. The log files in on the directory on the splunk node.

0 Karma

beatus
Communicator

Is the time off by a set of hours, as in does it look like the timezone is wrong but the minutes and seconds are correct?

0 Karma

jgcsco
Path Finder

The time shows up in Splunk is the time the log being ingested. e.g there are over 100 log files ingested yesterday which have logged event information for the last two weeks. I am looking for if there is a way to have Splunk can use the epoch time associated with each event in inside the log files as "_time".

0 Karma

beatus
Communicator

Time is an index time operation, so modifying these settings won't change what's already in Splunk. Have you tested on new data after modifying the settings?

0 Karma

jgcsco
Path Finder

thanks for pointing it out. It is working now! Thanks a lot for your help!

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...