Getting Data In

variance betweeen _time and date_* fields

kbrady
Explorer

I've got a situation where different date elements are providing inconsistent results for the same time data. I suspect this is a result of index time vs. search time processing and timezone differences between the data presented and data indexed. The data contains a date in UTC, which is converted to MDT by Splunk. The date_* data is incorrect when compared to the _time field, though is correct for the raw timestamp data.

If this activity were consistent amongst all system provided time fields, I could live with it pretty easily, but the fact that the data presented in the time field differs from the date* fields is problematic. Is this a bug or expected behavior?

Here's info on the search, data returned, props config, and splunk version

Search modifiers:
| eval wd=lower(strftime(time,"%A")) | table _time, date*, wd

Raw time data:
2013-08-16T05:10:05

Data presented:
_time==8/15/13 11:10:05.000 PM
wd==thursday
date_wday==friday
date_mday==16
date_hour==5

props.conf config:
[my_sourcetype]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 500
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 100000
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
TIME_PREFIX = ,\"start\":\"
TZ = UTC
pulldown_type = 1

Splunk version:
5.0.4, build 172409

1 Solution

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

View solution in original post

halkelley
Path Finder

I think your macro needs to have:

        eval  date_wday = strftime(_time, "%A")

for date_wday (capital "A") to work correctly

0 Karma

lguinn2
Legend

Yes, "%a" gives the DOW abbreviation, not the full name of the DOW.

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

lguinn2
Legend

You could create a macro to help your user base. I'd call the macro something like normalize_datetime and have it calculate/override the existing date* fields. This macro could also be used on data that has no date* fields, providing an added benefit.

eval date_hour = strftime(_time, "%H") | eval date_mday = strftime(_time, "%d") | eval date_minute = strftime(_time, "%M") | eval date_month = strftime(_time, "%m") | eval date_second = strftime(_time, "%S") | eval date_wday = strftime(_time, "%a") | eval date_year = strftime(_time, "%y") | eval date_zone = "UTC"

lguinn2
Legend

Update: you may find that these fields do exist for inputs that do not use text timestamps. So far, I have not figured out how to remove the date_* fields from automatic field extraction - and I have looked... I completely understand the comment "some of my user base will have a harder time"

0 Karma

kbrady
Explorer

Fair enough, at least it's documented. Not a big deal for me to stop using the date_* fields, but some of my user base will have a harder time.

Thank you for a solid explanation.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...