The timestamp of an application log file is always being set to midnight, an example line in the logs is:
02/05/17 14:47:21 IN:[(1)29RET_YSCO (2)(28)92 RINQ(28)1005183824(28)(28)10(28)990364911(28)457(28)7.50(28)Y(3)0CF9(4)]
This is given an event time of 02/05/2017 00:00:00.000
The sourcetype config for this log in props.conf is as follows:
[flat_file]
SHOULD_LINEMERGE = false
LINE_BREAKER = [\r\n]+
MAX_EVENTS = 1
TIME_FORMAT = %d/%m/%y %H:%M:%S
MAX_DAYS_AGO = 1
MAX_TIMESTAMP_LOOKAHEAD = 20
When I test the sourcetype config with the same log file using the Add Data / Set Source Type functioanlity in Splunk it assigns the time from the log file correctly!
I'm ingesting numerous other logs (of different format) from the same host without issue.
Can anyone see why the event time is always set to midnight for this log?
I strongly suspect that the whitespace between the day
and the time
is not as it seems. You are probably using a single space in your TIME_FORMAT
but maybe that is actually a tab or something else funky. Do a search for an event that is broken and then examine the timestartpos
and timeendpos
and this will give you insight into what the timeParser is thinking/doing.
I used an online Unicode editor UTF-8 for an example line from the logs,
02/05/17 08:02:03 OUT:[(1)82RET_SYCO (2)(28)13 NGLP(28)OK(28)124(28)13374739(28)C(28)MARCO(28)816697617(28)24(28)(3)(4)]
30 32 2F 30 35 2F 31 37 20 30 38 3A 30 32 3A 30 33 20 4F 55 54 3A 5B 28 31 29 38 32 52 45 54 5F 53 59 43 4F 20 20 20 20 20 20 28 32 29 28 32 38 29 31 33 20 4E 47 4C 50 28 32 38 29 4F 4B 28 32 38 29 31 32 34 28 32 38 29 31 33 33 37 34 37 33 39 28 32 38 29 43 28 32 38 29 4D 41 52 43 4F 28 32 38 29 38 31 36 36 39 37 36 31 37 28 32 38 29 32 34 28 32 38 29 28 33 29 28 34 29 5D
As you can see it appears to be a simple space between the date and time, I know think ome other extract format must be taking precedence as DalJeanis suggested above, what this could be is now the mystery.
The timestartpos is 0 and timeendpos is 8
So it is giving up (or being stopped, i.e. MAX_TIMESTAMP_LOOKAHEAD=8
) at the whitespace, but why? Try splunk cmd btool props list --debug
and maybe you can see that some settings are being defaulted some other way than you expect.
This is the entry for this sourcetype
ANNOTATE_PUNCT = True
AUTO_KV_JSON = true
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = True
CHARSET = AUTO
CHECK_METHOD = modtime
DATETIME_CONFIG =
HEADER_MODE =
KV_MODE = none
LEARN_MODEL = false
LEARN_SOURCETYPE = false
LINE_BREAKER = [\r\n]+
LINE_BREAKER_LOOKBEHIND = 100
MATCH_LIMIT = 100000
MAX_DAYS_AGO = 1
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 1
MAX_TIMESTAMP_LOOKAHEAD = 20
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
NO_BINARY_CHECK = true
SEGMENTATION = indexing
SEGMENTATION-all = whitespace-only
SEGMENTATION-inner = whitespace-only
SEGMENTATION-outer = whitespace-only
SEGMENTATION-raw = none
SEGMENTATION-standard = whitespace-only
SHOULD_LINEMERGE = false
TIME_FORMAT = %d/%m/%y %H:%M:%S
TIME_PREFIX = ^(?=\s*)
TRANSFORMS =
TRUNCATE = 1000000
TZ = Australia/Melbourne
category = Application
description = Micros Logs
detect_trailing_nulls = auto
disabled = false
maxDist = 100
priority =
pulldown_type = 1
Can you see any issues, it looked okay to me?
I currently have the props.conf entry set to:
TIME_PREFIX = ^(?=\s*)
MAX_TIMESTAMP_LOOKAHEAD = 20
TIME_FORMAT = %d/%m/%y %H:%M:%S
LINE_BREAKER = [\r\n]+
SHOULD_LINEMERGE = false
This looks exactly as it should; I've got nothing else. You should open a support case (be sure to upload a diag immediately).
Can you change MAX_TIMESTAMP_LOOKAHEAD = 20
to, say, MAX_TIMESTAMP_LOOKAHEAD = 40
? Just in case.
As a note, the few times I had such puzzling timestamp behavior I would usually just take out ALL of the "hints" on how to handle them and see what happens when Splunk does its own [not-recommended-for-production] magic.
I changed the MAX_TIMESTAMP_LOOKAHEAD to 40, no difference. I have also removed all timeformating hints from the props.conf for this sourcetype, again no change!
Okay, so let's try something really outrageous.
Change the timestamp format to...
TIME_FORMAT = %m/%d/%y
...and see if it properly extracts 6/5/17 to June 5 2017, or if it still gets May 6th. That will at least tell us whether this particular conf file is being heard at all.
If it does NOT properly swap the day/month to month/day, then we will know that there is some other extract format taking precedence, and we can start hunting down the culprit with our muskets and pitchforks. (I suspect that this has to be the case.)
Thanks for the suggestion, I changed the TIME_FORMAT to %m/%d/%y and this was NOT reflected in the date assigned. You are correct some other extract format must be taking precedence!
What happens if you don't specify the TIME_FORMAT?
I have tried this, no difference!
@somesoni2, @woodcock, @jkat54, any ideas on this one?
Hi romeo, any internal error regarding bad recognition timestamp or something similar ?
No internal errors for these logs!
Is this happening on a standalone server ?
Your timestamp is only 18 long including the space at the front (so it's < 20). There is exactly one space between data and time in the data and the format, so the basics are covered.
1) Try adonio's suggestion first. TIME_PREFIX = ^
If that doesn't fix the issue, then here's the next two shots in the increasingly pitch black dark...
2) Try setting MAX_DAYS_AGO to 2 just in case there's a UTC vs local time thing going on.
3) Any chance it's somehow not being handled as [flat_file]?
1) Added TIME_PREFIX = ^
2) Changed MAX_DAYS_AGO to 2
3) The sourcetype is being recorded correctly against the event.
Unfortunatley the time is still always assigned as midnight!
Example:
03/05/17 11:01:12 Msg: OUT:[(1)65RET_YSCO (2)(28)15 VSNG(28)OK(28)68975261(28)12689300(28)D(28)KALRA(28)(3)(4)]
Time
time - 2017-05-03T00:00:00.000+10:00
date_mday - 3
date_month - may
date_wday - wednesday
date_year - 2017
date_zone - 600
timeendpos - 8
timestartpos - 0
Default
host - gis-syco-01
index - crown
punct - //::::[()_____()()()()()()()()()()]
source - /sysC/logs/simphony/MICpst.debug
sourcetype - flat_file
splunk_server - MIT-SPLUNK-T1
Okay, one more shot in the dark. Suppose that innocent looking space is actually some other character? Try this one.
TIME_FORMAT = %d/%m/%y\s+%H:%M:%S
By the way, I assume you either made the conf changes through splunkweb and the CLI, or restarted splunkd to pick up the new conf file if you made the changes manually?
https://docs.splunk.com/Documentation/Splunk/6.5.3/Admin/Configurationfilechangesthatrequirerestart
The next shot in the next dark is all the way over the horizon and into the next ocean over... take your timestamp and verify exactly what characters in unicode are in the slashes and colons...
Hmm. Before I did that, I'd feed in a file with a 5/1 or prior date and see if it's getting 5/1 or 5/2 for that. I'm expecting it would have to get the current date, if it's not properly identifying the timestamp as a timestamp. If I'm wrong there, then we start beating the thickets for how it got the right date, and work from there.
just adding a thought that maybe (huge maybe but I have seen it happen before) this space is a tab
in that case:
TIME_FORMAT = %d/%m/%y\t%H:%M:%S