Getting Data In

Why is Splunk assigning the same and wrong timestamp to thousands of indexed events?

ben_davies2
New Member

alt textSplunk n00b here.

Our Splunk system was recently indexing the wrong timestamp. I made some alterations to props.conf and now I get thousands of logs indexed at the same time even though the time within the actual logs (ie the timestamp in the log itself rather than the Splunk timestamp) are completely different.

Here's an extract from my props.conf

[default]
CHARSET = UTF-8
LINE_BREAKER_LOOKBEHIND = 100
TRUNCATE = 10000
DATETIME_CONFIG = datetime.xml
ANNOTATE_PUNCT = True
HEADER_MODE =
MAX_DAYS_HENCE=2
MAX_DAYS_AGO=2000
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_TIMESTAMP_LOOKAHEAD = 128
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = True
MAX_EVENTS = 256
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
TRANSFORMS =
0 Karma

woodcock
Esteemed Legend

The problem is that Splunk cannot find a timestamp (because you are not correctly telling it how to do so) so it is defaulting to the last timestamp it has from this sourcetype. You will find error logs that say something like "cannot identify timestamp; defaulting to timestamp of previous event".

There is a scoping problem in that you are using a [default] stanza instead of one specifically targeted to your events so let's fix that first. I assume this means that you are editing $SPLUNK_HOME/etc/system/local/props.conf which is an exceedingly poor decision for many reasons. Pick a name for your thing like MyApp and create a directory structure like this $SPLUNK_HOME/etc/apps/MyApp/default. Inside this, create a props.conf file with these settings (instead of myAppSourcetype, use whatever you specified in sourcetype= inside of your inputs.conf file, which should also be moved into the same `MyApp directory structure (but on the Forwarders instead of the Indexers):

[myAppSourcetype]
SHOULD_LINEMERGE=false
TIME_PREFIX=^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%z
MAX_TIMESTAMP_LOOKAHEAD=20

Deploy this to all of your Indexers and restart all of their Splunk instances.

0 Karma

emiller42
Motivator

The fact that this is in a default stanza is kind of scary. Whatever you set in default is going to apply to everything that doesn't have explicit overrides, and could break a lot of configs that assume defaults. Create a stanza for this sourcetype and place configurations there.

That said, looking at the logs, this should be a pretty straightforward config:

[your_sourcetype]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)(?:\d{4}-\d{2}-\d{2})
TIME_PREFIX=^
MAX_TIMESTAMP_LOOKAHEAD=25
TIME_FORMAT=%FT%T%z

This tells Splunk exactly where to split events, tells it exactly where to find the timestamp, and tells it exactly how the timestamp is formatted. It can usually figure this stuff out itself, but explicitly defining it makes parsing more efficient, and should be done wherever possible.

somesoni2
SplunkTrust
SplunkTrust

Just add TIME_FORMAT in your props.conf

TIME_FORMAT = %Y-%m-%dT%H:%M:%S
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...