Splunk Search

How much do you care about NTP?

muebel
SplunkTrust
SplunkTrust

So I am considering how we implement NTP in a new environment. Time synchronization seems to be really important when it comes to event correlation, and I am wondering if anybody else out there has put a lot of thought into this.

What does your NTP implementation look like? Have you ever been burned by not having an adequate NTP implementation?

Tags (2)

jtrucks
Splunk Employee
Splunk Employee

If you can, get a rackmount time source. I've used GPS based devices before, but you have to have the ability to run an antenna cable up to the roof. There are others with their own built-in atomic clock, as well. I've personally used Galleon products, but there are several brands that are reliable and low-maintenance out there. Check with your hardware vendors to see what options they have. If you have a modest size operation, you can likely get into a hardware solution for a few thousand or less. Obviously if you have a large-scale environment, especially if distributed, you may have to investigate multiple time source devices, one per location.

Whatever your top tier time source is, make sure you designate several secondary sources within your environment. It isn't difficult to allocate one per networking area, floor of a building, or other logical demarkations that work for your operation. Then have all other systems in that region use the secondary source for their time source, with all your secondary sources pointing both to the closest stratum 1 (primary) time source as well as talking to each other secondary source as backup. This will best ensure you have the best time synching in your whole environment or many environments no matter how many systems are blocked from talking to one another by planned or unplanned catastrophic events.

I first learned to implement the above process for time synching using NTP because we spent several weeks diagnosing a complex race condition in a multi-system application spanning several datacenters in two different cities only to find it wasn't a race condition after all. It was a time synchronization issue that caused errors in processing requests between three systems.

In security, time synching is crucial to reconstruct the chain of events and even packets themselves during any event or incident. Also, your automated data analysis systems will heavily rely on multiple systems having time synched in order to correctly rebuild and analyze events coming across the wire.

--
Jesse Trucks
Minister of Magic

lguinn2
Legend

Correct time is extremely important for correlating events, for audits, for forensics, for diagnostics. Thumbs up for thinking about this, although I don't think that you need more than just solid time server(s) for the environment, accurate to the millisecond if possible.

Note that Windows servers often do not time sync predictably. Make sure virtual machine hosts are all time syncing properly, as virtual machines often rely on their host for time (or set the VMs to rely directly on a NTP server).

Also on the subject of time, I personally love an environment where all the machines run on UTC and log in UTC. Splunk can, of course, translate from any OS time to UTC, etc. But as a Splunk admin,if I need to configure the TZ setting for each server, that's just one more thing I have to manage.

Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...