Splunk Search

How to extract date:time format from raw data using REGEX?

jsuryaprakash
Path Finder

Below is part of my sample data .. I want to extract date and time from the data.

00.111.222.1 va10n40596.abcdefgt.com - - 443 [02/Jan/2018:18:25:41 -0500]

I want new filed called start_date as 02/Jan/2018:18:25:41 and delete semi-column between date and time.

need some thing like this start_date=02/Jan/2018 18:25:41 from above raw data.

Thanks.

0 Karma

micahkemp
Champion

@acharlieh makes a very good point. Considering that point, to configure this to be the timestamp used for the event at indextime, your props.conf may look like:

[<sourcetype>]
TIME_PREFIX = \[
TIME_FORMAT = %d/%b/%Y:%H:%M:%S %z

The time format above includes the GMT offset ( %z ), so if your results at search time appear to be off by exactly 5 hours that will explain why. I suggest leaving this in place, if possible, and setting your timezone in your user account settings to display events in your local timezone. Of course, you can't do this if you have other events indexed in the local timezone already.

acharlieh
Influencer

If this is the only date in your event, you should be configuring your sourcetype to properly parse this timestamp as _time. This comes with benefits of a lot of niceties in terms of becoming the thing Splunk searches by, built in commands like timechart and others.

The docs go a bit into parsing time values: http://docs.splunk.com/Documentation/Splunk/7.0.1/Data/Configuretimestamprecognition

With a value that is properly parsed (to epoch time), formatting it is just a function of the strftime eval function, or the convert search command. (In parsing and formatting you'll want to familiarize yourself with the Time format variables if you're not already familiar with them).

If this is not the only date in your event, and there is a better one that fills the role of event time, so you need to extract it as a string... first extract the entire field between the brackets (as that timezone offset is important). Then you can use either strptime or convert to turn that string into an Epoch time value. Once that's done then you can again use strftime or convert to format the timestamp how you wish.

(I mention the eval functions in addition to the commands, because the eval functions can be your friend in setting up Calculated Fields on your sourcetype).

micahkemp
Champion

Assuming your date string is the only thing that will ever be between square brackets, this seems like it should work for you:

| rex "\[(?<start_date>[^ ]+)" 
| rex field=start_date mode=sed "s/([^:]+):(.+)/\1 \2/"

To see this in action with a run anywhere example:

| makeresults 
| eval _raw="00.111.222.1 va10n40596.abcdefgt.com - - 443 [02/Jan/2018:18:25:41 -0500]" 
| rex "\[(?<start_date>[^ ]+)" 
| rex field=start_date mode=sed "s/([^:]+):(.+)/\1 \2/"
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...