Getting Data In

How can I extract the DATE from the middle of an event?

Ron_Naken
Splunk Employee
Splunk Employee

I have an ISA web log of the following format. Splunk doesn't correctly identify the timestamp in every event, even though the format doesn't change from event to event. Splunk sometimes seems to take the time/date that it indexes the updates as the timestamp.

How can I tell Splunk to only extract the date/time from the 5th & 6th fields in this CSV?

Sample Data: 1.4.5.1, GOOBERS\MYUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eHqkwMkh1DNu-XXR/102752, -, Inet, 200, -, Allow Internet Access to Web Group, -, Internal, External, 0x780, Allowed 1.4.5.1, GOOBERS\YOURUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eJqkwMkh0DNeCn2f/102747, -

Props.conf: [isa_web] SHOULD_LINEMERGE = false REPORT-isaw = isa-web

Transorms.conf: [isa-web] DELIMS = "," FIELDS = "src_ip","username","agent","authenticated","date","time","service","server","referer","r-host","r-ip","r-port","tmp1","tmp2","tmp3","cs-protocol","tmp4","s-operation","cs-uri","tmp5","s-object-source","sc-status","s-cache-info","rule","filter-info","cs-network","sc-network","error-info","action"

Thanks!

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

There are many ways to tune the timestamp extraction within Splunk. For your particular data, you should create the appropriate regex to correctly extract the timestamp. Details on how to set this can be found here:

http://www.splunk.com/base/Documentation/latest/Admin/TrainSplunktorecognizeatimestamp

For your scenario, you might be able to set the TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD parameters:

MAX_TIMESTAMP_LOOKAHEAD = <integer>
* Specifies how far (in characters) into an event Splunk should look for a timestamp.
* Defaults to 150.

TIME_PREFIX = <regular expression>
* Specifies the necessary condition for timestamp extraction.
* The timestamping algorithm only looks for a timestamp after the first regex match.
* Defaults to empty.

Without seeing more of your data, it will be hard to suggest the exact REGEX, but I would imagine you could do something that searches for the 4th comma.

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...