Getting Data In

How can I extract the DATE from the middle of an event?

Ron_Naken
Splunk Employee
Splunk Employee

I have an ISA web log of the following format. Splunk doesn't correctly identify the timestamp in every event, even though the format doesn't change from event to event. Splunk sometimes seems to take the time/date that it indexes the updates as the timestamp.

How can I tell Splunk to only extract the date/time from the 5th & 6th fields in this CSV?

Sample Data: 1.4.5.1, GOOBERS\MYUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eHqkwMkh1DNu-XXR/102752, -, Inet, 200, -, Allow Internet Access to Web Group, -, Internal, External, 0x780, Allowed 1.4.5.1, GOOBERS\YOURUSER, Shockwave Flash, -, 4/23/2010, 18:54:00, -, ROCKY, -, 8.7.7.7, 8.7.7.7, 80, 531, 215, 192, http, -, POST, http://8.7.7.7/idle/eJqkwMkh0DNeCn2f/102747, -

Props.conf: [isa_web] SHOULD_LINEMERGE = false REPORT-isaw = isa-web

Transorms.conf: [isa-web] DELIMS = "," FIELDS = "src_ip","username","agent","authenticated","date","time","service","server","referer","r-host","r-ip","r-port","tmp1","tmp2","tmp3","cs-protocol","tmp4","s-operation","cs-uri","tmp5","s-object-source","sc-status","s-cache-info","rule","filter-info","cs-network","sc-network","error-info","action"

Thanks!

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

There are many ways to tune the timestamp extraction within Splunk. For your particular data, you should create the appropriate regex to correctly extract the timestamp. Details on how to set this can be found here:

http://www.splunk.com/base/Documentation/latest/Admin/TrainSplunktorecognizeatimestamp

For your scenario, you might be able to set the TIME_PREFIX and MAX_TIMESTAMP_LOOKAHEAD parameters:

MAX_TIMESTAMP_LOOKAHEAD = <integer>
* Specifies how far (in characters) into an event Splunk should look for a timestamp.
* Defaults to 150.

TIME_PREFIX = <regular expression>
* Specifies the necessary condition for timestamp extraction.
* The timestamping algorithm only looks for a timestamp after the first regex match.
* Defaults to empty.

Without seeing more of your data, it will be hard to suggest the exact REGEX, but I would imagine you could do something that searches for the 4th comma.

dwaddle
SplunkTrust
SplunkTrust

Try something like this in props.conf:

TIME_FORMAT=%M/%D/%Y,%H:%M:%S
TIME_PREFIX=^([^,]*,){4}

I have not tested this, but the TIME_PREFIX should tell Splunk to skip the first 4 comma-delimited fields - and the TIME_FORMAT should pick it up from CSV fields 5 and 6.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...