Hi,
I am unable to extract a valid _time from the following log:
0168 004 07:59:03 09:01:35 0062 asdfghj ee bonfanyti Y P1233443P 443386 0012 07:59:17 dial_in 1 1234 N N 34567654555 000523456778 0000 09/20/10 0 1624443 01
0344 003 07:58:33 09:01:36 0063 Ssdfas Fd asdfffftim Y P5243343P 455483 0032 07:58:48 dial_in 1 7950 N N 000234234218 0000 09/20/10 0 1624443 01
0433 007 08:00:14 09:01:36 0061 ewrwreerer asdfsdfff N P5243443P 451333 0061 08:00:30 dial_in 19 7952 N N 58916588270 000522349181 0000 09/20/10 0 5673443 01
timestamps I would like to extract are:
1) 09/20/10 07:59:03
2) 09/20/10 07:58:33
3) 09/20/10 08:00:14
Reading the documentation I have figured out that I can only extract it using a custom datetime.xml
I have tried to construct a datatime.xml:
<datetime>
<define name="ccm_1_date" extract="month,day,year,">
<text><![CDATA[\s+\d+\s(\d+)/(\d+)/(\d+)]]></text>
</define>
<define name="ccm_1_time" extract="second,minute,hour,">
<text><![CDATA[\s\d+:\d+:\d+\s]]></text>
</define>
<timePatterns>
<use name="ccm_1_time"/>
</timePatterns>
<datePatterns>
<use name="ccm_1_date"/>
</datePatterns>
</datetime>
The date pattern is probably good, but the time pattern is suspicious.
props.conf:
[host::ccm]
SHOULD_LINEMERGE = false
DATETIME_CONFIG = /etc/apps/search/local/datetime.xml
MAX_TIMESTAMP_LOOKAHEAD = 300
Any help would be appreciated.
Try the following datetime.xml:
<datetime>
<define name="ccm_1_date" extract="month,day,year">
<text><![CDATA[0000\s(\d{2})/(\d{2})/(\d{2})]]></text>
</define>
<define name="ccm_1_time" extract="hour,minute,second">
<text><![CDATA[\*\*\*\s(\d{2}):(\d{2}):(\d{2})]]></text>
</define>
<define name="ccm_2_time" extract="hour,minute,second">
<text><![CDATA[\d{3}s(\d{2}):(\d{2}):(\d{2})]]></text>
</define>
<timePatterns>
<use name="ccm_1_time"/>
<use name="ccm_2_time"/>
</timePatterns>
<datePatterns>
<use name="ccm_1_date"/>
</datePatterns>
</datetime>
Also, I don't recommend updating the default datetime.xml. During upgrade your configuration will be overwritten. Name it something like datetime2.xml and specify this in your props.conf with DATETIME_CONF. ie:
[extracttime]
SHOULD_LINEMERGE = false
DATETIME_CONF=\etc\garfield.xml
MAX_TIMESTAMP_LOOKAHEAD = 1000
Try the following datetime.xml:
<datetime>
<define name="ccm_1_date" extract="month,day,year">
<text><![CDATA[0000\s(\d{2})/(\d{2})/(\d{2})]]></text>
</define>
<define name="ccm_1_time" extract="hour,minute,second">
<text><![CDATA[\*\*\*\s(\d{2}):(\d{2}):(\d{2})]]></text>
</define>
<define name="ccm_2_time" extract="hour,minute,second">
<text><![CDATA[\d{3}s(\d{2}):(\d{2}):(\d{2})]]></text>
</define>
<timePatterns>
<use name="ccm_1_time"/>
<use name="ccm_2_time"/>
</timePatterns>
<datePatterns>
<use name="ccm_1_date"/>
</datePatterns>
</datetime>
Also, I don't recommend updating the default datetime.xml. During upgrade your configuration will be overwritten. Name it something like datetime2.xml and specify this in your props.conf with DATETIME_CONF. ie:
[extracttime]
SHOULD_LINEMERGE = false
DATETIME_CONF=\etc\garfield.xml
MAX_TIMESTAMP_LOOKAHEAD = 1000
Your regex in ccm_1_time does not capture any groups.
Try adding parentheses to capture each value, and make sure that the timestamp regex only matches the first set of colon-delimited digits...
<datetime>
<define name="ccm_1_date" extract="day,month,year,">
<text><![CDATA[\s+\d+\s(\d+)/(\d+)/(\d+)]]></text>
</define>
<define name="ccm_1_time" extract="hour,minute,second,">
<text><![CDATA[^(?:\d+\s)+(\d+):(\d+):(\d+)\s]]></text>
</define>
<timePatterns>
<use name="ccm_1_time"/>
</timePatterns>
<datePatterns>
<use name="ccm_1_date"/>
</datePatterns>
</datetime>
The order is still wrong; it would need to be "day,month,year,". The regex looks like it should match, but in your sample data the second part is 20, which isn't a valid month.
found a typo, the correct time line is :
Now the time part is correctly recognised, date part is still not working as it should. What could be the problem with:
modified accordingly, _time is again 9/23/10 9:50:47.000 PM
Did't notice the field order - modified above to correct.
also, the extract
must be in order of the capture groups. use hour, minute, second
instead of second, minute, hour
.
because if Splunk fails to get a date or time from the data, it next tries the file/source name, and then the mod time of the file. http://www.splunk.com/base/Documentation/latest/Admin/HowSplunkextractstimestamps#Precedence_rules_f...
9/23/10 9:50:47.000 PM is the time of last modificaton of the log file. Why is it used instead of the intended fields?
No, it does not. Every event has the same _time field :
9/23/10 9:50:47.000 PM