Getting Data In

How to extract the timestamp from an HTML file?

tfitzgerald_col
Engager

Howdy. I'm trying to index an HTML file, and I can not, for the life of me, get the timestamp to extract when using the preview. Here's the event:

<abbr class="dt" title="2013-05-27T04:24:58.979Z">May 27, 2013, 4:24:58 AM
GMT</abbr>:
<cite class="sender vcard"><a class="tel" href="tel:+*******"><span class="fn">+**********</span></a></cite>:
<q>Yeah, I'll be there</q></div> 

And here's what I'm using for settings.

TIME_FORMAT = %Y-%m-%dT%H:%M:%S
TIME_PREFIX = <abbr class="\w+" title="
MAX_TIMESTAMP_LOOKAHEAD = 19

It's just not finding the timestamp at all. Any idea why? I've tried a few other iterations, even going so far as to make the prefix <.*>, and setting the time format to match the second timestamp; still nothing. I'm getting pretty frustrated.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

I would avoid using any kind of tag notation within TIME_PREFIX. Have you tried just as below?

TIME_PREFIX= title="
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...