I've got data that looks like this:
YCTC3|YCTC3|A277537|20131013|225102|316739|E|001|TP0|THPNBAV05|10.124.130.71|||||||PAR|A|0000119501|00|||
Date is the fourth column, and time is the fifth. Got any ideas about how to get TIME_PREFIX, TIME_FORMAT and MAX_TIME_LOOKAHEAD to get this right?
My latest try is:
TIME_PREFIX = ^[^|]|[^|]|[^|]*|
TIME_FORMAT = %Y%m%d|%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 20
I think the only issue is your TIME_PREFIX. The regex you have only matches a single non-pipe character between each pipe. For what you have, you want:
TIME_PREFIX = ^[^\|]+\|[^\|]+\|[^\|]+\|
Then the rest should work as intended.
(Replace + with * if any of the preceeding fields might be empty. |||20131013|...)
EDIT: Also need to escape the pipes, as sowings mentioned.
Good call, I missed the "only one char" thing.
| has special meaning in a regex, you'll have to escape it with a \.
TIME_PREFIX= ^[^\|]\|{3}
There are three groups of "non-pipe characters followed by a pipe".
I think the {3} only applies to the previous token, so you'd have to group before using it for it to apply to the whole pattern.
^(?:[^\|]+\|){3}
This is the one that eventually worked. I didn't test the others too hard. This one looked elegant.