Hello,
I am trying to extract a field and I have an error in my REGEX. The line looks like this:
6/26/2014 13:00:10.866 | 18636 | Cmd:ALARMS device:REED31_13HLOALP_RD, status:2:FAILED: Unresolved message: ALARMS, user:OKCLOI.MSS, parms: | UisCmdRequestManagerImpl.cpp | 747 | nLOG_EXCEPTIONS
I am trying to pull the 5th section out of the line. The extracted data in this log example would be 747. I have this REGEX in my field extraction:
(?i)^[^\|][^\|][^\|][^\|] (?P{FIELDNAME}\s\d+)
What have I done wrong? Is the data in the third piped section not available for an "any"? Do I need to break down that section?
There are a couple minor issues with the regex as writen:
\s
I like richalloway's approach, but I'm not sure the [\S\s]
is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".
") Also, this should be anchored to the beginning of the line (^
).
Here's my suggestion: (Modified from richalloway's answer)
^(?:[^|]+\|){4}\s*(?<fieldname>\d+)
Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".
Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.
There are a couple minor issues with the regex as writen:
\s
I like richalloway's approach, but I'm not sure the [\S\s]
is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".
") Also, this should be anchored to the beginning of the line (^
).
Here's my suggestion: (Modified from richalloway's answer)
^(?:[^|]+\|){4}\s*(?<fieldname>\d+)
Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".
Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.
Don't forget to mark your question resolved by selecting the check mark next to one of the answers.
This worked great. I did not get a chance to try out the first two suggestions. When I looked at the answers these three were already posted so I of course took the one that referenced others. My REGEX is weak and I thank you all very much for the answers.
There's an eval function that does this.
eval temp=split(_raw,"|") | eval FieldX=mvindex(temp,4)
The first eval splits your _raw into a multivalue field split by the pipe symbol, the second then pulls out the 4th of those fields, calling it FieldX.
Obviously, rename as desired.
This worked for me.
(?:[\S\s]*|){4}\s(?<fieldname>\d+)