Hi
We have a regex/requirement to extract col1,col2,col3,col4 everytime. But the data may not contain col3 onwards everytime.
How to write regex , so it will be forgiving and extract what it has got, rather than failing for the entire line?
(?<col1>[^\"]*?)\",\"(?<col2>[^\"]*?)\",\"(?<col3>[^\"]*?)\",\"(?<col4>[^\"]*?)\"
below is dataset
"r1col1","r1col2"
"r1col1","r1col2","r1col3"
"r3col1","r3col2","r3col3","r3col4"
"r4col1","r4col2","r4col3","r4col4","r4col5","r4col6","r4col7"
in above regex, it is failing for Line1 and Line2, but rather prefer to give atleast col1
and col2
if it doesn't find others.
How about this:
(?<col1>[^\"]*?)\",(\"(?<col2>[^\"]*?)\",)?(\"(?<col3>[^\"]*?)\",)?(\"(?<col4>[^\"]*?)\")?
This makes col2, col3, and col4 optional by wrapping them in parenthesis and appending a question mark, to indicate that the field may occur 0 or 1 times - effectively making them optional.
How about this:
(?<col1>[^\"]*?)\",(\"(?<col2>[^\"]*?)\",)?(\"(?<col3>[^\"]*?)\",)?(\"(?<col4>[^\"]*?)\")?
This makes col2, col3, and col4 optional by wrapping them in parenthesis and appending a question mark, to indicate that the field may occur 0 or 1 times - effectively making them optional.
cheers. it works