Hi,
I have log files for java stack traces I am trying to parse to get the names of the exceptions that caused them extracted into different fields. The log files are formatted in a way that gives the initial exception early on in the up to 200 line long event. Much farther down the event, you can SOMETIMES find a line that reads: Caused by: another exception name:
Reason.
The regex I am using to find the initial exception is as follows:
(?i)[a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)(?P<FIELDNAME1>[^\n]+)
I want to add another piece of regex to pull in this other line and this is my attempt with all the regex together:
(?i)[a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)(?P<FIELDNAME1>[^\n]+)((.+\n)+(?i)Caused by: [a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)(?P<FIELDNAME2>[^\n]+))?
This regex returns no matches. I want this second regex matching and field matching to be optional because it is not always present, so I tried adding the ? after the entire match for FIELDNAME2. Here are other configurations I have tried:
(?i)[a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)(?P<FIELDNAME1>[^\n]+)((.+\n)+(?i)Caused by: [a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)?(?P<FIELDNAME2>[^\n]+)
This almost worked. It does the extraction correctly where it finds the line: Caused by: ....... Although, it does not work correctly for events where 'Caused by:' is not found. It instead takes the last character of FIELDNAME1, chops it off of the line for FIELDNAME1 and puts it in FIELDNAME2.
(?i)[a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)(?P<FIELDNAME1>[^\n]+)(.+\n)+(?i)Caused by: [a-z]+(\.[a-z]+)*\.(?=[a-z]+Exception:*\s*)?(?P<FIELDNAME2>[^\n]+)?
This attempt shows no results. I tried making both the regex search optional, and the field optional, and no matches are found at all.
Any help is appreciated.
I'd be tempted to extract both the original exception, and the optional 2nd one if it exists as seperate field extractions, then handle it in your search, with something like coalesce.
I guess I am not sure what you mean. Coalesce is used to find the first value in a list that is not NULL. How will this help me here?