My data sample is as below
C12345 my1Surname, my1First Role Access (Group1) - I
Role Access (Group2) - II
HR Department1
Email Contractors
P98765 Þórarinsson, Guðmundur Role Access (Group1) - I
IT Department1
Email Permanent
P12567 my3Surname, my3First Role Access (Group2) - II
IT Department1
AWS Access Role2
Email Permanent
The first column (is Surname and Firstname), then there is a tab and then it is "multiline" roles of the user
I'm using below regex. The regex splits correctly, but the "\n" of the roles are missed, so unable to split it afterwards
| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"
I'm looking for final output into a key-value format possibly to have a relational-value (or json hierarchy). eg C12345 has 4 roles
C12345 = Role Access (Group1) - I
C12345 = Role Access (Group2) - II
C12345 = HR Department1
C12345 = Email Contractors
PS: The data is encoded in iso-8859-1 (European languages)
This chops up your provided data into each field, format the final output as you need:
| makeresults | eval _raw = " C12345 my1Surname, my1First Role Access (Group1) - I
Role Access (Group2) - II
HR Department1
Email Contractors
P98765 Þórarinsson, Guðmundur Role Access (Group1) - I
IT Department1
Email Permanent
P12567 my3Surname, my3First Role Access (Group2) - II
IT Department1
AWS Access Role2
Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)"
| mvexpand entry
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)"
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles
Output:
employeeid user roles
C12345 my1Surname, my1First Role Access (Group1) - I
Role Access (Group2) - II
HR Department1
Email Contractors
P98765 Þórarinsson, Guðmundur Role Access (Group1) - I
IT Department1
Email Permanent
P12567 my3Surname, my3First Role Access (Group2) - II
IT Department1
AWS Access Role2
Email Permanent
This chops up your provided data into each field, format the final output as you need:
| makeresults | eval _raw = " C12345 my1Surname, my1First Role Access (Group1) - I
Role Access (Group2) - II
HR Department1
Email Contractors
P98765 Þórarinsson, Guðmundur Role Access (Group1) - I
IT Department1
Email Permanent
P12567 my3Surname, my3First Role Access (Group2) - II
IT Department1
AWS Access Role2
Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)"
| mvexpand entry
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)"
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles
Output:
employeeid user roles
C12345 my1Surname, my1First Role Access (Group1) - I
Role Access (Group2) - II
HR Department1
Email Contractors
P98765 Þórarinsson, Guðmundur Role Access (Group1) - I
IT Department1
Email Permanent
P12567 my3Surname, my3First Role Access (Group2) - II
IT Department1
AWS Access Role2
Email Permanent
Thank you perfect.
The missing thing in my search was:
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
So my whole search is now
<raw search>| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"| mvexpand roles | table employeeID,user,roles