Splunk Search

Regex help on multiline fields and mvexpand

koshyk
Super Champion

My data sample is as below

C12345  my1Surname, my1First        Role Access (Group1) - I
                            Role Access (Group2) - II
                            HR Department1
                            Email Contractors
P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                            IT Department1
                            Email Permanent                               
P12567  my3Surname, my3First        Role Access (Group2) - II
                            IT Department1
                            AWS Access Role2
                            Email Permanent  

The first column (is Surname and Firstname), then there is a tab and then it is "multiline" roles of the user
I'm using below regex. The regex splits correctly, but the "\n" of the roles are missed, so unable to split it afterwards

| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"

I'm looking for final output into a key-value format possibly to have a relational-value (or json hierarchy). eg C12345 has 4 roles

C12345 = Role Access (Group1) - I
C12345 = Role Access (Group2) - II
C12345 = HR Department1
C12345 = Email Contractors

PS: The data is encoded in iso-8859-1 (European languages)

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

This chops up your provided data into each field, format the final output as you need:

| makeresults | eval _raw = " C12345  my1Surname, my1First        Role Access (Group1) - I
                             Role Access (Group2) - II
                             HR Department1
                             Email Contractors
 P98765  Þórarinsson, Guðmundur        Role Access (Group1) - I
                             IT Department1
                             Email Permanent                               
 P12567  my3Surname, my3First        Role Access (Group2) - II
                             IT Department1
                             AWS Access Role2
                             Email Permanent"
| rex max_match=0 "(?m)^(?<entry>\s{0,3}\S[^\r\n]+([\r\n]+\s{4,}[^\r\n]+)*)" 
| mvexpand entry 
| rex field=entry "(?s)^\s*(?<employeeid>\S+)\s+(?<user>.*?)\s\s+(?<roles>.*)" 
| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"
| table employeeid user roles

Output:

employeeid        user                      roles   
C12345            my1Surname, my1First      Role Access (Group1) - I
                                            Role Access (Group2) - II
                                            HR Department1
                                            Email Contractors
P98765            Þórarinsson, Guðmundur    Role Access (Group1) - I
                                            IT Department1
                                            Email Permanent
P12567            my3Surname, my3First      Role Access (Group2) - II
                                            IT Department1
                                            AWS Access Role2
                                            Email Permanent 

koshyk
Super Champion

Thank you perfect.
The missing thing in my search was:

| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"

So my whole search is now

<raw search>| rex "^(?<employeeID>.*)\t(?<user>.*)\t(?<roles>(?s).*)"| rex max_match=0 field=roles "(?<roles>.*?)(\s\s+|\s*$)"| mvexpand roles | table employeeID,user,roles
0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...