I have a custom log as below:
1 2017-11-27T09:42:05.449123+00:00 generus0002 Sonahock - - [timeQuality tzKnown="1" isSynced="1" syncAccuracy="235810"] "CTO","23-249691","2017-11-27 09:36:57",13,"Malware","Further investigation required","1.1.2.3",80,"1.2.3.4",64635,"Inbound","Generus IFGW","Eldos Wiper (OPTIONS /)",100.0,"1.1.2.3","","PC-myComputer","","aa:bb:cc:dd:ee:ff","East-West","1.1.2.3","/","DavClnt",""
Now I want to find a regular expression that would parse the log into multiple field-value pairs. The fields will be separated by coma (,) and the first part of the above string i.e.
1 2017-11-27T09:42:05.449123+00:00 generus0002 Sonahock - - [timeQuality tzKnown="1" isSynced="1" syncAccuracy="235810"]
will be skipped.
I have created the below regex. But it's not working.
^(?:[^"\n]"){7}(?P[^"]+)","(?P[^"]+)","(?P[^"]+)[^"\n]",(?P\d+)[^"\n]"(?P[^"]+)[^,\n],"(?P[^"]+)[^,\n],"(?P[^"]+)",(?P[^,]+)[^,\n],"(?P\d+.\d+.\d+.\d+)",(?P[^,]+)[^"\n]"(?P\w+)","(?P\w+\s+\w+)[^,\n],"(?P[^"]+)",(?P\d+.\d+),"(?P[^"]+),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"]),"(?P[^"])*
The main problem I'm facing is that there are some fields which might have values like "" (i.e. null value). I think the parser is not working whenever it encounters such fields. Please suggest.
Thanks in advance.
Try this regex. You'll want to replace each "field*" with something more meaningful. Regular expressions handle empty strings using the *
quantifier. BTW, it's not necessary to start your regex at the beginning of the line (unlike what the Splunk regex tool insists on). In this case, the pattern starts with the right bracket that ends the text you want to skip.
] "(?<field1>[^"]*)","(?<field2>[^"]*)","(?<field3>[^"]*)",(?<field4>\d*),"(?<field5>[^"]*)","(?<field6>[^"]*)","(?<field7>[^"]*)",(?<field8>\d*),"(?<field9>[^"]*)",(?<field10>\d*),"(?<field11>[^"]*)","(?<field12>[^"]*)","(?<field13>[^"]*)",(?<field14>[^,]*),"(?<field15>[^"]*)","(?<field16>[^"]*)","(?<field17>[^"]*)","(?<field18>[^"]*)","(?<field19>[^"]*)","(?<field20>[^"]*)","(?<field21>[^"]*)","(?<field22>[^"]*)","(?<field23>[^"]*)","(?<field24>[^"]*)"
Your regex string was mangled by the system. Please edit your question and put the correct regex as code (highlight the text and click the '101010' button).