I have a data source that is pipe delimited, but some of the fields contain no data or even a blank space. I've created a regex expression that is able to extract all the fields, but is there a way to have the fields with no value or a blank space show up as null so I can handle them with fillnull? Currently the show up as blank such as event=""
or with a space like subject=" "
Raw data:
2016-02-18 09:35:09,491|OAuth| USER1| 1.1.1.1 | |USER1| OAuth20| pingfedserver4| AS| success| | | 18
2016-02-18 09:35:09,267|SSO| USER2| 2.2.2.2 | | https://testsite.test.com| SAML20| pingfedserver4| IdP| success| testuser| | 92
My rex statement:
rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|(?<subject>[^\|]*)\|(?<ip>[^\|]*)\|(?<app>[^\|]*)\|(?<connectionid>[^\|]*)\|(?<protocol> [^\|]*)\|(?<host>[^\|]*)\|(?<role>[^\|]*)\|(?<status>[^\|]*)\|(?<adapterid>[^\|]*)\|(?<description>[^\|]*)\|(?<responsetime>[^\|]*)"
Try like this
Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)" | replace "" with "NA"
Try like this
Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)" | replace "" with "NA"
Worked perfectly! I didn't know about the replace command.
Hi,
I've noticed a typo in your regex. There's a space after the protocol label that it shouldn't be there.
Anyway, with regards to your questions, you could combine foreach with trim and isnull and it should give you what you are looking for:
| foreach * [eval <<FIELD>> = if(isnull(trim('<<FIELD>>')), "NULL", trim('<<FIELD>>'))]
Let me know if that helps
That didn't quite handle it, but I was able to get of rid of the spaces by changing my rex command a bit.
rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"
Now I just need to be able to handle where myfield=""
Any ideas?