Splunk Search

How do I handle fields with no value or a blank space in a rex field extraction so they show up as null?

jedatt01
Builder

I have a data source that is pipe delimited, but some of the fields contain no data or even a blank space. I've created a regex expression that is able to extract all the fields, but is there a way to have the fields with no value or a blank space show up as null so I can handle them with fillnull? Currently the show up as blank such as event="" or with a space like subject=" "

Raw data:

2016-02-18 09:35:09,491|OAuth| USER1| 1.1.1.1 | |USER1| OAuth20| pingfedserver4| AS| success| | | 18 
2016-02-18 09:35:09,267|SSO| USER2| 2.2.2.2 | | https://testsite.test.com| SAML20| pingfedserver4| IdP| success| testuser| | 92 

My rex statement:

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|(?<subject>[^\|]*)\|(?<ip>[^\|]*)\|(?<app>[^\|]*)\|(?<connectionid>[^\|]*)\|(?<protocol> [^\|]*)\|(?<host>[^\|]*)\|(?<role>[^\|]*)\|(?<status>[^\|]*)\|(?<adapterid>[^\|]*)\|(?<description>[^\|]*)\|(?<responsetime>[^\|]*)"
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

View solution in original post

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

jedatt01
Builder

Worked perfectly! I didn't know about the replace command.

0 Karma

javiergn
Super Champion

Hi,

I've noticed a typo in your regex. There's a space after the protocol label that it shouldn't be there.

Anyway, with regards to your questions, you could combine foreach with trim and isnull and it should give you what you are looking for:

| foreach * [eval <<FIELD>> = if(isnull(trim('<<FIELD>>')), "NULL", trim('<<FIELD>>'))]

Let me know if that helps

0 Karma

jedatt01
Builder

That didn't quite handle it, but I was able to get of rid of the spaces by changing my rex command a bit.

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"

Now I just need to be able to handle where myfield=""

Any ideas?

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...