Splunk Search

Regex Only Returning Partial Field Values

IRHM73
Motivator

Hi,

I wonder whether someone may be able to help me please.

I've put together the following regex to extract the address line from the data below.

"\"lines\":\[\"(?<idaAddress>[^\"]+)"

Field Data

{"matchingDataset":{"surnames":[{"value":"Smith","verified":true}],"gender":{"value":"MALE","verified":true},"dateOfBirth":{"value":"1973-12-26","verified":true},"firstName":{"value":"John","verified":true},"addresses":[{"verified":true,"postCode":"AB1 1BC","lines":["1 A Street","A Town","GB"]}],"middleNames":{"value":"john","verified":true}},"hashedPid":"123","matchId":"_123","levelOfAssurance":"LEVEL_2"}

The problem I have is that it is only extracting the first part of the address, i.e. in the above "1 A Street" where as I would like to extract "1 A Street, A Town, GB".

I'm sure that it's the back of the query which needs to change but despite trying I'm a little unsure about how to solve this.

I just wondered whether someone may be able to look at this please and offer some guidance on how I can go about this.

Many thanks and kindest regards

Chris

Tags (2)
0 Karma
1 Solution

MuS
Legend

Hi IRHM73,

your regex is ending too early; you want to get everything until the next ] so use this regex:

\"lines\":\[\"(?<idaAddress>[^\]]+)

or in a search

base search here | rex "\"lines\":\[\"(?<idaAddress>[^\]]+)" | more splunk fu

cheers, MuS

View solution in original post

MuS
Legend

Hi IRHM73,

your regex is ending too early; you want to get everything until the next ] so use this regex:

\"lines\":\[\"(?<idaAddress>[^\]]+)

or in a search

base search here | rex "\"lines\":\[\"(?<idaAddress>[^\]]+)" | more splunk fu

cheers, MuS

IRHM73
Motivator

Hi @Mus, thank you for coming back to me with this, it is greatly appreciated.

I'm going to persevere with this and see if I can get this to work. I've managed to get Regex101 working with the solutions yourself and tom kindly provided and I've just written my first regex.

I've accepted your answer and if I'm still unable to get this to work, I'll make another post.

Once again sincere thanks for all your time trouble and patience.

Have a good day and kind regards

Chris

0 Karma

MuS
Legend

No need to create a new question! I'll email you to get this working 😉

0 Karma

IRHM73
Motivator

Many thanks 🙂

0 Karma

IRHM73
Motivator

Hi @MuS, thank my you once more for taking the time to come back to me with this and for the working solution. It feels like a very steep learning curve at the moment getting to grips with regex, but I'm sure @ll get there 🙂

Many thanks and kind Regards

Chris

0 Karma

MuS
Legend

Instead of just replying like this: I like to provide useful answers and explain what happens 😉
Just keep trying out regex101.com which explains the regex very well and also try the pcregextest command of Splunk http://docs.splunk.com/Documentation/Splunk/6.0/Troubleshooting/CommandlinetoolsforusewithSupport#pc... which will use Splunk's internal regex and shows what Splunk will match.

cheers, MuS

0 Karma

IRHM73
Motivator

Fair point and many thanks.

0 Karma

IRHM73
Motivator

Hi @MuS, I'm very sorry to trouble you with this again. The solution you provided does work, but unfortunately the speech marks at the end of the expression are returned in the output e.g.

1 A Street"
" A Town"
"GB"

Is there any chance that these could be removed. As per your suggestion I've searched for solution on this forum and used regex101 to resolve this, but I just can't get it right without imbalanced ']' errors.

Could you possibly let me know where I've gone wrong please?

Many thanks and kind regards

Chris

0 Karma

tom_frotscher
Builder

Hey,

instead of putting it all in one field, put it in 3 fields for street, city and country. Therefore your can just use a similar rex, but with more capturing groups:

\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"

Remember if you use it in Splunk rex, put the " around it. After your rex you got 3 new fields, Street, Town and City. If you want you can use them as they are, or you can use an eval to combine them into one field again:

... | eval adress = 'Town'." ".'City'." ".'Country'

Greetings

Tom

0 Karma

IRHM73
Motivator

Hi @Tom, thank you for coming back to me with this. I've tried the solution you kindly provided but unfortunately I receive a 'Search Parser' error.

What also baffles me a little is that I ran the expression through Regex101 and it didn't show any errors?

The rex line is as follows:

 rex field="detail.input-ida-request" "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"

Many thanks and kind regards

Chris

0 Karma

MuS
Legend

You're missing the last " so use it in the search this way:

base search here | rex field="detail.input-ida-request" "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"" | more splunk fu
0 Karma

IRHM73
Motivator

Hi @MuS thank you very much for this, but unfortunately now no longer returns of any of the address details.

Many thanks and kind regards

Chris

0 Karma

MuS
Legend

the regex matches on regex101 without problem, but in Splunk you must put this regex inside of a starting " and a end " but have you tried it without the field name? So the regex is done on the _raw

base search here | rex  "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"" | table Street Town Country
0 Karma

IRHM73
Motivator

Hi @MuS, unfortunately I'm still getting the same error even though I've taken the field name out.

I'm very conscious of taking up your time, so do you think me may go back to the solution you provided which was "\"lines\":\[\"(?[^\]]+)" because this works and will suit my purposes better.

The only problem I had with this is that it returned:

1 A Street"
" A Town"
"GB"

Where I would like please if possible:

1 A Street
A Town
GB

i.e. without the speech marks.

My apologies for being a thorn in your side 🙂

Many thanks and kind regards

0 Karma

MuS
Legend

can you post the lit search from the job inspector? the regex is not wrong; there must be some other problem.....

0 Karma

woodcock
Esteemed Legend

Also, make sure that you click "Accept".

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...