Hi,
I wonder whether someone may be able to help me please.
I've put together the following regex to extract the address line from the data below.
"\"lines\":\[\"(?<idaAddress>[^\"]+)"
Field Data
{"matchingDataset":{"surnames":[{"value":"Smith","verified":true}],"gender":{"value":"MALE","verified":true},"dateOfBirth":{"value":"1973-12-26","verified":true},"firstName":{"value":"John","verified":true},"addresses":[{"verified":true,"postCode":"AB1 1BC","lines":["1 A Street","A Town","GB"]}],"middleNames":{"value":"john","verified":true}},"hashedPid":"123","matchId":"_123","levelOfAssurance":"LEVEL_2"}
The problem I have is that it is only extracting the first part of the address, i.e. in the above "1 A Street" where as I would like to extract "1 A Street, A Town, GB".
I'm sure that it's the back of the query which needs to change but despite trying I'm a little unsure about how to solve this.
I just wondered whether someone may be able to look at this please and offer some guidance on how I can go about this.
Many thanks and kindest regards
Chris
Hi IRHM73,
your regex is ending too early; you want to get everything until the next ]
so use this regex:
\"lines\":\[\"(?<idaAddress>[^\]]+)
or in a search
base search here | rex "\"lines\":\[\"(?<idaAddress>[^\]]+)" | more splunk fu
cheers, MuS
Hi IRHM73,
your regex is ending too early; you want to get everything until the next ]
so use this regex:
\"lines\":\[\"(?<idaAddress>[^\]]+)
or in a search
base search here | rex "\"lines\":\[\"(?<idaAddress>[^\]]+)" | more splunk fu
cheers, MuS
Hi @Mus, thank you for coming back to me with this, it is greatly appreciated.
I'm going to persevere with this and see if I can get this to work. I've managed to get Regex101 working with the solutions yourself and tom kindly provided and I've just written my first regex.
I've accepted your answer and if I'm still unable to get this to work, I'll make another post.
Once again sincere thanks for all your time trouble and patience.
Have a good day and kind regards
Chris
No need to create a new question! I'll email you to get this working 😉
Many thanks 🙂
Hi @MuS, thank my you once more for taking the time to come back to me with this and for the working solution. It feels like a very steep learning curve at the moment getting to grips with regex, but I'm sure @ll get there 🙂
Many thanks and kind Regards
Chris
Instead of just replying like this:
I like to provide useful answers and explain what happens 😉
Just keep trying out regex101.com
which explains the regex very well and also try the pcregextest
command of Splunk http://docs.splunk.com/Documentation/Splunk/6.0/Troubleshooting/CommandlinetoolsforusewithSupport#pc... which will use Splunk's internal regex and shows what Splunk will match.
cheers, MuS
Fair point and many thanks.
Hi @MuS, I'm very sorry to trouble you with this again. The solution you provided does work, but unfortunately the speech marks at the end of the expression are returned in the output e.g.
1 A Street"
" A Town"
"GB"
Is there any chance that these could be removed. As per your suggestion I've searched for solution on this forum and used regex101 to resolve this, but I just can't get it right without imbalanced ']' errors.
Could you possibly let me know where I've gone wrong please?
Many thanks and kind regards
Chris
Hey,
instead of putting it all in one field, put it in 3 fields for street, city and country. Therefore your can just use a similar rex, but with more capturing groups:
\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"
Remember if you use it in Splunk rex, put the " around it. After your rex you got 3 new fields, Street, Town and City. If you want you can use them as they are, or you can use an eval to combine them into one field again:
... | eval adress = 'Town'." ".'City'." ".'Country'
Greetings
Tom
Hi @Tom, thank you for coming back to me with this. I've tried the solution you kindly provided but unfortunately I receive a 'Search Parser' error.
What also baffles me a little is that I ran the expression through Regex101 and it didn't show any errors?
The rex line is as follows:
rex field="detail.input-ida-request" "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"
Many thanks and kind regards
Chris
You're missing the last "
so use it in the search this way:
base search here | rex field="detail.input-ida-request" "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"" | more splunk fu
Hi @MuS thank you very much for this, but unfortunately now no longer returns of any of the address details.
Many thanks and kind regards
Chris
the regex matches on regex101 without problem, but in Splunk you must put this regex inside of a starting "
and a end "
but have you tried it without the field name? So the regex is done on the _raw
base search here | rex "\"lines\":\[\"(?<Street>[^\"]+)\",\"(?<Town>[^\"]+)\",\"(?<Country>[^\"]+)\"" | table Street Town Country
Hi @MuS, unfortunately I'm still getting the same error even though I've taken the field name out.
I'm very conscious of taking up your time, so do you think me may go back to the solution you provided which was "\"lines\":\[\"(?[^\]]+)"
because this works and will suit my purposes better.
The only problem I had with this is that it returned:
1 A Street"
" A Town"
"GB"
Where I would like please if possible:
1 A Street
A Town
GB
i.e. without the speech marks.
My apologies for being a thorn in your side 🙂
Many thanks and kind regards
can you post the lit search
from the job inspector? the regex is not wrong; there must be some other problem.....
Also, make sure that you click "Accept".