I am working with a field < source_ip > containing three IP addresses and am wanting to split the values of that field into individual values.
The field data currently looks like this:
10.1.0.1 192.168.0.1, 192.168.2.1
10.1.0.1 192.168.3.1, 192.168.4.2
As you can see, the first and second IP addresses are separated by a space and the second and third is separated by ,
(a comma and a space).
I have tried using SPL commands to split this data, but I feel that a command which uses REGEX may be more suitable.
Is it possible to split these IP addresses into individual values in the same field, ie the < source_ip > field will then contain a list of single IP addresses (rather than splitting the three values into three separate fields).
If the field always contains exactly three IP addresses then this rex
command should do the job.
... | rex field=foo "(?<ip1>[^\s]+)\s(?<ip2>[^,]+),\s(?<ip3>.*)" | ...
If the field always contains exactly three IP addresses then this rex
command should do the job.
... | rex field=foo "(?<ip1>[^\s]+)\s(?<ip2>[^,]+),\s(?<ip3>.*)" | ...
@richgalloway
One other question - I feel that it may likely be the case that only the 2nd or 3rd IP address may be relevant in the end - can I tweak your REGEX code so that it ignore the first and/or second IP and only extracts the third?
Extract just the third IP address with this regex
... | rex field=foo ", (?<ip>.*)" | ...
You also requested an explanation of my original regex.
(?<ip1>[^\s]+)
takes everything up to the first white space and puts it into field 'ip1'
\s(?<ip2>[^,]+)
skips a space then puts everything up to the next comma into field 'ip2'
,\s(?<ip3>.*)
skips a comma and a space and puts the remaining characters into field 'ip3'.
You are a gentleman and a scholar, thank you kindly!
Answer accepted and upvoted.
@richgalloway
Thank you for your response - quick follow up question - can I extract all three of the values to the SINGLE new field - perhaps similar to the below;
... | rex field=foo "(?<ip_new>[^\s]+)\s(?<ip_new>[^,]+),\s(?<ip_new>.*)" | ...
Also - not sure if you have the time, but care to explain the logic behind your REGEX code?
Regular expressions don't allow the same group name to be used more than once so your rex
command won't work.
To split the field into a new (multi-valued) field, use the split
function.
... | eval bar=split(replace(foo, ",", "")," ") | ...
The replace
function removes the comma. The result is a multi-valued field containing the three IP addresses. You can use mv commands to access them.