I want to extract a field which is uuid format and name it instanceid
.
props.conf settings
EXTRACT-fields_5 = \[[i]nstance:\s+(?P<instanceid>[0-9a-f]{8}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{12})
For logs like ...
2017-01-01 00:00:00.000 99999 INFO xxxxxxxxxxxx [-] [instance: 01234567-89ab-cdef-0123-456789abcdef] Instance destroyed successfully.
However, it works for some events but it doesn't for some other events.
When I changed the field name to nstanceid
or istanceid
in regex, it works for all events. I don't know what's wrong with the field name instanceid
.
OTOH, rex
command with above regex (field name is instanceid
) works well.
Would somebody give me the reason why??
The problem is two-fold: either the event does not have what you think all of them does (non-conforming event data) OR your RegEx is slightly off and does not fully accommodate all variations of the events (insufficient RegEx). In either case, here is what you need to do to figure it out. Deploy the version that works best, let's say that you are using a field name of instance_id
. Then run a search like this:
... NOT instance_id="*"
This will show you all events that do not have a field called instance_id
. You adjust your RegEx or ignore that type of event (by putting an exclusion for it in your base search) and keep repeating this cycle until you have no events returned from that search.
Mmm... After I changed the extracted field name in regex from instanceid
to instance_id
for workaround, it doesn't work for some events. It worked fine soon after I did change, but 1 hour later, it doesn't.
Could you provide us with the exakt event _raw payload that doesn't match this regex?
Hi diavolo,
my guess would be that in some events there is actually a field called instanceid
.
Try to use a completely new/different field name to test your field extraction, something like this should work for you:
\[instance:\s+(?<ThisIsMyTestFieldName>[^\]]+)
cheers, MuS
Thanks MuS,
instanceid
is not used anywhere. Changing field name like instance_id
works fine. But I was wondering why...
The problem may be the (?P at the beginning of the regex.
Also, I believe you can shorthand hex digits as \h, so your regex can look a bit cleaner if you try this -
EXTRACT-fields_5 = \[instance:\s+(?<instanceid>\h{8}\-\h{4}\-\h{4}\-\h{4}\-\h{12})
see this page for more details - http://www.regular-expressions.info/refext.html
? didn't fix the problem... Also, \h for hex didn't work.
1) when you say "change the field name" are you talking about the underlying data, or the field name being extracted by the regex?
2) can you post an example of an event that the extract did NOT work for?
1) The latter one. I changed regex from (?P<instanceid>...)
to (?P<nstanceid>...)
. It worked.
2)
- Worked:
2017-01-06 03:08:35.416 21995 INFO nova.virt.libvirt.driver [-] [instance: 40624b9c-8179-4cb0-82ec-924ee5362cc0] Instance destroyed successfully.
- Not Worked:
2017-01-06 03:07:25.932 21995 DEBUG nova.network.neutronv2.api [-] [instance: 6708c71b-0f49-4b0b-8040-fec13e3e2a4b] get_instance_nw_info() _get_instance_nw_info /usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py:602
Hi diavolo,
try the following.
(?:\[instance:\s+)(?P<instanceid>[0-9a-f]{8}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{12})(?:\])
Should work fine now.
Unfortunately, it doesn't work. The field can't be extracted in some events.