Splunk Search

How to extract key, field name, and value with regex?

tcmarquesi
Explorer

I'm wondering if somebody had faced this freaking behavior.

I wanna extract both key, the field name, and its value from my (pretty uncommon) log and, in order to this I did the following:

In first place I made the search bellow just to test the regex, and it's working perfectly.

... | rex max_match=0 field=_raw "(?<test1>\w+)\(.+\)=(?<test2>[^\(].*)[\n|\r]"

I then replaced the test1 and test2 tags by _KEY_1 and _VAL_1 to assign properly each matched group to key and value as I wanted.

... | rex max_match=0 field=_raw "(?<_KEY_1>\w+)\(.+\)=(?<_VAL_1>[^\(].*)[\n|\r]"

From here ahead the extraction didn't work anymore.

So, had someone handled successfully same problem using this _KEY_1 and _VAL_1 tags? It seems like a bug for me.

Thanks in advance,

Tiago

0 Karma

irenefdezbb
Observer

Maybe, not working with _KEY_1 and _VALUE_1 because of splunk reserves the fields beginning with _ for your own settings, if I remember correctly.

0 Karma

aguthrie1190
Path Finder

Late to the party here, but I had a similar need to this and saw that this question hadn't been answered. Basically do your extractions, then use {} in an eval to have a variable fieldname.

| gentimes start=-2
| eval _raw="extract"+starttime+" this"+endtime
| rex field=_raw "(?<field_name>extract[0-9]+)\s(?<field_value>this[0-9]+)"
| eval {field_name}=field_value

Then if you care, you can get rid of the placeholder fields:

| gentimes start=-2
| fields - *human
| eval _raw="extract"+starttime+" this"+endtime
| rex field=_raw "(?<field_name>extract[0-9]+)\s(?<field_value>this[0-9]+)"
| eval {field_name}=field_value
| fields - field_name field_value

These searches should run anywhere. The idea came from here https://answers.splunk.com/answers/103700/how-do-i-create-a-field-whose-name-is-the-value-of-another....

Tags (1)

tcmarquesi
Explorer

Just to stay everybody in the same page, using "_" is not a problem, indeed both _KEY_foo and _VAL_bar are reserved tags in order to allow splunk find the field name a its value into the text, as in docs.

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Configureindex-timefieldextraction#Add_a_rege...

0 Karma

snoobzilla
Builder

Yes, I have done this, not with a variable delimiter, but I think a field transform will work.

I used this for logs with ]:[ key-value delimiter and ] [ as pair delimiter, e.g. [KEY1]:[VALUE1] [KEY2]:[VALUE2] [KEY3.....

From webui for example above...

Create Transform...

Fields-->Field Transformations--New
Regular Expression: \[([a-zA-Z0-9_]*?)\]\:\[([^\]]*?)\]
Source Key: _raw
Format: $1::$2

Create Extract
Then create new field extract, choose Type of transform, and point to the transform you created.

Tip: use regex101.com or equivalent to test your regex... it will work there and in transform but I get errors using this inline.

tcmarquesi
Explorer

I'd done this but through transforms.conf. Indeed I can see my stanza through UI.

About the regex, I tested it exhaustively in both regex101.com and regexr.com/v1, and it's working perfectly.

0 Karma

snoobzilla
Builder

Did you try method above with your rex without named capturing groups at all?... e.g.

(\w+)\(.+\)=([^\(].*)[\n|\r]

Note the Format field in transform: $1::$2

0 Karma

tcmarquesi
Explorer

Yes, I did. It was my starting point.

This issue really seems as a bug for me...

0 Karma

snoobzilla
Builder

Bummer. You may be right, may be a limitation.

Assume you saw this... https://answers.splunk.com/answers/133561/multiple-key-value-pair-extraction.html

Good luck.

0 Karma

tcmarquesi
Explorer

Thanks all help. 🙂

0 Karma

snoobzilla
Builder

Or maybe

(\w+?)\(.+\)=([^\(].*?)[\n|\r]  
0 Karma

tcmarquesi
Explorer

Just few additional comments:

I need to use regex because my log is a little unusual, it can't be automatically parsed.

I don't want to change my log with sed or something like that, is important to me keep it original.

In fact I intend to implement it in transforms.conf. I made the question using the SPL search because it behaved equally and it's easier to be reproduced.

Regards,

Tiago

0 Karma

snoobzilla
Builder

You need to do this using a field transform and reference that transform in a field extraction. I can get these working on regex101.com but have not had luck using them inline.

See https://answers.splunk.com/answers/126754/transforms-field-value-extract-not-fully-working.html

0 Karma

sundareshr
Legend

Splunk regex does not like _ in field names. Having said that, have your looked at the extract command, that may be a better options.

... | extract kvdelim="=" pairdelim="\n"

http://docs.splunk.com/Documentation/Splunk/6.5.0/SearchReference/Extract

tcmarquesi
Explorer

Thanks, but you missed my log is not that simple. Between key and value there is some text like "(foo 12)=". So I have to use regex, extract is ineffective.

0 Karma

rjthibod
Champion

Leading underscores on field names is a no-no. Splunk uses leading underscores on field names for special / hidden fields.

Try renaming your fields to something with no leading underscore.

0 Karma

rjthibod
Champion

Here is a link with more details about internal fields. http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/Usedefaultfields

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...