Splunk Search

Need advice on a complex field extraction

arkadyz1
Builder

I have some data which are of the following format:

CommonPrefix.1.name="Field1",CommonPrefix.1.type="STRING",CommonPrefix.1.status="alive",CommonPrefix.2.name="Field2",CommonPrefix.2.type="NUMBER",CommonPrefix.2.value="3",CommonPrefix.2.status="seen"

etc. I would like to extract some fields so that name above will become a field name and status will become a value. So the data above would yield two extra fields: Field1=alive and Field2=seen. I know that those numbers always go from 1 to 7, and that .name always precedes .status.

I tried to make a transform like this:
In props.conf:

[MySourceType]
TRANSFORMS-myfield1 = transformed_1
...
TRANSFORMS-myfield7 = transformed_7

and in transforms.conf:

[transformed_1]
REGEX = CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576
...
[transformed_7]
REGEX = CommonPrefix\.7\.name=”([^”]*)”.*CommonPrefix\.7\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576

I'm using LOOKAHEAD because my data are quite long. I tried to use _KEY_1 + _VAL_1 capturing groups as well, instead of or in addition to FORMAT. Nothing worked - the fields are not extracted.

Any ideas on what to fix here?

0 Karma
1 Solution

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

View solution in original post

MuS
Legend

Hi arkadyz1,

Your regex would work! But you have a format issue; your double quotes are windownized and therefore wrong 😉

This is working:

 CommonPrefix\.1\.name="([^"]*)".*CommonPrefix\.1\.status="([^"]*)"

This is not working:

 CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”

Hope this helps ...

cheers, MuS

0 Karma

arkadyz1
Builder

The quotes are fine in transforms.conf, it's just this site that windownized them. So no, it's not that. I tried escaping them with backslashes, which also didn't work.

0 Karma

MuS
Legend

Your regex works on your provided sample event see http://pasteboard.co/gzVlDIRjH.png :

alt text

Make sure your sourcetype matches, you placed the props.conf on the parsing Splunk instance and restarted splunk afterwards.

0 Karma

arkadyz1
Builder

I added capturing groups as suggested by adamsaul in the accepted answer and it started working. I also escaped double quotes with backslashes but I tried that before. Really strange...

0 Karma

MuS
Legend

Of course facepalm - good spotting in this case!

0 Karma

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

adamsaul
Communicator

The above is assuming you do not want to keep the surrounding " 's

0 Karma

arkadyz1
Builder

I'm not sure why adding capturing groups worked, but it did. Really weird...

0 Karma

adamsaul
Communicator

Technically you have capturing groups as well, but I also used non-capturing groups so that Splunk doesn't interpret any other data (not that it should).

Glad it worked for you!

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...