Splunk Search

How can I extract key value pairs with spaces in the keys?

jmartens
Path Finder

I am trying to extract data from plain text files which contain data like this:

Angle Transverse Current (A):   0.06143188
Position Radial Current (A):    -0.8885803
Position Transverse Current (A):    -0.5808258
Trim : (A)  -0.06399536
Accelerator Vacion Current (uA):    0.009002686
Positive 5V dc: 4.993916
Positive 24V dc:    24.4706
Analog Negative 5V dc:  5.054981
Analog Positive 5V dc:  4.97241
Negative 12V dc:    12.24357
Positive 3V dc: 3.303525
Node Power Supply Voltage (V):  24.58121
Water Level :   Normal
City Water Temperature (deg C): 20.54332
Internal Water Supply Temperature (deg C):  40.51909
Gas Pressure (PSI): 32.3421

I would like to extract the left hand part as the key and the right hand part as the value, but it seems the space in the key is causing havoc and I fail to get proper key value pairs.

I have now resorted to a regular expression solution, but even that does not seem to work. This is my configuration.

transforms.conf:

[kv-extraction-trajectory-text]
FORMAT = $1::$2
REGEX = (.*(?!\s))(?:(?<!\s\s)\:\s+)((?:[\-\+])?\d+\.\d+|\w+$|[\-\+]\d+|\w+\s\d+)

My regular expression seems to work pretty well as can be seen at https://regex101.com/r/iH2mG9/1

However when I add it to my props.conf or add it when I manually test it in the webinterface using file upload it seems not to work.

props.conf:

[test]
DATETIME_CONFIG =
NO_BINARY_CHECK = true
EXTRACT-kv = kv-extraction-trajectory-text
SHOULD_LINEMERGE = false
category = Custom
description = Testing
disabled = false
pulldown_type = true

Any hints on what I might try next or what I am doing wrong?

0 Karma
1 Solution

jmartens
Path Finder

Thanks all for your input.

With all your hints and suggestions I was eventually to mock up something that for now seems to be workable, with the added bonus of not needing a transforms.conf anymore, all I need is this in props.conf

SEDCMD-remove-whitespace = s/\s//g
SEDCMD-remove-unit = s/\(.*\)//g
SEDCMD-replace-colon-with-assign = s/:/=/g 

View solution in original post

jmartens
Path Finder

Thanks all for your input.

With all your hints and suggestions I was eventually to mock up something that for now seems to be workable, with the added bonus of not needing a transforms.conf anymore, all I need is this in props.conf

SEDCMD-remove-whitespace = s/\s//g
SEDCMD-remove-unit = s/\(.*\)//g
SEDCMD-replace-colon-with-assign = s/:/=/g 

DMohn
Motivator

If you want to use the part left of the colon as the key, you will have to remove the whitespaces. Either replace them by underscores or get rid of them completely (which I would recommend, as the keys are still readable then).

Then you can even simplify your extraction regex.

For props.conf:

 SEDCMD-removewhitespace = s/ //g
 EXTRACT-kv = kv-extraction-trajectory-text

For transforms.conf

 [kv-extraction-trajectory-text]
 FORMAT = $1::$2
 REGEX = ([^:]+):([\w\d-\.]+)
0 Karma

jmartens
Path Finder

I had to adjust your situation a little bit as the SEDCMD line now only replaces spaces, where there seems to be tabs as whitespace charachters as well, but by replacing your space with \s all whitespace is removed.

After doing that I can see that the SEDCMD is working as the events are now listed without spaces in Splunk, as can be seen at this screenshot http://imagebin.ca/v/2VunPn0OEd0B

However as you can see in the 1[screenshot][1] field extraction seems to still not work. I get 53 lines of log data as expected, but no fields are extracted.

I tested your regex for transform.conf and apart from some lines not matching as they should I would expect some fields to be extracted as shown here https://regex101.com/r/mV9lB3/2

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi jmartens,

un-tested and not sure if this will work, but your assumption about the spaces is correct - this causes havoc. What you could try is to use

  FORMAT = "$1"::$2

in props.conf. Another approach would be to remove/replace all whitespace from the left hand part of the string with sed in a first transform and in the second transform use the extract to get left hand side as key and right hand side as value.

Hope this helps ...

cheers, MuS

0 Karma

jmartens
Path Finder

I already tried quoting the key part in the FORMAT definition, to no avail unfortunately.

0 Karma

lvetter
Explorer

you should use | rex in search to troubleshoot your regular expression. But to be honest, I have tried making the key set by regex and have been unsuccessful, so good luck.

To use rex for this, append this to your search string | rex field=_raw "(.*(?!\s))(?:(?

0 Karma

jmartens
Path Finder

Thanks for your suggestion, but since I am testing using the file upload data import, AFAIK data is not added to the index until after you complete the wizard.

Since I am not even halfway through the wizard, doing this at search time is impossible.

I have tested my regex using other tools, like the website I mentioned, you should be able to even see the results at the link I provided in my opening post, for your convenience: https://regex101.com/r/iH2mG9/1

0 Karma

jmartens
Path Finder

After finding this (https://answers.splunk.com/answering/685/view.html
) on Splunk Answers I have been able to quickly add the data from the file to a test index.

After adding the | rex stanza and providing named capturing groups like this:

index=test | rex field=_raw "(?<key>.*(?!\s))(?:(?<!\s\s)\:\s+)(?<value>(?:[\-\+])?\d+\.\d+|\w+$|[\-\+]\d+|\w+\s\d+)"

it seems to at least parse key and value at search time, still no luck at index time however.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...