Splunk Search

How to parse fields with spaces at index time for metrics?

tlscelsi
Engager

Hello all,

I am currently having some problems with filtering my raw data into a metric index. My raw data currently looks like this:

Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376

my main issue is with the 'counter' and 'collection' fields which have values that contain spaces. e.g. Available Bytes.

I initially was using the field_extraction TRANSFORM in order to parse the data. Here are the relevant stanzas from my props.conf and transforms.conf here:

props.conf:

[mkv:meminfo:Memory]
TRANSFORMS-EXTRACT = field_extraction
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics

transforms.conf:

[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value

BUT this only seemed to take the first word of the phrase. e.g. in splunk, counter would only be 'Available' (see image below)

I then tried to manually extract the field using REGEX through the config files. This is what my transforms.conf and props.conf look like at this point:

Data:
Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376

props.conf:

[mkv:meminfo:Memory]
TRANSFORMS-metricsfields = custom_field_extractor
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics

transforms.conf:

[custom_field_extractor]
REGEX = ([a-zA-Z]+)=([^,]*)
FORMAT = $1::$2
WRITE_META = true
REPEAT_MATCH = true

[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value

This produces the same results, the counter and collection values are still only 'Available'.
Can anybody see a problem with the strategy that i'm implementing?

NOTE: have also added stanza to fields.conf although not sure if it's doing anything:

[metricsfields]
INDEXED=true
Labels (1)
0 Karma

woodcock
Esteemed Legend

Keep everything the same but change this:

REGEX = ,([^=]+)\s*=\s*([^,]+)
0 Karma

tlscelsi
Engager

Hey Gregg,

Made the REGEX change you suggested and when i restarted splunk gave me this error:
Bad regex value: ',([^=]+)\s*=\s*(?[^,]+)', of param: transforms.conf / [custom_field_extractor] / REGEX; why: unrecognized character after (? or (?-
One or more regexes in your configuration are not valid. For details, please see btool.log or directly above.

0 Karma

tlscelsi
Engager

Hey Gregg, still doesn't seem to be working 😞 am still only seeing 'Available' instead of 'Available Bytes'. Could this be some sort of splunk bug?

0 Karma

landen99_gdms
Explorer

Indexed fields cannot span major segments.  Space " " breaks the value into multiple major segments.  The value to be indexed must not contain major index breakers like space " ".

0 Karma

woodcock
Esteemed Legend

I can PROVE that this works. Run this search and look at the results:

| makeresults 
| eval _raw="Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376" 
| rex max_match=0 ",(?<key>[^=]+)\s*=\s*(?<value>[^,]+)"

So, why might this not be working? Did you:
Use the ORIGINAL sourcetype value in your stanza header if you are doing sourcetype override/overwrite?
Deploy to the first full instance of Splunk that handles these events (HF or Indexers)?
Restart all splunk instances there.
Send fresh data in after the restarts.
Test with a search using a Time picker value of All time and SPL like this:

index=* sourcetype=YourOriginalSourcetypeHere _index_earliest=-5m
0 Karma

woodcock
Esteemed Legend

I blew it and left a stray ? in there. I edited my original answer and fixed it. Try it now.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...