Hello all,
I am currently having some problems with filtering my raw data into a metric index. My raw data currently looks like this:
Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376
my main issue is with the 'counter' and 'collection' fields which have values that contain spaces. e.g. Available Bytes.
I initially was using the field_extraction TRANSFORM in order to parse the data. Here are the relevant stanzas from my props.conf and transforms.conf here:
props.conf:
[mkv:meminfo:Memory]
TRANSFORMS-EXTRACT = field_extraction
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics
transforms.conf:
[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value
BUT this only seemed to take the first word of the phrase. e.g. in splunk, counter would only be 'Available' (see image below)
I then tried to manually extract the field using REGEX through the config files. This is what my transforms.conf and props.conf look like at this point:
Data:
Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376
props.conf:
[mkv:meminfo:Memory]
TRANSFORMS-metricsfields = custom_field_extractor
METRIC-SCHEMA-TRANSFORMS = metric-schema:extract_metrics
category = Log to Metrics
transforms.conf:
[custom_field_extractor]
REGEX = ([a-zA-Z]+)=([^,]*)
FORMAT = $1::$2
WRITE_META = true
REPEAT_MATCH = true
[metric-schema:extract_metrics]
METRIC-SCHEMA-MEASURES = Value
This produces the same results, the counter and collection values are still only 'Available'.
Can anybody see a problem with the strategy that i'm implementing?
NOTE: have also added stanza to fields.conf although not sure if it's doing anything:
[metricsfields]
INDEXED=true
Keep everything the same but change this:
REGEX = ,([^=]+)\s*=\s*([^,]+)
Hey Gregg,
Made the REGEX change you suggested and when i restarted splunk gave me this error:
Bad regex value: ',([^=]+)\s*=\s*(?[^,]+)', of param: transforms.conf / [custom_field_extractor] / REGEX; why: unrecognized character after (? or (?-
One or more regexes in your configuration are not valid. For details, please see btool.log or directly above.
Hey Gregg, still doesn't seem to be working 😞 am still only seeing 'Available' instead of 'Available Bytes'. Could this be some sort of splunk bug?
Indexed fields cannot span major segments. Space " " breaks the value into multiple major segments. The value to be indexed must not contain major index breakers like space " ".
I can PROVE that this works. Run this search and look at the results:
| makeresults
| eval _raw="Date=2019-02-15_00:06:04_+0000,collection=Available Memory,object=Memory,counter=Available Bytes,metric_name=available_bytes,instance=0,Value=5155557376"
| rex max_match=0 ",(?<key>[^=]+)\s*=\s*(?<value>[^,]+)"
So, why might this not be working? Did you:
Use the ORIGINAL sourcetype value in your stanza header if you are doing sourcetype override/overwrite?
Deploy to the first full instance of Splunk that handles these events (HF or Indexers)?
Restart all splunk instances there.
Send fresh data in after the restarts.
Test with a search using a Time picker
value of All time
and SPL like this:
index=* sourcetype=YourOriginalSourcetypeHere _index_earliest=-5m
I blew it and left a stray ?
in there. I edited my original answer and fixed it. Try it now.