I am consuming facter data from Puppet in the format of "virtual => vmware". Each host has on average 180 unique data points all in the same format with different field names.
The format is almost a key value pair, but not quite. Is it possible to have Splunk extract the fields for Puppet facts i.e. virtual => vmware so that the field name is virtual and the value is vmware without having to define a field extraction for every single field? Would it make sense to use sed in props to remove ">" to help Splunk automatically extract the field at search time?
Thanks!
Using sed in props.conf is not a bad idea, but it is better to use search time field extractions. You could use props.conf with transforms.conf to do the search time field extractions, like so:
props.conf
[yourpuppetsourcetype]
KV_MODE = none
REPORT-yourpuppetsourcetype=extract-puppet-fields
transforms.conf
[extract-puppet-fields]
REGEX = (\S+?)=>(\S+?)\s+
FORMAT = $1::$2
Don't use the KV_MODE setting if you have some fields that can be auto-extracted (name=value). The REGEX above will extract name=>value pairs, but it will not properly recognize values that have embedded whitespace.
Using sed in props.conf is not a bad idea, but it is better to use search time field extractions. You could use props.conf with transforms.conf to do the search time field extractions, like so:
props.conf
[yourpuppetsourcetype]
KV_MODE = none
REPORT-yourpuppetsourcetype=extract-puppet-fields
transforms.conf
[extract-puppet-fields]
REGEX = (\S+?)=>(\S+?)\s+
FORMAT = $1::$2
Don't use the KV_MODE setting if you have some fields that can be auto-extracted (name=value). The REGEX above will extract name=>value pairs, but it will not properly recognize values that have embedded whitespace.
I removed the KV_MODE=none and changed the regex to
REGEX = (\S+)\s=>\s(\S+)
That works just fine. Thanks for pointing me in the right direction.
Ed