I'm trying to parse some input where the kv pairs are nested, but I'm also trying to cheat a little bit. Maybe.
With a data source of:
rec=buddies buddy=Jerry animal=cat(aloof, quadruped, canhaz, Frank) dog(doglike, quadruped, derp, Carol), bird(istheword, winged, flippable, MisterSquawk)
rec=buddies buddy=Augustine animal=horse(toothy, speaks, quadruped, Biscuits) dog(ladylike, tri-ped, rabid, Bubbles)
Can I have Splunk ingest both of these as single records but also return the appropriate record for the following searches:
animal=horse OR animal=bird
?
Next, can I have splunk create a kv pair according to each "attribute" of each animal from this input to allow for searches like:
animal_funct=aloof (returns this record, and "animal_funct" is the field I'd like Splunk to 'discover' and index)
cat_funct=aloof (returns this record, and "cat_funct" is the field I'd like splunk to discover and index)
and better yet, have a search that asks for 'dog_funct=doglike' and will return buddy=Jerry?
It seems like a regex-heavy solution is in my future, but the more Splunky angle is how can I make Splunk heed my nested input bidding?
Finally, with animal being defined 5 times in these two records, how can I make splunk answer a question of "rec=buddy | stats count(animal)" with 5 instead of 2 which it seems to do currently with repeated keys in a single message?
Apologies for this question available upon request. I'm not even sure which terminology I should be using for parsing kv pairs from a list of attributes.
I would say that maybe a regex is not the right way to go here, at least not 100%. I would like at creating search macros instead, that would first filter out events that contained the right terms, then validated them against a regex, rather than trying to extract all the terms from the regex. If you need to report (rather than search) over the data, then I would suggest a custom search command to process the data rather than a pure regex field extraction solution.
You could do things differently if you were able to write the data out differently, but I don't know if you have any control over that.