We're ingesting structured JSON logs from a source and would like to run the equivalent of the extract
command on one of the event's sub fields. The events look something like this:
{
"field1":"value1",
"field2":"value2",
"field3":"value3",
"msg":"field4=value4 field5=value5 field6=value6"
}
The top level field1/field2/field3/msg fields are all being extracted as expected. However, we'd also like to extract arbitrary key/value pairs defined in the msg
field, ideally at index time so that they're available to all searches. The key/value pairs that exist in the msg
field are not known beforehand. Is it possible to still extract them at index time and make them available to searches?
We've been able to achieve the desired result with a search command chain like the following:
...base search...
| rename _raw AS _temp
| rename msg AS _raw
| extract pairdelim="?&" kvdelim="="
| rename _raw AS msg
| rename _temp AS _raw
However, we have some dashboards that run lots of searches, and we don't want to hack the above command chain into every individual search query.
I was able to solve this by creating two field transforms like the following that handle the case where the values are in quotes (e.g., key1="value1 with spaces"
) as well as the case where they aren't (e.g., key1=value1withoutspaces
).
json_msg_transform_with_quotes
(?P<_KEY_1>\w+)="(?P<_VAL_1>[^"]*)"
json_msg_transform_without_quotes
(?P<_KEY_1>\w+)=(?P<_VAL_1>[^"\s]+)
I then wired up two new field extractions that use those transforms on the desired source type, and I'm now seeing all the fields (both those from the raw JSON event as well as those embedded in the msg
field) available at query time.
@ckarcher,
Can you please try by adding below configurations in props.conf
?
File path: SPLUNK_HOME/etc/apps/YOUR_APP/local/props.conf
[YOUR_SOURCETYPE]
EXTRACT-field4,field5,field6 = ^[^=\n]*=(?P<field4>\w+)[^=\n]*=(?P<field5>\w+)[^=\n]*=(?P<field6>\w+)
Note: You may need to update the regular expression as per your events/requirement.
Thanks
Per the original post, the names of the key/value pairs in the msg
field are arbitrary and unknown beforehand.
@ckarcher,
You can try this also:
| makeresults | eval _raw="{\"field1\":\"value1\",\"field2\":\"value2\",\"field3\":\"value3\",\"msg\":\"field4=value4 field5=value5 field6=value6\"}" | extract | eval _raw=msg | extract
Hi @kamlesh_vaghela - we've already proven that it's possible to extract the K/V pairs from msg
at search time with an extract
command like you've provided. However, we have dashboards with lots of searches in them, and we want to avoid hacking the rename + extract
command into each of them. Do you know if it's possible to do this in a way that works for all searches against a given source type?
@ckarcher,
please check my below answer.
Hello @ckarcher,
In case the format of msg does not change, you can use rex, as below
| makeresults
| eval _raw="{\"field1\":\"value1\",\"field2\":\"value2\",\"field3\":\"value3\",\"msg\":\"field4=value4 field5=value5 field6=value6\"}"
| spath
| rex field=msg "field4=(?<field4>.*) field5=(?<field5>.*) field6=(?<field6>.*)"
Hi @poete - the format of the msg
field is unknown beforehand. It may contain any number of arbitrary key/value pairs, and we want to extract them all. I've updated the question to reflect this.