The following query is able to join two source logs where one of the source logs is in json format:
(sourcetype="request" AND application=vsp NOT (Agent.007) key_name1 )
OR
(sourcetype="response" key_name2)
| spath
| spath path=your_json_path output=your_output_key_name1
| spath path=your_json_path output=your_output_key_name2
| spath path=your_json_path output=your_output_key_name3
...
| spath path=your_json_path output=your_output_key_name4
stats
first(your_output_key_name1) as your_output_key_name1
first(your_output_key_name2) as your_output_key_name2
first(your_output_key_name3) as your_output_key_name3
first(your_output_key_name4) as your_output_key_name4
first(key_name1) as key1
list(key_name2) as key2
dc(sourcetype) as dc by id
Problem:
JSon path could vary. Therefore, output variables too. Is there a way in Splunk that these could be discovered dynamically?
Regards,
Lp
Have you considered using the "KV_MODE" parameter in your props.conf for this sourcetype? Check it out at http://docs.splunk.com/Documentation/Splunk/5.0.2/Knowledge/Createandmaintainsearch-timefieldextract....
It could be an option too... I have to test it.
This query works:
(sourcetype="request")//sourcelog. Key=value pair format.
OR
(sourcetype="json_response")//source og. json format.
|spath //spath command in auto-extract mode.
|stats list(*) as * by id //join source log by id.
However, I am not source if it is limited by the way spath is ran without arguments. As documented:
When spath is ran with no path argument, spath runs in "auto-extract" mode, where it finds and extracts all the fields from the first 5000 characters in the input field (which defaults to _raw if another input source isn't specified).