I currently index a range of semi-structured log lines which contain a mix of textual and json data. I've recently upgraded to Splunk 4.3.4 and was hoping to use the Splunk 4.3+ "kv_mode=json" feature to avoid maintaining a series of regexes.
Unfortunately, it looks like the json parsing feature expects the entire log line to be json based. Is this correct? If so is it possible to configure the feature to extract based on a field name? Alternatively are there any other suggestions?
Example log line....
[08/Oct/2012:14:31:22.965314 +0100] UHLVqqwQ52oAAGX1I2oAAAAE - [mbet] NOTICE (5): [TIMER] request.run {"init":0.93698501586914,"run":25.791883468628,"uri":"/racing/home","time":31.274080276489}
As a workaround I’ve used a transform to automatically extract any json elements into a “json” field. I then use spath to decode it at search time. It would be great if I could get the json parsed automatically thereby avoiding the spath.
Transform:
REGEX = (?
SOURCE_KEY - _raw
Example search:
index=web request.run | spath input=json
If you're just doing simple key value pairs with json, you can do this using a transform with MV_ADD
[report-json]
REGEX = (?
[report-json-kv]
SOURCE_KEY = json
REGEX = "(\w+)":"?([^,}"]+)
FORMAT = $1::$2
MV_ADD = true
Referring to it like so in your props:
[sourcetype]
REPORT-json = report-json, report-json-kv
If you're just doing simple key value pairs with json, you can do this using a transform with MV_ADD
[report-json]
REGEX = (?
[report-json-kv]
SOURCE_KEY = json
REGEX = "(\w+)":"?([^,}"]+)
FORMAT = $1::$2
MV_ADD = true
Referring to it like so in your props:
[sourcetype]
REPORT-json = report-json, report-json-kv
Wow.. this one works like a miracle...Thanks a lot @dart
Just one thing. it totally ignores and removes the timestamp section in the beginning. How can I retain that and still be able to parse the json?