What are the options for parsing JSON data at inde...

ronak · ‎01-20-2015

Splunk Gurus -

I've yet not absorbed JSON data in my setup, but I'm anticipating many sources in near future generating lot of JSON data., I wanted to gather some inputs from this group -

I've read that use of spath is the way to work with JSON objects. In this regard,

Do I've to use spath every time with all the searches while working with JSON data ?
Does JSON data and spath have any constraints in regards to creating summaries, creating and searching data models ?
what kind of performance impact I should anticipate when using spath vs not using it
Do Splunk users take a route of converting JSON into KV pairs and then index? If so, what is that situation that you have faced?
What will be key constraints that I should be aware of as user of spath for working with JSON data

thanks for your inputs

best, ronak

vbumgarner · ‎01-25-2015

Feature request...
Is there any way to make a single extract that runs before KV_MODE = JSON kicks in? I keep seeing cases where just the body of the message is json, listed after the date, level and logger. In that case, you have to run the rex first followed by the spath.

To the original question...
If the whole body of the event is json, then the fields will automatically be extracted. It "just works." You can summarize like any other extracted field. I've never measured the performance, but it seems pretty good, but just think about what it's doing... you probably don't want to convert all of your logs to json.

The only caveat I can give is to avoid complicated json documents if you can control the documents. For example, if you have arrays of objects that are actually each an event, you'll have to do some gymnastics with mvexpand to keep the fields in each nested event related.

trsavela · ‎01-20-2015

If your data is all json then you want this in your props:

KV_MODE = JSON

http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf

For my JSON data I do little else. For my large data sets I use a tsindx to store what I need for speed.

If not pure JSON you need the json in a filed then extract

  | rex field=_raw "(?s)(?<xxx>match_json_here)" | spath input=xxx

ronak · ‎01-20-2015

BTW, the version that I've is 6.2.1

What are the options for parsing JSON data at index and search-time and are there any key constraints to be aware of?

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes