Getting Data In

How can we extract a json document within an event?

ddrillic
Ultra Champion

We have events such as -

10.10.2017 09:40:39.651 *INFO* [10.86.208.119 [1507646439651] POST /apps/xxxx/yyyy HTTP/1.1] com.xxxx.yyyy.api.impl.logging.info.InfoLoggerServiceImpl {"id":{"access_token":"7ee2ea18-e72c-449d-9dec-28d02b116c92","uid":"zzzzz","jsessionID":"aaaaaaa","uuid":"12e255ac-35e9-4630-a36b-89aa27e9566e"},"request":{"url":"https://bbb.cccc.com/content/uuuuuu"..... }]}}

The json document is part of the event. Can we extract this json document?

Tags (1)
0 Karma
1 Solution

sshelly_splunk
Splunk Employee
Splunk Employee

I took a quick look at this, and I think this transforms might work for you. This will not get the "id" or "request" fields, as I am not sure what they are. This did get the following: access_token, uid, jsessionID, uuid and url.
In props, I added: REPORT-extract = json_embedded
The transforms stanza is:
[json_embedded]
REGEX = "(\w+)"."(\S+?)"
FORMAT = $1::$2

Hope this helps. Reply if it does not.

View solution in original post

sshelly_splunk
Splunk Employee
Splunk Employee

I took a quick look at this, and I think this transforms might work for you. This will not get the "id" or "request" fields, as I am not sure what they are. This did get the following: access_token, uid, jsessionID, uuid and url.
In props, I added: REPORT-extract = json_embedded
The transforms stanza is:
[json_embedded]
REGEX = "(\w+)"."(\S+?)"
FORMAT = $1::$2

Hope this helps. Reply if it does not.

ddrillic
Ultra Champion

Just applied it and it works perfectly - much appreciated. Just wondering if there is anything like the spath command that we use for XML documents for json documents, so we can reach nested elements ...

0 Karma

sshelly_splunk
Splunk Employee
Splunk Employee

ddrillic - You can index just the json portion of the event, but it looks like the text before the json portion includes timestamp, etc. Since this log is not proper json, I think you're going to need to do regex on it for display purposes.

When looking at xml or json data (assuming it conforms to standards - sorry not exactly sure what that all entails:)), you can use kvmode=xml or json, or use something like the above. My skills are really around getting data in, and not SPL proper (I know, I know:)), so I will defer to the SPL experts for the spl-specific questions, but my focus is really on making sure data comes in correctly, so the SPL doesn't need to be complex to get value out of the data. Sorry if that doesnt help.

ddrillic
Ultra Champion

Very interesting, so you are saying that if it's a "real" json document we can parse it as such - interesting.

0 Karma

ddrillic
Ultra Champion

Beautiful thing!!! I wanted to ask for a while - is there a way to test these configurations somehow from the search interface before adding these configurations to the config files?

0 Karma

blacknight659
Explorer

You could take a raw copy of the logs and use the UI to upload and test the event breaking and extraction. I think Splunk really likes Json since it auto extracts the fields and values.

sshelly_splunk
Splunk Employee
Splunk Employee

I use regex101 to test all of my transforms (unless they are extremely simple:)).
Copy 2 events if available into the "Test String" window, and go to town.

sshelly_splunk
Splunk Employee
Splunk Employee

sorry - just re-read your question. I test regex in the search bar sometimes, but not usually. Slightly different format, etc, so I use regex101, but might be just a preference.

0 Karma

ddrillic
Ultra Champion

Perfect, so I got the REGEX part. What does the FORMAT - $1::$2 mean?

0 Karma

ddrillic
Ultra Champion

For future reference -

FORMAT = $1::$2 (where the REGEX extracts both the field name and the field value)

from Create custom fields at index time

0 Karma

sbbadri
Motivator

https://regex101.com/r/FPxKuU/1

or

| makeresults | eval test="10.10.2017 09:40:39.651 INFO [10.86.208.119 [1507646439651] POST /apps/xxxx/yyyy HTTP/1.1] com.xxxx.yyyy.api.impl.logging.info.InfoLoggerServiceImpl {\"id\":{\"access_token\":\"7ee2ea18-e72c-449d-9dec-28d02b116c92\",\"uid\":\"zzzzz\",\"jsessionID\":\"aaaaaaa\",\"uuid\":\"12e255ac-35e9-4630-a36b-89aa27e9566e\"},\"request\":{\"url\":\"https://bbb.cccc.com/content/uuuuuu\"..... }]}}" | rex field=test "(?P\"(\w+)\".\"(\S+))\""

ddrillic
Ultra Champion

Wow - man. very pretty!!!

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...