Splunk Search

Parsing a JSON string in search object

bshega
Explorer

We changed how our data was getting into splunk instead of dealing with full JSON we're just importing the data straight from the database. We have a dashboard that lets our consumer services team search by address, we're using spath currently to parse the JSON. We don't have to do that anymore with the new format but the additional_information part of our object is still JSON, how can I parse this?

This is that the data looks like

2017-01-04 16:41:34.439, id="60", created_at="2017-01-04 16:41:34.43926", updated_at="2017-01-04 16:41:34.43926", email="", encrypted_password="", token="", first_name="Brandon", last_name="Shega", additional_info=""{\"address_2\":\"\",\"city\":\"North Olmsted\",\"zip_code\":\"44070\",\"country\":\"us\",\"address_1\":\"23500 Al Moen Dr\",\"state\":\"Ohio\"}""

So additional_info is really a string that contains a JSON object, is this possible to be parsed to be able to create a dropdown with the full address and search on that?

Tags (1)
1 Solution

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

somesoni2
Revered Legend

You should be able to use | spath input=additional_info to parse that embedded json data and extract fields. If those escaped double quotes are causing issue with spath, you may have to correct it before using spath (either by eval-replace or rex-sed).

0 Karma

bshega
Explorer

I actually noticed something else, no matter what I do, Splunk always puts the additional_info as blank, I'm thinking it's because it's a string within a string (notice the double quotes at the beginning and end). So I think Splunk is just taking the first set of quotes and saying that it's an empty string. Is there a way to massage the data already in Splunk to remove the outer set of quotes?

0 Karma

bshega
Explorer

So, I've got something like this

index=iot-productiondb source=Users | eval additional_info = replace(additional_info, "\\/", "") | eval additional_info = ltrim(additional_info, "\"") | eval additional_info = rtrim(additional_info, "\"") | spath input=additional_info | table additional_info

I'm mostly just trying to make sure the JSON is parsing correctly with the table command. Does this look correct? I'm not sure how to actually access the information after the spath. I've tried something like additional_info.country

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...