Splunk Search

How do you fetch JSON embedded in plain text logs using regex or spath?

maulikdesai21
Engager

I have been running into a problem where I need to fetch the value from JSON data in the log. I am aware of spath but I believe spath expects JSON as an input. However, my data has lots of plain text and JSON mixed together. I have seen similar questions being asked before:

https://answers.splunk.com/answers/151040/how-to-parse-json-mixed-in-with-text-data-or-a-timestamp.h...

I am not sure what's the clean way to do this. Below is a sample log:

mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {"url":"http://mysite.com","error_message":"Blocked a frame with origin \"https://mysite.com\" from accessing a cross-origin frame.","errStack":"{\"isTrusted\":true}","urCounter":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark

I need to group stuff by error_message

0 Karma
1 Solution

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

View solution in original post

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

maulikdesai21
Engager

Thanks @woodcock, that works 🙂

However, I having bit hard time understanding the regex, the part below:

| rex max_match=0 ",\"error_message\":\"(?.*?)\",\"(?

0 Karma

woodcock
Esteemed Legend

Click Accept to close the question. Throw the RegEx into RegEx101.com and it will explain all.

0 Karma
Get Updates on the Splunk Community!

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...

Splunk APM: New Product Features + Community Office Hours Recap!

Howdy Splunk Community! Over the past few months, we’ve had a lot going on in the world of Splunk Application ...

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...