Getting Data In

Parse json - relate parent key to child-array values

n_young
New Member

Source JSON Structure:

{
      "working": {
        "https://site.number.one": [
          {
            "metric": "html",
            "type": "requests",
            "value": 1,
            "friendlyValue": 1,
            "limit": 1,
            "friendlyLimit": 1,
            "limitType": "max",
            "status": "working"
          },
          {
            "metric": "html",
            "type": "transferSize",
            "value": 10662,
            "friendlyValue": "10.4 KB",
            "limit": 20000,
            "friendlyLimit": "19.5 KB",
            "limitType": "max",
            "status": "working"
          }
        ],
        "https://site.number.two": [
          {
            "metric": "html",
            "type": "requests",
            "value": 1,
            "friendlyValue": 1,
            "limit": 1,
            "friendlyLimit": 1,
            "limitType": "max",
            "status": "working"
          },
          {
            "metric": "html",
            "type": "transferSize",
            "value": 10662,
            "friendlyValue": "10.4 KB",
            "limit": 20000,
            "friendlyLimit": "19.5 KB",
            "limitType": "max",
            "status": "working"
          }
        ]
  },
  "failing": {
    "https://site.number.two": [
      {
        "metric": "firstPaint",
        "type": "timings",
        "value": 1392,
        "friendlyValue": "1.392 s",
        "limit": 1000,
        "friendlyLimit": "1.000 s",
        "limitType": "max",
        "status": "failing"
      }
    ],
    "https://site.number.four": [
      {
        "metric": "firstPaint",
        "type": "timings",
        "value": 1665,
        "friendlyValue": "1.665 s",
        "limit": 1000,
        "friendlyLimit": "1.000 s",
        "limitType": "max",
        "status": "failing"
      }
  ]
  }
}

So basically there's two root-level arrays ("working" and "failing") and then within each of those, there is a "site/url" and then within each "url" there's an array of metrics/values etc.

What i'm trying to do, is parse out the URLs (sites), and then their related metrics for comparison/charting etc. I can't figure out what the right combination of spath / field extraction / etc is, so i can chart things like the "html" size as it relates to the "firstPaint" metric. There are hundreds of other metrics, but i just pulled out a few in that source JSON as an example.

Ideally what i'm left with after the parsing, is an ability to work with these fields: Site, metric name, value, status. If i can figure that out, im sure i can expand the logic to include the other fields as needed.

Hopefully i've explained this well enough to describe what i'm trying to do.

0 Karma

to4kawa
Ultra Champion
| makeresults 
| eval _raw="{
       \"working\": {
         \"https://site.number.one\": [
           {
             \"metric\": \"html\",
             \"type\": \"requests\",
             \"value\": 1,
             \"friendlyValue\": 1,
             \"limit\": 1,
             \"friendlyLimit\": 1,
             \"limitType\": \"max\",
             \"status\": \"working\"
               },
           {
             \"metric\": \"html\",
             \"type\": \"transferSize\",
             \"value\": 10662,
             \"friendlyValue\": \"10.4 KB\",
             \"limit\": 20000,
             \"friendlyLimit\": \"19.5 KB\",
             \"limitType\": \"max\",
             \"status\": \"working\"
               }
         ],
         \"https://site.number.two\": [
           {
             \"metric\": \"html\",
             \"type\": \"requests\",
             \"value\": 1,
             \"friendlyValue\": 1,
             \"limit\": 1,
             \"friendlyLimit\": 1,
             \"limitType\": \"max\",
             \"status\": \"working\"
               },
           {
             \"metric\": \"html\",
             \"type\": \"transferSize\",
             \"value\": 10662,
             \"friendlyValue\": \"10.4 KB\",
             \"limit\": 20000,
             \"friendlyLimit\": \"19.5 KB\",
             \"limitType\": \"max\",
             \"status\": \"working\"
               }
         ]
   },
   \"failing\": {
     \"https://site.number.two\": [
       {
         \"metric\": \"firstPaint\",
         \"type\": \"timings\",
         \"value\": 1392,
         \"friendlyValue\": \"1.392 s\",
         \"limit\": 1000,
         \"friendlyLimit\": \"1.000 s\",
         \"limitType\": \"max\",
         \"status\": \"failing\"
           }
     ],
     \"https://site.number.four\": [
       {
         \"metric\": \"firstPaint\",
         \"type\": \"timings\",
         \"value\": 1665,
         \"friendlyValue\": \"1.665 s\",
         \"limit\": 1000,
         \"friendlyLimit\": \"1.000 s\",
         \"limitType\": \"max\",
         \"status\": \"failing\"
           }
   ]
   }
 }" 
| spath path=working output=raw 
| appendpipe 
    [ spath path=failing output=raw] 
| table raw 
| eval raw=replace(raw,"\]\,","] }#{") 
| makemv delim="#" raw 
| stats count by raw 
| fields - count 
| rex field=raw "(?<site_name>http.+)\":" 
| rex field=raw mode=sed "s/(?s)^\{.+\[//g" 
| rex field=raw mode=sed "s/(?s)\].+$//g" 
| eval raw=replace(raw,"\}\,","}#") 
| makemv delim="#" raw 
| eval tmp=mvmap(raw,raw."Z".random()) 
| stats values(site_name) as site_name by tmp 
| spath input=tmp 
| fields - tmp 
| table site_name metric type value friendlyValue limit friendlyLimit limitType status

Hi, @n_young
After that, try to carry out tallying.

0 Karma

vik_splunk
Communicator

@n_young

What particular use case are you trying to satisfy? Some information could help.

Natively, Splunk should be able to parse the fields necessary without having to use spath/regex.

I was able to ingest the json provided and a table and transpose produces the fields for the most part. Based on the use case necessary, we can tweak the query to produce the necessary output

alt text

0 Karma

n_young
New Member

@vik_splunk The issue is that the "site" names are diverse/variable. I just used those as examples for posting the question here. The actual URLs/sites will be completely diverse --and there will be hundreds of them in the same JSON source file(s).

So, while i could do something like " | table site.number.two | transpose | .. that would only work for the sites i specify.

I figured i'd have to use something like spath / field extraction to pull each of the sites out into an individual variable.

0 Karma

vik_splunk
Communicator

I am a bit confused @n_young because what it seems to me is you are referring to parsing and event breaking index time as opposed to field extraction.

To achieve the above desired event breaking, I am afraid you will have to modify the source data to be delivered in the desired format rather than attempting to let Splunk parse the event to the desired format.

I do understand that the field names would be different than the sample regex above. What I was trying to understand was the use case to see if the same could be accomplished within the field extraction that Splunk inherently performs.

0 Karma

n_young
New Member

@vik_splunk I dont have the ability to change the source JSON structure. It's coming out of a product directly, and i dont have control of the format/structure.

My question is most closely aligned with this former post, but the solution doesn't seem to work with my source structure.

https://answers.splunk.com/answers/174698/leveraging-spath-with-arbitrary-json-fields.html

0 Karma

n_young
New Member

what im trying to do, is convert each of the sub-arrays into events, so that i can parse/chart things like you normally would with individual events.

I guess ultimately that means figuring out how to parse the source JSON, into individual events, but each individual event's time would be that of the file, and each event would also have the parent "site" included.

So, the event would end up being something like this:

         "site":"https://site.number.one",
         "metric": "html",
         "type": "requests",
         "value": 1,
         "friendlyValue": 1,
         "limit": 1,
         "friendlyLimit": 1,
         "limitType": "max",
         "status": "working"
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...