All Apps and Add-ons

Extracting multiple events from a large single JSON output sent via a REST service

jpanderson
Path Finder

I have a service which runs an SQL query over a database and returns a simple table with two columns, “Item” and “Item count”, which contain the name of an item and its quantity. These tables will have no fixed number of rows. It is output as a Json file of the format below:

{ "odata.metadata":"<service link information goes here>”:[ 
{ "ItemName":"Item1","ItemCount":23
}
{ "ItemName":"Item2","ItemCount":12
   // etc…
}
{ “ItemName”:”Item71”,”ItemCount”:2
}

I have managed to pull this data into Splunk via the modular REST API app. However, in it’s current state it’s not very useful in search. For example I cannot do a search for items with item count less than 10, or I cannot search for an item of a specific name, since each item row is grouped into one massive event in Splunk. Splunk has managed to extract the fields, but they also have an ugly name which isn’t attractive for search either, e.g. “value{}.ItemCount” rather than just “itemCount”.
What I would like to do is extract each individual row of this table/JSON data and assign it to an event, so each event from one table would have the tables timestamp, and a field for each attribute (item name and item count). So each event will look more like a traditional log file, like below:

15/12/2015 09:47  Item1  23
15/12/2015 09:47  Item2  12
15/12/2015 09:47  Item71  2

Is this possible with the Custom Response Handlers? I’ve taken a look at the example code in responsehandlers.py and they all seem to be very simple and convert it into another format and then just print it. I need to extract each field and then submit it as a distinct event! Also, another priority is being able to extract this data into the other format before search time, if I could handle it in the "Response Handler" that would be perfect.

Many thanks.

1 Solution

Damien_Dallimor
Ultra Champion

Simple example , you can jazz it up as you require. Paste into responsehandlers.py and declare it in your stanza setup.

class MyJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

alt text

View solution in original post

JDukeSplunk
Builder

We have been using this app for some JSON data, and "spath" works very well for breaking the data down without the need for a lot of extraction/regex.

index=foo sourcetype=bar |spath

0 Karma

Damien_Dallimor
Ultra Champion

Simple example , you can jazz it up as you require. Paste into responsehandlers.py and declare it in your stanza setup.

class MyJSONArrayHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        if response_type == "json":
            output = json.loads(raw_response_output)

            for entry in output['value']:
                print_xml_stream(json.dumps(entry))
        else:
            print_xml_stream(raw_response_output)

alt text

jpanderson
Path Finder

Is the print_xml_stream neccessary? Would a normal print work, and could I then print it in any format I like?

0 Karma

Damien_Dallimor
Ultra Champion

Yes.
It is not reformatting the event in XML.
It is for wrapping the event(in any format) in an internal Splunk XML based protocol for transporting to Splunk.

jpanderson
Path Finder

Many thanks for your help today. If you're still about, I've just asked another question through a co-workers account here...
https://answers.splunk.com/answers/334524/accessing-response-handler-arguments-from-response-1.html

Cheers!

0 Karma

jpanderson
Path Finder

Thanks for your answer, I spent a bit of time on a python program and came up with something similar, basically the same as yours but I'm extracting it all into variables beforehand, and adding a timestamp, should be straightforward to move it across to the response handler file now.

0 Karma

Damien_Dallimor
Ultra Champion

Can you post an actual complete JSON response example ?

I will then show you a custom response handler you can plug in.

0 Karma

jpanderson
Path Finder
{
  "odata.metadata":"https://","value":[
    {
      "UnitShortName":"<short name>","EmployeeCount":27
    },{
      "UnitShortName":"<short name>","EmployeeCount":59
    },{
      "UnitShortName":"<short name>","EmployeeCount":23
    },{
      "UnitShortName":"<short name>","EmployeeCount":23
    }
  ]
}

I've had to censor some things such as the ip address and rest link, and the unit names, as this is customer data. But I think you can get an idea of the structure from this. Thanks!

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...