I have a service which runs an SQL query over a database and returns a simple table with two columns, “Item” and “Item count”, which contain the name of an item and its quantity. These tables will have no fixed number of rows. It is output as a Json file of the format below:
{ "odata.metadata":"<service link information goes here>”:[
{ "ItemName":"Item1","ItemCount":23
}
{ "ItemName":"Item2","ItemCount":12
// etc…
}
{ “ItemName”:”Item71”,”ItemCount”:2
}
I have managed to pull this data into Splunk via the modular REST API app. However, in it’s current state it’s not very useful in search. For example I cannot do a search for items with item count less than 10, or I cannot search for an item of a specific name, since each item row is grouped into one massive event in Splunk. Splunk has managed to extract the fields, but they also have an ugly name which isn’t attractive for search either, e.g. “value{}.ItemCount” rather than just “itemCount”.
What I would like to do is extract each individual row of this table/JSON data and assign it to an event, so each event from one table would have the tables timestamp, and a field for each attribute (item name and item count). So each event will look more like a traditional log file, like below:
15/12/2015 09:47 Item1 23
15/12/2015 09:47 Item2 12
15/12/2015 09:47 Item71 2
Is this possible with the Custom Response Handlers? I’ve taken a look at the example code in responsehandlers.py and they all seem to be very simple and convert it into another format and then just print it. I need to extract each field and then submit it as a distinct event! Also, another priority is being able to extract this data into the other format before search time, if I could handle it in the "Response Handler" that would be perfect.
Many thanks.
Simple example , you can jazz it up as you require. Paste into responsehandlers.py and declare it in your stanza setup.
class MyJSONArrayHandler:
def __init__(self,**args):
pass
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
if response_type == "json":
output = json.loads(raw_response_output)
for entry in output['value']:
print_xml_stream(json.dumps(entry))
else:
print_xml_stream(raw_response_output)
We have been using this app for some JSON data, and "spath" works very well for breaking the data down without the need for a lot of extraction/regex.
index=foo sourcetype=bar |spath
Simple example , you can jazz it up as you require. Paste into responsehandlers.py and declare it in your stanza setup.
class MyJSONArrayHandler:
def __init__(self,**args):
pass
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
if response_type == "json":
output = json.loads(raw_response_output)
for entry in output['value']:
print_xml_stream(json.dumps(entry))
else:
print_xml_stream(raw_response_output)
Is the print_xml_stream neccessary? Would a normal print work, and could I then print it in any format I like?
Yes.
It is not reformatting the event in XML.
It is for wrapping the event(in any format) in an internal Splunk XML based protocol for transporting to Splunk.
Many thanks for your help today. If you're still about, I've just asked another question through a co-workers account here...
https://answers.splunk.com/answers/334524/accessing-response-handler-arguments-from-response-1.html
Cheers!
Thanks for your answer, I spent a bit of time on a python program and came up with something similar, basically the same as yours but I'm extracting it all into variables beforehand, and adding a timestamp, should be straightforward to move it across to the response handler file now.
Can you post an actual complete JSON response example ?
I will then show you a custom response handler you can plug in.
{
"odata.metadata":"https://","value":[
{
"UnitShortName":"<short name>","EmployeeCount":27
},{
"UnitShortName":"<short name>","EmployeeCount":59
},{
"UnitShortName":"<short name>","EmployeeCount":23
},{
"UnitShortName":"<short name>","EmployeeCount":23
}
]
}
I've had to censor some things such as the ip address and rest link, and the unit names, as this is customer data. But I think you can get an idea of the structure from this. Thanks!