All Apps and Add-ons

REST API Modular Input: How to remove meta data from JSON and split it in to multiple events

Aftab_alam
Explorer

Hi All,
I am pulling this json from REST input.
{"meta":{"bucket":"second","bucketsize":"1","tstart":1510090302753,"tend":1510093902753,"group":{"mname":{"desc":"Monitor name of measurement","type":"string","name":"Monitor Name"},"monid":{"desc":"Slot ID of measurement","type":"number","name":"Slot ID"}},"monid":[22762032],"metrics":{"count":{"desc":"Number of total hits or datapoints","unit":"number","name":"Total number of hits"},"avail":{"desc":"Average Availability of selected Measurements","unit":"%","name":"Availability"},"nwtme":{"desc":"Total time of all network traffic measured by the agent","unit":"ms","name":"Total Network time"},"uxtme":{"desc":"Full User Experience time as reported by the browser","unit":"ms","name":"User Experience"}},"limit":100000,"dbtime":3,"dbname":"db_dt_wa_raw_5 ","apiversion":"48.0.0.201709291816.154491-8"},"data":[{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":37874,"uxtme":118764,"avail":1,"mtime":1510090652753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":33711,"uxtme":120795,"avail":1,"mtime":1510091094753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":44886,"uxtme":134951,"avail":1,"mtime":1510091548753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":32109,"uxtme":114532,"avail":1,"mtime":1510091999753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":31346,"uxtme":116506,"avail":1,"mtime":1510092457753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":34060,"uxtme":126494,"avail":1,"mtime":1510092909753},{"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":34784,"uxtme":120856,"avail":1,"mtime":1510093359753}]}

I would like to remove header until "data":[ and last part of json ]}
and create multiple events of each array
{"mname":"NM - SearchAndPurchase_Guest - Chrome {"mname":"NM - SearchAndPurchase_Guest - Chrome Agent","monid":22762032,"count":1,"nwtme":34784,"uxtme":120856,"avail":1,"mtime":1510093359753}

I have already added below in prop.conf and restarted but it does not work
TIME_PREFIX = \"mtime\":
SHOULD_LINEMERGE = false
LINE_BREAKER = ,{|}(,){
TRUNCATE = 200000
SEDCMD-remove_header = s/{\"meta[^[][//g
SEDCMD-remove_footer = s/]}//g

0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

You should use a custom response handler added to rest_ta/bin/responsehandlers.py for this.

alt text

alt text

View solution in original post

0 Karma

hardikJsheth
Motivator

I think you should have a python script which is pulling this data. The best way will be to do this modification in your python script.

0 Karma

Damien_Dallimor
Ultra Champion

The REST API Modular Input App is the defacto standard for getting data from REST API's into Splunk for years now. It has a custom pre processor framework (as detailed in my answer below regarding response handlers) for manipulating the raw received HTTP response into the format that you want to index it in Splunk.

0 Karma

Damien_Dallimor
Ultra Champion

You should use a custom response handler added to rest_ta/bin/responsehandlers.py for this.

alt text

alt text

0 Karma

Aftab_alam
Explorer

Thanks for help. it works perfectly

0 Karma

Aftab_alam
Explorer

Hi Damien,
how do I increase limit for REST input. I am getting this error. increase truncate limit in prop.conf does not help.
11-09-2017 10:50:53.087 -0800 WARN LineBreakingProcessor - Truncating line because limit of 10000 bytes has been exceeded with a line length >= 130439 - data_source="rest://Dynatrace Synthetic Monitors", data_host="portal.dynatrace.com", data_sourcetype="dynatrace_api"

Thanks for help

0 Karma

Aftab_alam
Explorer

added this in prop.confg /etc/apps/search/local/
[dynatrace_api]
TIME_PREFIX = \"mtime\":
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = ,{
LINE_BREAKER = }(,){
TRUNCATE = 200000
SEDCMD-remove_header = s/{\"meta.*\"data\":[//g
SEDCMD-remove_footer = s/]}//g

it works in search UI
index=main sourcetype=dynatrace_api host="dynatrace-api" | rex field=_raw mode=sed "s/{\"meta.*\"data\":[//g" | rex field=_raw mode=sed "s/]}//g" | table _raw
but I still see complete json is getting indexed

0 Karma

Aftab_alam
Explorer

tried changing SEDCMD
[dynatrace_api]
TIME_PREFIX = \"mtime\":
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = ,{
LINE_BREAKER = }(,){
TRUNCATE = 200000
SEDCMD-remove_header = s/{\"meta.*\"data\":[//g
SEDCMD-remove_footer = s/]}//g

it works in search UI but I am still seeing complete json getting indexed
index=main sourcetype=dynatrace_api host="dynatrace-api" | rex field=_raw mode=sed "s/{\"meta.*\"data\":[//g" | rex field=_raw mode=sed "s/]}//g" | table _raw

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...