I'm having serious issues in Splunk related to searching Json structures.
I really don't understand why Json isn't easier to search, considering that Splunk recognises the structure and allows you to expand it in syntax highlighted mode.
The issue that i'm having is that it appears to be impossible to search for a simple table of
value.data.timestamp value.name.localizedValue value.data.count
ultimately across multiple value.name.localizedValue entries, in order to try and do a timechart that shows metrics over time, or to be able to use this data inside ITSI as metrics for KPIs.
Above is one expanded data
node, of which inside this value
node, there are multiple data
nodes each with this structure. Each data
node contains one type of name.localizedValue
(effectively the metric name).
I have tried, unsuccessfully, to use spath and mvexpand (admittantly, i don't fully understand these commands, and how they are used to their best effect) and I can't get them to format the data in the way i want to display it. But at the same time, why should we have to put over 500 characters into a search bar in order to search this type of structured data?
If the solution is ultimately to flatten the data so that you can search it, why have json at all?
The way i'd expect to see the data is like:
Table:
| value.data.timestamp | value.name.localizedValue | value.data.count
| 2019-01-23T13:10:00Z | CPU Time | 15
| 2019-01-23T13:11:00Z | CPU Time | 16
As an aside, is there any way to make this process easier?
Ok, so there is even different measurements in the same event.
So:
step 1: split out the different measurements (value{})
step 2: extract the name of each measurement (name.value) and split out the datapoints (data{})
step 3: extract the timestamp and count of each data point
...your search to get to this data...
| spath output=value path=value{}
| table value
| mvexpand value
| spath input=value path=name.value output=name
| spath input=value path=data{} output=data
| table data,name
| mvexpand data
| spath input=data
| fields - data
| eval _time = strptime(timeStamp,"%Y-%m-%dT%H:%M:%S%Z")
| table _time,name,count
Field aliases helped us a lot. Something like -
FIELDALIAS-alias01 = "apiRequest.reqInfo.clientID" AS clientID
FIELDALIAS-alias02 = "apiResponse.apiResponse.services{}.servHttpStatus" AS servHttpStatus
Took this as an exercise in my spare time, but to be honest, I'm sceptical how well this scales.
... your search for events...
| fields _time _raw
| spath path="value{}"
| mvexpand value{}
| streamstats count AS level1
| eval _raw='value{}'
| spath
| fields - _raw
| mvexpand data{}.timeStamp
| streamstats count AS level2 BY level1
| eval value=mvindex('data{}.count',level2-1)
| eval _time=strptime(('data{}.timeStamp'),"%Y-%m-%dT%H:%M:%S%Z")
| fields _time value name.* type unit
| table *
Ok, so there is even different measurements in the same event.
So:
step 1: split out the different measurements (value{})
step 2: extract the name of each measurement (name.value) and split out the datapoints (data{})
step 3: extract the timestamp and count of each data point
...your search to get to this data...
| spath output=value path=value{}
| table value
| mvexpand value
| spath input=value path=name.value output=name
| spath input=value path=data{} output=data
| table data,name
| mvexpand data
| spath input=data
| fields - data
| eval _time = strptime(timeStamp,"%Y-%m-%dT%H:%M:%S%Z")
| table _time,name,count
This does exactly as expected!
Thank you so much.
I'm starting to get a better appreciation for json now, just wish it were easier to search on in splunk
@Lazarix
Can you please try below search??
YOUR_SEARCH | spath path=value{} output=data | mvexpand data | eval _raw= data | kv | spath path=data{} output=data | mvexpand data | eval _raw=data | kv | table id "name.localizedValue" "name.value" type unit data timeStamp count
My Sample Search:
| makeresults | eval _raw="{\"value\": [{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 15.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 16.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 15.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 12.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 13.0}],\"id\": \"1\",\"name\": {\"value\": \"CpuTime\",\"localizedValue\": \"CPU Time\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Seconds\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 3.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 2.0}],\"id\": \"2\",\"name\": {\"value\": \"Requests\",\"localizedValue\": \"Requests\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Count\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 12.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 12.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 12.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 12.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 12.0}],\"id\": \"3\",\"name\": {\"value\": \"AppConnections\",\"localizedValue\": \"Connections\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Count\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 0.0}],\"id\": \"4\",\"name\": {\"value\": \"Http4xx\",\"localizedValue\": \"Http 4xx\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Count\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 0.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 0.0}],\"id\": \"5\",\"name\": {\"value\": \"Http5xx\",\"localizedValue\": \"Http Server Errors\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Count\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 17.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 17.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 18.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 13.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 17.0}],\"id\": \"6\",\"name\": {\"value\": \"MemoryWorkingSet\",\"localizedValue\": \"Memory working set\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Bytes\"},{\"data\": [{\"timeStamp\": \"2019-01-23T13:10:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:11:00Z\",\"count\": 3.0},{\"timeStamp\": \"2019-01-23T13:12:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:13:00Z\",\"count\": 2.0},{\"timeStamp\": \"2019-01-23T13:14:00Z\",\"count\": 2.0}],\"id\": \"7\",\"name\": {\"value\": \"AverageResponseTime\",\"localizedValue\": \"Average Response Time\"},\"type\": \"Microsoft.Insights/metrics\",\"unit\": \"Seconds\"}]}" | spath path=value{} output=data | mvexpand data | eval _raw= data | kv | spath path=data{} output=data | mvexpand data | eval _raw=data | kv | table id "name.localizedValue" "name.value" type unit data timeStamp count
Thanks
Generically, Splunk works best with more atomic events so to say. So events with a single timestamp and then some fields with values. Formatting that as json works fine.
Splunk doesn't work as easily when you have events like this, that actually contain multiple events (or data points) in one. You'll then need to do some processing to split that out. Not impossible, but not always very straightforward if you're not too experienced with it.
It would help if you could share a sample raw json event, so we can play a bit with that and give you some suggestions on how to best split it into the individual data points.
On a high level, what I would try is to extract the entire data points into a multi valued field 'data', then mvexpand on that field, then parse out the individual fields of the data point (time, count), parse the timestamp and assign it to _time. Then you should be able to apply the timechart command.
Thanks for the information Frank.
I've uploaded an example of the json here:
https://gist.github.com/IanMoroney/34369128701f67ce0a1b159332356665
I understand the approach around atomic data points, and I agree that splunk is more suited to that. I suppose i wouldn't expect to be forced to flatten a structured relational data source in order to read it, as the associations are contained within the structure.
but, you live and learn!
@Lazarix
Can you please share sample JSON event? So we can help you.
did you try using spath
command?
https://docs.splunk.com/Documentation/SplunkCloud/7.2.3/SearchReference/Spath
As mentioned in the question, yes i did and I also tried combinations of spath and mvexpand.