I've got complicated structure.
Start of the log file:
{
"dataUpdatedTime" : "2017-12-28T12:07:00+02:00",
"links" : [ {
"id" : 27,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 329,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:29:00+02:00"
.
.
.
.
}, {
"fluencyClass" : 5,
"minute" : 1289,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1358,
"averageSpeed" : 72.633,
"medianTravelTime" : 165,
"measuredTime" : "2017-12-27T22:38:00+02:00"
} ],
"measuredTime" : "2017-12-27T22:38:00+02:00"
}, {
"id" : 30,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 0,
"averageSpeed" : 43.548,
"medianTravelTime" : 124,
"measuredTime" : "2017-12-27T00:00:00+02:00"
Notice that id doesnt change until certain period. How to index events based on id which is unique identifier which how ever doesnt appear in every json array.
Assuming i have interpreted your JSON right,, spath is interpreting correctly:
I tested with this:
| makeresults |eval samplejson="{
\"dataUpdatedTime\": \"2017-12-28T12:07:00+02:00\",
\"links\": [{
\"id\": 27,
\"linkMeasurements\": [{
\"fluencyClass\": 5,
\"minute\": 329,
\"averageSpeed\": 75.851,
\"medianTravelTime\": 158,
\"measuredTime\": \"2017-12-27T05:29:00+02:00\"
}, {
\"fluencyClass\": 5,
\"minute\": 331,
\"averageSpeed\": 75.851,
\"medianTravelTime\": 158,
\"measuredTime\": \"2017-12-27T05:31:00+02:00\"
}, {
\"fluencyClass\": 5,
\"minute\": 354,
\"averageSpeed\": 83.807,
\"medianTravelTime\": 143,
\"measuredTime\": \"2017-12-27T05:54:00+02:00\"
}],
\"measuredTime\": \"2017-12-27T22:38:00+02:00\"
}, {
\"id\": 30,
\"linkMeasurements\": [{
\"fluencyClass\": 5,
\"minute\": 0,
\"averageSpeed\": 43.548,
\"medianTravelTime\": 124,
\"measuredTime\": \"2017-12-27T00:00:00+02:00\"
}]
}]
}"|spath input=samplejson|table *
I wonder if your issue is truncation - very large Json events which exceed 10,000 bytes can often cause complications.
Run this:
index=_internal sourcetype=splunkd LineBreakingProcessor - Truncating line because limit of 10000 has been exceeded
Do you see this for your json sourcetype?
Are you looking to only index events that have the unique identifier? If that is the case, then you probably want to do something like this:
https://answers.splunk.com/answers/477356/how-to-only-index-events-that-contain-specific-fie.html
If you are looking to index all the JSON files, and then trace events with the same ID, then you probably want to use the TRANSACTION command:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Transaction
Note: If you have a lot of events per ID, you may want to use STATS instead of TRANSACTION.
Can you paste a complete json block.
Ideally confirm its well formed first with https://jsonlint.com/
Since the log file is huge in event wise i will not post whole log file, but here is little bit more.
{
"dataUpdatedTime" : "2017-12-28T12:07:00+02:00",
"links" : [ {
"id" : 27,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 329,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 330,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:30:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 331,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:31:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 332,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:32:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 333,
"averageSpeed" : 75.851,
"medianTravelTime" : 158,
"measuredTime" : "2017-12-27T05:33:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 352,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:52:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 353,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:53:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 354,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:54:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 355,
"averageSpeed" : 83.807,
"medianTravelTime" : 143,
"measuredTime" : "2017-12-27T05:55:00+02:00"
....
}, {
"fluencyClass" : 5,
"minute" : 1274,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:14:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1275,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:15:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1276,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:16:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1277,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:17:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1278,
"averageSpeed" : 70.496,
"medianTravelTime" : 170,
"measuredTime" : "2017-12-27T21:18:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1287,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:27:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1288,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:28:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1289,
"averageSpeed" : 75.374,
"medianTravelTime" : 159,
"measuredTime" : "2017-12-27T21:29:00+02:00"
}, {
"fluencyClass" : 5,
"minute" : 1358,
"averageSpeed" : 72.633,
"medianTravelTime" : 165,
"measuredTime" : "2017-12-27T22:38:00+02:00"
} ],
"measuredTime" : "2017-12-27T22:38:00+02:00"
}, {
"id" : 30,
"linkMeasurements" : [ {
"fluencyClass" : 5,
"minute" : 0,
"averageSpeed" : 43.548,
"medianTravelTime" : 124,
"measuredTime" : "2017-12-27T00:00:00+02:00"
You get the idea?
"Id" is basically is in unique place in geolocation .