Getting Data In

Where does Splunk log errors about malformed JSON input data?

Graham_Hanningt
Builder

I sent two events in JSON format to Splunk (Enterprise 6.4) via TCP. The second event was deliberately malformed: a string value was missing its closing quote.

The first event was successfully indexed. As expected, the second wasn't.

How do I troubleshoot this? For example, which Splunk log records the failure to ingest the second event?

If I send similarly malformed event data to the HTTP Event Collector (EC) as two events batched in a single request:

{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"good"}}
{"time":1459241926.498019000,"sourcetype":"my_test","index":"test","event":{"myfield":"bad}}

(note the deliberately missing closing quote after the bad value)

then, again, as expected, only the first event gets indexed. Unexpectedly, though, EC responds with:

{"text":"Success","code":0}

whereas, if I reverse the order of the JSON lines (putting the event with the bad value first), I get:

{"text":"Invalid data format","code":6,"invalid-event-number":0}

(For JSON parsing errors in EC input, I've seen that the data.num_of_parser_errors metric in the _introspection index for that time period gets incremented. But that's all the evidence I can see: I don't see the specific error details logged anywhere.)

Graham_Hanningt
Builder

I think I'll leave this question up for a few days longer as a testament to my own ignorance, and then delete it.

I might ask a new question later around similar issues, based on my recent, slightly better understanding. (For example, although much of my question is based on bogus assumptions, that HEC behavior I reported still looks dodgy to me.)

On with the self-flagellation:

I sent two events...

No, I didn't.

I sent two lines of JSON, each ending in \r\n, but, in props.conf, I had failed to specify SHOULD_LINEMERGE = false. So the two lines were being treated as a single event.

If I had bothered to look at the _raw field, I would have noticed that the JSON line with the "bad (missing closing quote) value was appended to the "good" line, in a single event.

After adding SHOULD_LINEMERGE = false and resending the data, I get two events. The first event has a myfield value of good. The second event has no myfield value.

The first event was successfully indexed. As expected, the second wasn't.

My expectation was wrong.

As described above, after adding SHOULD_LINEMERGE = false, the second event (with the missing closing quote) is indexed. It just doesn't have a myfield value, because the JSON is malformed.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...