Splunk Enterprise

Splunk DB Connect timeout - Unable to read next record

edoardo_vicendo
Contributor

Hello,

We had this error on an output query set-up on Splunk DB Connect.

Basically the Splunk query is inserting data into an external database.

 

 

2023-11-08 01:58:32.712 +0100  [QuartzScheduler_Worker-9] ERROR org.easybatch.core.job.BatchJob - Unable to read next record
java.lang.RuntimeException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[836463,5]
Message: Premature EOF
                at com.splunk.ResultsReaderXml.getNextEventInCurrentSet(ResultsReaderXml.java:128)
                at com.splunk.ResultsReader.getNextElement(ResultsReader.java:87)
                at com.splunk.ResultsReader.getNextEvent(ResultsReader.java:64)
                at com.splunk.dbx.server.dboutput.recordreader.DbOutputRecordReader.readRecord(DbOutputRecordReader.java:82)
                at org.easybatch.core.job.BatchJob.readRecord(BatchJob.java:189)
                at org.easybatch.core.job.BatchJob.readAndProcessBatch(BatchJob.java:171)
                at org.easybatch.core.job.BatchJob.call(BatchJob.java:101)
                at org.easybatch.extensions.quartz.Job.execute(Job.java:59)
                at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
                at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[836463,5]
Message: Premature EOF
                at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:599)
                at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(XMLEventReaderImpl.java:83)
                at com.splunk.ResultsReaderXml.getResultKVPairs(ResultsReaderXml.java:306)
                at com.splunk.ResultsReaderXml.getNextEventInCurrentSet(ResultsReaderXml.java:124)
                ... 9 common frames omitted

 

 

The issue was related to a query timeout. We have set-up the upsert_id in the Splunk DB Connect output configuration so that Splunk can go in insert_update.

Looking into _internal log we understood that Splunk, when using the upsert_id, performs a select query for each record it has to insert and then commits every 1000 records (by default):

 

2023-11-10 01:22:28.215 +0100  [QuartzScheduler_Worker-12] INFO  com.splunk.dbx.connector.logger.AuditLogger - operation=dboutput connection_name=SPLUNK_CONN stanza_name=SPLUNK_OUTPUT state=success sql='SELECT FIELD01,FIELD02,FIELD03 FROM MYSCHEMA.MYTABLE WHERE UPSERT_ID=?'

 

 

 

2023-11-10 01:22:28.258 +0100  [QuartzScheduler_Worker-12] INFO  com.splunk.dbx.connector.logger.AuditLogger - operation=dboutput connection_name=SPLUNK_CONN stanza_name=SPLUNK_OUTPUT state=success sql='INSERT INTO MYSCHEMA.MYTABLE (FIELD01,FIELD02,FIELD03) values (?,?,?)'

 

 

Upsert_id is very useful to avoid an sql duplicate key error, and whenever you want to recover data in case the output is failing for some reason. You basically re-run the output query and if the record already exists it is replaced in the sql table.

But the side effect is that the WHERE condition of the SELECT statement can be very inefficient if the Database table start to be huge.

The solution is to create in the output Database table an SQL index on the upsert_id field.

 

The output run passed from 11 minutes to 11 seconds, avoiding to hit the timeout of the Splunk DB Connect (30 seconds by default, calculated for every commit).

 

Best Regards,

Edoardo

_JP
Contributor

Thanks for the tip!  Non-streaming  type pushes like this are often a challenge, and this is one way to manage the coupling of something that likes to be working in the real-time space (Splunk) versus  more of a batch space (the DB).

Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...