I am querying a Websense database (MSSQL) with 5 DB inputs. 1 Tail and 4 Dump. The 4 Dump inputs work just fine. The Tail input seems to freeze and sometimes crash Splunk. This is the query we are using (running every 60s) for the tail input:
SELECT top 100000 date_time,port,protocol_id,bytes_sent,bytes_received,duration,category,url,full_url,user_id,hits,source_server_ip_int,source_ip_int,destination_ip_int,record_number from incoming {{WHERE $rising_column$ > ?}} ORDER by record_number
The number of records we are requesting is high because the database produces a high volume of events per min. I tried to lower this number to 20000, but after some time (and a large time lag in data) the inputs stopped coming in. I try to run a DB query through the dbx app and no results return, the search just runs and runs never producing a result. I also tried to look at the "Database Info" but once I select a database, the schema table does not load.
Once we see that the dbx is frozen we stop splunk. Splunkweb usually does not stop successfully so we kill the 2 following processes:
4305 ? Sl 0:08 /usr/java/jre1.7.0_21/bin/java -cp /opt/splunk/splunk/etc/apps/dbx/bin/lib/xstream-1.4.1.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/sqlite-jdbc-3.7.2.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/jtds-1.2.6.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/hsqldb.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/postgresql-9.0-801.jdbc3.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/stringtemplate-3.2.1.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/log4j-1.2.15.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/jdbm-2.2.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/commons-logging-1.0.4.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/commons-pool-1.5.6.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/h2-1.3.162.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/antlr-2.7.7.jar:/opt/splunk/splunk/etc/apps/dbx/bin/lib/dbx.jar -Xmx256m -Dfile.encoding=UTF-8 -server -Duser.language=en -Duser.region= -Dsplunk.app.ctx=dbx com.splunk.bridge.JavaBridgeServer 4303
21155 ? Sl 2:44 python -O /opt/splunk/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/root.py start
Has anyone else seen this? Is there a resolution?
First thing to check would be to review the dbx.log and jbridge.log files in $SPLUNK_HOME/var/log/splunk for any relevant error. You can also enable DEBUG logging by adding the following entries in the $SPLUNK_HOME/etc/apps/dbx/local/java.conf:
[logging]
level = DEBUG
A splunkd restart would be required after the above changes. The DEBUG info will be written in the dbx.log as well.
Otherwise, further troubleshooting of the issue may require gathering java thread dump to be submitted to Splunk Support for further investigation.
FYI: There is also a known bug (DBX-151) reported for DB Connect 1.0.10 with similar symptoms.
First thing to check would be to review the dbx.log and jbridge.log files in $SPLUNK_HOME/var/log/splunk for any relevant error. You can also enable DEBUG logging by adding the following entries in the $SPLUNK_HOME/etc/apps/dbx/local/java.conf:
[logging]
level = DEBUG
A splunkd restart would be required after the above changes. The DEBUG info will be written in the dbx.log as well.
Otherwise, further troubleshooting of the issue may require gathering java thread dump to be submitted to Splunk Support for further investigation.
FYI: There is also a known bug (DBX-151) reported for DB Connect 1.0.10 with similar symptoms.
If the jbridge server is running, but the query won't complete, then kill java while the query is running, start the query again after the 47 error (sometimes you have start it twice) and it will fix the problem until the next restart/reboot.
One thing to note, I was having the same issue running the dbquery search command using 1.0.10. It would freeze and never finalize. No improvement after I upgraded to 1.0.11. However, once I upgraded java from 1.7.0_17 to 1.7.0_25, the problem went away.