All Apps and Add-ons

Splunk DB Connect Batch Input Vs Rising Column

nquba
Explorer

Hi,

I am using splunk_db connect version 4.2 and dealing with a table of 28 million entries with a million new entries adding every day. While creating the DB input as batch I can choose the table and choose continue to go to the next section that is 'Set Parameters'. I am worried that we can't index the whole table of 28 million entry, because under Operations --> Set Parameter section, it says "Enter an integer between 1 and 10000000.". Does this mean it can only index 10 million at one time or would it slowly catches up the next time it'll run?

This is a MySQL server. Do you think even splunk db_connect is a proper way to go or should I consider setting up a forwarder in this case? I want to make sure I am not hitting the limitations of the db connect.

Tags (2)
0 Karma

nquba
Explorer

I fixed the title. I didn’t get you quite well. The amount ”Enter an integer between 1 and 10000000." Is this the limit of one time fetch. If yes then the next time (depending on the frequency we set) it can fetch the same rows, this way it’d never finish the full list (specially in case of batch input). Couple of things regarding the database I am dealing with, it is increasing with rate of almost more than half a million new entries every day. My two main options are batch input Vs rising column. Let's discuss both of them.

Followup question:

A) Batch Input
A1: One problem is that if we pull 10K every 5 minutes, as proposed above. It will take a long time to catch-up and with the amount of logs the db is increasing (half a million/per day) this seems even more challenging.

A2: Even it completes at some point the the data would start duplicating.

B) Rising Column Input
B1: When I run the same exact query in Rising column, it takes about 7/10 minutes before throwing error like below:

Invalid Query
External search command 'dbxquery' returned error code 1. Script output = "RuntimeError: Failed to run query: "SELECT * FROM (SELECT * from db.apps) t", params: "None", caused by: Exception(" java.sql.SQLException: Incorrect key file for table 'C:\Windows\TEMP\#sql22c_2b711e_2.MYI'; try to repair it.",). "

B2: Let's say we solve the above issue, I am not sure how the limit of 10 million would affect here with the rate it is growing.

0 Karma

gjanders
SplunkTrust
SplunkTrust

What are you trying to do here?
Your subject mentions rising column and then your post refers to batch input.

Also Splunk DB Connect V3 has been released but not 4.2.

If you use rising column then it will run on the schedule and pull upto the limit of the number of rows, for example you could pull 10K at a time and run every 5 minutes, it would take a long time to pull down 28 million rows so you might want to use a larger number as per your description.

Furthermore you mention using a forwarder on the MySQL server, what data are you trying to get in? Is it in a file or in a database table?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...