From the "DB Input", there is a parameter, "Max Rows to Retrieve", from step 2. It limits 1 to 10000000 records. My initial pulls have more than 10000000 records. Is there another alternate solution to pull more than the max limit?
Thanks.
Use a rising column as the parameter type. Set the max rows to the maximum allowed and set the execution frequency to a short duration. This will sequentially run through the database and pull all the records over time instead of all at once as it would with a batch input. I have used this method before when creating new inputs for large databases. After it has caught up to current entries, I set a lower limit for faster queries and increase the frequency.
But this cannot avoid there are more than 10000000 records from the delta pulls in the future schedule.
How frequently are you pulling?
the data is pulled in a daily basis
Try setting the max_rows option in inputs.conf and then restart. If this fails to retrieve more than 10,000,000 records per pull, then you can set the frequency to pull more often.
http://docs.splunk.com/Documentation/DBX/2.3.0/DeployDBX/inputsspec