All Apps and Add-ons

Splunk DB Connect 2 - dbxquery only returns 1001 rows

jeff
Contributor

I'm trying to return all of the data from a table in a Microsoft SQL database using the sqljdbc4.jar. Running

 SELECT * from table

in SQL Management Studio correctly returns >11,000 records. However, when I try to run the same search from Splunk via dbxquery:

| dbxquery connection=landesk shortnames=t  maxrows=15,000 query="SELECT%20*%20from%20table"

Splunk returns exactly 1001 rows, regardless of any value of maxrows>1000. It seems like it's trying to page results, but there's no paging control in dbxquery... What am I missing?

Tags (1)
1 Solution

jeff
Contributor

so dbxquery.py has this in it's code:

from splunk import rest
from splunk import util

from functools import partial

REST_BASE='db_connect'
MAX_ROWS=1001

....

def generate(self):
        ....
        maxrows_opt = int(self.maxrows or 0)
        maxrows = MAX_ROWS
        ....
        if maxrows_opt and maxrows_opt < MAX_ROWS:
            maxrows = maxrows_opt

        ....

so it seems there's an intentional 1001 row hard limit in place. Anyone care to elaborate why? I was able to change this value to 100001 and pull in all of my values with no errors. I'm sure there are cases where we want to gate the rows imported (paging or based on an incrementing column), but need more flexibility to work directly with the data.

There may be a different way of thinking about this problem, but my goal was simply to directly access and extract the data into a state table (csv lookup) for Enterprise Security. The built in data inputs / data lookups didn't seem exactly to fit the bill.

View solution in original post

jcoates_splunk
Splunk Employee
Splunk Employee

Greetings LANDesk user! DBX 2.3.0 should no longer have these limitations, please give it a shot.

0 Karma

benton
Path Finder

DBX still has a default maxrows = 100,000. If you want to overcome that limitation and return 100,001 rows, use the parameter 'maxrows' like this: '| dbxquery maxrows=100001'

alex_loffler
Engager

Agreed - forcing the use of DB Input has two side effects:

1) Ingesting the entire dataset into a Splunk index from the source DB - data duplication, synchronization issues, ETL is bad when datasets are large (think Phoenix on HBase!)
2) DB Input counts towards ones Splunk quotas - Sorry that's double dipping...

jeff
Contributor

so dbxquery.py has this in it's code:

from splunk import rest
from splunk import util

from functools import partial

REST_BASE='db_connect'
MAX_ROWS=1001

....

def generate(self):
        ....
        maxrows_opt = int(self.maxrows or 0)
        maxrows = MAX_ROWS
        ....
        if maxrows_opt and maxrows_opt < MAX_ROWS:
            maxrows = maxrows_opt

        ....

so it seems there's an intentional 1001 row hard limit in place. Anyone care to elaborate why? I was able to change this value to 100001 and pull in all of my values with no errors. I'm sure there are cases where we want to gate the rows imported (paging or based on an incrementing column), but need more flexibility to work directly with the data.

There may be a different way of thinking about this problem, but my goal was simply to directly access and extract the data into a state table (csv lookup) for Enterprise Security. The built in data inputs / data lookups didn't seem exactly to fit the bill.

delink
Communicator

It seems like the latest version upped this limit to 5001, but the problem bit me too, and I was going crazy trying to figure out why my query was behaving differently between Splunk and the mysql client. I also raised the limit inside the code to the maximum allowed value for maxSetRows, which seems to be 50,000,000 for this particular driver.

0 Karma

clayramey
Engager

Agreed. We ended up changing the hardcoded 1001 to 1,000,000 to get around this issue, but will have to remember to fix it every time we upgrade. Hard limiting the rows to 1001 makes the app completely pointless for most practical use. It's more than just a "preview" tool. This hard limit affects the dbxquery command also.

araitz
Splunk Employee
Splunk Employee

dbxquery is intended for use as a preview tool, and isn't guaranteed to be around forever. If you want to get data from a relational database, I'd recommend using a dbinput (seems like batch mode would fit the bill) or dblookup.

0 Karma

jeff
Contributor

Yeah, I'm not sold on that...

a: not how it's documented
b: dbindex and dblookup both seem pointless for my use case- while relatively insignificant using dbindex would count against index licensing limits when all I need is a current state of the data. dblookup requires mapping against a Splunk search... again all I want to do is extract data from the SQL tables to build a current-state table.

Honestly the only reason I looked at v2 of DB Connect is the support of Search Head clusters. If dbxquery isn't fully supported as an interface to extract data (despite the documentation) then I'll script another solution and uninstall...

monkeydust
Engager

I'm going to have to agree with Jeff here. We use dbquery all throughout out Splunk installation to show real-time data from our database. We have no interest in indexing this data as we only care about what it looks like at this moment in time (for example, we use it heavily to show data on which help desk tickets currently need to be worked and then launch our help desk system when they click on one of those rows).

pj
Contributor

Agree - not sure what the point of maxrows is, if there is a hard coded limit of 1000. The DBX GUI based lookup and inputs do not allow for advanced SQL queries. Sometimes you just want to rip a SQL query, run it once per day and do an outputlookup or something. There shouldnt be a hard coded limit like this.

0 Karma

pj
Contributor

Actually - you can pass some advanced SQL to get what you need. You just have to index the data first, then you could do an outputlookup from there.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...