Splunk Search

API search limits to 1000 results

j6white
Path Finder

When I use the Splunk API (from node.js) to query a given sid, I only get back 1000 results, even when supplying the count=0 argument. This particular sid happens to return a great many records -- over 6 million. When trying with an sid that returns much fewer records, say 5,000, they all are returned.

Here is my code:

var Request = require('request');  // 2.34.x

var options = {
    url: 'htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=0',
    method: 'GET',
    auth: {
        user: 'hector',
        pass: 'wouldntyouliketoknow'
    },
    rejectUnauthorized: false,
    requestCert: true,
    agent: false
};

Request(options, function(err, response, body) {
    err && console.log('Error calling Splunk: ' + err);
    body = JSON.parse(body);
    body && body.results && console.log('query result count: ' + body.results.length);
});

Output:

query result count: 1000

I've also tried using pagination like this:

var Request = require('request');

var internals = {};

var options = {
    url: 'htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset={offset}',
    method: 'GET',
    auth: {
        user: 'hector',
        pass: 'wouldntyouliketoknow'
    },
    rejectUnauthorized: false,
    requestCert: true,
    agent: false
};

internals.querySplunk = function (options, offset, callback) {

    options.originalUrl = options.url;
    options.url = options.url.replace('{offset}', offset);
    console.log(options.url);

    Request(options, function(err, response, body) {

        err && console.log('Error calling Splunk: ' + err);

        body = JSON.parse(body);
        var resultCount = body.results.length;
        console.log('query result count: ' + resultCount);

        if (resultCount === 0) {
            console.log('Done getting results.');
            return callback();
        }

        offset += 300;
        options.url = options.originalUrl;
        internals.querySplunk(options, offset, callback);
    });
};

internals.querySplunk(options, 0, function () {
    console.log('Exiting.');
});

Output:
htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset=0
query result count: 300
htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset=300
query result count: 300
htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset=600
query result count: 300
htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset=900
query result count: 100
htttps://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=300&offset=1200
query result count: 0
Done getting results.
Exiting.

As you can see, it still only pages to 1,000 results. What gives?

Tags (3)
1 Solution

hexx
Splunk Employee
Splunk Employee

If you search is non-transforming (i.e: it returns events but not results, as it doesn't use commands like stats or timechart to perform aggregation), you don't want to hit the /services/search/jobs/{SID}/results endpoint:

Returns the results of the search specified by {search_id}. This is the table that exists after all processing from the search pipeline has completed.

This is the primary method for a client to fetch a set of TRANSFORMED events. If the dispatched search does not include a transforming command, the effect is the same as get_events, however with fewer options.

...but rather the /services/search/jobs/{SID}/events endpoint:

Returns the events of the search specified by {search_id}. These events are the data from the search pipeline before the first "transforming" search command. This is the primary method for a client to fetch a set of UNTRANSFORMED events for the search job.

This endpoint is only valid if the status_buckets > 0 or the search has no transforming commands.

Also, if your goal is to perform massive event export, the best method to leverage is /services/search/export endpoint.

View solution in original post

hexx
Splunk Employee
Splunk Employee

If you search is non-transforming (i.e: it returns events but not results, as it doesn't use commands like stats or timechart to perform aggregation), you don't want to hit the /services/search/jobs/{SID}/results endpoint:

Returns the results of the search specified by {search_id}. This is the table that exists after all processing from the search pipeline has completed.

This is the primary method for a client to fetch a set of TRANSFORMED events. If the dispatched search does not include a transforming command, the effect is the same as get_events, however with fewer options.

...but rather the /services/search/jobs/{SID}/events endpoint:

Returns the events of the search specified by {search_id}. These events are the data from the search pipeline before the first "transforming" search command. This is the primary method for a client to fetch a set of UNTRANSFORMED events for the search job.

This endpoint is only valid if the status_buckets > 0 or the search has no transforming commands.

Also, if your goal is to perform massive event export, the best method to leverage is /services/search/export endpoint.

j6white
Path Finder

hexx: Thanks for your very succinct and insightful answer. It looks like the export endpoint is best for my needs.

0 Karma

j6white
Path Finder
  1. Why, specifically, is the search string relevant? I'm not sure it's necessary that I post it to the public internet.
  2. I executed the query using the Splunk web browser client. Sorry, I thought "running a search in the Splunk client" was clear. My mistake.
0 Karma

ineeman
Splunk Employee
Splunk Employee

I explained my question poorly - I apologize.

  1. What search are you running (i.e. what is the search string)?
  2. How are you executing it (are you running it through the UI and getting the , or running it through the API)?
  3. If it is from the API, what code is running it?

The details you gave above only talk about how you are requesting results from a specific search, not about how that search was created, which is what we need to know.

0 Karma

j6white
Path Finder

I am running a search in the Splunk client and then referencing its sid via the REST API query parameters.

As you can see from the code above, the API query (with parameters) is:

https://splunksvr:8089/servicesNS/hector/search/search/jobs/[sid]/results?output_mode=json&count=0

0 Karma

ineeman
Splunk Employee
Splunk Employee

What search are you running? How are you running it? If it is from the REST API, what parameters are you sending to the REST API?

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...