Splunk Dev

Export/stream massive results from splunk REST API

karan1337
Path Finder

I need to export a massive number of events from splunk. Hence for performance reasons i resorted to directly using the REST API in my python code rather than using the Splunk SDK itself.

I found the following curl command to export results:-

curl -ku username:password
https://splunk_host:port/servicesNS/admin/search/search/jobs/export -d
search=“search index%3D_internal | head 3” -d output_mode=json

My attempt at simulating this using python's http functions is as follows:-

//assume i have authenticated to splunk and have a session key
base_url = "http://splunkhost:port"

search_job_urn = '/services/search/jobs/export'

myhttp = httplib2.Http(disable_ssl_certificate_validation=True)

searchjob = myhttp.request(base_url + search_job_urn, 'POST', headers=
{'Authorization': 'Splunk %s' % sessionKey},
body=urllib.urlencode({'search':'search index=indexname sourcetype=sourcename'}))[1]

print searchjob

The last print keeps printing all results until done. For large queries i get "Memory Errors". I need to be able to read results in chunks (say 50,000) and write them to a file and reset the buffer for searchjob. How can i accomplish that?

Tags (3)
0 Karma
1 Solution

karan1337
Path Finder

I solved the above using the python's requests API. Refer: http://docs.python-requests.org/en/latest/api/

Just need to set stream=true in iter_content call (the call is looped until a valid chunk is received) and write the chunk to a file.
Also refer here for more info: http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py

View solution in original post

0 Karma

karan1337
Path Finder

I solved the above using the python's requests API. Refer: http://docs.python-requests.org/en/latest/api/

Just need to set stream=true in iter_content call (the call is looped until a valid chunk is received) and write the chunk to a file.
Also refer here for more info: http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py

0 Karma

guilmxm
Influencer

Hello karan1337,

Would yu mind sharing a copy of your python script call rest api and using chunk ?

I'm trying to get the same behavior, and that would be very cool 🙂

Thank you anyway !

Guilhem

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Have you considered mass-exporting from the CLI?

$SPLUNK_HOME/bin/splunk export eventdata -index indexname -sourcetype sourcetypename -dir /path/to/write/to

More info by running splunk help export.

0 Karma

karan1337
Path Finder

I don't have access to the box running splunk so cannot use the CLI. I need to do it remotely. I fixed the above problem by using the requests API and writing chunks of results to a file. But i see that for large searches, the job status sometimes auto finalizes when i have huge number of results.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...