Splunk Dev

Export/stream massive results from splunk REST API

karan1337
Path Finder

I need to export a massive number of events from splunk. Hence for performance reasons i resorted to directly using the REST API in my python code rather than using the Splunk SDK itself.

I found the following curl command to export results:-

curl -ku username:password
https://splunk_host:port/servicesNS/admin/search/search/jobs/export -d
search=“search index%3D_internal | head 3” -d output_mode=json

My attempt at simulating this using python's http functions is as follows:-

//assume i have authenticated to splunk and have a session key
base_url = "http://splunkhost:port"

search_job_urn = '/services/search/jobs/export'

myhttp = httplib2.Http(disable_ssl_certificate_validation=True)

searchjob = myhttp.request(base_url + search_job_urn, 'POST', headers=
{'Authorization': 'Splunk %s' % sessionKey},
body=urllib.urlencode({'search':'search index=indexname sourcetype=sourcename'}))[1]

print searchjob

The last print keeps printing all results until done. For large queries i get "Memory Errors". I need to be able to read results in chunks (say 50,000) and write them to a file and reset the buffer for searchjob. How can i accomplish that?

Tags (3)
0 Karma
1 Solution

karan1337
Path Finder

I solved the above using the python's requests API. Refer: http://docs.python-requests.org/en/latest/api/

Just need to set stream=true in iter_content call (the call is looped until a valid chunk is received) and write the chunk to a file.
Also refer here for more info: http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py

View solution in original post

0 Karma

karan1337
Path Finder

I solved the above using the python's requests API. Refer: http://docs.python-requests.org/en/latest/api/

Just need to set stream=true in iter_content call (the call is looped until a valid chunk is received) and write the chunk to a file.
Also refer here for more info: http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py

0 Karma

guilmxm
SplunkTrust
SplunkTrust

Hello karan1337,

Would yu mind sharing a copy of your python script call rest api and using chunk ?

I'm trying to get the same behavior, and that would be very cool 🙂

Thank you anyway !

Guilhem

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Have you considered mass-exporting from the CLI?

$SPLUNK_HOME/bin/splunk export eventdata -index indexname -sourcetype sourcetypename -dir /path/to/write/to

More info by running splunk help export.

0 Karma

karan1337
Path Finder

I don't have access to the box running splunk so cannot use the CLI. I need to do it remotely. I fixed the above problem by using the requests API and writing chunks of results to a file. But i see that for large searches, the job status sometimes auto finalizes when i have huge number of results.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...