I am attempting to run the back fill script to populate a summary index however some jobs seem to stall after reading around half the number of events expected, then are killed when they time out.
For example, I am using the following backfill command:
splunk/bin> splunk cmd python fill_summary_index.py -app DAMSAnalyticsTool -name 'DAMS_Client_IP_Summary' -et 1296655199 -lt 1297778399 -showprogress true -dedup true -auth 'admin:secret'
After some time, I will see:
:
:
Executing DAMS_Client_IP_Summary for UTC = 1296678900 (Thu Feb 3 07:35:00 2011)
waiting for job sid = 'admin_nobody_DAMSAnalyticsTool_REFNU19DbGllbnRfSVBfU3VtbWFyeQ_at_1296678900_1925827782'
... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0% ... 50.0%Traceback (most recent call last):
File "fill_summary_index.py", line 401, in
sys.stdout.write(" ... %.1f%%" % (job.doneProgress * 100))
File "/opt/splunk/lib/python2.6/site-packages/splunk/search/init.py", line 506, in getattr
self._getStatus()
File "/opt/splunk/lib/python2.6/site-packages/splunk/search/init.py", line 750, in getStatus
serverResponse, serverContent = rest.simpleRequest(uri, sessionKey=self.sessionKey, raiseAllErrors=True)
File "/opt/splunk/lib/python2.6/site-packages/splunk/rest/init.py", line 414, in simpleRequest
raise splunk.ResourceNotFound, uri
splunk.ResourceNotFound: [HTTP 404] _nobody__DAMSAnalyticsTool_REFNU19DbGllbnRfSVBfU3VtbWFyeQ_at_1296678900_1925827782">https://127.0.0.1:8089/services/search/jobs/admin_nobody_DAMSAnalyticsTool_REFNU19DbGllbnRfSVBfU3VtbWFyeQ_at_1296678900_1925827782; None
(I suspect the error above is because the job timed out rather than an issue with the back fill script ... )
If I run the same saved search over the same time period from the BUI, it returns output within a second or so ...
Does anyone have an idea why this might occur? We have noticed that other searches also take a long time (>20m) to return content and will appear to get 'stuck' at a consistent point (say, 20%) each time they are run before finally producing any output.
Regards,
Malcolm
... View more