All Apps and Add-ons

Splunk not transferring data to Hadoop cluster

jonnim
Explorer

I am trying to export data from the main index to a cloudera hadoop cluster. When I start the scheduled export it saves a cursor on the hadoop side but no files are transferred even though I have 230 MB of data indexed. I tried to re-install, Splunk but the same thing is happening. I suspect that Splunk thinks that the data has been exported already. It says last completed run 33 days ago.

  1. How do I reset this to assume nothing has been exported
  2. How do I get the data to be exported?
0 Karma

hyan_splunk
Splunk Employee
Splunk Employee

It might because your export job start time (the "Export from" item in export create/edit page) is set too early and the first one or few runs were not getting any events. You can manually run the job a few times (by clicking the 'run' link in the job list page) and see if you get any event exported.

You can also check the HadoopConnect.log under $SPLUNK_HOME/var/log/splunk/. Each time export is run, you will get two lines like this:

2014-04-14 13:20:14,868 INFO run_export.py [main] [878] - search:search (_indextime=1397494147 OR _indextime=1397494148 ... OR _indextime=1397495946) host=foo| eval _dstpath=strftime(_time, "%Y%m%d") + "/" + strftime(_time, "%H") + "/" + host | fields _dstpath host source host,source,sourcetype | dump basefilename=foobar ... exportname="Host"

2014-04-14 13:21:51,304 INFO run_export.py [renameTmpFiles] [354] - Renaming 2 temporary files in HDFS...

You can run the first part of the search query in search bar and verify if there is any events fall into that export time frame:

(_indextime=1397494147 OR _indextime=1397494148 ... OR _indextime=1397495946) host=foo| eval _dstpath=strftime(_time, "%Y%m%d") + "/" + strftime(_time, "%H") + "/" + host | fields _dstpath host source host,source,sourcetype

The second log statement tells you how many event files are exported to HDFS.

Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...