All Apps and Add-ons

hadoop connect error when connecting to CDH5 centos linux cluster

mikejf12
New Member

I have installed splunk today on both a windows 7 64 server and a centos linux 32 bit machine. I have also installed installed version 122 of hadoop connect app from splunk as a tgz file.

I want to connect to a CDH5 Centos linux based hadoop cluster built with the CDH5 manager. I have set my hdfs uri, my java home and hadoop home as well as the name node htt port. When I click save on both windows and linux I get the error

Unable to connect to Hadoop cluster 'hdfs://hc2nn:8020/' with principal 'None': Invalid HADOOP_HOME. Cannot find Hadoop command under bin directory HADOOP_HOME='/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12'.

I can see the hadoop command within the $HADOOP_HOME/bin directory and I have checked that the command works i.e. I can connect to hdfs and do a listing. I wondered whether anyone had seen this error before ?

0 Karma

mikejf12
New Member

I was running hunk from a linux host that didnt have hdfs access to the remote cluster. Rather than try and sort that access out I installed hunk on the cluster and it runs ok.

0 Karma

apatil_splunk
Splunk Employee
Splunk Employee

HADOOP_HOME should be set to a directory on the local machine where splunk is installed.

Seems like you are setting hadoop home to a directory on remote cluster where hadoop is installed?
HADOOP_HOME='/opt/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12'

0 Karma

csharp_splunk
Splunk Employee
Splunk Employee

Does the user Splunk is running as have permissions to access the $HADOOP_HOME/bin directory?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...