All Apps and Add-ons

Hadoop Connect to Hortonworks HDP for Windows

owenrumney
New Member

Hi

I've created a Hortonworks HDP cluster with a master and 2 slaves on Windows 2012. This configuration is working fine.

I wanted to connect to this HDFS cluster from Splunk using Hadoop Connect so I've installed Splunk on a CentOS box and installed the Hadoop Connect App. When I try to configure I'm getting a load of errors trying to find the local hadoop version.

Is it possible to use Hadoop connect to explore HDP Windows or am I on a losing battle?

Thanks,
Owen

EDIT: I'm inclined to think given the error, Failed to get local Hadoop version: /usr/bin/env: bash : No such file or directory . that the issue is that it's using bash instead of python to execute the underlying config file. Do I need to add a shebang at the top of one of the py files?

0 Karma

barakreeves
Splunk Employee
Splunk Employee

Regarding the "Failed to get local Hadoop version: /usr/bin/env: bash : No such file or directory", 2 things:
- check the py file for execute permissions using "ls -la command; then do (chmod +x filename.py) as necessary
- If the python files have a shebang then you should be able to execute the file without "python filename.py"; otherwise you have to precede the command with "python"

HTH.

0 Karma

owenrumney
New Member

I'll give that a go when I get back in tomorrow. My only doubt with that is that none of the python files in Hadoop COnnect have the shebang in the top and they all seem to work for running the WebUI, I don't see why I might have to modify their distributed files.

I'll certainly give it a go thought, thanks for the input!

0 Karma

sdaniels
Splunk Employee
Splunk Employee

You need to have the proper utilities installed on the Splunk server. Have you checked the following:

Software dependencies

Splunk Hadoop Connect requires that you install the following additional software packages on the Splunk instance on which the app runs:

Hadoop client utilities (Hadoop CLI).
Oracle Java Development Kit (JDK) v1.6u31 or higher (Required for Hadoop CLI).
Kerberos client utilities (to connect to clusters that require Kerberos authentication). 

You must make sure you run the correct Hadoop client utilities for your specific Hadoop distribution and version.

0 Karma

owenrumney
New Member

Hi, thanks for your answer. I've got the JDK and CLI installed. from the Splunk box I can do $HADOOP_HOME/bin/hadoop dfs -ls hdfs://sever:port and it lists the contents of the hdfs. So to me, that means that the cli is working okay and java is fine.

So I'm bit stumped why the config tool in the web front end is having problems.

It seems to be an issue calling the /usr/bin/env command, which is there. So I guess it's perms issue

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...