I installed Hunk and Hadoop 2.2.0 on my Hunk node and launched an EMR cluster with Hadoop 2.2.0. In indexes.conf
, I set vix.fs.default.name
to my s3 bucket. This results in the following error message:
cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3
To work around this, I configured core-site.xml
to define fs.s3.impl
, and added a bunch of jars from AWS (e.g. emr-fs-1.0.0.jar
.) I started getting some classpath errors and fixed them in hadoop-env.sh
. However, I am running back into the same error message:
cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3n
What else do I need to do?
Added fs.s3n.*
and fs.s3.*
in core-site.xml
to provide AWS S3 credentials. I am able to use hadoop fs -ls
to get a listing of my bucket.
Are you able to use the Hadoop CLI to access the s3 filesystem (hadoop fs -ls s3://... )?
Not sure how I messed up the formatting. It should say "$> hadoop fs -ls s3n://xxx/"
Are you able to use the Hadoop CLI to access the s3 filesystem (hadoop fs -ls s3://... )?
Can you please provide us the full stacktrace so we can see during which step is Hunk failing?
I also tried setting vix.fs.s3n.impl = com.amazon.ws.emr.hadoop.fs.EmrFileSystem
and vix.fs.s3.impl = com.amazon.ws.emr.hadoop.fs.EmrFileSystem
in the provider.
Fixed that, but I am still getting the same error.
Great idea. Unfortunately, I cannot - ls: 's3://xxx': No such file or directory
. This tells me that the problem is in my hadoop config, not my hunk config.