Using Hunk with simple search like index=myindex retreives all the expected results. But as soon as I add something else (ei: sourcetype=mysourcetype or add something like | stats by count user) the search stop at some point after 400 000 more events with and error like:
[myprovider] JobStartException - Failed to start MapReduce job. Please consult search.log for more information. Message: [ Failed to start MapReduce job, name=SPLK_myclient_XXXXXXXX.XX_X ] and [ Failed on local exception: java.io.IOException: Couldn't setup connection for myuser/myclient@myrealm to myuser/hadoopnamenode@myrealm; Host Details : local host is: "myclient/myclientIP"; destination host is: "hadoopNameNode":9001; ]
I tried the solution related to the ephemeral port needed to send the confirmation of Mareduced Job termination by shuting down both firewall of hadoop and the client devices, the same error occured.
Search.log shows several part like
09-09-2014 17:35:31.620 DEBUG ERP.idoop11 - Client$Connection$1 - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]
But it continue to parse the log files until it crashes.
Any clue of where to look at?
Thanks,
Happy to hear things are working
The reason you are not seeing the error If you are just running ' index=xyz ' is because you are not running MR Jobs. You are just running Splunk streaming on Hadoop.
To run MR Jobs you need ' index=xyz |
Before you can use Hunk, it requires:
1) Hadoop libraries
2) These Hadoop libraries must be identical version as the Hadoop Server !! Important otherwise your MR will fail
3) Java
4) A user that can install Hunk + exists on HDFS /user/
Once you got the above working, now we can talk about Kerberos
1) Make sure the Hadoop client node (Hunk) has a keytab / Has a fully working Kerberos / kinit / kadmin / etc .. before you configure Hunk to use it
2) Find the file /etc/krb5.conf
3) From that file you can find many of these values, which are needed by Hunk to use with Kerberos:
vix.java.security.krb5.kdc =
vix.java.security.krb5.realm =
vix.kerberos.principal =
vix.kerberos.keytab =
vix.hadoop.security.authentication =
vix.hadoop.security.authorization =
vix.dfs.namenode.kerberos.principal =
vix.mapreduce.jobtracker.kerberos.principal =
vix.hadoop.security.auth_to_local =
Thank, I was able to fix the provider configuration. The only error I made was in key
vix.mapreduce.jobtracker.kerberos.principal
where I did set MyUser/_HOST@myrealm rather than mapred/_HOST@myrealm. MyUser is the one running the Hunk client and is not defined on hadoop even if it has credential to access it.
No need of the following keys while I use kerberos conf file
vix.java.security.krb5.kdc
vix.java.security.krb5.realm
vix.hadoop.security.auth_to_local
hi rdagan, as Benoit mentioned above, the search works fine when executing it just for the index, but shows this error whenever any additional search parameter is added to it. Won't it result in the same error and show no results if the configs you mentioned were not in place?