Known Issues for Big Data Service

Known issues have been identified in Big Data Service.

Synchronize Hive Databases Task Fails When Specifying Wildcard Character in Apache Ambari

Details
In Big Data Service clusters using Oracle Distribution including Apache Hadoop, if you synchronize the hive databases by specifying the wildcard character * for the Synchronize Hive Databases property using Apache Ambari, you receive an error that states that the synchronization of Hive metadata failed.
Workaround
We are aware of the issue and working on a resolution. Meanwhile, do not use the wildcard character * for the Synchronize Hive Databases property, but explicitly specify the Hive databases that you want to synchronize as a comma separated, no spaces list. For example: db1,db2.

Restarting Kafka Broker Fails

Details
During the restart of the Kafka broker, the Kafka broker might fail to start up.
Workaround
Remove the .lock file manually:
  1. SSH to the failing broker node.
  2. Run:

    rm rf /u01/kafka-logs/.lock

Spark Job Might Fail With a 401 Error While Trying to Download the Ranger-Spark Policies

Details
In a Big Data Service HA cluster with the Ranger-Spark plugin enabled, when you try any operation on Hive tables using the spark-submit command in cluster mode, the Spark job might fail with a 401 error while trying to download the Ranger-Spark policies. This issue arises from a known delegation token problem on the Ranger side.
Workaround
We recommend you include the user's keytab and principal in the spark-submit command. This approach ensures that Spark uses the provided keytab and principal for authentication, allowing it to communicate with Ranger to download policies without relying on delegation tokens.

Example:

spark-submit --master yarn --deploy-mode cluster --name SparkHiveQueryJob --keytab <keytab-path> --principal <keytab-principal> --class com.oracle.SparkHiveQuery ./SparkTests-1.0-SNAPSHOT.jar
Note

  • The provided user (keytab user/principal) must have the necessary permissions to download Ranger policies and tags. These permissions can be configured using the Ranger-admin UI.

    In Ranger, click Edit for the Spark repository and go to the Add New Configurations section. Be sure the specified user is added to both policy.download.auth.users and tag.download.auth.users lists. If not, add the user and save.

    Example:

    spark,jupyterhub,hue,livy,trino

  • Grant the required permissions to the same user in Ranger-Spark policies to access the necessary tables.

For more information on Ranger plugins, see Using Ranger Plugins.