Planning and Understanding ODH Clusters

Before creating Big Data Service clusters, you must plan and understand clusters, instance types and shapes, and cluster profiles.

For more information, see the following:

Planning the Cluster Layout, Shape, and Storage

Before you start the process to create a cluster, you must plan the layout of the cluster, the node shapes, and storage.

Understanding Instance Types and Shapes

Big Data Service cluster nodes run in Oracle Cloud Infrastructure compute instances (servers).

When you create a cluster, you choose an instance type, which determines whether the instance runs directly on the bare metal instance of the hardware or in a virtualized environment. You also choose a shape, which configures the resources assigned to the instance.

Understanding Cluster Profiles

Cluster profiles enable you to create optimal clusters for a specific workload or technology. After creating a cluster with a specific cluster profile, more Hadoop services can be added to the cluster.

Cluster Profile Types

Oracle Big Data Service enables you to create clusters for numerous cluster profile types.

Cluster profile Components (Secure and Highly Available) Components
HADOOP_EXTENDED1 Hive, Spark, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Ranger, Hue, Oozie, Tez Hive, Spark, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Hue, Oozie, Tez
HADOOP HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Ranger, Hue HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Hue
HIVE Hive, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Ranger, Hue, Tez Hive, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Hue, Tez
SPARK Spark, Hive2, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Ranger, Hue Spark, Hive2, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Hue 2
HBASE HBase, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Ranger, Hue HBase, HDFS, Yarn, ZooKeeper, MapReduce2, Ambari Metrics, Hue
TRINO Trino, Hive3, HDFS, ZooKeeper, Ambari Metrics, Ranger, Hue Trino, Hive3, HDFS, ZooKeeper, Ambari Metrics, Hue
KAFKA Kafka Broker, HDFS, ZooKeeper, Ambari Metrics, Ranger, Hue Kafka Broker, HDFS, ZooKeeper, Ambari Metrics, Hue

1 HADOOP_EXTENDED consists of components that you created clusters before cluster profiles were available.

2Hive metastore component from Hive service is used for managing the metadata in Spark.

3Hive metastore component from Hive service is used for managing the Hive metadata entities in Trino.

Apache Hadoop Versions in Cluster Profiles

The following table lists the Hadoop component versions included in cluster profiles corresponding to ODH version.

ODH 1.x

Cluster profile Version
HADOOP_EXTENDED HDFS 3.1, Hive 3.1, Spark 3.0.2
HADOOP HDFS 3.1
HIVE Hive 3.1
SPARK Spark 3.0.2
HBASE HBase 2.2
TRINO Trino 360
KAFKA Kafka 2.1.0

ODH 2.x

Cluster profile Version
HADOOP_EXTENDED HDFS 3.3, Hive 3.1, Spark 3.2
HADOOP HDFS 3.3
HIVE Hive 3.1
SPARK Spark 3.2
HBASE HBase 2.2
TRINO Trino 389