Using GPU

Data Flow supports fully managed GPU based Spark for speeding up and unifying the Data and AI ML pipeline.

NVIDIA's Spark RAPIDS accelerator transparently speeds up ETL on Data Flow and optimizes the price-performance. Run existing Data Flow Spark applications on GPU with zero code changes to reap the benefits of NVIDIA RAPIDS acceleration.

Prerequisite

The minimum Spark version to run GPU shapes on Data Flow is Spark 3.2.1. If you're interested in using GPU shapes on Data Flow, file a service request for Oracle Cloud Infrastructure Data Flow. Indicate the GPU shape you're interested in and provide your use case. This information is used for completing the necessary limit and quota increase for your tenancy.

Supported Shapes

  • VM.GPU.A10: X9-based GPU compute.
    • GPU: NVIDIA A10 24 GB
    • CPU: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz.
Supported GPU Shapes
Shape Number of OCPUs GPU Memory (GB) CPU Memory (GB) Block Storage (GB) Maximum Network Bandwidth (Gbps) Maximum Total Number of VNICs (Linux)
VM.GPU.A10.1 (GPU: 1xA10 15 24 240 1575 24 15
VM.GPU.A10.2 (GPU: 2xA10) 30 48 480 3075 48 24

For more information, see the: Compute documentation.

Getting Started with GPU Shapes for Spark on Data Flow

When creating an application or run, select the appropriate GPU shape for the drivers and executors. Data Flow preconfigures GPUs for drivers and executors using Apache Spark's resource-aware scheduling. Configure the application using the Spark-submit options.
Note

When you mix and match CPU and GPU shapes of different architectures for your driver and executors, ensure that the application and all dependencies are architecture-agnostic.
Here are two examples of spark-submit compatible commands for enabling GPU:
Using --jars and --conf options
This Spark-submit-compatible command uses the included NVIDIA RAPIDS accelerator for Apache Spark:
oci --profile <cli-profile> --auth security_token data-flow run submit  \
--compartment-id <compartment-id> \
--execute "--jars oci://dataflow_sample_apps@bigdatadatasciencelarge/rapids-4-spark_2.12-23.06.0.jar \
--driver-shape "VM.GPU.A10.1" \
--executor-shape "VM.GPU.A10.1" \
--num-executors 1
--spark-version "3.2.1" \
--conf spark.rapids.sql.explain=ALL \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--class com.oracle.oci.dataflow.samples.DataFlowJavaSample \

oci://dataflow_sample_apps@bigdatadatasciencelarge/dataflow-java-sample-1.0-SNAPSHOT.jar"
Using --packages and --conf options
This Spark-submit-compatible command pulls the latest NVIDIA RAPIDS accelerator for Apache Spark from the Maven repository:
oci --profile <cli-profile> --auth security_token data-flow run submit \
--compartment-id <compartment-id> \
--driver-shape "VM.GPU.A10.1" \
--executor-shape "VM.GPU.A10.1" \
--num-executors 1 \
--spark-version "3.2.1" \
--execute "--packages com.nvidia:rapids-4-spark_2.12-23.06.0 \
--conf spark.rapids.sql.explain=ALL
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--class com.oracle.oci.dataflow.samples.DataFlowJavaSample

oci://dataflow_sample_apps@bigdatadatasciencelarge/dataflow-java-sample-1.0-SNAPSHOT.jar"

For the latest Spark RAPIDS version, check NVIDIA Spark RAPIDS download version. For more tuning options, see the RAPIDS Accelerator for Apache Spark Tuning Guide.

NVIDIA's Spark RAPIDS accelerator can transparently speed-up ETL on OCI Data Flow and optimize the price-performance. You can run existing Data Flow Spark applications on GPUs with zero code changes to reap the benefits of NVIDIA RAPIDS acceleration.

When using NVIDIA's Spark RAPIDS accelerator for the GPU runs on Data Flow, the Spark UI shows any CPU operations replaced by GPU operations with a 'GPU' prefix. For example:
Figure 1. Example of GPU prefix
Example of Spark UI showing the GPU prefix instead of the CPU prefix.