Migrating Spark Applications to Oracle Cloud Infrastructure Data Flow

This tutorial shows you how to migrate your existing Spark applications to Oracle Cloud Infrastructure Data Flow.

Before You Begin

To successfully perform this tutorial, you must have Set Up Your Tenancy, and can Access Data Flow.

Allowed Spark Variables

Data Flow automatically configures many Spark variables based on factors including the infrastructure you select when running jobs. To ensure proper operation, some Spark variables can't be set or overridden when running jobs. For more information, see Supported Spark Properties in Data Flow.

Compatibility Limitations

You can't set environment variables in Data Flow jobs. Instead, you can pass variables as command line arguments, and add them to the environment in the application.

1. Supported Ways to Access the Spark Session

Data Flow creates the Spark session before your Spark application runs. It ensures that your application takes full advantage of all the hardware you configured your run to use.

Important

Don't try to create a Spark session within your application. It doesn't use the hardware you provisioned for it, and other unpredictable behavior might result.

The following are supported ways of accessing your Spark session within applications:

Java

Basic session creation.

Builder builder = SparkSession.builder().appName("My App");
                SparkSession session = builder.getOrCreate();

Set configurations while getting the session.

Builder builder = SparkSession.builder().appName("My App");
            builder.config("spark.sql.orc.impl", "hive");
            SparkSession session = builder.getOrCreate();

The spark.sql settings can be changed after the session is retrieved.

Builder builder = SparkSession.builder().appName("My App");
            SparkSession session = builder.getOrCreate();
            session.conf().set("spark.sql.crossJoin.enabled", "true");

Python

Basic session creation.

spark_builder = SparkSession.builder.appName("My App")
                spark_session = spark_builder.getOrCreate()

Set configurations while getting the session.

spark_builder = SparkSession.builder.appName("My App")
            spark_builder.config("spark.sql.orc.impl", "hive")
            spark_session = spark_builder.getOrCreate()

The spark.sql settings can be changed after the session is retrieved.

spark_builder = SparkSession.builder.appName("My App")
            spark_session = spark_builder.getOrCreate()
            spark_session.conf.set("spark.sql.crossJoin.enabled", "true")

Scala

Basic session creation.

val builder = SparkSession.builder.appName("My App")
                val session = builder.getOrCreate()

Set configurations while getting the session.

val builder = SparkSession.builder.appName("My App")
                    builder.config("spark.sql.orc.impl", "hive")
                    val session = builder.getOrCreate()

The spark.sql settings can be changed after the session is retrieved.

val builder = SparkSession.builder.appName("My App")
            val session = builder.getOrCreate()
            session.conf.set("spark.sql.crossJoin.enabled", "true")

SQL

2. Managing Java Dependencies for Apache Spark Applications in Data Flow

In Data Flow, when you run Java or Scala applications that rely on JARs not included with Spark, you must create uber or fat JARs. These JARs include the code dependencies you need. The Data Flow runtime includes several popular open source libraries that your Java or Scala applications might also use. To avoid runtime conflicts between Data Flow versions and your application versions, use a process called shading instead. You might need to recompile your Java or Scala applications with the following shading rules for your applications to run correctly in Data Flow.

Note

Shading isn't needed if you're using Spark 3.5.0 or 3.2.1.

Shading with Maven

Use the template pom.xml provided to build Data Flow Spark on your development machine, and as the basis for your own application libraries.

Download the template pom.xml file.
Read the README comments in the pom.xml.
Add the application dependencies under the following comment in the pom.xml: README: Application dependencies should be added below.
Note

Adding them elsewhere changes the class loading order and doesn't accurately reflect the Data Flow Spark runtime environment.

Create an empty project directory.
Create a file in the directory called pom.xml.
Copy the contents of the template pom.xml file into your pom.xml file.
Follow the README instructions to download Spark and build Data Flow.
Build and run the Data Flow example to ensure that Spark is built and working correctly before proceeding to integrating the existing application code into this project
For more information, see the tutorial, Use Your Data Flow Delegation Token to Access Oracle Cloud Infrastructure Resource
To build the application separately without the Oracle Cloud Infrastructure SDK and third-party dependencies, you must build an archive.zip file.

Tip

The template pom jars and target directory Spark distribution included in both the compile and runtime classpath, contain every library that Data Flow uses with the same versions and the same load order.

Example Pom.xml File for Spark 3.5.0

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <artifactId>project</artifactId>
  <groupId>org.example</groupId>
  <version>3.5.0-SNAPSHOT</version>
  <!-- README -->
  <!--
  This template pom will build Spark on your local machine Spark libraries that match Data Flow.
 
  Using a web browser:
  Download Spark to your project base directory - https://archive.apache.org/dist/spark/
  Use this exact version to match Data Flow - spark-3.5.0-bin-hadoop3.tgz
 
  Alternatively:
    $ wget https://archive.apache.org/dist/spark/spark-3.5.0/spark-3.5.0-bin-hadoop3.tgz
 
 
  To test and debug on a local dev machine
  $ mvn -P dev clean install
  OR
  Select dev profile under maven->profiles
 
  To build for Data Flow
  $ mvn clean install
 
  You may run your Spark application in two ways, either by running your application main() method
  directly in your IDE, e.g. main() -> right-click -> Run, or from the command-line
  using spark-submit. Remember, if you change the pom artifactId, also change the jar name in the
  command below. You may also need to explicitly select the "dev" Maven profile in your favorite
  IDE to avoid ClassNotFoundException when you setup the example project.
 
  $ ./target/spark-3.5.0-bin-hadoop3/bin/spark-submit -\
-class example.Example target/project-3.5.0-SNAPSHOT.jar
  -->
  <properties>
    <oci-java-sdk-version>3.34.1</oci-java-sdk-version>
    <oci-hdfs-version>3.3.4.1.4.2</oci-hdfs-version>
    <spark-scope>provided</spark-scope>
    <spark-download>${project.build.directory}/spark-3.5.0-bin-hadoop3/jars</spark-download>
  </properties>
  <dependencies>
 
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>3.5.0</version>
      <scope>system</scope>
      <systemPath>${basedir}/spark-3.5.0-bin-hadoop3.tgz</systemPath>
      <classifier>bin</classifier>
      <type>tgz</type>
    </dependency>
 
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-hdfs-connector</artifactId>
      <version>${oci-hdfs-version}</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-text</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-configuration2</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-monitoring</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-objectstorage-generated</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-addons-apache-configurator-jersey</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-common-httpclient</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-common-httpclient-jersey</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.glassfish.jersey.connectors</groupId>
          <artifactId>jersey-apache-connector</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.glassfish.jersey.media</groupId>
          <artifactId>jersey-media-json-jackson</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.code.gson</groupId>
          <artifactId>gson</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-hdfs</artifactId>
        </exclusion>
        <exclusion>
          <groupId>commons-io</groupId>
          <artifactId>commons-io</artifactId>
        </exclusion>
        <exclusion>
          <groupId>commons-logging</groupId>
          <artifactId>commons-logging</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.avro</groupId>
          <artifactId>avro</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.fasterxml.woodstox</groupId>
          <artifactId>woodstox-core</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-common</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-all</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-http</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-servlet</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-security</artifactId>
        </exclusion>
        <exclusion>
          <groupId>net.minidev</groupId>
          <artifactId>json-smart</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-log4j12</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>jul-to-slf4j</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>log4j</groupId>
          <artifactId>log4j</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.oracle.oci.sdk</groupId>
          <artifactId>oci-java-sdk-common</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.bouncycastle</groupId>
          <artifactId>bcprov-jdk15on</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.bouncycastle</groupId>
          <artifactId>bcpkix-jdk15on</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-log4j12</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-util</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-util-ajax</artifactId>
        </exclusion>
        <exclusion>
          <groupId>jakarta.annotation</groupId>
          <artifactId>jakarta.annotation-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-compress</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
 
    <!-- Data Flow Spark runtime dependency upgrades that are not transitive dependencies of Spark in Maven -->
    <dependency>
      <groupId>org.jetbrains.kotlin</groupId>
      <artifactId>kotlin-stdlib-common</artifactId>
      <version>1.4.10</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.google.guava</groupId>
      <artifactId>failureaccess</artifactId>
      <version>1.0.1</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.squareup.okio</groupId>
      <artifactId>okio-jvm</artifactId>
      <version>3.6.0</version>
      <exclusions>
        <exclusion>
          <groupId>org.jetbrains.kotlin</groupId>
          <artifactId>kotlin-stdlib-common</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro-ipc</artifactId>
      <version>1.11.3</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.apache.velocity</groupId>
          <artifactId>velocity-engine-core</artifactId>
        </exclusion>
        <exclusion>
          <groupId>javax.annotation</groupId>
          <artifactId>javax.annotation-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.github.luben</groupId>
          <artifactId>zstd-jni</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro-mapred</artifactId>
      <version>1.11.3</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.apache.avro</groupId>
          <artifactId>avro-ipc-jetty</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-util-ajax</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop.thirdparty</groupId>
      <artifactId>hadoop-shaded-guava</artifactId>
      <version>1.1.1</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>io.netty</groupId>
      <artifactId>netty-all</artifactId>
      <version>4.1.100.Final</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-mqtt</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-stomp</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-memcache</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-transport-rxtx</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-transport-sctp</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-dns</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-resolver-dns-native-macos</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-haproxy</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-smtp</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-handler-ssl-ocsp</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-redis</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-resolver-dns</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-resolver-dns-classes-macos</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-codec-xml</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-transport-udt</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>io.netty</groupId>
      <artifactId>netty-tcnative-boringssl-static</artifactId>
      <version>2.0.61.Final</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.google.code.gson</groupId>
      <artifactId>gson</artifactId>
      <version>2.10.1</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-text</artifactId>
      <version>1.10.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.thrift</groupId>
      <artifactId>libthrift</artifactId>
      <version>0.16.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>javax.annotation</groupId>
          <artifactId>javax.annotation-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>commons-logging</groupId>
          <artifactId>commons-logging</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.httpcomponents</groupId>
          <artifactId>httpclient</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-api</artifactId>
      <version>2.0.7</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.derby</groupId>
      <artifactId>derby</artifactId>
      <version>10.15.1.3</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.checkerframework</groupId>
      <artifactId>checker-qual</artifactId>
      <version>3.33.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.google.errorprone</groupId>
      <artifactId>error_prone_annotations</artifactId>
      <version>2.18.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hive</groupId>
      <artifactId>hive-exec</artifactId>
      <version>3.1.1</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.xerial.snappy</groupId>
      <artifactId>snappy-java</artifactId>
      <version>1.1.10.5</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.jetbrains</groupId>
      <artifactId>annotations</artifactId>
      <version>17.0.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.ivy</groupId>
      <artifactId>ivy</artifactId>
      <version>2.5.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.httpcomponents</groupId>
      <artifactId>httpclient</artifactId>
      <version>4.5.14</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.yetus</groupId>
      <artifactId>audience-annotations</artifactId>
      <version>0.12.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>com.google.guava</groupId>
      <artifactId>listenablefuture</artifactId>
      <version>9999.0-empty-to-avoid-conflict-with-guava</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.google.j2objc</groupId>
      <artifactId>j2objc-annotations</artifactId>
      <version>2.8</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>com.google.guava</groupId>
      <artifactId>guava</artifactId>
      <version>32.0.1-jre</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>dev.ludovic.netlib</groupId>
      <artifactId>arpack</artifactId>
      <version>3.0.3</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-compress</artifactId>
      <version>1.23.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-yarn-server-web-proxy</artifactId>
      <version>3.3.4</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>javax.servlet</groupId>
          <artifactId>javax.servlet-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-yarn-server-common</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-yarn-common</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-yarn-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.eclipse.jetty</groupId>
          <artifactId>jetty-server</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.bouncycastle</groupId>
          <artifactId>bcprov-jdk15on</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.bouncycastle</groupId>
          <artifactId>bcpkix-jdk15on</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>commons-collections</groupId>
      <artifactId>commons-collections</artifactId>
      <version>3.2.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
 
    <dependency>
      <groupId>org.apache.httpcomponents</groupId>
      <artifactId>httpcore</artifactId>
      <version>4.4.16</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-annotations</artifactId>
      <version>2.15.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-databind</artifactId>
      <version>2.15.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-core</artifactId>
      <version>2.15.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.dataformat</groupId>
      <artifactId>jackson-dataformat-yaml</artifactId>
      <version>2.15.2</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.module</groupId>
      <artifactId>jackson-module-scala_2.12</artifactId>
      <version>2.15.2</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.yaml</groupId>
      <artifactId>snakeyaml</artifactId>
      <version>1.34-SNAPSHOT</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.squareup.okhttp3</groupId>
      <artifactId>okhttp</artifactId>
      <version>4.9.3</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.jetbrains</groupId>
          <artifactId>annotations</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
 
    <!-- Spark transitive dependencies exclude/upgrade section -->
    <dependency>
      <groupId>org.apache.curator</groupId>
      <artifactId>curator-framework</artifactId>
      <version>2.13.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>*</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.curator</groupId>
      <artifactId>curator-recipes</artifactId>
      <version>2.13.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>com.squareup.okio</groupId>
      <artifactId>okio</artifactId>
      <version>3.6.0</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.objenesis</groupId>
      <artifactId>objenesis</artifactId>
      <version>3.3</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>jline</groupId>
      <artifactId>jline</artifactId>
      <version>2.14.6</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>jakarta.validation</groupId>
      <artifactId>jakarta.validation-api</artifactId>
      <version>2.0.2</version>
      <scope>${spark-scope}</scope>
    </dependency>
    <dependency>
      <groupId>commons-codec</groupId>
      <artifactId>commons-codec</artifactId>
      <version>1.16.0</version>
    </dependency>
    <dependency>
      <groupId>dev.ludovic.netlib</groupId>
      <artifactId>lapack</artifactId>
      <version>3.0.3</version>
    </dependency>
    <dependency>
      <groupId>jakarta.annotation</groupId>
      <artifactId>jakarta.annotation-api</artifactId>
      <version>1.3.5</version>
      <scope>${spark-scope}</scope>
    </dependency>
 
    <!-- Spark 3.5.0 -->
    <!-- ********************************************************************** -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.apache.logging.log4j</groupId>
          <artifactId>log4j-1.2-api</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.ivy</groupId>
          <artifactId>ivy</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.avro</groupId>
          <artifactId>avro-ipc</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.avro</groupId>
          <artifactId>avro-mapred</artifactId>
        </exclusion>
        <exclusion>
          <groupId>io.netty</groupId>
          <artifactId>netty-all</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.protobuf</groupId>
          <artifactId>protobuf-java</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.zookeeper</groupId>
          <artifactId>zookeeper</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.code.gson</groupId>
          <artifactId>gson</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.guava</groupId>
          <artifactId>guava</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.objenesis</groupId>
          <artifactId>objenesis</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.curator</groupId>
          <artifactId>curator-framework</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.curator</groupId>
          <artifactId>curator-recipes</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.squareup.okio</groupId>
          <artifactId>okio</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-lang2</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.yaml</groupId>
          <artifactId>snakeyaml</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.squareup.okhttp3</groupId>
          <artifactId>okhttp</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-hdfs</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hadoop</groupId>
          <artifactId>hadoop-client</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-compress</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
        <exclusion>
          <groupId>jakarta.annotation</groupId>
          <artifactId>jakarta.annotation-api</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
        <exclusion>
          <groupId> org.apache.yetus</groupId>
          <artifactId>audience-annotations</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-hive_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.codehaus.jackson</groupId>
          <artifactId>jackson-mapper-asl</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.codehaus.jackson</groupId>
          <artifactId>jackson-core-asl</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hive</groupId>
          <artifactId>hive-exec</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.derby</groupId>
          <artifactId>derby</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.thrift</groupId>
          <artifactId>libthrift</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.code.gson</groupId>
          <artifactId>gson</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.protobuf</groupId>
          <artifactId>protobuf-java</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.commons</groupId>
          <artifactId>commons-compress</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hive</groupId>
          <artifactId>hive-llap-client</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
        <exclusion>
          <groupId>jline</groupId>
          <artifactId>jline</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.httpcomponents</groupId>
          <artifactId>httpcore</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.hive.shims</groupId>
          <artifactId>hive-shims-0.23</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
 
    <!-- ********************************************************************** -->
    <!-- README
    This section is not required. These dependencies are here for runtime completeness
    when testing within an IDE, to exactly match all of the Data Flow classpath libraries.
    -->
 
    <dependency>
      <groupId>org.apache.curator</groupId>
      <artifactId>curator-client</artifactId>
      <version>2.13.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>com.google.guava</groupId>
          <artifactId>guava</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.zookeeper</groupId>
          <artifactId>zookeeper</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-yarn_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-graphx_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-mllib-local_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-tags_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-hive-thriftserver_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
        <exclusion>
          <groupId>org.apache.httpcomponents</groupId>
          <artifactId>httpcore</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-kubernetes_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-repl_2.12</artifactId>
      <version>3.5.0</version>
      <scope>${spark-scope}</scope>
      <exclusions>
        <exclusion>
          <groupId>org.spark-project.spark</groupId>
          <artifactId>unused</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
 
    <!-- ********************************************************************** -->
 
    <!-- ############################################## -->
    <!-- README: Application dependencies should be added below -->
    <!-- ############################################## -->
 
    <!-- THESE ARE NOT REQUIRED, JUST AN EXAMPLE, UNCOMMENT AS NEEDED -->
    <dependency>
      <groupId>org.glassfish.jersey.connectors</groupId>
      <artifactId>jersey-apache-connector</artifactId>
      <version>2.34</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-common</artifactId>
      <version>${oci-java-sdk-version}</version>
      <exclusions>
        <exclusion>
          <groupId>com.fasterxml.jackson.datatype</groupId>
          <artifactId>jackson-datatype-jsr310</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-objectstorage</artifactId>
      <version>${oci-java-sdk-version}</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-objectstorage-generated</artifactId>
      <version>${oci-java-sdk-version}</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
      <version>${oci-java-sdk-version}</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-common-httpclient-jersey</artifactId>
      <version>${oci-java-sdk-version}</version>
      <exclusions>
        <exclusion>
          <groupId>jakarta.annotation</groupId>
          <artifactId>jakarta.annotation-api</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <!-- Add the Vault SDK dependency to our example project pom.xml below the Secrets SDK dependency -->
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-vault</artifactId>
      <version>${oci-java-sdk-version}</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.oci.sdk</groupId>
      <artifactId>oci-java-sdk-secrets</artifactId>
      <version>${oci-java-sdk-version}</version>
    </dependency>
    <!-- README Add archive.zip jar dependencies here with <scope>${spark-scope}</scope> or <scope>system</scope>-->
 
    <!-- Add the ADW dependencies to our example project pom.xml below the Secrets SDK dependency -->
    <!-- Drivers for talking to ADW. Jars need to be deployed using mvn deploy:deploy-file -->
    <dependency>
      <groupId>com.oracle.database.jdbc</groupId>
      <artifactId>ojdbc8</artifactId>
      <version>21.7.0.0</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.database.jdbc</groupId>
      <artifactId>ucp</artifactId>
      <version>21.7.0.0</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.database.security</groupId>
      <artifactId>oraclepki</artifactId>
      <version>21.7.0.0</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.database.security</groupId>
      <artifactId>osdt_cert</artifactId>
      <version>21.7.0.0</version>
    </dependency>
    <dependency>
      <groupId>com.oracle.database.security</groupId>
      <artifactId>osdt_core</artifactId>
      <version>21.7.0.0</version>
    </dependency>
 
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <artifactId>maven-toolchains-plugin</artifactId>
        <groupId>org.apache.maven.plugins</groupId>
        <version>3.1.0</version>
        <executions>
          <execution>
            <goals>
              <goal>toolchain</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <toolchains>
            <jdk>
              <version>17</version>
            </jdk>
          </toolchains>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.0</version>
        <configuration>
          <source>17</source>
          <target>17</target>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.22.0</version>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.1.1</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <filters>
            <filter>
              <artifact>*:*</artifact>
              <excludes>
                <exclude>META-INF/*.SF</exclude>
                <exclude>META-INF/*.DSA</exclude>
                <exclude>META-INF/*.RSA</exclude>
                <exclude>META-INF/versions/11/org/roaringbitmap/ArraysShim.class</exclude>
                <exclude>META-INF/versions/11/org/glassfish/jersey/internal/jsr166/**</exclude>
              </excludes>
            </filter>
          </filters>
          <artifactSet>
            <excludes>
              <exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
              <exclude>org.bouncycastle:bcprov-jdk15on</exclude>
              <exclude>com.google.code.findbugs:jsr305</exclude>
            </excludes>
          </artifactSet>
        </configuration>
      </plugin>
      <!-- Unpack Spark compressed tgz -->
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>3.1.1</version>
        <executions>
          <execution>
            <phase>generate-resources</phase>
            <goals>
              <goal>unpack-dependencies</goal>
            </goals>
            <configuration>
              <overWriteIfNewer>true</overWriteIfNewer>
              <includeTypes>tgz</includeTypes>
              <includeArtifactIds>spark-core_2.12</includeArtifactIds>
              <outputDirectory>${project.build.directory}</outputDirectory>
              <includes>**/**</includes>
              <excludes>
                **/jars/audience-annotations-0.5.0.jar,
                **/jars/ivy-2.5.1.jar,
                **/jars/snappy-java-1.1.10.3.jar,
                **/jars/avro-ipc-1.11.2.jar,
                **/jars/hive-exec-2.3.9-core.jar,
                **/jars/avro-mapred-1.11.2.jar,
                **/jars/derby-10.14.2.0.jar,
                **/jars/libthrift-0.12.0.jar,
                **/jars/avro-1.11.2.jar,
                **/jars/guava-14.0.1.jar,
                **/jars/netty-*.jar,
                **/jars/jackson-annotations-2.12.3.jar,
                **/jars/jackson-core-2.12.3.jar,
                **/jars/jackson-databind-2.12.3.jar,
                **/jars/jackson-dataformat-yaml-2.12.3.jar,
                **/jars/jackson-module-scala_2.12-2.12.3.jar,
                **/jars/mesos-1.4.0-shaded-protobuf.jar,
                **/jars/snakeyaml-1.27.jar,
                **/jars/spark-mesos_2.12-3.2.1.jar,
                **/jars/log4j-1.2.17.jar,
                **/jars/okhttp-3.12.12.jar,
                **/jars/okio-1.14.0.jar,
                **/jars/gson-2.2.4.jar,
                **/jars/commons-dbcp-1.4.jar,
                **/examples/jars/*
              </excludes>
            </configuration>
          </execution>
        </executions>
      </plugin>
      <!--
      We need to move the correct versions of the jars that we excluded above into the Spark
      jars directory so that spark-submit from the command line will run correctly.
      Note: some jars like mesos-1.4.0 are removed because Data Flow does not use Mesosphere.
      -->
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>3.1.1</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>copy</goal>
            </goals>
            <configuration>
              <artifactItems>
                <artifactItem>
                  <groupId>io.github.resilience4j</groupId>
                  <artifactId>resilience4j-circuitbreaker</artifactId>
                  <version>1.7.1</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.github.resilience4j</groupId>
                  <artifactId>resilience4j-core</artifactId>
                  <version>1.7.1</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.glassfish.jersey.media</groupId>
                  <artifactId>jersey-media-json-jackson</artifactId>
                  <version>2.27</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-core</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-common</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-circuitbreaker</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage-generated</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-secrets</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>-->
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-hdfs-connector</artifactId>
                  <version>${oci-hdfs-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.fasterxml.jackson.core</groupId>
                  <artifactId>jackson-annotations</artifactId>
                  <version>2.15.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.fasterxml.jackson.core</groupId>
                  <artifactId>jackson-databind</artifactId>
                  <version>2.15.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.fasterxml.jackson.core</groupId>
                  <artifactId>jackson-core</artifactId>
                  <version>2.15.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.fasterxml.jackson.dataformat</groupId>
                  <artifactId>jackson-dataformat-yaml</artifactId>
                  <version>2.15.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.fasterxml.jackson.module</groupId>
                  <artifactId>jackson-module-scala_2.12</artifactId>
                  <version>2.15.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.yaml</groupId>
                  <artifactId>snakeyaml</artifactId>
                  <version>1.34-SNAPSHOT</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.squareup.okhttp3</groupId>
                  <artifactId>okhttp</artifactId>
                  <version>4.9.3</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.squareup.okio</groupId>
                  <artifactId>okio</artifactId>
                  <version>3.6.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.squareup.okio</groupId>
                  <artifactId>okio-jvm</artifactId>
                  <version>3.6.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.jetbrains.kotlin</groupId>
                  <artifactId>kotlin-stdlib-common</artifactId>
                  <version>1.4.10</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.jetbrains.kotlin</groupId>
                  <artifactId>kotlin-stdlib</artifactId>
                  <version>1.4.10</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>commons-dbcp</groupId>
                  <artifactId>commons-dbcp</artifactId>
                  <version>1.4</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage-generated</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-common-httpclient-jersey</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-common-httpclient</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-vault</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.oracle.oci.sdk</groupId>
                  <artifactId>oci-java-sdk-secrets</artifactId>
                  <version>${oci-java-sdk-version}</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
                  <version>3.5.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.errorprone</groupId>
                  <artifactId>error_prone_annotations</artifactId>
                  <version>2.18.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.hive</groupId>
                  <artifactId>hive-exec</artifactId>
                  <version>3.1.1</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.xerial.snappy</groupId>
                  <artifactId>snappy-java</artifactId>
                  <version>1.1.10.5</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.ivy</groupId>
                  <artifactId>ivy</artifactId>
                  <version>2.5.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.yetus</groupId>
                  <artifactId>audience-annotations</artifactId>
                  <version>0.12.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.j2objc</groupId>
                  <artifactId>j2objc-annotations</artifactId>
                  <version>2.8</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.guava</groupId>
                  <artifactId>listenablefuture</artifactId>
                  <version>9999.0-empty-to-avoid-conflict-with-guava</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.vavr</groupId>
                  <artifactId>vavr</artifactId>
                  <version>0.10.2</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.bouncycastle</groupId>
                  <artifactId>bcpkix-jdk15to18</artifactId>
                  <version>1.74</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.bouncycastle</groupId>
                  <artifactId>bcprov-jdk15to18</artifactId>
                  <version>1.74</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.guava</groupId>
                  <artifactId>guava</artifactId>
                  <version>32.0.1-jre</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.checkerframework</groupId>
                  <artifactId>checker-qual</artifactId>
                  <version>3.33.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-all</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-buffer</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-codec</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-codec-http</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-codec-http2</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-codec-socks</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-common</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-handler</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-handler-proxy</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-resolver</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-transport</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-transport-classes-epoll</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-transport-classes-kqueue</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-transport-native-unix-common</artifactId>
                  <version>4.1.100.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.guava</groupId>
                  <artifactId>failureaccess</artifactId>
                  <version>1.0.1</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>io.netty</groupId>
                  <artifactId>netty-tcnative-boringssl-static</artifactId>
                  <version>2.0.61.Final</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>com.google.code.gson</groupId>
                  <artifactId>gson</artifactId>
                  <version>2.10.1</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.avro</groupId>
                  <artifactId>avro-ipc</artifactId>
                  <version>1.11.3</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.avro</groupId>
                  <artifactId>avro-mapred</artifactId>
                  <version>1.11.3</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.glassfish.jersey.connectors</groupId>
                  <artifactId>jersey-apache-connector</artifactId>
                  <version>2.34</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.thrift</groupId>
                  <artifactId>libthrift</artifactId>
                  <version>0.16.0</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
                <artifactItem>
                  <groupId>org.apache.derby</groupId>
                  <artifactId>derby</artifactId>
                  <version>10.15.1.3</version>
                  <type>jar</type>
                  <outputDirectory>${spark-download}</outputDirectory>
                </artifactItem>
              </artifactItems>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
  <profiles>
    <profile>
      <id>dev</id>
      <activation>
        <activeByDefault>false</activeByDefault>
      </activation>
      <properties>
        <spark-scope>compile</spark-scope>
      </properties>
      <build>
        <plugins>
          <plugin>
            <artifactId>maven-antrun-plugin</artifactId>
            <version>1.8</version>
            <executions>
              <execution>
                <phase>validate</phase>
                <configuration>
                  <tasks>
                    <taskdef resource="net/sf/antcontrib/antcontrib.properties" />
                    <if>
                      <available file="archive.zip"/>
                      <then>
                        <unzip src="archive.zip" dest="${project.build.directory}/archive" />
                      </then>
                      <else>
                        <echo>The archive.zip does not exist</echo>
                      </else>
                    </if>
                  </tasks>
                </configuration>
                <goals>
                  <goal>run</goal>
                </goals>
              </execution>
            </executions>
            <dependencies>
              <dependency>
                <groupId>ant-contrib</groupId>
                <artifactId>ant-contrib</artifactId>
                <version>20020829</version>
              </dependency>
            </dependencies>
          </plugin>
        </plugins>
      </build>
    </profile>
  </profiles>
</project>

Example Pom.xml File for Spark 3.2.1

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>project</artifactId>
<groupId>org.example</groupId>
<version>1.0-SNAPSHOT</version>
<!-- README -->
<!--
This template pom will build Spark on your local machine Spark libraries that match Data Flow.
 
Using a web browser:
Download Spark to your project base directory - https://archive.apache.org/dist/spark/
Use this exact version to match Data Flow - spark-3.2.1-bin-hadoop3.2.tgz
 
Alternatively:
$ wget https://archive.apache.org/dist/spark/spark-3.2.1/spark-3.2.1-bin-hadoop3.2.tgz
 
 
To test and debug on a local dev machine
$ mvn -P dev clean install
OR
Select dev profile under maven->profiles
 
To build for Data Flow
$ mvn clean install
 
You may run your Spark application in two ways, either by running your application main() method
directly in your IDE, e.g. main() -> right-click -> Run, or from the command-line
using spark-submit. Remember, if you change the pom artifactId, also change the jar name in the
command below. You may also need to explicitly select the "dev" Maven profile in your favorite
IDE to avoid ClassNotFoundException when you setup the example project.
 
$ ./target/spark-3.2.1-bin-hadoop3.2/bin/spark-submit -\
-class example.Example target/project-1.0-SNAPSHOT.jar
-->
<properties>
<oci-java-sdk-version>2.12.1-SNAPSHOT</oci-java-sdk-version>
<spark-scope>provided</spark-scope>
<spark-download>${project.build.directory}/spark-3.2.1-bin-hadoop3.2/jars</spark-download>
</properties>
<dependencies>
 
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.2.1</version>
<scope>system</scope>
<systemPath>${basedir}/spark-3.2.1-bin-hadoop3.2.tgz</systemPath>
<classifier>bin</classifier>
<type>tgz</type>
</dependency>
 
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.30</version>
<scope>${spark-scope}</scope>
</dependency>
 
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>3.3.1.0.3.2</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
<!-- this is required for com.oracle.bmc.auth -->
<!-- <exclusion>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-addons-apache</artifactId>
</exclusion>-->
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-codec-http</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-servlet</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-security</artifactId>
</exclusion>
<exclusion>
<groupId>net.minidev</groupId>
<artifactId>json-smart</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
<exclusion>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util-ajax</artifactId>
</exclusion>
</exclusions>
</dependency>
 
<!-- Data Flow Spark runtime dependency upgrades that are not transitive dependencies of Spark in Maven -->
<dependency>
<groupId>org.apache.hadoop.thirdparty</groupId>
<artifactId>hadoop-shaded-guava</artifactId>
<version>1.1.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-web-proxy</artifactId>
<version>3.3.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>3.2.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0-jre</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.4.14</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.12</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
<version>1.28</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.9.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>commons-dbcp</groupId>
<artifactId>commons-dbcp</artifactId>
<version>1.2</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>commons-pool</groupId>
<artifactId>commons-pool</artifactId>
</exclusion>
</exclusions>
</dependency>
 
<!-- Spark transitive dependencies exclude/upgrade section -->
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
<version>2.8.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.21</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.objenesis</groupId>
<artifactId>objenesis</artifactId>
<version>2.6</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
<version>2.14.6</version>
<scope>${spark-scope}</scope>
</dependency>
 
<!-- Spark 3.2.1 -->
<!-- ********************************************************************** -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
<exclusion>
<groupId>org.objenesis</groupId>
<artifactId>objenesis</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang2</artifactId>
</exclusion>
<exclusion>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
</exclusion>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>jakarta.validation</groupId>
<artifactId>jakarta.validation-api</artifactId>
<version>2.0.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId> org.apache.yetus</groupId>
<artifactId>audience-annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hive.shims</groupId>
<artifactId>hive-shims-0.23</artifactId>
</exclusion>
</exclusions>
</dependency>
 
<!-- ********************************************************************** -->
<!-- README
This section is not required. These dependencies are here for runtime completeness
when testing within an IDE, to exactly match all of the Data Flow classpath libraries.
-->
 
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-yarn_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib-local_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive-thriftserver_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kubernetes_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-repl_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.2.11</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17-16</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>1.4.10</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-common</artifactId>
<version>1.4.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.12.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
 
 
<!-- ********************************************************************** -->
 
<!-- ############################################## -->
<!-- README: Application dependencies should be added below -->
<!-- ############################################## -->
 
<!-- THESE ARE NOT REQUIRED, JUST AN EXAMPLE, UNCOMMENT AS NEEDED -->
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
<version>${oci-java-sdk-version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jsr310</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-circuitbreaker</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-generated</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-vault</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<version>18.3.0.0</version>
</dependency>
<!-- README Add archive.zip jar dependencies here with <scope>${spark-scope}</scope> or <scope>system</scope>-->
 
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/versions/11/org/roaringbitmap/ArraysShim.class</exclude>
<exclude>META-INF/versions/11/org/glassfish/jersey/internal/jsr166/**</exclude>
</excludes>
</filter>
</filters>
<artifactSet>
<excludes>
<exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
<exclude>org.bouncycastle:bcprov-jdk15on</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
</excludes>
</artifactSet>
</configuration>
</plugin>
<!-- Unpack Spark compressed tgz -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>generate-resources</phase>
<goals>
<goal>unpack-dependencies</goal>
</goals>
<configuration>
<overWriteIfNewer>true</overWriteIfNewer>
<includeTypes>tgz</includeTypes>
<includeArtifactIds>spark-core_2.12</includeArtifactIds>
<outputDirectory>${project.build.directory}</outputDirectory>
<includes>**/**</includes>
<excludes>
**/jars/guava-14.0.1.jar,
**/jars/jackson-annotations-2.12.3.jar,
**/jars/jackson-core-2.12.3.jar,
**/jars/jackson-databind-2.12.3.jar,
**/jars/jackson-dataformat-yaml-2.12.3.jar,
**/jars/jackson-module-scala_2.12-2.12.3.jar,
**/jars/mesos-1.4.0-shaded-protobuf.jar,
**/jars/snakeyaml-1.27.jar,
**/jars/spark-mesos_2.12-3.2.1.jar,
**/jars/log4j-1.2.17.jar,
**/jars/okhttp-3.12.12.jar,
**/jars/okio-1.14.0.jar,
**/jars/gson-2.2.4.jar,
**/jars/commons-dbcp-1.4.jar,
**/examples/jars/*
</excludes>
</configuration>
</execution>
</executions>
</plugin>
<!--
We need to move the correct versions of the jars that we excluded above into the Spark
jars directory so that spark-submit from the command line will run correctly.
Note: some jars like mesos-1.4.0 are removed because Data Flow does not use Mesosphere.
-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-circuitbreaker</artifactId>
<version>1.2.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-core</artifactId>
<version>1.2.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.glassfish.jersey.media</groupId>
<artifactId>jersey-media-json-jackson</artifactId>
<version>2.27</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-circuitbreaker</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-generated</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>-->
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>3.3.1.0.3.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
<version>1.60</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
<version>1.60</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0-jre</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.12</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
<version>1.28</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
<version>2.8.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17-16</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.glassfish.jersey.connectors</groupId>
<artifactId>jersey-apache-connector</artifactId>
<version>2.34</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>io.vavr</groupId>
<artifactId>vavr</artifactId>
<version>0.10.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>1.4.10</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-common</artifactId>
<version>1.4.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.12.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.9.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>commons-dbcp</groupId>
<artifactId>commons-dbcp</artifactId>
<version>1.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.example</groupId>
<artifactId>Test</artifactId>
<version>2.0-SNAPSHOT</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>dev</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<spark-scope>compile</spark-scope>
</properties>
<build>
<plugins>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.8</version>
<executions>
<execution>
<phase>validate</phase>
<configuration>
<tasks>
<taskdef resource="net/sf/antcontrib/antcontrib.properties" />
<if>
<available file="archive.zip"/>
<then>
<unzip src="archive.zip" dest="${project.build.directory}/archive" />
</then>
<else>
<echo>The archive.zip does not exist</echo>
</else>
</if>
</tasks>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>ant-contrib</groupId>
<artifactId>ant-contrib</artifactId>
<version>20020829</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>

Example Pom.xml File for Spark 3.0.2

<?xml version="1.0" encoding="UTF-8"?>
        <project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
        
        <groupId>com.example</groupId>
        <artifactId>project</artifactId>
        <version>1.0-SNAPSHOT</version>
        
        <repositories>
        <repository>
        <id>project.local</id>
        <name>project</name>
        <url>file:${project.basedir}/repo</url>
        </repository>
        </repositories>
        
        <properties>
        <oci-java-sdk-version>1.25.2</oci-java-sdk-version>
        </properties>
        
        <dependencies>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-hdfs-connector</artifactId>
        <version>3.2.1.3</version>
        <scope>provided</scope>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-core</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-objectstorage</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-secrets</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        
        <!-- Spark 3.0.2 -->
        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>3.0.2</version>
        <scope>provided</scope>
        </dependency>
        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>3.0.2</version>
        <scope>provided</scope>
        </dependency>
        </dependencies>
        
        <build>
        <plugins>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.0</version>
        <configuration>
        <source>1.8</source>
        <target>1.8</target>
        </configuration>
        </plugin>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.22.0</version>
        </plugin>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.1.1</version>
        <executions>
        <execution>
        <phase>package</phase>
        <goals>
        <goal>shade</goal>
        </goals>
        </execution>
        </executions>
        <configuration>
        <filters>
        <filter>
        <artifact>*:*</artifact>
        <excludes>
        <exclude>META-INF/*.SF</exclude>
        <exclude>META-INF/*.DSA</exclude>
        <exclude>META-INF/*.RSA</exclude>
        </excludes>
        </filter>
        </filters>
        <relocations>
        <relocation>
        <pattern>org.apache.http</pattern>
        <shadedPattern>shaded.oracle.org.apache.http</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.apache.commons</pattern>
        <shadedPattern>shaded.oracle.org.apache.commons</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.fasterxml</pattern>
        <shadedPattern>shaded.oracle.com.fasterxml</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.google</pattern>
        <shadedPattern>shaded.oracle.com.google</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.ws.rs</pattern>
        <shadedPattern>shaded.oracle.javax.ws.rs</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.glassfish</pattern>
        <shadedPattern>shaded.oracle.org.glassfish</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.jvnet</pattern>
        <shadedPattern>shaded.oracle.org.jvnet</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.annotation</pattern>
        <shadedPattern>shaded.oracle.javax.annotation</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.validation</pattern>
        <shadedPattern>shaded.oracle.javax.validation</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.oracle.bmc</pattern>
        <shadedPattern>shaded.com.oracle.bmc</shadedPattern>
        <includes>
        <include>com.oracle.bmc.**</include>
        </includes>
        <excludes>
        <exclude>com.oracle.bmc.hdfs.**</exclude>
        </excludes>
        </relocation>
        </relocations>
        <artifactSet>
        <excludes>
        <exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
        <exclude>org.bouncycastle:bcprov-jdk15on</exclude>
        <exclude>com.google.code.findbugs:jsr305</exclude>
        </excludes>
        </artifactSet>
        </configuration>
        </plugin>
        </plugins>
        </build>
        </project>

Example Pom.xml File for Spark 2.4.4

<?xml version="1.0" encoding="UTF-8"?>
        <project xmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
        
        <groupId>com.example</groupId>
        <artifactId>project</artifactId>
        <version>1.0-SNAPSHOT</version>
        
        <repositories>
        <repository>
        <id>project.local</id>
        <name>project</name>
        <url>file:${project.basedir}/repo</url>
        </repository>
        </repositories>
        
        <properties>
        <oci-java-sdk-version>1.15.4</oci-java-sdk-version>
        </properties>
        
        <dependencies>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-hdfs-connector</artifactId>
        <version>2.9.2.6</version>
        <scope>provided</scope>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-core</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-objectstorage</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        <dependency>
        <groupId>com.oracle.oci.sdk</groupId>
        <artifactId>oci-java-sdk-secrets</artifactId>
        <version>${oci-java-sdk-version}</version>
        </dependency>
        
        <!-- Spark 2.4.4 -->
        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>2.4.4</version>
        <scope>provided</scope>
        </dependency>
        <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>2.4.4</version>
        <scope>provided</scope>
        </dependency>
        </dependencies>
        
        <build>
        <plugins>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.0</version>
        <configuration>
        <source>1.8</source>
        <target>1.8</target>
        </configuration>
        </plugin>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.22.0</version>
        </plugin>
        <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>3.1.1</version>
        <executions>
        <execution>
        <phase>package</phase>
        <goals>
        <goal>shade</goal>
        </goals>
        </execution>
        </executions>
        <configuration>
        <filters>
        <filter>
        <artifact>*:*</artifact>
        <excludes>
        <exclude>META-INF/*.SF</exclude>
        <exclude>META-INF/*.DSA</exclude>
        <exclude>META-INF/*.RSA</exclude>
        </excludes>
        </filter>
        </filters>
        <relocations>
        <relocation>
        <pattern>org.apache.http</pattern>
        <shadedPattern>shaded.oracle.org.apache.http</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.apache.commons</pattern>
        <shadedPattern>shaded.oracle.org.apache.commons</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.fasterxml</pattern>
        <shadedPattern>shaded.oracle.com.fasterxml</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.google</pattern>
        <shadedPattern>shaded.oracle.com.google</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.ws.rs</pattern>
        <shadedPattern>shaded.oracle.javax.ws.rs</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.glassfish</pattern>
        <shadedPattern>shaded.oracle.org.glassfish</shadedPattern>
        </relocation>
        <relocation>
        <pattern>org.jvnet</pattern>
        <shadedPattern>shaded.oracle.org.jvnet</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.annotation</pattern>
        <shadedPattern>shaded.oracle.javax.annotation</shadedPattern>
        </relocation>
        <relocation>
        <pattern>javax.validation</pattern>
        <shadedPattern>shaded.oracle.javax.validation</shadedPattern>
        </relocation>
        <relocation>
        <pattern>com.oracle.bmc</pattern>
        <shadedPattern>shaded.com.oracle.bmc</shadedPattern>
        <includes>
        <include>com.oracle.bmc.**</include>
        </includes>
        <excludes>
        <exclude>com.oracle.bmc.hdfs.**</exclude>
        </excludes>
        </relocation>
        </relocations>
        <artifactSet>
        <excludes>
        <exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
        <exclude>org.bouncycastle:bcprov-jdk15on</exclude>
        <exclude>com.google.code.findbugs:jsr305</exclude>
        </excludes>
        </artifactSet>
        </configuration>
        </plugin>
        </plugins>
        </build>
        </project>

Shading with Scala Build Tool

To Shade your application using Scala Build Tool (sbt), you must add the sbt-assembly plugin to your project, and create an assemblyMergeStrategy and an assemblyShadeRules in your sbt build file.

The plugins.sbt file

In your project's root directory, add this line to the project/plugins.sbt file. Create the file if it doesn't exist.

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")

The build.sbt file

Add the following items to your build.sbt file. While this configuration addresses the most common conflicts, you might need to add more items to the assemblyShadeRules section if you experience runtime failures.

For Spark 2.4.4:

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "2.4.4" % "provided",
  "org.apache.spark" %% "spark-sql" % "2.4.4" % "provided",
  "com.oracle.oci.sdk" % "oci-hdfs-connector" % "2.9.2.1" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-core" % "1.15.4" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-objectstorage" % "1.15.4" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-secrets" % "1.15.4",
)

assemblyMergeStrategy in assembly := {
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case "module-info.class" => MergeStrategy.last
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("org.apache.http.**" -> "shaded.oracle.org.apache.http.@1").inAll,
  ShadeRule.rename("org.apache.commons.**" -> "shaded.oracle.org.apache.commons.@1").inAll,
  ShadeRule.rename("com.fasterxml.**" -> "shaded.oracle.com.fasterxml.@1").inAll,
  ShadeRule.rename("com.google.**" -> "shaded.oracle.com.google.@1").inAll,
  ShadeRule.rename("javax.ws.rs.**" -> "shaded.oracle.javax.ws.rs.@1").inAll,
  ShadeRule.rename("org.glassfish.**" -> "shaded.oracle.org.glassfish.@1").inAll,
  ShadeRule.rename("org.jvnet.**" -> "shaded.oracle.org.jvnet.@1").inAll,
  ShadeRule.rename("javax.annotation.**" -> "shaded.oracle.javax.annotation.@1").inAll,
  ShadeRule.rename("javax.validation.**" -> "shaded.oracle.javax.validation.@1").inAll,
  ShadeRule.rename("com.oracle.bmc.hdfs.**" -> "com.oracle.bmc.hdfs.@1").inAll,
  ShadeRule.rename("com.oracle.bmc.**" -> "shaded.com.oracle.bmc.@1").inAll,
  ShadeRule.zap("org.bouncycastle").inAll,
)

For Spark 3.0.2:

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.0.2" % "provided",
  "org.apache.spark" %% "spark-sql" % "3.0.2" % "provided",
  "com.oracle.oci.sdk" % "oci-hdfs-connector" % "3.2.1.3" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-core" % "1.25.2" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-objectstorage" % "1.25.2" % "provided",
  "com.oracle.oci.sdk" % "oci-java-sdk-secrets" % "1.25.2",
)

assemblyMergeStrategy in assembly := {
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case "module-info.class" => MergeStrategy.last
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("org.apache.http.**" -> "shaded.oracle.org.apache.http.@1").inAll,
  ShadeRule.rename("org.apache.commons.**" -> "shaded.oracle.org.apache.commons.@1").inAll,
  ShadeRule.rename("com.fasterxml.**" -> "shaded.oracle.com.fasterxml.@1").inAll,
  ShadeRule.rename("com.google.**" -> "shaded.oracle.com.google.@1").inAll,
  ShadeRule.rename("javax.ws.rs.**" -> "shaded.oracle.javax.ws.rs.@1").inAll,
  ShadeRule.rename("org.glassfish.**" -> "shaded.oracle.org.glassfish.@1").inAll,
  ShadeRule.rename("org.jvnet.**" -> "shaded.oracle.org.jvnet.@1").inAll,
  ShadeRule.rename("javax.annotation.**" -> "shaded.oracle.javax.annotation.@1").inAll,
  ShadeRule.rename("javax.validation.**" -> "shaded.oracle.javax.validation.@1").inAll,
  ShadeRule.rename("com.oracle.bmc.hdfs.**" -> "com.oracle.bmc.hdfs.@1").inAll,
  ShadeRule.rename("com.oracle.bmc.**" -> "shaded.com.oracle.bmc.@1").inAll,
  ShadeRule.zap("org.bouncycastle").inAll,
)

For more information on Shading with sbt, see, the sbt-assembly plugin.

What's Next

Now you can start migrating your Spark applications to run in Data Flow.

Oracle Cloud Infrastructure Documentation Try Free Tier

Migrating Spark Applications to Oracle Cloud Infrastructure Data Flow

Before You Begin 🔗

Allowed Spark Variables 🔗

Compatibility Limitations 🔗

1. Supported Ways to Access the Spark Session 🔗

2. Managing Java Dependencies for Apache Spark Applications in Data Flow 🔗

What's Next 🔗

Oracle Cloud Infrastructure Documentation
Try Free Tier

Before You Begin

Allowed Spark Variables

Compatibility Limitations

1. Supported Ways to Access the Spark Session

2. Managing Java Dependencies for Apache Spark Applications in Data Flow

What's Next