Migrating Spark Applications to Oracle Cloud Infrastructure Data Flow
This tutorial shows you how to migrate your existing Spark applications to Oracle Cloud Infrastructure Data Flow.
Before You Begin
To successfully perform this tutorial, you must have Set Up Your Tenancy, and be able to Accessing Data Flow.
Allowed Spark Variables
Data Flow automatically configures many Spark variables based on factors including the infrastructure you choose when running jobs. To ensure proper operation, some Spark variables can't be set or overridden when running jobs. For more information, see Supported Spark Properties in Data Flow.
Compatibility Limitations
You cannot set environment variables in Data Flow jobs. Instead, you can pass variables as command line arguments, and add them to the environment in the application.
1. Supported Ways to Access the Spark Session
Do not attempt to create a Spark session within your application. It doesn't use the hardware you provisioned for it, and other unpredictable behavior might result.
The following are supported ways of accessing your Spark session within applications:
- Basic session creation.
Builder builder = SparkSession.builder().appName("My App"); SparkSession session = builder.getOrCreate();
- Set configurations while getting the session.
Builder builder = SparkSession.builder().appName("My App"); builder.config("spark.sql.orc.impl", "hive"); SparkSession session = builder.getOrCreate();
- The spark.sql settings can be changed after the session is
retrieved.
Builder builder = SparkSession.builder().appName("My App"); SparkSession session = builder.getOrCreate(); session.conf().set("spark.sql.crossJoin.enabled", "true");
- Basic session creation.
spark_builder = SparkSession.builder.appName("My App") spark_session = spark_builder.getOrCreate()
- Set configurations while getting the
session.
spark_builder = SparkSession.builder.appName("My App") spark_builder.config("spark.sql.orc.impl", "hive") spark_session = spark_builder.getOrCreate()
- The spark.sql settings can be changed after the session is
retrieved.
spark_builder = SparkSession.builder.appName("My App") spark_session = spark_builder.getOrCreate() spark_session.conf.set("spark.sql.crossJoin.enabled", "true")
- Basic session creation.
val builder = SparkSession.builder.appName("My App") val session = builder.getOrCreate()
- Set configurations while getting the
session.
val builder = SparkSession.builder.appName("My App") builder.config("spark.sql.orc.impl", "hive") val session = builder.getOrCreate()
- The spark.sql settings can be changed after the session is
retrieved.
val builder = SparkSession.builder.appName("My App") val session = builder.getOrCreate() session.conf.set("spark.sql.crossJoin.enabled", "true")
No action is required. SQL manages the Spark session for you automatically.
2. Managing Java Dependencies for Apache Spark Applications in Data Flow
Shading is not needed if you are using Spark 3.2.1.
To shade your application using Maven, use the template pom.xml to build Data Flow Applications on your development machine, and as the basis for your own application libraries.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>project</artifactId>
<groupId>org.example</groupId>
<version>1.0-SNAPSHOT</version>
<!-- README -->
<!--
This template pom will build Spark on your local machine Spark libraries that match Data Flow.
Using a web browser:
Download Spark to your project base directory - https://archive.apache.org/dist/spark/
Use this exact version to match Data Flow - spark-3.2.1-bin-hadoop3.2.tgz
Alternatively:
$ wget https://archive.apache.org/dist/spark/spark-3.2.1/spark-3.2.1-bin-hadoop3.2.tgz
To test and debug on a local dev machine
$ mvn -P dev clean install
OR
Select dev profile under maven->profiles
To build for Data Flow
$ mvn clean install
You may run your Spark application in two ways, either by running your application main() method
directly in your IDE, e.g. main() -> right-click -> Run, or from the command-line
using spark-submit. Remember, if you change the pom artifactId, also change the jar name in the
command below. You may also need to explicitly select the "dev" Maven profile in your favorite
IDE to avoid ClassNotFoundException when you setup the example project.
$ ./target/spark-3.2.1-bin-hadoop3.2/bin/spark-submit -\
-class example.Example target/project-1.0-SNAPSHOT.jar
-->
<properties>
<oci-java-sdk-version>2.12.1-SNAPSHOT</oci-java-sdk-version>
<spark-scope>provided</spark-scope>
<spark-download>${project.build.directory}/spark-3.2.1-bin-hadoop3.2/jars</spark-download>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.2.1</version>
<scope>system</scope>
<systemPath>${basedir}/spark-3.2.1-bin-hadoop3.2.tgz</systemPath>
<classifier>bin</classifier>
<type>tgz</type>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.30</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>3.3.1.0.3.2</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
<!-- this is required for com.oracle.bmc.auth -->
<!-- <exclusion>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-addons-apache</artifactId>
</exclusion>-->
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-codec-http</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-servlet</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-security</artifactId>
</exclusion>
<exclusion>
<groupId>net.minidev</groupId>
<artifactId>json-smart</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
<exclusion>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util-ajax</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Data Flow Spark runtime dependency upgrades that are not transitive dependencies of Spark in Maven -->
<dependency>
<groupId>org.apache.hadoop.thirdparty</groupId>
<artifactId>hadoop-shaded-guava</artifactId>
<version>1.1.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-web-proxy</artifactId>
<version>3.3.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
</exclusion>
<exclusion>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>3.2.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0-jre</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.4.14</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.12</artifactId>
<version>2.13.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
<version>1.28</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.9.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>commons-dbcp</groupId>
<artifactId>commons-dbcp</artifactId>
<version>1.2</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>commons-pool</groupId>
<artifactId>commons-pool</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Spark transitive dependencies exclude/upgrade section -->
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
<version>2.8.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.21</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.objenesis</groupId>
<artifactId>objenesis</artifactId>
<version>2.6</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
<version>2.14.6</version>
<scope>${spark-scope}</scope>
</dependency>
<!-- Spark 3.2.1 -->
<!-- ********************************************************************** -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
<exclusion>
<groupId>org.objenesis</groupId>
<artifactId>objenesis</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-framework</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang2</artifactId>
</exclusion>
<exclusion>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
</exclusion>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>jakarta.validation</groupId>
<artifactId>jakarta.validation-api</artifactId>
<version>2.0.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId> org.apache.yetus</groupId>
<artifactId>audience-annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-llap-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hive.shims</groupId>
<artifactId>hive-shims-0.23</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- ********************************************************************** -->
<!-- README
This section is not required. These dependencies are here for runtime completeness
when testing within an IDE, to exactly match all of the Data Flow classpath libraries.
-->
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-client</artifactId>
<version>2.13.0</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-yarn_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib-local_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive-thriftserver_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kubernetes_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-repl_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
<version>2.2.11</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17-16</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>1.4.10</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-common</artifactId>
<version>1.4.0</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.12.2</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
<scope>${spark-scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<scope>${spark-scope}</scope>
<exclusions>
<exclusion>
<groupId>org.spark-project.spark</groupId>
<artifactId>unused</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- ********************************************************************** -->
<!-- ############################################## -->
<!-- README: Application dependencies should be added below -->
<!-- ############################################## -->
<!-- THESE ARE NOT REQUIRED, JUST AN EXAMPLE, UNCOMMENT AS NEEDED -->
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
<version>${oci-java-sdk-version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jsr310</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-circuitbreaker</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-generated</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-vault</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc8</artifactId>
<version>18.3.0.0</version>
</dependency>
<!-- README Add archive.zip jar dependencies here with <scope>${spark-scope}</scope> or <scope>system</scope>-->
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/versions/11/org/roaringbitmap/ArraysShim.class</exclude>
<exclude>META-INF/versions/11/org/glassfish/jersey/internal/jsr166/**</exclude>
</excludes>
</filter>
</filters>
<artifactSet>
<excludes>
<exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
<exclude>org.bouncycastle:bcprov-jdk15on</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
</excludes>
</artifactSet>
</configuration>
</plugin>
<!-- Unpack Spark compressed tgz -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>generate-resources</phase>
<goals>
<goal>unpack-dependencies</goal>
</goals>
<configuration>
<overWriteIfNewer>true</overWriteIfNewer>
<includeTypes>tgz</includeTypes>
<includeArtifactIds>spark-core_2.12</includeArtifactIds>
<outputDirectory>${project.build.directory}</outputDirectory>
<includes>**/**</includes>
<excludes>
**/jars/guava-14.0.1.jar,
**/jars/jackson-annotations-2.12.3.jar,
**/jars/jackson-core-2.12.3.jar,
**/jars/jackson-databind-2.12.3.jar,
**/jars/jackson-dataformat-yaml-2.12.3.jar,
**/jars/jackson-module-scala_2.12-2.12.3.jar,
**/jars/mesos-1.4.0-shaded-protobuf.jar,
**/jars/snakeyaml-1.27.jar,
**/jars/spark-mesos_2.12-3.2.1.jar,
**/jars/log4j-1.2.17.jar,
**/jars/okhttp-3.12.12.jar,
**/jars/okio-1.14.0.jar,
**/jars/gson-2.2.4.jar,
**/jars/commons-dbcp-1.4.jar,
**/examples/jars/*
</excludes>
</configuration>
</execution>
</executions>
</plugin>
<!--
We need to move the correct versions of the jars that we excluded above into the Spark
jars directory so that spark-submit from the command line will run correctly.
Note: some jars like mesos-1.4.0 are removed because Data Flow does not use Mesosphere.
-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-circuitbreaker</artifactId>
<version>1.2.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-core</artifactId>
<version>1.2.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.glassfish.jersey.media</groupId>
<artifactId>jersey-media-json-jackson</artifactId>
<version>2.27</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-circuitbreaker</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-generated</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage-extensions</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>-->
<artifactItem>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>3.3.1.0.3.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.bouncycastle</groupId>
<artifactId>bcpkix-jdk15on</artifactId>
<version>1.60</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
<version>1.60</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0-jre</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-yaml</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.12</artifactId>
<version>2.13.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
<version>1.28</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.squareup.okio</groupId>
<artifactId>okio</artifactId>
<version>2.8.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17-16</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.glassfish.jersey.connectors</groupId>
<artifactId>jersey-apache-connector</artifactId>
<version>2.34</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>io.vavr</groupId>
<artifactId>vavr</artifactId>
<version>0.10.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>1.4.10</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-common</artifactId>
<version>1.4.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.12.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.9.0</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>3.2.1</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>commons-dbcp</groupId>
<artifactId>commons-dbcp</artifactId>
<version>1.2</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
<artifactItem>
<groupId>org.example</groupId>
<artifactId>Test</artifactId>
<version>2.0-SNAPSHOT</version>
<type>jar</type>
<outputDirectory>${spark-download}</outputDirectory>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>dev</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<spark-scope>compile</spark-scope>
</properties>
<build>
<plugins>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.8</version>
<executions>
<execution>
<phase>validate</phase>
<configuration>
<tasks>
<taskdef resource="net/sf/antcontrib/antcontrib.properties" />
<if>
<available file="archive.zip"/>
<then>
<unzip src="archive.zip" dest="${project.build.directory}/archive" />
</then>
<else>
<echo>The archive.zip does not exist</echo>
</else>
</if>
</tasks>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>ant-contrib</groupId>
<artifactId>ant-contrib</artifactId>
<version>20020829</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>project</artifactId>
<version>1.0-SNAPSHOT</version>
<repositories>
<repository>
<id>project.local</id>
<name>project</name>
<url>file:${project.basedir}/repo</url>
</repository>
</repositories>
<properties>
<oci-java-sdk-version>1.25.2</oci-java-sdk-version>
</properties>
<dependencies>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>3.2.1.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<!-- Spark 3.0.2 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.2</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<relocations>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>shaded.oracle.org.apache.http</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons</pattern>
<shadedPattern>shaded.oracle.org.apache.commons</shadedPattern>
</relocation>
<relocation>
<pattern>com.fasterxml</pattern>
<shadedPattern>shaded.oracle.com.fasterxml</shadedPattern>
</relocation>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>shaded.oracle.com.google</shadedPattern>
</relocation>
<relocation>
<pattern>javax.ws.rs</pattern>
<shadedPattern>shaded.oracle.javax.ws.rs</shadedPattern>
</relocation>
<relocation>
<pattern>org.glassfish</pattern>
<shadedPattern>shaded.oracle.org.glassfish</shadedPattern>
</relocation>
<relocation>
<pattern>org.jvnet</pattern>
<shadedPattern>shaded.oracle.org.jvnet</shadedPattern>
</relocation>
<relocation>
<pattern>javax.annotation</pattern>
<shadedPattern>shaded.oracle.javax.annotation</shadedPattern>
</relocation>
<relocation>
<pattern>javax.validation</pattern>
<shadedPattern>shaded.oracle.javax.validation</shadedPattern>
</relocation>
<relocation>
<pattern>com.oracle.bmc</pattern>
<shadedPattern>shaded.com.oracle.bmc</shadedPattern>
<includes>
<include>com.oracle.bmc.**</include>
</includes>
<excludes>
<exclude>com.oracle.bmc.hdfs.**</exclude>
</excludes>
</relocation>
</relocations>
<artifactSet>
<excludes>
<exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
<exclude>org.bouncycastle:bcprov-jdk15on</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
</excludes>
</artifactSet>
</configuration>
</plugin>
</plugins>
</build>
</project>
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>project</artifactId>
<version>1.0-SNAPSHOT</version>
<repositories>
<repository>
<id>project.local</id>
<name>project</name>
<url>file:${project.basedir}/repo</url>
</repository>
</repositories>
<properties>
<oci-java-sdk-version>1.15.4</oci-java-sdk-version>
</properties>
<dependencies>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-hdfs-connector</artifactId>
<version>2.9.2.6</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-core</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-objectstorage</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-secrets</artifactId>
<version>${oci-java-sdk-version}</version>
</dependency>
<!-- Spark 2.4.4 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.4</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.4</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<relocations>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>shaded.oracle.org.apache.http</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons</pattern>
<shadedPattern>shaded.oracle.org.apache.commons</shadedPattern>
</relocation>
<relocation>
<pattern>com.fasterxml</pattern>
<shadedPattern>shaded.oracle.com.fasterxml</shadedPattern>
</relocation>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>shaded.oracle.com.google</shadedPattern>
</relocation>
<relocation>
<pattern>javax.ws.rs</pattern>
<shadedPattern>shaded.oracle.javax.ws.rs</shadedPattern>
</relocation>
<relocation>
<pattern>org.glassfish</pattern>
<shadedPattern>shaded.oracle.org.glassfish</shadedPattern>
</relocation>
<relocation>
<pattern>org.jvnet</pattern>
<shadedPattern>shaded.oracle.org.jvnet</shadedPattern>
</relocation>
<relocation>
<pattern>javax.annotation</pattern>
<shadedPattern>shaded.oracle.javax.annotation</shadedPattern>
</relocation>
<relocation>
<pattern>javax.validation</pattern>
<shadedPattern>shaded.oracle.javax.validation</shadedPattern>
</relocation>
<relocation>
<pattern>com.oracle.bmc</pattern>
<shadedPattern>shaded.com.oracle.bmc</shadedPattern>
<includes>
<include>com.oracle.bmc.**</include>
</includes>
<excludes>
<exclude>com.oracle.bmc.hdfs.**</exclude>
</excludes>
</relocation>
</relocations>
<artifactSet>
<excludes>
<exclude>org.bouncycastle:bcpkix-jdk15on</exclude>
<exclude>org.bouncycastle:bcprov-jdk15on</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
</excludes>
</artifactSet>
</configuration>
</plugin>
</plugins>
</build>
</project>
To Shade your application using Scala Build Tool (sbt), you must add the
sbt-assembly
plugin to your project, and create an
assemblyMergeStrategy
and an assemblyShadeRules
in
your sbt build file.
- The
plugins.sbt
file - In your project’s root directory, add this line to the
project/plugins.sbt
file. Create the file if it does not exist.addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")
- The
build.sbt
file Add the following items to your
build.sbt
file. While this configuration addresses the most common conflicts, you might need to add more items to theassemblyShadeRules
section if you experience run-time failures.For Spark 2.4.4:libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "2.4.4" % "provided", "org.apache.spark" %% "spark-sql" % "2.4.4" % "provided", "com.oracle.oci.sdk" % "oci-hdfs-connector" % "2.9.2.1" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-core" % "1.15.4" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-objectstorage" % "1.15.4" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-secrets" % "1.15.4", ) assemblyMergeStrategy in assembly := { case PathList("javax", "inject", xs @ _*) => MergeStrategy.last case "module-info.class" => MergeStrategy.last case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) } assemblyShadeRules in assembly := Seq( ShadeRule.rename("org.apache.http.**" -> "shaded.oracle.org.apache.http.@1").inAll, ShadeRule.rename("org.apache.commons.**" -> "shaded.oracle.org.apache.commons.@1").inAll, ShadeRule.rename("com.fasterxml.**" -> "shaded.oracle.com.fasterxml.@1").inAll, ShadeRule.rename("com.google.**" -> "shaded.oracle.com.google.@1").inAll, ShadeRule.rename("javax.ws.rs.**" -> "shaded.oracle.javax.ws.rs.@1").inAll, ShadeRule.rename("org.glassfish.**" -> "shaded.oracle.org.glassfish.@1").inAll, ShadeRule.rename("org.jvnet.**" -> "shaded.oracle.org.jvnet.@1").inAll, ShadeRule.rename("javax.annotation.**" -> "shaded.oracle.javax.annotation.@1").inAll, ShadeRule.rename("javax.validation.**" -> "shaded.oracle.javax.validation.@1").inAll, ShadeRule.rename("com.oracle.bmc.hdfs.**" -> "com.oracle.bmc.hdfs.@1").inAll, ShadeRule.rename("com.oracle.bmc.**" -> "shaded.com.oracle.bmc.@1").inAll, ShadeRule.zap("org.bouncycastle").inAll, )
For Spark 3.0.2:For more information on Shading with sbt, see, the sbt-assembly plugin.libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "3.0.2" % "provided", "org.apache.spark" %% "spark-sql" % "3.0.2" % "provided", "com.oracle.oci.sdk" % "oci-hdfs-connector" % "3.2.1.3" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-core" % "1.25.2" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-objectstorage" % "1.25.2" % "provided", "com.oracle.oci.sdk" % "oci-java-sdk-secrets" % "1.25.2", ) assemblyMergeStrategy in assembly := { case PathList("javax", "inject", xs @ _*) => MergeStrategy.last case "module-info.class" => MergeStrategy.last case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) } assemblyShadeRules in assembly := Seq( ShadeRule.rename("org.apache.http.**" -> "shaded.oracle.org.apache.http.@1").inAll, ShadeRule.rename("org.apache.commons.**" -> "shaded.oracle.org.apache.commons.@1").inAll, ShadeRule.rename("com.fasterxml.**" -> "shaded.oracle.com.fasterxml.@1").inAll, ShadeRule.rename("com.google.**" -> "shaded.oracle.com.google.@1").inAll, ShadeRule.rename("javax.ws.rs.**" -> "shaded.oracle.javax.ws.rs.@1").inAll, ShadeRule.rename("org.glassfish.**" -> "shaded.oracle.org.glassfish.@1").inAll, ShadeRule.rename("org.jvnet.**" -> "shaded.oracle.org.jvnet.@1").inAll, ShadeRule.rename("javax.annotation.**" -> "shaded.oracle.javax.annotation.@1").inAll, ShadeRule.rename("javax.validation.**" -> "shaded.oracle.javax.validation.@1").inAll, ShadeRule.rename("com.oracle.bmc.hdfs.**" -> "com.oracle.bmc.hdfs.@1").inAll, ShadeRule.rename("com.oracle.bmc.**" -> "shaded.com.oracle.bmc.@1").inAll, ShadeRule.zap("org.bouncycastle").inAll, )
What's Next
Now you can start migrating your Spark applications to run in Data Flow.