Publishing to OCI Data Flow

Only integration and data loader tasks in Data Integration can be published from the Data Integration service to the Oracle Cloud Infrastructure Data Flow service.

When you publish a task to OCI Data Flow, a JAR file is created in OCI Object Storage, and an application that points to the JAR file is created in the Data Flow service.

After publishing, you can run the application in OCI Data Flow, where you can choose compute shapes, and monitor and diagnose data flow runs. If the task has assigned parameters, the OCI Data Flow application is created using the default parameter values. You won't be able to enter parameter values when you run the application in OCI Data Flow.

The following pages describe how to publish a task to OCI Data Flow and the specific tasks that you can perform after publishing:

Required Setup and Policies

Before you publish a task to the OCI Data Flow service, ensure that you have the following:

  • An Object Storage data asset to publish the executables to

  • A bucket in Object Storage for the JAR

  • The relevant permissions and IAM policies to access Object Storage, as described in Policy Examples to Enable Access to OCI Object Storage.

  • The relevant permissions and IAM policies:

    allow any-user to manage dataflow-application in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
    allow any-user to manage dataflow-run in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
    allow group <group-name> to read dataflow-application in compartment <compartment-name>
    allow group <group-name> to manage dataflow-run in compartment <compartment-name>

Private Endpoint

By default, the OCI Data Flow application is created with public internet access. You can choose to publish using a private endpoint in OCI Data Flow. For example, if your tasks use data sources that are hosted in private networks, you can publish to OCI Data Flow using a private endpoint.

To publish to OCI Data Flow using a private endpoint, ensure that you also have the following:

  • An existing private endpoint in OCI Data Flow for the application to use. See Creating a Private Endpoint.

    For required policies to use OCI Data Flow with private endpoints, see Private Endpoint Policies.

  • The policy required to publish from OCI Data Integration to OCI Data Flow applications that are enabled by private endpoints:

    allow any-user to read dataflow-private-endpoint in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
  • The policy that enables a group of users who are not administrators to list existing private endpoints at the compartment level (when publishing from OCI Data Integration to OCI Data Flow):

    allow group <group-name> to inspect dataflow-private-endpoint in compartment <compartment-name>
  • The data assets used in the task you're publishing to OCI Data Flow must be:
    • Configured to use OCI Vault secrets that contain the passwords to connect to the data sources. This is required for passing credentials securely across OCI services. See OCI Vault Secrets and Oracle Wallets.

      The policy required to use secrets in OCI Vault:

      allow any-user to read secret-bundles in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}

      The following policy enables a group of users who are not administrators to use secrets with Oracle Autonomous Data Warehouse and Oracle Autonomous Transaction Processing:

      allow group <group-name> to read secret-bundles in compartment <compartment-name>
    • Specified using the fully qualified domain name (FQDN) for the database hosts. OCI Data Flow does not allow connections through direct IP addresses.

Note

  • The policy statements provided in this topic are examples only. Ensure that you write policies that meet your own requirements.

  • Cross-tenancy policies are required if your resources (such as Object Storage objects and buckets) and your Data Integration workspace are on different tenancies. See Policy Examples and the Policies blog to identify the right policies for your needs.

  • After you add IAM components (for example, dynamic groups and policy statements), don't try to perform the associated tasks immediately. New IAM policies require about five to 10 minutes to take effect.