Before you create a Data Integration workspace, review the prerequisites and list of tasks that you, the customer, are responsible for.
Customer Responsibility Checklist
You must have the following resources and minimum policies in the tenancy. If you don't have the proper rights, have the administrator create them for you.
Before You Begin
Before you start setting up the Data Integration service for use, you must have:
An Oracle Cloud Infrastructure account with administrator privileges
Access to the Data Integration service
List of Customer Tasks
This section summarizes the responsibilities of Data Integration customers before setting up and using Data Integration for the first time.
In the service Oracle Cloud Infrastructure
Identity and Access Management (IAM) with Identity Domains, create the compartments, users, and groups of users.
You can set up virtual cloud networks (VCNs) and subnets n Oracle Cloud Infrastructure Networking for Data Integration. Only regional subnets are supported, and DNS hostnames must be used in the subnets. Depending on the location of the data sources that you're using, you might have to create other network objects such as service gateways, network security groups, and Network Address Translation (NAT) gateways.
In the service Oracle Cloud Infrastructure
Identity and Access Management (IAM) with Identity Domains, create the required policies that give groups of users proper access to Data Integration resources.
Data Integration must also have permission to manage the virtual networks and subnets that you set up for integration.
Learn how control plane and data plane management tasks for Data Integration are shared between Oracle and you, the customer.
Generally speaking, the control plane is responsible for provisioning OCI resources and managing metadata operations to get, create, update, and delete Data Integration workspaces. The data plane is responsible for design time and runtime operations related to data assets, data flows, pipelines, tasks, and applications in Data Integration.
Task
Who
Description
Workspace resources provisioning
Oracle and Customer
Oracle is responsible for provisioning Oracle Cloud Infrastructure resources for Data Integration workspaces, including compute instances and their connectivity to a subnet (if provided) through a secondary VNIC.
You, the customer, are responsible for:
Setting up the infrastructure resources beforehand, such as creating a compartment and networking resources.
Creating the Data Integration workspaces that you need by specifying the appropriate configuration characteristics.
For the list of customer responsibilities to set up the Data Integration service before first use, see Customer Responsibility Checklist.
Backup and recovery of workspaces and applications
Oracle and Customer
Oracle backs up content on a continuous basis to perform disaster recovery of Data Integration service resources metadata and the operation of the service only. Such backups include customer workspace backups, but the backups aren't made available to customers.
You, the customer, are responsible for making backups of the application data, by copying the applications to the same workspace, another workspace, or another compartment. This is especially important for cross-region disaster recovery.
Service patching and upgrading
Oracle
Oracle is responsible for patching and upgrading the Data Integration service and its agent components.
Scaling
Oracle
Oracle is responsible for scaling the control and data planes.
You, the customer, can request scaling the OCI resources in the data plane for agent computation.
Health monitoring
Oracle and Customer
Oracle is responsible for monitoring the health of workspace resources and for ensuring their availability.
You, the customer, are responsible for monitoring the health and performance of tasks and applications at all levels, including the availability of dependent resources that are referenced in the data plane during task runs.
Application security
Oracle and Customer
Oracle ensures that data stored in OCI is encrypted and ensures that connections to Data Integration require SSL encryption.
You, the customer, are responsible for the security of applications at all levels. This responsibility includes access to workspace resources, network access to those resources, and access to dependent data.
Auditing
Oracle and Customer
Oracle is responsible for logging REST API calls that are made to workspace resources and for making those logs available to you for auditing purposes.
You, the customer, are responsible for configuring access to audit logs in the audit log service, and using the logs to audit usage and monitor activity within the tenancy.
Alerts and notifications
Oracle and Customer
Oracle provides service events and notifications.
You, the customer, are responsible to configure alerts and notifications for service events and for monitoring alerts that might be of interest.
Creating Resources 🔗
To create resources for Data Integration activities:
Create a compartment in the tenancy for Data Integration activities.
If the data sources are in a private network, create a VCN with at least one subnet in the compartment.
Note
The VCN and subnet you create here are the ones you select when you create a workspace. The subnet must be regional, spanning all availability domains.
If you don't see the subnet listed, go back and check that it was created as a regional subnet.
Create a group for users in charge of workspaces, and then add users to the
group.
Take note of the group name. You create policies for the group in the next section. For more information, see Managing Groups.
Creating Policies 🔗
To control non-administrator user access to Data Integration resources and functions, you create groups in Oracle Cloud Infrastructure
Identity and Access Management (IAM) with Identity Domains. Then you write IAM policies that give the groups proper access.
You can use Data Integration policy templates in the IAM Policy Builder to create a policy, or you can manually enter the policy statements in the manual editor. See Writing Policy Statements with the Policy Builder for information about how to use the Policy Builder and policy templates.
To understand the syntax used in writing a policy statement, see Policy Syntax. Ensure that you understand the relationship between Permissions and Verbs.
You can create most of the Data Integration policies at the tenancy level or at the compartment level. The policies listed here are examples, which you can modify to suit access needs.
After you add IAM components (for example, dynamic groups and policy statements), don't try to perform the associated tasks immediately. New IAM policies require about five to 10 minutes to take effect.
This policy gives permission to a group to create Data Integration workspaces.
Copy
allow group <group-name> to manage dis-workspaces in compartment <compartment-name>
Users with the inspect permission can only list dis-workspaces. Users with the manage permission for dis-workspaces can create and delete workspaces. Users with the use permission can only perform integration activities within workspaces. View more examples to create a policy for specific requirements.
This policy gives Data Integration access to list users' names in the Created by field when they create projects, data assets, and applications in the workspace.
Copy
allow service dataintegration to inspect users in tenancy
While creating a workspace for which private network is enabled, to check whether the
subnet has enough IP addresses to allocate, add the following policy:
Copy
allow group <group_name> to inspect instance-family in compartment <compartment_name>
To restrict the permission to a specific API call, add the following policy:
Copy
allow group <group_name> to inspect instance-family in compartment <compartment_name> where ALL {request.operation = 'ListVnicAttachments'}
Data Integration can be in a different tenancy from data resources. To run a task, Data Integration sends a request to the tenancy. In return, you must give Data Integration permission to manage the virtual networks that you have set up for integration. Create Data Integration workspaces in the same region as the network and securely access the network through private IP addresses. Without a policy to accept this request, data integration fails.
Copy
allow service dataintegration to use virtual-network-family in compartment <compartment-name>
The following policy gives permission to a group to manage networking resources in the compartment.
Copy
allow group <group-name> to manage virtual-network-family in compartment <compartment-name>
Or, for non-admin users:
Copy
allow group <group-name> to use virtual-network-family in compartment <compartment-name>
Copy
allow group <group-name> to inspect instance-family in compartment <compartment-name>
You can limit user activities within the network when you assign the inspect permission for VCNs and subnets within the compartment instead of manage. Users can then view existing VCNs and subnets and select them when creating a workspace. View more examples to create a policy for specific requirements.
Create these policies to allow Data Integration to access Object Storage resources, such as objects and buckets.
Copy
allow group <group-name> to use object-family in compartment <compartment-name>
Copy
allow any-user to use buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
Copy
allow any-user to manage objects in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
If the Data Integration workspace and Object Storage data source are in different tenancies, then you must also create the following policies for compartments:
In the workspace tenancy:
Copy
Endorse any-user to inspect compartments in tenancy <tenancy-name> where ALL {request.principal.type = 'disworkspace'}
In the Object Storage tenancy:
Copy
Admit any-user of tenancy <tenancy-name> to inspect compartments in tenancy
Note
Different types of policies (resource principal and on behalf of) are required for using Object Storage. The policies required also depend on whether the Object Storage instance and Data Integration instance are in the same tenancy or different tenancies, and whether you create the policies at the compartment level or tenancy level. Review more examples and the blog Policies in Oracle Cloud Infrastructure (OCI) Data Integration to identify the policies that you need.
Create these policies to allow Data Integration to access buckets and objects in Oracle Cloud Infrastructure Object Storage. The policies are required for staging extracted data, which need pre-authentication to complete the operations.
Copy
allow group <group-name> to use object-family in compartment <compartment-name>
Copy
allow any-user to use buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
Copy
allow any-user to manage objects in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
Copy
allow any-user to manage buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>', request.permission = 'PAR_MANAGE'}
Note
Different types of policies (resource principal and on behalf of) are required for using Object Storage. The policies required also depend on whether the Object Storage instance and Data Integration instance are in the same tenancy or different tenancies, and whether you create the policies at the compartment level or tenancy level. Review more examples and the blog Policies in Oracle Cloud Infrastructure (OCI) Data Integration to identify the policies that you need.
Create this policy to use secrets in OCI Vault for sensitive information.
Copy
allow any-user to read secret-bundles in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
The following policy lets a group of users who are not administrators to use secrets with Oracle Autonomous Data Warehouse and Oracle Autonomous Transaction Processing:
Copy
allow group <group-name> to read secret-bundles in compartment <compartment-name>
Create this policy if you use an autonomous database as a target. Autonomous
databases use Object Storage for staging data and need pre-authentication to complete
operations.
Copy
allow any-user to manage buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>', request.permission = 'PAR_MANAGE'}
Create this policy if you want the autonomous database credentials to be retrieved automatically while create an autonomous database data asset.
Copy
allow group <group-name> to read autonomous-database-family in compartment <compartment-name>
Create these policies to publish Data Integration tasks from Data Integration to the OCI Data Flow service.
Copy
allow any-user to manage dataflow-application in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
Copy
allow any-user to read dataflow-private-endpoint in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace-ocid>'}
Copy
allow group <group-name> to read dataflow-application in compartment <compartment-name>
Copy
allow group <group-name> to manage dataflow-run in compartment <compartment-name>
For non-administrator users to publish to OCI Data Flow using a private endpoint, this policy is required to show private endpoints:
Copy
allow group <group-name> to inspect dataflow-private-endpoint in compartment <compartment-name>
Creating a Workspace 🔗
Before you can get started with Data Integration, you or the administrator must first create a workspace for the data integration projects.
Create a workspace after the connectivity requirements for Data Integration are satisfied. See Creating Resources.
For other networking information, see the following topics:
Ensure that you also have the required policies for creating workspaces, as described in Creating Policies. For example, if you're creating a workspace that uses virtual cloud network (VCN) resources, you must create policies to allow Data Integration access to the VCN in the compartment.
If you haven't added the required policies, workspace creation fails. In the Unauthorized access information box that appears, click Manage policies to view the details of the required policy statements. Specify the correct group name and compartment in the statements. If you're an administrator, you can add the policies by clicking Add policies. If you're not an administrator, click Copy policies and then send them to an administrator to add.
You're returned to the Workspaces page. It might take a few minutes before the workspace is ready for you to access. When the status is Active, you can select the workspace from the list.
For information about navigating and searching in a workspace, see Navigating a Workspace.
To create the workspace later using Resource Manager and Terraform, click Save as stack to save the resource definition as a Terraform configuration.
Use the workspace to create design-time artifacts such as data assets, data flows, and tasks in one or more projects or folders. For information about using projects in a workspace, see Using Projects and Folders.
After creating data assets for the source and target data systems, you create the data integration processes for extracting, loading, and transforming data.
In Data Integration, to ingest and transform data, you create data loader tasks, data flows, integration tasks, and other tasks. To orchestrate a set of tasks in a sequence or in parallel, you create pipelines and pipeline tasks. You can use the following tasks as a guideline.
Create a data loader task from the Tasks section of a project or folder details page. A data loader task takes data from a source, transforms the data, then loads the data into a target.
In the data flow designer, build the logical flow of data from source data assets to target data assets. Add data operators to specify the source and target data sources. Add shaping operators such as filter and join to cleanse, transform, and enrich data.
In the Details tab of an operator in the data flow designer, assign parameters to externalize and override values. By using parameters, different configurations of sources, targets, and transformations can be reused at design time and runtime.
After completing a data flow design, from the Tasks section of a project or folder details page, create an integration task that uses the data flow. Wrapping the data flow in an integration task lets you run the data flow, and you can choose the parameter values you want to use at runtime.
Create a pipeline from the Pipelines section of a project or folder details page. In the pipeline designer, use operators to add the tasks and activities you want to orchestrate as a set of processes in a sequence or in parallel. You can also use parameters to override values at design time and runtime.
After completing a pipeline design, from the Tasks section of a project or folder details page, create a pipeline task that uses the pipeline. Wrapping the pipeline in a pipeline task lets you run the pipeline, and you can choose the parameter values you want to use at runtime.