Set Up Identity and Access Management Policies

Data Flow requires common policies to be set up in Identity and Access Management (IAM) to manage and run Spark applications.

You can use the policy templates in IAM or manually create the policies. For more information on how IAM policies work, see the Identity and Access Management without Identity Domains documentation or the Identity and Access Management with Identity Domains documentation. Create the following four policies:
  • dataflow-service-level-policy
  • dataflow-admins-policy
  • dataflow-data-engineers-policy
  • dataflow-sql-users-policy

Data Flow Policy Templates

Data Flow has four Common Policy Templates. They are listed in the order in which you need to create the policies.

Let Data Flow admins manage all Applications and Runs
For administration-like users (or super users) of the service who can take any action on the service, including managing applications owned by other users and runs initiated by any user within their tenancy subject to the policies assigned to the group
Let Data Flow users manage their own Applications and Runs.
All other users who are only authorized to create and delete their own applications. But they can run any application within their tenancy, and have no other administrative rights such as deleting applications owned by other users or canceling runs initiated by other users.
Allow Data Flow service to perform actions on behalf of the user or group on objects within the tenancy.
The Data Flow service needs permission to perform actions on behalf of the user or group on objects within the tenancy.
(Optional) Allow Data Flow users to create, edit, or modify private endpoints.
This policy template allows use of the virtual-network-family, allows acces to more specific resources, allows access to specific operations, and allows changing of the netwrok configuration.

Creating Policies Using IAM Policy Builder Templates

Use the IAM Policy Builder templates to create your policies for Data Flow.

Using templates in the Policy Builder in IAM without Identity Domains or with Identity Domains, follow the steps to use the Console to create a policy:
  1. From the navigation menu, click Identity & Security.
  2. Under Identity click Policies.
3. Creating dataflow-service-level-policy
  1. From the Policies page, click Create Policy.
  2. Give the policy the Name dataflow-service-level-policy.
  3. (Optional) Enter a Description to help you find the policy. Do not include confidential information.
  4. Select the compartment at the root of your tenancy.
  5. In Policy Use Cases, select Data Flow.
  6. From Common Policy Templates, select Allow Data Flow service to perform common actions on behalf of users.
    Figure 1. Allow Data Flow service to perform common actions on behalf of user Template
    The policy Allow Data Flow service to perform common actions on behalf of user is selected, and the policy statements for it are displayed.
  7. Click Create.
4. Creating dataflow-admins-policy
  1. From the Policies page, click Create Policy.
  2. Give the policy the Name dataflow-admins-policy.
  3. (Optional) Enter a Description to help you find the policy. Don't include confidential information.
  4. Select the compartment at the root of the tenancy.
  5. In Policy Use Cases, select Data Flow.
  6. From Common Policy Templates, select Let dataflow-admins perform common admin operations related to Data Flow service.
    Figure 2. Allow Data Flow service to perform common admin operations related to Data Flow service Template
    The policy Let dataflow-admins perform common admin operations related to Data Flow service is selected, and the policy statements for it are displayed.
  7. Click Create.
5. Creating dataflow-data-engineers-policy
  1. From the Policies page, click Create Policy.
  2. Give the policy the Name dataflow-data-engineers-policy.
  3. (Optional) Enter a Description to help you find the policy. Do not include confidential information.
  4. For Compartment, select dataflow-compartment.
  5. In Policy Use Cases, select Data Flow.
  6. From Common Policy Templates, select Let dataflow-data-engineers perform common data engineering operations using Data Flow service.
    Figure 3. Let dataflow-data-engineers perform common data engineering operations using Data Flow service Template
    The policy Let dataflow-data-engineers perform common data engineering operations using Data Flow service is selected, and the policy statements for it are displayed.
  7. Click Groups.
  8. From the list of groups, select dataflow-data-engineers.
  9. From Location, select dataflow-compartment.
  10. Click Create.
6. Creating dataflow-sql-users-policy
  1. From the Policies page, click Create Policy.
  2. Give the policy the Name dataflow-sql-users-policy.
  3. (Optional) Enter a Description to help you find the policy. Do not include confidential information.
  4. For Compartment, select dataflow-compartment.
  5. In Policy Use Cases, select Data Flow.
  6. From Common Policy Templates, select Let dataflow-sql-users read and connect to Data Flow Interactive SQL clusters via JDBC or ODBC.
    Figure 4. Let dataflow-sql-users read and connect to Data Flow Interactive SQL clusters via JDBC or ODBC Template
    The policy Let dataflow-sql-users read and connect to Data Flow Interactive SQL clusters via JDBC or ODBC is selected, and the policy statements for it are displayed.
  7. Click Groups.
  8. From the list of groups, select dataflow-sql-users.
  9. From Location, select dataflow-compartment.
  10. Click Create.

Manually Create Policies

Rather than using the templates in IAM to create the policies for Data Flow, you can create them yourself in IAM Policy Builder.

Following the steps in Managing Policies in IAM with Identity Domains or without Identity Domains to manually create the following policies:

Data Flow User Policies
As a general practice, categorize your Data Flow users into two groups for clear separation of authority:
  • For administration-like users (or super users) of the service who can take any action on the service, including managing applications owned by other users and runs started by any user within their tenancy subject to the policies assigned to the group:
    • Create a group in your identity service called dataflow-admin and add users to this group.
    • Create a policy called dataflow-admin and add the following statements:
      ALLOW GROUP dataflow-admin TO READ buckets IN <TENANCY>
      ALLOW GROUP dataflow-admin TO MANAGE dataflow-family IN <TENANCY>
      ALLOW GROUP dataflow-admin TO MANAGE objects IN <TENANCY> WHERE ALL
                {target.bucket.name='dataflow-logs', any {request.permission='OBJECT_CREATE',
                request.permission='OBJECT_INSPECT'}}
    It includes access to the dataflow-logs bucket.
  • The second category is for all other users who are only authorized to create and delete their own applications. But they can run any application within their tenancy, and have no other administrative rights such as deleting applications owned by other users or canceling runs begun by other users.
    • Create a group in your identity service called dataflow-users and add users to this group.
    • Create a policy called dataflow-users and add the following statements:
      ALLOW GROUP dataflow-users TO READ buckets IN <TENANCY>
      ALLOW GROUP dataflow-users TO USE dataflow-family IN <TENANCY>
      ALLOW GROUP dataflow-users TO MANAGE dataflow-family IN <TENANCY> WHERE ANY 
      {request.user.id = target.user.id, request.permission = 'DATAFLOW_APPLICATION_CREATE', 
      request.permission = 'DATAFLOW_RUN_CREATE'}
      ALLOW GROUP dataflow-users TO MANAGE objects IN <TENANCY> WHERE ALL 
      {target.bucket.name='dataflow-logs', any {request.permission='OBJECT_CREATE', 
      request.permission='OBJECT_INSPECT'}}
Data Flow Service Policy

The Data Flow service needs permission to perform actions on behalf of the user or group on objects within the tenancy.

To set it up, create a policy called dataflow-service and add the following statement:
ALLOW SERVICE dataflow TO READ objects IN tenancy WHERE target.bucket.name='dataflow-logs'
Oracle Cloud Infrastructure Logging Policies

These policies allow you to use Oracle Cloud Infrastructure Logging with Data Flow.

To enable service logs, you must grant your user manage access on the log group, and access to the resource. Logs and log groups use the log-group resource-type, but to search the contents of logs, you must use the log-content resource-type. Add the following policies:
allow group dataflow-users to manage log-groups in compartment <compartment_name>
allow group dataflow-users to manage log-content in compartment <compartment_name>

Setting Up a Policy for Spark Streaming

To use Spark Streaming with Data Flow, you need more than the common policies.

You must have created the common policies either using the IAM Policy Builder templates or manually.

You can use the IAM Policy Builder to manage access to the sources and sinks your streaming applications consume from or produce to. For example, the specific stream pool or the specific Object Storage bucket which are at a location you pick. Or you can follow these steps to create a policy manually:

  1. Create a policy called dataflow-streaming-policy at the root of your tenancy.
  2. Add the following statements to allow Data Flow Runs from the dataflow-compartment compartment to consume or produce from a specific stream pool. The stream pool has an ID of stream-pool-ocid1 and an Object Storage bucket named stream-bucket-1.
    ALLOW ANY-USER TO {STREAM_INSPECT, STREAM_READ, STREAM_CONSUME, STREAM_PRODUCE} IN TENANCY WHERE ALL
    {request.principal.type='dataflowrun', request.resource.compartment.id = '<compartment_id>', target.streampool.id = 'stream-pool-ocid1'}
    ALLOW ANY-USER TO MANAGE OBJECTS IN TENANCY WHERE ALL 
    {request.principal.type='dataflowrun', request.resource.compartment.id = '<compartment_id>', target.bucket.name = '<bucket_name>'}