Connect to Data Sources

In this tutorial, you create data assets for the data sources by using Object Storage as the source and Autonomous Data Warehouse as the target.

Before You Begin

You must have the following:

  • Access to a Data Integration workspace. See Connect to Data Integration.
  • Imported sample data in an Object Storage bucket.
  • Policies in the required compartment as follows:

    • To use the Object Storage service:
      allow group <group_name> to use object-family in compartment <compartment-name>
      allow any-user to read buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace_ocid>', request.operation = 'GetBucket'}
      allow any-user to manage objects in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace_ocid>'}
    • To use Oracle Autonomous Data Warehouse or Oracle Autonomous Transaction Processing as a target:
      allow any-user to manage buckets in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id = '<workspace_ocid>', request.permission = 'PAR_MANAGE'}
      

    For reference, see creating a policy and policy examples to enable access to Oracle Object Storage.

  • Access to an Autonomous Data Warehouse database, including credentials and the wallet file. See Download a Wallet.
  • The user name and password to establish connection while creating a new data asset. You create the user name and password in this tutorial.

1. Creating the Source Data Asset

For the purposes of this tutorial, Oracle Cloud Infrastructure Object Storage serves as the source data asset.

To create a data asset for Oracle Cloud Infrastructure Object Storage:

  1. On the workspace Home page, click Create data asset on the Design tile.
  2. On the Select Data asset type panel, with All selected under Connector Type, click the OCI Object Storage icon.

    If you don't see OCI Object Storage, enter obj in the search field.

  3. On the Create data asset page, complete the fields in the General information section:
    1. For Name, enter Data_Lake without any spaces. You can use alphanumeric characters, hyphens, periods, and underscores only.
    2. For Description, enter a description about the data asset.
    3. The Type menu shows OCI Object Storage selected.
    4. The Tenancy OCID and Namespace fields are automatically filled for you.
      Note

      This procedure assumes that the workspace and Object Storage resources are within the same tenancy.

      The namespace value is based on the tenancy OCID value. If the Namespace field isn't automatically filled, you can enter the tenancy name.

    5. For OCI region, select the region for the Object Storage resource, or start typing a name to filter the list and then select a region from the filtered list.
      Important

      The Object Storage data source that you set up as a Source or Target data asset in Data Integration is identified using the region ID, namespace, and bucket name. You don't encounter issues if the Source and Target data assets have different region IDs, different namespaces, and different bucket names. If the Source and Target have the same bucket name but different region IDs and namespaces, or if they have the same namespace but different region IDs and different bucket names, then also you won't encounter any issues. However, if only the Source and Target data asset region ID is different and the namespace and bucket name are the same, the Data Integration tasks would fail.
    6. Ignore the Enable policies to use data asset information box, if you have already added the required policies. Otherwise, click Show more information to view the details of the policy name and policy statements required to use the Object Storage data asset. Add or copy the policies, specifying the group name and compartment in the statements.
  4. In the Default connection information section, enter a name and description (optional).
  5. (Optional) Click Test connection.
    A successful test isn't required to create the data asset.
  6. Click Create.

2. Preparing the Target Database

To complete all the tutorials in this series, configure the target autonomous database to add a schema and a table.

To prepare the target autonomous database with the BETA schema:

  1. In the Oracle Cloud Infrastructure Console navigation menu, go to Oracle Database, and then click Autonomous Data Warehouse.
  2. Select the compartment that has the autonomous database you're going to use.
  3. From the list of autonomous databases, select the database.
  4. On the database details page, click Database Actions.
  5. When prompted, log in with the administrator credentials for the autonomous database.
  6. Under Development, click SQL.
  7. To create the BETA user, enter and then run the following script in the SQL worksheet:
    create user BETA identified by "<example-password>";
    grant DWROLE to BETA;
    alter user BETA quota 200M on data;
    Note

    Ensure that you enter a password in place of <example-password>.
  8. Run the following SQL statements to create the CUSTOMERS_TARGET table.
    CREATE TABLE "BETA"."CUSTOMERS_TARGET"
       ("CUST_ID" NUMBER,
        "LAST_NAME" VARCHAR2(200 BYTE),
        "FIRST_NAME" VARCHAR2(200 BYTE),
        "FULL_NAME" VARCHAR2(200 BYTE),
        "STREET_ADDRESS" VARCHAR2(400 BYTE),
        "POSTAL_CODE" VARCHAR2(10 BYTE),
        "CITY_ID" NUMBER,
        "CITY" VARCHAR2(100 BYTE),
        "STATE_PROVINCE_ID" NUMBER,
        "STATE_PROVINCE" VARCHAR2(100 BYTE),
        "COUNTRY_ID" NUMBER,
        "COUNTRY" VARCHAR2(400 BYTE),
        "CONTINENT_ID" NUMBER,
        "CONTINENT" VARCHAR2(400 BYTE),
        "AGE" NUMBER,
        "COMMUTE_DISTANCE" NUMBER,
        "CREDIT_BALANCE" NUMBER,
        "EDUCATION" VARCHAR2(40 BYTE),
        "EMAIL" VARCHAR2(416 BYTE),
        "FULL_TIME" VARCHAR2(40 BYTE),
        "GENDER" VARCHAR2(6 BYTE),
        "HOUSEHOLD_SIZE" NUMBER,
        "INCOME" NUMBER,
        "INCOME_LEVEL" VARCHAR2(20 BYTE),
        "INSUFF_FUNDS_INCIDENTS" NUMBER,
        "JOB_TYPE" VARCHAR2(200 BYTE),
        "LATE_MORT_RENT_PMTS" NUMBER,
        "MARITAL_STATUS" VARCHAR2(8 BYTE),
        "MORTGAGE_AMT" NUMBER,
        "NUM_CARS" NUMBER,
        "NUM_MORTGAGES" NUMBER,
        "PET" VARCHAR2(40 BYTE),
        "PROMOTION_RESPONSE" NUMBER,
        "RENT_OWN" VARCHAR2(40 BYTE),
        "SEG" NUMBER,
        "WORK_EXPERIENCE" NUMBER,
        "YRS_CURRENT_EMPLOYER" NUMBER,
        "YRS_CUSTOMER" NUMBER,
        "YRS_RESIDENCE" NUMBER,
        "COUNTRY_CODE" VARCHAR2(2 BYTE),
        "ORDER_NUMBER" NUMBER,
        "REVENUE" NUMBER
       ) SEGMENT CREATION IMMEDIATE
      PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
     NOCOMPRESS LOGGING
      STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
      PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
      BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
      TABLESPACE "USERS" ;
     
    --------------------------------------------------------
    --  Constraints for Table CUSTOMERS_TARGET
    --------------------------------------------------------
      ALTER TABLE "BETA"."CUSTOMERS_TARGET" MODIFY ("CUST_ID" NOT NULL ENABLE);
      ALTER TABLE "BETA"."CUSTOMERS_TARGET" MODIFY ("SEG" NOT NULL ENABLE);
  9. In the Navigator tab on the left side, switch to the BETA schema to verify that the table was created successfully.
    You might have to log out and then log in again.
  10. Log out of the SQL page.

3. Creating the Target Data Asset

Navigate back to Data Integration to create the target data asset.

To create a data asset for the target:

  1. In the Oracle Cloud Infrastructure Console navigation menu, go to Analytics & AI, then click Data Integration.
  2. Click Workspaces.
  3. Navigate to the compartment where you created the workspace, and select the workspace.
  4. On the workspace Home page, click Create data asset from the Quick actions tile.

    You can also click Open tab (plus icon) in the tab bar and select Data assets. Then on the Data assets page, click Create data asset.

  5. On the Create data asset page, for General information, set the following:
    • Name: Data_Warehouse (You can use alphanumeric characters, hyphens, periods, and underscores only).
    • Identifier: A generated identifier based on the value you enter for Name. You can change the generated value, but after you save the data asset, you aren't allowed to update the identifier.
    • Description: Optional
    • Type: Oracle Autonomous Data Warehouse
    • Upload wallet and Wallet file: Drop or browse to select the wallet file. See Download a Wallet.
  6. In the Default connection information section, enter or select the following:
    • Name: Default connection (Optionally, you can rename the connection)
    • Description: Optional (For example, Connect with BETA user)
    • Username: BETA
    • Password: The password you created for BETA
    • TNS alias: The database service to use
  7. In the Default staging location section, enter or select the following:
    • Object Storage data asset: The Object Storage data asset that has the bucket you want to use as the default staging location
    • Connection: The connection
    • Compartment: The compartment
    • Bucket: The bucket name
  8. (Optional) Click Test connection.
    A successful test isn't required to create the data asset.
  9. Click Create.