Replicate data from PostgreSQL to Snowflake using Streaming Handler

Discover how to use OCI GoldenGate to replicate data from PostgreSQL to Snowflake using Streaming Handler.

Before you begin

To successfully complete this quickstart, you must have the following:

  • Access to OCI Database with PostgreSQL
  • Open port 5432 in the security list of the VCN used by OCI Database with PostgreSQL
  • Access to Snowflake

Set up OCI Database with PostgreSQL

  1. Create an OCI Database with PostgreSQL.
    1. In OCI PostgreSQL:
      1. Click Configurations.
      2. Use an existing configuration, such as PostgreSQL.VM.Standard.E5.Flex-14-0_51.
      3. Click Copy configuration, rename it, add wal_level under User variables (read/write), and set it to 'logical'.
      4. Click Create.
      5. See Copying a Configuration for more information.
    2. Use the Configuration with wal_level set to true when creating DB System. See Creating a Database System for more information.
  2. Connect to OCI PostgreSQL. See Connecting to a Database for more information.
  3. Create a database and user for GoldenGate in OCI PostgreSQL:
    1. Database
      1. create database ociggll;
      2. \c ociggll;
      3. create schema src_ociggll;
      4. Load sample script (seedSRCOCIGGLL_PostgreSQL.sql)
    2. User
      1. create user ggadmin with password '<password>';
      2. GRANT ALL PRIVILEGES ON DATABASE ociggll TO ggadmin;
      3. GRANT SELECT ON ALL TABLES IN SCHEMA src_ociggll TO ggadmin;

Set up Snowflake database

  1. Create Snowflake database.
  2. Users must create a public and private key pair to authenticate into Snowflake.
  3. Create a user specifically for GoldenGate in Snowflake with appropriate privileges.
  4. Add the public key to Snowflake user, for example: ALTER USER example_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...';
  5. Create target tables using sample schema.

Task 1: Create the OCI GoldenGate resources

This quickstart example requires deployments and connections for both the source and target.
  1. GoldenGate for PostgreSQL 23ai is required.
  2. Create a PostgreSQL deployment for the source PostgreSQL database.
  3. Create a Big Data deployment for the target Snowflake database.
  4. Create a PostgreSQL connection with the following values:
    1. For Type, select OCI PostgreSQL from the dropdown.
    2. For Database name, enter ociggll.
    3. For Username, enter ggadmin.
    4. For Password, enter your password.
    5. For Security Protocol, select TLS from the dropdown, and then select Prefer.
  5. Create a Snowflake connection with the following values:
    1. For Connection URL, enter jdbc:snowflake://<account_identifier>.snowflakecomputing.com/?warehouse=<warehouse name>&db=OCIGGLL.
      Note

      Ensure you replace <account_identifier> and <warehouse name> with the appropriate values.
    2. For Authentication Type, select Key pair authentication from the dropdown.
      Note

      Key pair authentication is the only supported authentication type for Snowflake Streaming.
    3. For Username, enter a name.
    4. Upload the Private key you created earlier.
    5. Enter the private key passphrase in the Private key password field.
  6. Create a connection to GoldenGate for the target Big Data deployment, and then assign this connection to the source PostgreSQL deployment.
  7. Assign the source PostgreSQL connection to the PostgreSQL deployment.
  8. Assign the Snowflake connection to the target Big Data deployment.

Task 2: Enable supplemental logging

To enable supplemental logging:
  1. Launch the PostgreSQL GoldenGate deployment console:
    1. From the Deployments page, select the PostgreSQL deployment to view its details.
    2. On the PostgreSQL deployment details page, click Launch console.
    3. On the deployment console sign in page, enter the GoldenGate admin credentials provided in Task 1, step 1.
      Note

      Sign in is required if IAM wasn't selected as the credential store when creating a deployment.
  2. In GoldenGate 23ai, click DB Connections in the left navigation, then the source PostgreSQL database, and then Trandata.
  3. On the Trandata page, next to TRANDATA Information, click Add Trandata (plus icon).
  4. In the Trandata panel, for Schema Name, enter src_ociggll.*, and then click Submit.
    Note

    Use the search field to search for src_ociggll and verify the tables were added.

Task 3: Create the Extracts

  1. Add the Change Data Capture Extract:
    1. In the left navigation, click Extracts,
    2. On the Extracts page, click Add Extract (plus icon), and then complete the fields as follows:
      • On the Extract Information page:
        1. For Extract type, select Change Data Capture Extract.
        2. For Process Name, enter a name for the Extract, such as ECDC.
        3. Click Next.
      • On the Extract Options page:
        1. For Source credentials, select Oracle GoldenGate from the Domain dropdown
        2. Select the source PostgreSQL database from the Alias dropdown.
        3. For Extract Trail Name, enter a two-character trail name, such as C1.
        4. Click Register, and then click Next.
      • On the Extract Parameters page, add:
        TABLE src_ociggll.*;
    3. Click Create and Run.
  2. Add the Initial Load Extract:
    1. On the Extracts page, click Add Extract, and then complete the Add Extract form as follows:
      • On the Extract Information page:
        1. For Extract type, select Initial Load Extract.
        2. For Process Name, enter a name, such as EIL.
        3. Click Next.
      • On the Extract Options page:
        1. For Source credentials, select Oracle GoldenGate from the Domain dropdown.
        2. Select the PostgreSQL database from the Alias, dropdown.
        3. For Extract Trail Name, enter a two-character trail name, such as I1.
        4. Click Next.
      • On the Extract Parameters page, replace TABLE *.* with the following:
        TABLE src_ociggll.*;
    2. Click Create and Run.
You return to the Extracts page, where you can observe the Extract starting.

Task 4: Create the Distribution Path for Initial Load Extract

To create a Distribution Path for Initial Load Extract:
  1. In the Oracle Cloud console, on the Deployments page, select the target Big Data deployment.
  2. On the deployment details page, click Launch Console. Log in with the admin user details created in task 1, step 2.
  3. If using IAM credential store, proceed to the Create a Distribution Path step. If using GoldenGate credential store, create a user with which the source GoldenGate uses to connect to the target GoldenGate.
    1. In the navigation menu, click User Administration.
    2. Click Add New User (plus icon), complete the fields as follows, and then click Submit:
      • For Username, enter ggsnet.
      • For Role, select Operator.
      • Enter the password twice for verification.
  4. In the source PostgreSQL deployment console, create a Path Connection for the user created in the previous step.
    1. In the navigaton menu, click Path Connections.
    2. Click Add Path Connection (plus icon), complete the fields as follows, and then click Submit:
      • For Credential Alias, enter dpuser.
      • For User ID, enter ggsnet
      • For Password, enter the same password used in the previous step.
  5. Create a Distribution Path.
    1. In the service menu bar, click Distribution Service, and then click Add Distribution Path (plus icon).
    2. Complete the Add Path form as follows:
      • On the Path Information page:
        1. For Path Name, enter a name for this path.
        2. Click Next.
      • On the Source Options page:
        1. For Source Extract, leave blank.
        2. For Trail Name, enter the Initial Load Extract trail name (I1).
        3. Click Next.
      • On the Target Options page:
        1. For Target, select wss.
        2. For Target Host, enter the target deployment URL, without the https:// or any trailing slashes.
        3. For Port Number, enter 443.
        4. For Trail Name, enter I1.
        5. For Target Authentication Method, select UserID Alias.
        6. For Domain, enter the domain name created in the previous step.
        7. For Alias, enter the alias created in the previous step (dpuser).
        8. Click Next.
    3. Click Create and Run.
    You return to the Distribution Service page where you can review the path created.
  6. In the target Big Data deployment console, review the Receiver Path created as a result of the Distribution path:
    1. Click Receiver Service.
    2. Review the Receiver Path details.

Task 5: Add the Replicat for Initial Load

  1. In the target Big Data deployment console, add the Initial Load Replicat.
    1. In the navigation menu, click Replicats, and then click Add Replicat (plus icon).
    2. On the Replicats page, then complete the Add Replicat fields as follows:
      1. On the Replication Information page:
        1. For Replicat type, select Coordinated Replicat.
        2. For Process Name, enter a name, such as RIL.
        3. Click Next.
      2. On the Replicat Options page:
        1. For Replicat Trail Name, enter the name of the Trail from Task 2 (I1).
        2. For Target, select Snowflake.
        3. For Target Credentials, select the Domain and Alias for the Snowflake connection.
        4. For Available aliases, select an alias from the dropdown, such as Snowflake.
        5. Select Streaming.
        6. Click Next.
      3. On the Parameter File page, add the following mapping:
        INSERTALLRECORDS
        MAP src_ociggll.src_city, TARGET SRCMIRROR_OCIGGLL.SRC_CITY;
        MAP src_ociggll.src_region, TARGET SRCMIRROR_OCIGGLL.SRC_REGION;
        MAP src_ociggll.src_customer, TARGET SRCMIRROR_OCIGGLL.SRC_CUSTOMER;
        MAP src_ociggll.src_orders, TARGET SRCMIRROR_OCIGGLL.SRC_ORDERS;
        MAP src_ociggll.src_order_lines, TARGET SRCMIRROR_OCIGGLL.SRC_ORDER_LINES;
        MAP src_ociggll.src_product, TARGET SRCMIRROR_OCIGGLL.SRC_PRODUCT;
      4. On the Properties page, review the properties, and add jvm.bootoptions= -Djdk.lang.processReaperUseDefaultStackSize=true.
      5. Click Create and Run.

    You return to the Replicats page, where you can review the Replicat details.

  2. To verify the Initial Load, connect to Snowflake database and run following queries:
    select * from SRCMIRROR_OCIGGLL.SRC_CITY;
    select * from SRCMIRROR_OCIGGLL.SRC_CUSTOMER;

    The output should return the data that was loaded into the target database tables as a result of the Initial Load.

Task 6: Create the Distribution Path for Change Data Capture

To create a Distribution Path for Change Data Capture:
  1. In the source PostgreSQL deployment console, click Distribution Service.
  2. Click Add Distribution Path.
  3. Complete the Add Path form as follows:
    1. On the Path Information page:
      1. For Path Name, enter a name.
      2. Click Next.
    2. On the Source Options page:
      1. For Source Extract, select the Integrated Extract (ECDC).
      2. For Trail Name, select the Integrated Extract trail file (C1).
      3. Click Next.
    3. On the Target Options page:
      1. For Target, select wss.
      2. For Target Host, enter the target deployment console URL (you can find this on the deployment details page, without the https:// or any trailing slashes.
      3. For Port Number, enter 443.
      4. For Trail Name, enter C1.
      5. For Target Authentication Method, select UserID Alias.
      6. For Domain, enter the domain name.
      7. For Alias, enter the alias.
    4. Click Create Path and Run.
  4. In the target Big Data deployment console, click Receiver Service, and then review the Receiver path created.

Task 7: Add a Replicat for Change Data Capture

Perform updates to the source PostgreSQL database to verify replication to Snowflake.
  1. Add the Replicat.
    1. In the target Big Data deployment console, click Administration Service, and then in the navigation menu, click Replicats.
    2. On the Replicats page, click Add Replicat (plus icon), and then complete the Add Replicat form as follows:
      • On the Replicat Information page:
        1. For Replicat type, select Classic or Coordinated.
        2. For Process Name, enter a name, such as RCDC.
        3. Click Next.
      • On the Replicat Options page:
        1. For Replicat Trail Name, enter the name of the Trail from Task 3 (C1).
        2. For Target, select Snowflake.
        3. For Target Credentials, select the Domain and Alias for the Snowflake connection.
        4. Select Streaming.
      • On the Parameter Files page, add the following mapping, and then click Next:
        INSERTALLRECORDS 
        MAP src_ociggll.src_city, TARGET SRCMIRROR_OCIGGLL.SRC_CITY;
        MAP src_ociggll.src_region, TARGET SRCMIRROR_OCIGGLL.SRC_REGION;
        MAP src_ociggll.src_customer, TARGET SRCMIRROR_OCIGGLL.SRC_CUSTOMER;
        MAP src_ociggll.src_orders, TARGET SRCMIRROR_OCIGGLL.SRC_ORDERS;
        MAP src_ociggll.src_order_lines, TARGET SRCMIRROR_OCIGGLL.SRC_ORDER_LINES;
        MAP src_ociggll.src_product, TARGET SRCMIRROR_OCIGGLL.SRC_PRODUCT;
      • On the Properties page, review the properties, add the following mapping, and then click Create and Run:
        jvm.bootoptions= -Djdk.lang.processReaperUseDefaultStackSize=true

    You return to the Replicats page, where you can review the Replicat details.

  2. Verify Change Data Capture:
    1. Perform updates to the source PostgreSQL database to verify replication to Snowflake. Run the following script to perform inserts into the PostgreSQL database:
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1000,'Houston',20,743113);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1001,'Dallas',20,822416);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1002,'San Francisco',21,157574);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1003,'Los Angeles',21,743878);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1004,'San Diego',21,840689);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1005,'Chicago',23,616472);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1006,'Memphis',23,580075);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1007,'New York City',22,124434);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1008,'Boston',22,275581);
      Insert into src_ociggll.src_city (CITY_ID,CITY,REGION_ID,POPULATION) values (1009,'Washington D.C.',22,688002);
    2. In the source PostgreSQL deployment console, select the RCDC, and then click Statistics. Verify that src_ociggll.src_city has 10 inserts.
      Note

      If the Extract captured no inserts, then restart the ECDC Extract.
    3. In the target Big Data deployment console, select the RCDC, review its Details and Statistics to verify the number of Inserts.

Task 8: Monitor and maintain processes