Harvest On-Premises Data Sources
Harvesting is a process that extracts technical metadata from your data source into your data catalog. This tutorial provides the steps to harvest from an on-premises Oracle Database data source that is connected to Oracle Cloud Infrastructure using a Virtual Cloud Network (VCN).
In this tutorial, you:
- Create the policies needed to harvest on-premises data sources.
- Obtain the on-premises database access details.
- Create a private endpoint in Data Catalog.
- Attach the private endpoint to your data catalog.
- Create a data asset.
- Add a connection for the data asset.
- Harvest the data asset.
For more information, see Configuring a Private Network.
Before You Begin
To successfully perform this tutorial, you must have the following:
- An Oracle Cloud Infrastructure account. See signing up for Oracle Cloud Infrastructure.
- Access to use the Data Catalog resources. See prerequisites and policy examples.
- A created data catalog instance. See creating a data catalog instance.
Before you can harvest an on-premises data source, you must connect your on-premises data source to Oracle Cloud Infrastructure.
Connecting an On-Premises Data Source to Oracle Cloud Infrastructure
You create policies in Oracle Cloud Infrastructure to allow access to the various resources. Before you can create a private network in your tenancy, you must have the required networking permissions.
In this setup, you create a policy to allow you to perform all networking operations in any compartment in your tenancy.
Perform the following steps:
A Virtual Cloud Network (VCN) is a virtual, private network that you set up in a single Oracle Cloud Infrastructure region. A VCN has a single, contiguous IPv4 CIDR block of your choice.
The allowable VCN size range is /16 to /30. Decide on the CIDR block before you create a VCN. You can't change the CIDR value later. For your reference, here's a CIDR Calculator.
To create a VCN, complete the following steps:
The VCN is created and the Virtual Cloud Networks Details page for the VCN displays.
By default, a route table, DHCP option, and security list are automatically created for the VCN.
Dynamic Routing Gateway (DRG) is a virtual router that you must add to your VCN to provide a path for private network traffic between your VCN and on-premises network.
After you create a DRG, you configure access to the on-premises network using either VPN Connect or FastConnect. In this setup, instructions to use DRG with FastConnect are provided.
To create a DRG, complete the following steps:
- Open the navigation menu and click Identity & Security. Under Identity, click Domains. Under Identity domain, click Dynamic groups.
- Click Create Dynamic Routing Gateway.
- Ensure you have permissions to work in the selected compartment or select the compartment where you want to create the DRG.
- Enter a NAME for the DRG.
- Click Create Dynamic Routing Gateway. The DRG is created and the details page displays. You now configure the DRG with FastConnect.
- Click Virtual Circuits.
- Click the FastConnect link in the Virtual Circuits table. Alternatively, from the navigation menu select Networking and then click FastConnect.
- Create your FastConnect connections depending on whether you want to connect with an Oracle Partner, a third-party provider, or colocate with Oracle. While creating your FastConnect connections, select a Private Virtual Circuit and the DRG you created previously.
After you have configured the Virtual Circuits for your DRG, you attach this DRG to your VCN. You can attach only one DRG to a VCN at a time, and you can attach a DRG to only one VCN at a time.
Complete the following steps to attach a DRG to a VCN:
- Open the navigation menu, click Networking, and then click Virtual cloud networks.
- Click the VCN you created before.
- Click Dynamic Routing Gateways.
- Click Attach Dynamic Routing Gateway.
- Select the compartment where you created the DRG and then select the DRG you created and configured before.
- Click Attach.
You route your private subnet traffic to the DRG using a route table for the private subnet.
Complete the following steps to update the default route table for your private subnet:
- Open the navigation menu, click Networking, and then click Virtual cloud networks.
- Click the VCN you created previously to view the VCN details.
- Click Route Tables.
- Click Default Route Table for <your-vcn> or the route table you specified for the private subnet.
- Click Add Route Rules.
- In the Add Route Rules panel, select Dynamic Routing Gateway for TARGET TYPE.
- In DESTINATION CIDR BLOCK, enter the CIDR for your on-premises network.
- Click Add Route Rules.
You use DHCP options for a VCN to automatically provide configuration information to instances when they boot up.
To create DHCP options, complete the following steps:
- Open the navigation menu, click Networking, and then click Virtual cloud networks.
- Select the VCN you're configuring for your on-premises data source.
- Click DHCP Options.
- Click Create DHCP Options.
- Enter a name for the DHCP options.
- Ensure you have permissions to work in the selected compartment or select the compartment where you want to create the DHCP options.
- Select Internet and VCN Resolver for DNS Type.
- Click Create DHCP Options.
You use a Network Address Translation (NAT) gateway to give instances in your private network access to the Internet without exposing your instances to incoming internet connections. Creating a NAT gateway is an optional step that you can complete for your VCN. You can harvest the on-premises Oracle Database without creating a NAT gateway.
- Open the navigation menu, click Networking, and then click Virtual cloud networks.
- Select the VCN you are configuring for your on-premises data source.
- Click NAT Gateways.
- Click Create NAT Gateway.
- Enter a name for the NAT Gateway.
- Ensure you have permissions to work in the selected compartment or select the compartment where you want to create the NAT Gateway.
- Click Create NAT Gateway.
- Create a route rule that directs the required traffic from your private subnet to the NAT gateway..
- Click Route Tables
- Select the route table for your VCN.
- Click Add Route Rule.
- Select NAT Gateway for Target Type.
- Enter 0.0.0.0/0 for Destination CIDR Block.
- Select the compartment where you created the NAT Gateway previously.
- Select the NAT Gateway you created previously for Target NAT Gateway.
- Click Add Route Rule.
When you create a VCN, a security list is created by default for the VCN. You can add more security rules to this default security list or create a security list to permit traffic in and out of your VCN. In this tutorial, you add a security rule to the default security list.
Complete the following steps to create a security list with the required security rules:
- Open the navigation menu, click Networking, and then click Virtual cloud networks.
- Click the VCN you had created before to view the VCN details.
- Click Security Lists from the Virtual Cloud Networks Details page of the VCN you created before.
- Click the Default Security List for <your vcn>.
- Click Egress Rules.
- Ensure you have the default egress rule created to allow traffic for all protocols.
- Click Ingress Rules.
- Click Add Ingress Rules.
- Enter the CIDR of your on-premises network that contains the DNS server IP used to resolve the FQDNs of your on-premises data sources.
- Select TCP for IP PROTOCOL.
- Enter 1521–1522 for DESTINATION PORT RANGE.
- Click Add Ingress Rules.
Subnets are divisions you create in a VCN. Each subnet consists of a contiguous range of IP addresses that don't overlap with other subnets in the VCN. You create a private subnet when you don't want the resources created in the subnet to have public IP addresses.
Complete the following steps to create a private subnet:
- Click Create Subnet from the Virtual Cloud Networks Details page of the VCN you created in the previous step.
- Enter a name for the private subnet.
- Retain the default regional selection for SUBNET TYPE.
- Enter the CIDR block for the private subnet.
- Select the default route table or the route table you updated with the DRG.
- Select PRIVATE SUBNET for SUBNET ACCESS.
- Select Use DNS Hostnames in this Subnet for DNS RESOLUTION.
- Enter a DNS label.
- Select the DHCP options you created previously.
- Select the default security lists where you added the security rule for your VCN.
- Click Create Subnet.
The private subnet is created and displayed on the Subnets page in the compartment you chose.
1. Create Access Policies
To configure Data Catalog to access the private network of a data source, you need access to networking and data catalog resources.
If you already have access to perform all Data Catalog and Networking operations in your required compartments, you may skip this step.
To create the policy needed to configure a private network in data catalog, perform the following steps:
2. Obtain Data Source Details
You need the private network and database connection information for the on-premises Oracle Database you want to harvest.
Obtain the following details for the on-premises Oracle Database from your administrator:
- For configuring the private network, you need the VCN and subnet name and the URL of the Oracle Database.
- For creating the data asset, you need the Oracle Database host, port, and database service name or SID.
- For adding a connection, you need the database login credentials.
3. Create a Private Endpoint
You create a Data Catalog private endpoint to configure the network access details for the on-premises Oracle Database data source you want to harvest.
To create a private endpoint in Data Catalog, perform the following steps:
- Open the navigation menu and click Analytics & AI. Under Data Lake, click Data Catalog.
- Click Private Endpoints.
- On the Private Endpoints page, click Create Private Endpoint.
- In the Create Private Endpoint panel, ensure you have permission to work in the selected compartment, and enter a name for the private endpoint. For example, XYZ Private Endpoint.
- Select the VCN and subnet that is used to connect your on-premises Oracle Database to Oracle Cloud Infrastructure.
- Enter the DNS zone for the Oracle Database. Use a comma to enter more than one data source DNS zone.
- Click Create.
ACTIVE
status.If the private endpoint status changes to
FAILED
, ensure you have the created the access policies and set up your private network
correctly.
4. Attach a Private Endpoint
You attach a private endpoint to a data catalog to allow data assets to be created for data sources available in the private network.
To attach a private endpoint to a data catalog, perform the following steps:
- Click Data Catalogs.
- Click the Actions menu for the data catalog where you want to attach the private endpoint and select Attach Private Endpoint.
- Select the private endpoint you created in the previous step and click Attach.
Updating
, and the private
endpoint is being attached. After the private endpoint is attached successfully, the
status of the data catalog changes to Active
.5. Create an Oracle Database Data Asset
You are now ready to register your on-premises Oracle Database data source with Data Catalog as a data asset .
To create an Oracle Database data asset, perform the following steps:
- Click the data catalog instance where you attached the private endpoint in the previous step.
- From your data catalog Home tab, click Create Data Asset from the Quick Actions tile.
- In the Create Data Asset panel, enter a name to uniquely identify your data asset. Optionally, enter a description.
- From the Type list, select Oracle Database.
- In the Host field, enter the database hostname.
- In the Port field, enter the database port.
- In the Database field, enter the database service name or SID.
- Select the Use private endpoint check box.
- Click Create.
6. Add a Connection
After creating the Oracle Database data asset, you add a connection for the data asset.
For Oracle database, you can use secrets in Oracle Cloud Infrastructure Vault to store the password that you need to connect to the source using a connection. By using OCI Vault, you provide the OCID of the secret when specifying the connection details, so you don't have to enter the actual password when you create the data asset.
A vault is a container for keys and secrets. Secrets store credentials such as required passwords for connecting to data sources. You use an encryption key in a vault to encrypt and import secret contents to the vault. Secret contents are based64-encoded. Data Catalog uses the same key to retrieve and decrypt secrets while connecting a data asset to the data source. For more information about vault, key, and secret, see Overview of Vault. For information about copying the secret OCID, see View Secret Details.
To add a connection for the Oracle Database data asset, follow these steps:
7. Harvest the Data Asset
You are now ready to harvest your Oracle Database data asset.
To harvest your Oracle Database data asset, perform the following steps:
What's Next
Now, you can explore the data asset, create a glossary, and link terms and tags to data objects.