DBMS_DCAT Package

The DBMS_DCAT package provides functions and procedures to help Autonomous Database users leverage the data discovery and centralized metadata management system of OCI Data Catalog.

Data Catalog harvests metadata from a data lake's object storage assets. The harvesting process creates logical entities, which can be thought of as tables with columns and associated data types. DBMS_DCAT procedures and functions connect Autonomous Database to Data Catalog and then synchronize the assets with the database, creating protected schemas and external tables. You can then query object store using those external tables, easily joining external data with data stored in Autonomous Database. This dramatically simplifies the management process; there is a single, centrally managed metadata store that is shared across multiple OCI services (including Autonomous Databases). There are also Autonomous Database dictionary views that allow you to inspect the contents of Data Catalog using SQL, and show you how these Data Catalog entities map to your Autonomous Database schemas and tables.

Data Catalog Users and Roles

The DBMS_DCAT package supports synced users/schemas, dcat_admin users and local users. Users must have the dcat_sync role to be able to use this package.

Data Catalog Users

  • Synced users/schemas

    The synced external tables are organized into database schemas corresponding to Data Asset/Bucket combinations, or according to custom properties set by the user. The synced schemas are automatically created/dropped during Data Catalog synchronization. They are created as no authentication users without the CREATE SESSION privilege. The synced schemas are also created using the protected clause, so that they cannot be altered by local users (not even the PDB admin) and can only be modified through the synchronization.

  • User dcat_admin

    User dcat_admin is a local database user that can run a sync and grant READ privilege on synced tables to other users or roles. The user is created as a no authentication user without the CREATE SESSION privilege.

  • Local users

    Database users querying the external tables must be explicitly granted READ privileges on the synced external tables by users dcat_admin or ADMIN. By default, after the sync is completed, only users dcat_admin and ADMIN have access to the synced external tables.

Data Catalog Roles

  • dcat_sync

    The dcat_sync role has all the required privileges for using the DBMS_DCAT package. Users must have this role to be able to use the API for navigating the Data Catalog and running the sync.

Required Credentials and IAM Policies

This topic describes the Oracle Cloud Infrastructure Identity and Access Management (IAM) user credentials and policies required to give Autonomous Database users permission to manage a data catalog and to read from object storage.

OCI Data Catalog Credential and Policy Requirements:

AWS Glue Data Catalog Credential and Policy Requirements

The following user credentials and policies are required to give Autonomous Database users permission to access Amazon Web Services (AWS) Glue Data Catalogs and to read from the S3 object storage:
  • A credential object with permission to access an AWS Glue Data Catalog is required. For information on managing credentials, see DBMS_CLOUD for Access Management.

    For accessing an AWS Glue Data Catalog the following privileges are required: glue:GetDatabases , glue:GetTables , and glue:GetTable.

    In addition, privilege s3:GetBucketLocation is needed during synchronization for generating resolvable https urls pointing to the underlying S3 objects.
  • A credential object with permission to access the files stored in S3 is required so that Autonomous Database can query data files.
  • AWS credentials are supported. AWS Amazon Resource Names (ARN) credentials are not supported.

Example: Creating an OCI Native Authentication Credential Object

In this example, we create an OCI native authentication credential that can be used when creating a data catalog or an object store credential object. For more details, see DBMS_DCAT SET_DATA_CATALOG_CREDENTIAL Procedure and DBMS_DCAT SET_OBJECT_STORE_CREDENTIAL Procedure respectively.

In OCI native authentication, the DBMS_CLOUD.CREATE_CREDENTIAL procedure includes these parameters: credential_name, user_ocid, tenancy_ocid, private_key, and fingerprint. See DBMS_CLOUD CREATE_CREDENTIAL Procedure for a complete description of this procedure.

The credential_name is the name of the credential object. The user_ocid and tenancy_ocid parameters correspond to the user's and tenancy's OCIDs respectively.

The private_key parameter specifies the generated private key in PEM format. Private keys created with a passphrase are not supported. Therefore, we need to make sure we generate a key with no passphrase. See How to Generate an API Signing Key for more details on how to create a private key with no passphrase. Also, the private key that we provide for this parameter must only contain the key itself without any header or footer (e.g. ‘-----BEGIN RSA PRIVATE KEY-----', ‘-----END RSA PRIVATE KEY-----’).

The fingerprint parameter specifies the fingerprint that is obtained either after uploading the public key to the console, or using the OpenSSL commands. See How to Upload the Public Key and How to Get the Key's Fingerprint for further details on obtaining the fingerprint.

Once all the necessary information is gathered and the private key is generated, we're ready to run the following CREATE_CREDENTIAL procedure:

BEGIN
  DBMS_CLOUD.CREATE_CREDENTIAL (
    credential_name => 'OCI_NATIVE_CRED',
    user_ocid              => 'ocid1.user.oc1..aaaaaaaatfn77fe3fxux3o5lego7glqjejrzjsqsrs64f4jsjrhbsk5qzndq',
    tenancy_ocid           => 'ocid1.tenancy.oc1..aaaaaaaapwkfqz3upqklvmelbm3j77nn3y7uqmlsod75rea5zmtmbl574ve6a',
    private_key            => 'MIIEogIBAAKCAQEA...t9SH7Zx7a5iV7QZJS5WeFLMUEv+YbYAjnXK+dOnPQtkhOblQwCEY3Hsblj7Xz7o=',
    fingerprint            => '4f:0c:d6:b7:f2:43:3c:08:df:62:e3:b2:27:2e:3c:7a');
END;
/
After creating the credential object, it displays in the dba_credentials table:
SELECT owner, credential_name
FROM dba_credentials 
WHERE credential_name LIKE '%NATIVE%';

OWNER CREDENTIAL_NAME
----- ---------------
ADMIN OCI_NATIVE_CRED

Example: Using Autonomous Database Resource Principal

In this example, a dynamic group is created that includes appropriate resource members, the dynamic group is given permission to manage a Data Catalog, and then the dynamic group is given permission to read from object storage.

  1. Create a dynamic group named adb-grp-1. Add a matching rule to adb-grp-1 that includes the Autonomous Database instance with OCID ocid1.autonomousdatabase.oc1.iad.abuwcljr...fjkfe as a resource member.

    Dynamic group matching rule:

    resource.id = 'ocid1.autonomousdatabase.oc1.iad.abuwcljr...fjkfe'
  2. Define a policy granting the adb-grp-1 dynamic group full access to the Data Catalog instances, in the mycompartment compartment.

    allow dynamic-group adb-grp-1 to manage data-catalog-family in compartment mycompartment
  3. Define a policy that allows the adb-grp-1 dynamic group to read any bucket in the compartment named mycompartment.
    allow dynamic-group adb-grp-1 to read objects in compartment mycompartment

Example: Using User Principals

In this example, user1 is a member of the group adb-admins. All members of this group are given permission to manage all data catalogs in mycompartment, and to read from object-store in mycompartment.

  1. Allow users that are members of adb-admins to manage all data catalogs within mycompartment.
    allow group adb-admins to manage data-catalog-family in compartment mycompartment
  2. Allow users that are members of adb-admins to read any object in any bucket within mycompartment.
    allow group adb-admins to read objects in compartment mycompartment

Summary of Connection Management Subprograms

This table lists the DBMS_DCAT package procedures used to create, query and drop Data Catalog connections.

Subprogram Description
SET_DATA_CATALOG_CONN Procedure Create a connection to the given data catalog
SET_DATA_CATALOG_CREDENTIAL Procedure Set the data catalog access credential used by a specific connection to the data catalog
SET_OBJECT_STORE_CREDENTIAL Procedure Set the credential used by the given unique connection identifier for accessing the Object Store
UNSET_DATA_CATALOG_CONN Procedure Remove an existing Data Catalog connection

SET_DATA_CATALOG_CREDENTIAL Procedure

This procedure sets the Data Catalog access credential used by a specific connection to the Data Catalog.

Syntax

PROCEDURE DBMS_DCAT.SET_DATA_CATALOG_CREDENTIAL(
    credential_name VARCHAR2(128) DEFAULT NULL,
    dcat_con_id     VARCHAR2 DEFAULT NULL
  );

Parameters

Parameter Description

credential_name

(Optional) The credential used for accessing the Data Catalog.

dcat_con_id The unique Data Catalog connection identifier. This credential is used for the connection identified by dcat_con_id. The default is Null.

Usage

This credential must have Manage Data Catalog permissions; see Data Catalog Policies. The default is the resource principal; see Access Cloud Resources by Configuring Policies and Roles.

SET_OBJECT_STORE_CREDENTIAL Procedure

This procedure sets the credential that is used by the given unique connection identifier for accessing the Object Store. Changing the Object Store access credential alters all existing synced tables to use the new credential.

Syntax

PROCEDURE DBMS_DCAT.SET_OBJECT_STORE_CREDENTIAL(
    credential_name  VARCHAR2(128),
    dcat_con_id      IN VARCHAR2 DEFAULT NULL
  );

Parameters

Parameter Description

credential_name

The credential used by the external tables for accessing the Object Store.

dcat_con_id The unique Data Catalog connection identifier. The default is NULL.

SET_DATA_CATALOG_CONN Procedure

This procedure creates a connection to the given Data Catalog. The connection is required to synchronize metadata with Data Catalog. An Autonomous Database instance can connect to multiple Data Catalog instances and supports connecting to OCI Data Catalogs and AWS Glue Data Catalogs.

Syntax

PROCEDURE DBMS_DCAT.SET_DATA_CATALOG_CONN (
   region       VARCHAR2 DEFAULT NULL,
   endpoint     VARCHAR2 DEFAULT NULL,
   catalog_id   VARCHAR2 DEFAULT NULL,
   dcat_con_id  VARCHAR2 DEFAULT NULL,
   catalog_type VARCHAR2 DEFAULT NULL
 );

Parameters

Parameter Description

region

The Data Catalog region. If the endpoint is specified, region is optional. If both endpoint and region are given, then the endpoint takes precedence. Default is NULL.

endpoint

The Data Catalog endpoint. If the region is specified, endpoint is optional. If both endpoint and region are given, then the endpoint takes precedence. Default is NULL.

catalog_id

The unique Oracle Cloud Identifier (OCID) for the Data Catalog instance. When connecting to AWS Glue Data Catalogs, catalog_id is optional.

dcat_con_id A unique Data Catalog connection identifier. This identifier is required when connecting to multiple Data Catalogs and is optional when connecting to only one. It is used to refer to the Data Catalog connection in subsequent calls or when querying views. If no identifier is specified this procedure generates a NULL connection identifier. The following restrictions apply for dcat_con_id:
  • It must be unique within the Autonomous Database instance.
  • It must start with a letter.
  • It may contain alphanumeric characters, underscores (_), dollar signs ($), and pound signs (#).
  • It must be at least 16 characters long.
catalog_type The type of data catalog to connect. Allowed values:
  • OCI_DCAT - OCI Data Catalog
  • AWS_GLUE - AWS Glue Data Catalog
  • NULL - The catalog type is automatically detected from the provided region or endpoint.

Usage

You only need to call this procedure once to set the connection. As part of the connection process, Autonomous Database adds custom properties to Data Catalog. These custom properties are accessible to Data Catalog users and allow you to override default names (for schemas, tables and columns) and column data types.

Before creating a connection, credentials must be created and set. For a description of the connection process, see Typical Workflow with Data Catalog for OCI Data Catalogs and User Workflow for Querying with AWS Glue Data Catalog for AWS Glue Data Catalogs.

Example: Connecting with a known OCID

In this example, Autonomous Database is connecting to Data Catalog in the uk-london-1 region. The catalog_id parameter uses the Oracle Cloud Identifier (ocid) for the Data Catalog instance. The type of Data Catalog is automatically determined: AWS Glue Data Catalog or OCI Data Catalog.

BEGIN
  DBMS_DCAT.SET_DATA_CATALOG_CONN(
    region=>'uk-london-1',
    catalog_id=>'ocid1.datacatalog.oc1.uk-london-1...');
END;
/

Example: Connecting to an AWS Glue Data Catalog

A connection is the association between an Autonomous Database instance and an AWS Glue Data Catalog. After a successful connection, the Autonomous Database instance is able to synchronize with AWS Glue. Each AWS account has one AWS Glue Data Catalog per region and each catalog can be accessed using the corresponding service endpoint for each region. An Autonomous Database instance can be associated with an AWS Glue Data Catalog by invoking the API DBMS_DCAT.SET_DATA_CATALOG_CONN and specify the endpoint for the region where the catalog resides.

See AWS Glue endpoints and quotas.

In this example, Autonomous Database is connecting to an AWS Glue Data Catalog in the uk-london-1 region. Because this is an AWS Glue Data Catalog connection, the catalog_id parameter is not needed.

BEGIN
  DBMS_DCAT.SET_DATA_CATALOG_CONN(
    region=>'uk-london-1',
    catalog_type=>'AWS_GLUE'
END;
/

UNSET_DATA_CATALOG_CONN Procedure

This procedure removes an existing Data Catalog connection.

Syntax

Note

Invoking this procedure drops all of the protected schemas and external tables that were created as part of previous synchronizations. It does not impact the metadata in Data Catalog.
PROCEDURE DBMS_DCAT.UNSET_DATA_CATALOG_CONN (
    dcat_con_id IN VARCHAR2 DEFAULT NULL
);

Parameters

Parameter Description
dcat_con_id The unique Data Catalog connection identifier. Default is Null.

Summary of Synchronization Subprograms

Running a synchronization, creating and dropping a synchronization job, and dropping synchronized schemas can be performed with the procedures listed in this table.

Note

On April 4, 2022, the sync_option and grant_read parameters were added to the DBMS_DCAT.RUN_SYNC procedure. To ensure correct performance of scheduled sync jobs created prior to that date, you need to drop and recreate the scheduled sync jobs. See DBMS_DCAT.DROP_SYNC_JOB Procedure and DBMS_DCAT.CREATE_SYNC_JOB Procedure.
Subprogram Description
CREATE_SYNC_JOB Procedure Create a scheduler job to invoke RUN_SYNC periodically
DROP_SYNC_JOB Procedure Drop an existing sync job for the given unique connection identifier
DROP_SYNCED_SCHEMAS Procedure Drop all previously synchronized schemas for the given unique connection identifier
RUN_SYNC Procedure Run a synchronization operation

RUN_SYNC Procedure

This procedure runs a synchronization operation and is the entry point to the synchronization. As an input, it takes lists of selected data catalog assets, folders and entities and materializes them by creating, dropping, and altering external tables.

The sync_option parameter specifies which operation the RUN_SYNC procedure performs: SYNC, DELETE or REPLACE. The operation is performed over entities within the scope of the synced_objects parameter.

Every call to the RUN_SYNC procedure returns a unique operation_id that can be used to query the USER_LOAD_OPERATIONS view to obtain information about the status of the sync and the corresponding log_table. The DBMS_DCAT$SYNC_LOG view can be queried for easy access to the log_table for the last sync operation executed by the current user. For further details, see DBMS_DCAT$SYNC_LOG View, and Monitoring and Troubleshooting Loads.

Note

On April 4, 2022, the sync_option and grant_read parameters were added to the RUN_SYNC procedure. To ensure correct performance of scheduled sync jobs created prior to that date, you need to drop and recreate the scheduled sync jobs. See DBMS_DCAT.DROP_SYNC_JOB Procedure and DBMS_DCAT.CREATE_SYNC_JOB Procedure.

Synchronizing Partitioned Logical Entities or Glue Tables

The RUN_SYNC procedure creates a partitioned external table for each logical entity or Glue table when all three of the following apply:

  1. The OCI data catalog logical entity or Glue table has one or more partitioned attributes.
  2. For OCI data catalogs, the logical entity is derived from a prefix-based filename pattern. Partitioned logical entities derived from regex-based patterns are not supported.
  3. For OCI data catalogs, the logical entity is based on partitioned data that follows the hive-style or non-hive folder format. Logical entities based on partitioned data that follow the non-hive style format using object names are not supported.
    • Example 1. Logical entities based on harvested objects that follow the Hive style partitioning format with prefix-based filename patterns.

      Consider the following objects:
      Bucket: MYBUCKET
      cluster1/db1.db/sales/country=USA/year=2020/month=01/sales1.csv
      cluster1/db1.db/sales/country=USA/year=2020/month=01/sales2.csv
      cluster1/db1.db/sales/country=USA/year=2020/month=02/sales1.csv

      Harvesting the bucket using a filename pattern with a starting folder prefix of cluster1/db1.db generates a logical entity named SALES with three partition attributes: country, year, and month. The type for partitioned attributes is Partition while the type for non-partitioned attributes is Primitive.

    • Example 2. Logical entities based on harvested objects that follow the non-Hive style partitioning format with prefix-based filename patterns.
      Consider the following objects:
      Bucket: MYBUCKET
      cluster2/db2.db/sales/USA/2020/01/sales1.csv
      cluster2/db2.db/sales/USA/2020/01/sales2.csv
      cluster2/db2.db/sales/USA/2020/02/sales1.csv

      Harvesting the bucket using a filename pattern with a starting folder prefix of cluster2/db2.db generates a logical entity named SALES with three partition attributes: name0, name1, and name2. The only difference between the generated logical entity compared to Example 1, is that the names of partitioned attributes are auto generated, while in Example 1 they are extracted from the URL (country, year, and month respectively).

For a complete end to end example of synchronizing partitioned logical entities, see Example: A Partitioned Data Scenario.

Incremental Synchronization of Partitioned Logical Entities/Glue Tables

Every call to the RUN_SYNC procedure specifies a set of OCI data catalog logical entities or AWS Glue tables to be synced with the database. When a logical entity or Glue table is present in two RUN_SYNC calls, the second call preserves and possibly alters existing external tables. The following table shows which logical entity or Glue table changes are supported when the logical entity or Glue table is partitioned:

Logical Entity or Glue Table Change Action

Addition, removal, or update of a partition

All partitions of the external partitioned table are updated, regardless of whether a change has been detected by the data catalog.

Addition of a partitioned attribute

Adding a partitioned column to an external partitioned table is not supported. An exception is raised.

Deletion of a partition attribute

Dropping a partitioned column from an external partitioned table is not supported. An exception is raised.

Renaming of a partitioned attribute

Renaming a partitioned column in an external partitioned table is not supported. An exception is raised.

Syntax

PROCEDURE DBMS_DCAT.RUN_SYNC (
    synced_objects   IN  CLOB,
    sync_option      IN VARCHAR2 DEFAULT 'SYNC',
    error_semantics  IN VARCHAR2 DEFAULT 'SKIP_ERRORS',
    log_level        IN VARCHAR2 DEFAULT 'INFO',
    grant_read       IN VARCHAR2 DEFAULT NULL,
    dcat_con_id      IN VARCHAR2 DEFAULT NULL
 );
PROCEDURE DBMS_DCAT.RUN_SYNC (
    synced_objects   IN  CLOB,
    sync_option      IN VARCHAR2 DEFAULT 'SYNC',
    error_semantics  IN VARCHAR2 DEFAULT 'SKIP_ERRORS',
    log_level        IN VARCHAR2 DEFAULT 'INFO',
    grant_read       IN VARCHAR2 DEFAULT NULL,
    operation_id     OUT NOCOPY NUMBER,
    dcat_con_id      IN VARCHAR2 DEFAULT NULL
 );

Parameters

Parameter Description

synced_objects

This parameter is a JSON document that specifies the data catalog objects to synchronize.

For OCI Data Catalogs, the JSON document specifies a set of entities in multiple granularity: data assets, folders (Object Store buckets) or logical entities. It contains an asset_list that is either an array of asset objects or an array containing a single "*" string that stands for 'sync all (object store) data assets in the catalog'.

For AWS Glue Data Catalogs, the JSON document specifies a list of tables in multiple granularity: databases, tables. The document specifies a list of databases. Users can restrict the set of tables to be synced by specifying individual tables within a database.

sync_option (Optional) There are three options:
  • SYNC (Default) - This option ensures that what is in the data catalog, over the synced_objects scope, is represented in the Autonomous Database. If a logical entity or Glue table was deleted from the data catalog, since the last sync operation, then it is deleted in the Autonomous Database. The following operations are performed over the synced_objects scope:
    • Adds tables for new data catalog entities
    • Removes tables for deleted data catalog entities
    • Updates properties (such as name, columns and data types) for existing tables
  • DELETE - Deletes tables within the synced_objects scope.
  • REPLACE - Replaces all currently synced objects with the objects within the synced_objects scope.

error_semantics

(Optional) This parameter specifies the error behavior. If set to SKIP_ERRORS, the sync attempts to continue despite errors encountered for individual entities. If set to STOP_ON_ERROR, the procedure fails on the first encountered error. The default is SKIP_ERRORS.

log_level

(Optional) This parameter specifies the following values in increasing level of logging detail: (OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE, ALL). The default is INFO.
grant_read (Optional) This parameter is a list of users/roles that are automatically granted READ privileges on all external tables processed by this invocation of RUN_SYNC. All users/roles in the grant_read list are given READ privileges on all new or already existing external tables that correspond to entities specified by the synced_objects parameter. The RUN_SYNC procedure preserves already granted privileges on synced external tables.

operation_id

(Optional) This parameter is used to find the corresponding entry in USER_LOAD_OPERATIONS for the sync and determine the name of the log table.

Note: A version of RUN_SYNC that does not return an operation_id is available so users can query USER_LOAD_OPERATIONS for the latest sync.

dcat_con_id This parameter is the unique data catalog connection identifier that was specified when the connection to the data catalog was created. See DBMS_DCAT SET_DATA_CATALOG_CONN Procedure. This parameter identifies which connection is used for synchronization and becomes a part of the derived schema name. See Synchronization Mapping for a description of how the schema name is derived. The parameter default is NULL.

Example: Synchronize All OCI Data Catalog Entities

In the following example, all Data Catalog entities are synchronized.

EXEC DBMS_DCAT.RUN_SYNC(synced_objects=>'{"asset_list":["*"]}');

Example: synced_objects Parameter for Synchronizing All OCI Data Catalog Data Assets

The following is an example synced_objects parameter for synchronizing all (Object Storage) data assets in the Data Catalog.

{"asset_list" : ["*"]}

Example: synced_objects Parameter for Synchronizing Specific OCI Data Catalog Data Assets

The following is an example synced_objects parameter for synchronizing two data assets.

{"asset_list": [
        {
            "asset_id":"0b320de9-8411-4448-91fb-9e2e7f78fd5f"
        },
        {
            "asset_id":"0b320de9-8411-4448-91fb-9e2e7f74523"
        }
    ]}

Example: synced_objects Parameter for Synchronizing Specific OCI Data Catalog Entities within a Data Asset

The following shows an example synced_objects parameter for synchronizing two entities within the data asset.

{"asset_list": [
        {
            "asset_id":"0b320de9-8411-4448-91fb-9e2e7f78fd5f",
            "folder_list":[
                "f1",         
                "f2" 
           ]
        }   
    ]}

Example: synced_objects Parameter for Synchronizing Specific OCI Data Catalog Folders and Entities within a Data Asset

The following shows an example synced_objects parameter for synchronizing two folders and two entities within the data asset.

{"asset_list":[
        {
            "asset_id":"0b320de9-8411-4448-91fb-9e2e7f78fd5f",
            "entity_list": [
                "entity1",     
                "entity2"
            ],
             "folder_list": [
                "f1",         
                "f2"
            ]
        }   
    ]}

Example: synced_objects Parameter for Synchronizing All AWS Glue Data Catalog Databases

The following shows an example synced_objects parameter for synchronizing all databases in the AWS Glue Data Catalog.

{"database_list":["*"]}

Example: synced_objects Parameter for Synchronizing Two AWS Glue Data Catalog Databases

The following shows an example synced_objects parameter for synchronizing two AWS Glue Data Catalog databases.

{"database_list":[
    {"database":"tpcdscsv"},
    {"database":"tpcdsparquet"} ]}

Example: synced_objects Parameter for Synchronizing Three AWS Glue Data Catalog Databases

The following shows an example synced_objects parameter for synchronizing three tables from an AWS Glue Data Catalog database.

{"database_list":[
 {"database":"tpcdsparquet",
     "table_list": [ "tpcdsparquet_customer",
                     "tpcdsparquet_item",
                     "tpcdsparquet_web_sales" ] } ]}

CREATE_SYNC_JOB Procedure

This procedure creates a scheduler job to invoke RUN_SYNC periodically.

It takes as input the set of objects to be synced, the error semantics, the log level, and a repeat interval. See DBMS_DCAT RUN_SYNC Procedure for further details on how synchronization works.

There can only be a single sync job. The CREATE_SYNC_JOB procedure fails if another job is already specified, unless the force parameter is set to TRUE. If force is set to TRUE the previous job is dropped.

If a scheduler job attempts to run while another sync is in progress, the scheduler job fails.

Note

On April 4, 2022, the sync_option and grant_read parameters were added to the RUN_SYNC procedure. To ensure correct performance of scheduled sync jobs created prior to that date, you need to drop and recreate the scheduled sync jobs. See DBMS_DCAT.DROP_SYNC_JOB Procedure and DBMS_DCAT.CREATE_SYNC_JOB Procedure.

Syntax

PROCEDURE DBMS_DCAT.CREATE_SYNC_JOB (
    synced_objects   IN CLOB,
    error_semantics  IN VARCHAR2 DEFAULT 'SKIP_ERRORS',
    log_level        IN VARCHAR2 DEFAULT 'INFO',
    repeat_interval  IN VARCHAR2,
    force            IN VARCHAR2 DEFAULT 'FALSE',
    grant_read       IN VARCHAR2 DEFAULT NULL,
    sync_option      IN VARCHAR2 DEFAULT 'SYNC',
    dcat_con_id      IN VARCHAR2 DEFAULT NULL
);

Parameters

Parameter Description

synced_objects

A JSON object specifying the objects to be synced, as described in the RUN_SYNC procedure.

error_semantics

(Optional) Error behavior, as specified for RUN_SYNC. Default is SKIP_ERRORS.

log_level

(Optional) Logging level, as specified for RUN_SYNC. Default is INFO.

repeat_interval

Repeat interval for the job, with the same semantics as the repeat interval parameter of the DBMS_SCHEDULER.CREATE_JOB procedure. For details on the repeat_interval, see Overview of Creating Jobs.

force

(Optional) If TRUE, existing sync jobs are deleted first. If FALSE, the CREATE_SYNC_JOB procedure fails if a sync job already exists. Default is FALSE.

grant_read (Optional) List of users/roles to be granted READ on the synced external tables, as described for procedure RUN_SYNC. See DBMS_DCAT.RUN_SYNC Procedure.
sync_option (Optional) Behavior with respect to entities that have already been synced through a previous RUN_SYNC operation, as described for procedure RUN_SYNC. See DBMS_DCAT.RUN_SYNC Procedure.
dcat_con_id This parameter is the unique Data Catalog connection identifier that was specified when the connection to Data Catalog was created. See DBMS_DCAT SET_DATA_CATALOG_CONN Procedure. This parameter identifies which connection is used for synchronization and becomes a part of the derived schema name. See Synchronization Mapping for a description of how the schema name is derived. The parameter default is NULL.

DROP_SYNC_JOB Procedure

This procedure drops an existing sync job for the given unique connection identifier.

Syntax

PROCEDURE DBMS_DCAT.DROP_SYNC_JOB (
    dcat_con_id IN VARCHAR2 DEFAULT NULL
  );

Parameters

Parameter Description
dcat_con_id The unique Data Catalog connection identifier. The default is NULL.

DROP_SYNCED_SCHEMAS Procedure

This procedure drops all previously synchronized schemas for the given unique connection identifier.

Syntax

PROCEDURE DBMS_DCAT.DROP_SYNCED_SCHEMAS (
    dcat_con_id IN VARCHAR2 DEFAULT NULL
  );

Parameters

Parameter Description
dcat_con_id The unique Data Catalog connection identifier. The default is NULL.

Summary of Data Catalog Views

Data Catalog integration with Autonomous Database provides numerous tables and views.

These tables and views help you understand:

  • Available Data Catalog assets. Get information about any type of Data Catalog asset - including databases, object stores, and more.
  • Information about the Data Catalog Object Storage assets and entities that have been synchronized with Autonomous Database. This includes details about how Data Catalog items (assets, folders and entities) map to Autonomous Database objects (i.e. schemas and external tables).
  • Metadata sync executions. Review details about sync jobs, including any issues that may have occurred during synchronization.

This table lists the tables and views provided by the DBMS_DCAT package.

View Description
ALL_CLOUD_CATALOG_DATABASES View Display information about OCI Data Catalog data assets and AWS Glue Data Catalog databases
ALL_CLOUD_CATALOG_TABLES View Used to display information about data entities for OCI Data Catalogs and tables for AWS Glue Data Catalogs
ALL_DCAT_ASSETS View List data catalog assets that this database is authorized to access
ALL_DCAT_ATTRIBUTES View List data catalog attributes this database is authorized to access
ALL_DCAT_CONNECTIONS View A view that contains information about the data catalog(s) connected to this instance
ALL_DCAT_ENTITIES View

Lists logical entities this database is authorized to access

ALL_DCAT_FOLDERS View List metadata for the Object Storage buckets containing the data files for the Logical Entities
ALL_DCAT_GLOBAL_ACCESSIBLE_CATALOGS View List all accessible catalogs across all regions, along with the level of access privileges for each catalog
ALL_DCAT_LOCAL_ACCESSIBLE_CATALOGS View List all accessible catalogs in the current region, along with the level of access privileges for each catalog
ALL_GLUE_DATABASES View Lists the AWS Glue Data Catalog databases that the data catalog credential is authorized to access
ALL_GLUE_TABLES View Shows all AWS Glue Data Catalog tables that the data catalog credential is authorized to access
DCAT_ATTRIBUTES View List the mapping of logical entity attributes to external table columns
DCAT_ENTITIES View Describes the mapping of logical entities to external tables
DBMS_DCAT$SYNC_LOG View Provides easy access to the log table for the last sync operation executed by the current user

ALL_CLOUD_CATALOG_DATABASES View

Use view ALL_CLOUD_CATALOG_DATABASES to display information about OCI Data Catalog data assets and AWS Glue Data Catalog databases.

Column Description
DCAT_CON_ID CON1
CATALOG_ID Data catalog unique identifier.

OCI Data Catalog example:

ocid1.datacatalog.oc1.ap-mumbai-1.….y35a

AWS Glue Data Catalog example:

NULL

579294766787

NAME Name of the data asset (OCI)/ database (AWS Glue).

OCI Data Catalog example:

OBJECT_STORE_AT_ASHBURN

AWS Glue Data Catalog example:

OBJECT_STORE_AT_N_CALIFORNIA

DESCRIPTION Description of the data asset (OCI)/ database (AWS Glue).

OCI Data Catalog example:

Data stored in S3 (N. California)

AWS Glue Data Catalog example:

Data stored in S3 (N. California)
TIME_CREATED The date and time the data asset (OCI) / databases (AWS Glue) were created in the data catalog.

OCI Data Catalog example:

26-SEP-22 10.56.01.395000 PM +00:00

AWS Glue Data Catalog example:

2022-06-15T09:45:35+01:00

DETAILS JSON document with metadata about each data entity (OCI) / database (AWS Glue).

OCI Data Catalog example:

{
  "catalog-id": "ocid1.datacatalog.oc1.ap-mumbai-1.amaaa...",
  "description": null,
  "display-name": "OBJECT_STORE_AT_ASHBURN",
  "external-key": "https://swiftobjectstorage.us-ashburn-1....",
  "key": "bc95181c-3ac3-4959-9e5f-4e460d3fb82a",
  "lifecycle-state": "ACTIVE",
  "time-created": "2022-09-26T22:56:01.395000+00:00",
  "type-key": "3ea65bc5-f60d-477a-a591-f063665339f9",
  "uri": "/dcat/20190325/dataAssets/bc95181c-3ac3-4959-9e5f-4e460d3fb82a"
}

AWS Glue Data Catalog example:

{
    "Name": "dbmsdcatpoc",
    "Parameters": {
        "somekey": "somevalue"
    },
    "CreateTime": "2022-06-15T09:45:35+01:00",
    "CreateTableDefaultPermissions": [
        {
            "Principal": {
                "DataLakePrincipalIdentifier": "IAM_ALLOWED_PRINCIPALS"
            },
            "Permissions": [
                "ALL"
            ]
        }
    ],
    "CatalogId": "579294766787"
}

ALL_CLOUD_CATALOG_TABLES View

View ALL_CLOUD_CATALOG_TABLES is used to display information about data entities for OCI Data Catalogs and tables for AWS Glue Data Catalogs.

Column Description
DCAT_CON_ID Unique identifier of the data catalog. The connection id.

OCI Data Catalog example: CON1

AWS Glue Data Catalog example: CON1

CATALOG_ID Data catalog unique identifier.

OCI Data Catalog example: ocid1.datacatalog.oc1.ap-mumbai-1.….y35a

AWS Glue Data Catalog example: NULL

579294766787

DATABASE_NAME Name of the data asset (OCI)/ database (AWS Glue).

OCI Data Catalog example: OBJECT_STORE_AT_ASHBURN

AWS Glue Data Catalog example: OBJECT_STORE_AT_N_CALIFORNIA

NAME Name of the data entity (OCI) / table (AWS Glue).

OCI Data Catalog example: BIKES_TRIPS

AWS Glue Data Catalog example: BIKES_TRIPS

DESCRIPTION Description of the data entity (OCI) / table (AWS Glue).

OCI Data Catalog example: Table storing bike trips

AWS Glue Data Catalog example: Table storing bike trips

TIME_CREATED The date and time the data entity (OCI) / table (AWS Glue) was created in the data catalog.

OCI Data Catalog example: 26-SEP-22 10.56.01.395000 PM +00:00

AWS Glue Data Catalog example: 2022-06-15T09:45:35+01:00

TIME_UPDATED Last time a change was made to the data entity (OCI) / table (AWS Glue).

OCI Data Catalog example: 26-SEP-22 10.56.01.395000 PM +00:00

AWS Glue Data Catalog example: 2022-06-15T09:45:35+01:00

DETAILS JSON document with metadata about each each data entity (OCI) / table (AWS Glue)

OCI Data Catalog example:

{  
  "business-name": null,
  "data-asset-key": "bc95181c-3ac3-4959-9e5f-...",
  "description": null,
  "display-name": "bikes_trips",
  "external-key": "LE: https://swiftobjectstorage.us-ashburn-1.oraclecloud.com/v1/..._trips",
  "folder-key": "9c4b542d-d6eb-4b83-bf59-...",
  "folder-name": "hive",
  "is-logical": true,
  "is-partition": false,
  "key": "fde30a69-a07c-478a-ab62-...",
  "lifecycle-state": "ACTIVE",
  "object-storage-url": "https://objectstorage.us-ashburn-1.oraclecloud.com/n/...",
  "path": "OBJECT_STORE_AT_ASHBURN/hive/hive",
  "pattern-key": "db21b3f1-1508-4045-aa80-...",
  "properties": {
    "default": {
      "CONTENT-LENGTH": "4310321",
      "LAST-MODIFIED": "Fri, 9 Oct 2020 20:16:52 UTC",
      "archivedPECount": "0",
      "dataEntityExpression": "{logicalEntity:[^/]+}.db/{logicalEntity:[^/]+}/.*",
      "harvestedFile": "bikes.db/trips/p_start_month=2019-09/000000_0",
      "patternName": "bikes_trips"
    },
    "harvestProps": {
      "characterset": "UTF8",
      "compression": "none",
      "type": "PARQUET"
    }
  },
  "realized-expression": "bikes.db/trips/.*",
  "time-created": "2022-09-26T22:56:35.063000+00:00",
  "time-updated": "2022-09-26T22:56:35.063000+00:00",
  "type-key": "6753c3af-7f88-44b9-be52-1d57bef462fb",
  "updated-by-id": "ocid1.user.oc1..r5l3tov7a",
  "uri": "/dcat/20190325/dataAssets/bc95181c-3ac3-4959-9e5f-..."
}

AWS Glue Data Catalog example:

{
    "Name": "bikes_trips",
    "DatabaseName": "dbmsdcatpoc",
    "Owner": "owner",
    "CreateTime": "2022-06-23T13:24:20+01:00",
    "UpdateTime": "2022-06-23T13:24:20+01:00",
    "LastAccessTime": "2022-06-23T13:24:20+01:00",
    "Retention": 0,
    "StorageDescriptor": {
        "Columns": [
            {
                "Name": "trip_duration",
                "Type": "int"
            },
            {
                "Name": "start_month",
                "Type": "string"
            }, ...
        ],
        "Location": "s3://dbmsdcatpoc/hive/bikes.db/trips/",
        "InputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",
        "OutputFormat": "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",
        "Compressed": false,
        "NumberOfBuckets": -1,
        "SerdeInfo":
 {            "SerializationLibrary": "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe",
            "Parameters": {
                "serialization.format": "1"
            }
        },
        "BucketColumns": [],
        "SortColumns": [],
        "Parameters": {
            "CrawlerSchemaDeserializerVersion": "1.0",
            "CrawlerSchemaSerializerVersion": "1.0",
            "UPDATED_BY_CRAWLER": "crawler-bikes",
            "averageRecordSize": "86",
            "classification": "parquet",
            "compressionType": "none",
            "objectCount": "12",
            "recordCount": "404947",
            "sizeKey": "35312159",
            "typeOfData": "file"
        },
        "StoredAsSubDirectories": false
    },
    "PartitionKeys": [
        {
            "Name": "p_start_month",
            "Type": "string"
        }
    ],
    "TableType": "EXTERNAL_TABLE",
    "Parameters": {
        "CrawlerSchemaDeserializerVersion": "1.0",
        "CrawlerSchemaSerializerVersion": "1.0",
        "UPDATED_BY_CRAWLER": "crawler-bikes",
        "averageRecordSize": "86",
        "classification": "parquet",
        "compressionType": "none",
        "objectCount": "12",
        "recordCount": "404947",
        "sizeKey": "35312159",
        "typeOfData": "file"
    },
    "CreatedBy": "arn:aws:sts::579294766787:assumed-role/AWSGlueServiceRole-dbmsdcat/AWS-Crawler",
    "IsRegisteredWithLakeFormation": false,
    "CatalogId": "579294766787",
    "VersionId": "0"
}

Example

ALL_DCAT_ASSETS View

The Data Catalog assets that this database is authorized to access.

Column Datatype Description
DCAT_CON_ID VARCHAR2 (4000) Connection identifier that is unique within the instance

KEY

VARCHAR2(4000)

Asset key

DISPLAY_NAME VARCHAR2(4000) Asset display name
DESCRIPTION VARCHAR2(4000) Asset description
CATALOG_ID VARCHAR2(4000) OCID for the Data Catalog containing the asset
EXTERNAL_KEY VARCHAR2(4000) Base Object Storage URI for the asset
URI VARCHAR2(4000) Asset URI for the Data Catalog API
TIME_CREATED TIMESTAMP(6) WITH TIMEZONE The date and time the data asset was created
TYPE_KEY VARCHAR2(4000) The key of the data asset type (currently, only Object Storage data assets are supported). Type keys can be found via the '/types' Data Catalog endpoint.
LIFECYCLE_STATE VARCHAR2(4000) The current state of the data asset. For more information on possible life cycle states, see the Data Catalog DataAsset Reference for a list of possible states for lifecycleState.

ALL_DCAT_ATTRIBUTES View

The Data Catalog attributes this database is authorized to access.

Column Datatype Description
DCAT_CON_ID VARCHAR2 (4000) Connection identifier that is unique within the instance
KEY NUMBER Attribute key

DISPLAY_NAME

VARCHAR2(4000)

Attribute display name

BUSINESS_NAME VARCHAR2(4000) Attribute business name
DESCRIPTION VARCHAR2(4000) Attribute description
DATA_ASSET_KEY VARCHAR2(4000) Data asset key
FOLDER_KEY VARCHAR2(4000) Folder key
ENTITY_KEY VARCHAR2(4000) Entity key
EXTERNAL_KEY VARCHAR2(4000) Unique external key for the attribute
LENGTH NUMBER Maximum allowed length of the attribute value
PRECISION NUMBER Precision of the attribute value (usually applies to float data type)
SCALE NUMBER Scale of the attribute value (usually applies to float data type)
IS_NULLABLE NUMBER Identifies if this attribute can be assigned null values
URI VARCHAR2(4000) URI to the attribute instance in the Data Catalog API
LIFECYCLE_STATE VARCHAR2(4000) The current state of the attribute. For more information on possible life cycle states, see the Data Catalog Attribute Reference for a list of possible states for lifecycleState.
TIME_CREATED TIMESTAMP(6) WITH TIME ZONE The date and time the attribute was created
EXTERNAL_DATA_TYPE VARCHAR2(4000) Data type of the attribute as defined in the external system
MIN_COLLECTION_COUNT NUMBER Minimum number of elements, if the type of the attribute is a collection type
MAX_COLLECTION_COUNT NUMBER Maximum number of elements, if the type of the attribute is a collection type
DATATYPE_ENTITY_KEY VARCHAR2(4000) Entity key that represents the datatype of this attribute, applicable if this attribute is a complex type
EXTERNAL_DATATYPE_ENTITY_KEY VARCHAR2(4000) External entity key that represents the datatype of this attribute, applicable if this attribute is a complex type
PARENT_ATTRIBUTE_KEY VARCHAR2(4000) Attribute key that represents the parent attribute of this attribute, applicable if the parent attribute is of complex datatype
EXTERNAL_PARENT_ATTRIBUTE_KEY VARCHAR2(4000) External attribute key that represents the parent attribute of this attribute, applicable if the parent attribute is of complex type
PATH VARCHAR2(4000) Full path of the attribute

ALL_DCAT_CONNECTIONS View

A view that contains information about the data catalog(s) connected to this instance.

Column Datatype Description
DCAT_CON_ID VARCHAR2(4000) Connection identifier that is unique within the instance
COMPARTMENT_ID VARCHAR2(4000) OCID for the compartment where the Data Catalog instance resides
INSTANCE_ID VARCHAR2(4000)

OCID for the Data Catalog instance

REGION

VARCHAR2(4000)

Region for the Data Catalog instance

ENDPOINT VARCHAR2(4000) Endpoint for the Data Catalog instance
CREATED TIMESTAMP When the Data Catalog instance was created
NAME VARCHAR2(4000) Name of the Data Catalog instance
LAST_UPDATED TIMESTAMP Timestamp of the last update of the connection to the Data Catalog instance
LATEST_OPERATION_ID NUMBER The id of the last synchronization operation
DATA_CATALOG_CREDENTIAL VARCHAR2(128) Credential used for accessing the Data Catalog
OBJECT_STORE_CREDENTIAL VARCHAR2(128) Credential used by the external table driver for accessing the Object Store

ALL_DCAT_ENTITIES View

The Data Catalog logical entities this database is authorized to access.

Column Datatype Description
DCAT_CON_ID VARCHAR2(4000) Connection identifier that is unique within the instance
CATALOG_ID VARCHAR2(4000) OCID for the Data Catalog containing the asset

KEY

VARCHAR2(4000)

Entity key

DISPLAY_NAME VARCHAR2(4000)

Entity display name

BUSINESS_NAME VARCHAR2(4000)

Entity business name

DESCRIPTION VARCHAR2(4000)

Logical entity description

DATA_ASSET_KEY

VARCHAR2(4000)

Asset key

FOLDER_KEY

VARCHAR2(4000)

Folder unique key

FOLDER_NAME VARCHAR2(4000) Folder name (bucket)
EXTERNAL_KEY VARCHAR2(4000) External key for the logical entity
PATTERN_KEY VARCHAR2(4000) Key of the associated pattern for the logical entity
REALIZED_EXPRESSION VARCHAR2(4000) The regular expression used to obtain the files for this logical entity
PATH VARCHAR2(4000) Full path for the logical entity
TIME_CREATED TIMESTAMP(6) WITH TIME ZONE Date and time the entity was created
TIME_UPDATED TIMESTAMP(6) WITH TIME ZONE Last time a change was made to the data entity
UPDATED_BY_ID VARCHAR2(4000) OCID of the user who updated this object in the Data Catalog
URI VARCHAR2(4000) URI of the entity instance in the API
LIFECYCLE_STATE VARCHAR2(4000) The current state of the entity. For more information on possible life cycle states, see the Data Catalog Entity Reference for a list of possible states for lifecycleState.

ALL_DCAT_FOLDERS View

Metadata for the Object Storage buckets containing the data files for the Logical Entities.

Column Datatype Description
DCAT_CON_ID VARCHAR2(4000) Connection identifier that is unique within the instance
CATALOG_ID VARCHAR2(4000) OCID for the Data Catalog containing the asset

KEY

VARCHAR2(4000)

Folder key

DISPLAY_NAME

VARCHAR2(4000)

Folder display name

BUSINESS_NAME VARCHAR2(4000) Folder business name
DESCRIPTION VARCHAR2(4000) Folder description
DATA_ASSET_KEY VARCHAR2(4000) Key for the data asset containing the folder
PARENT_FOLDER_KEY VARCHAR2(4000) Key for the parent folder (currently, this is the data asset key)
PATH VARCHAR2(4000) Full path for the folder
EXTERNAL_KEY VARCHAR2(4000) Object Storage URI for the bucket
TIME_EXTERNAL TIMESTAMP(6) WITH TIMEZONE The last modified timestamp of this folder
TIME_CREATED TIMESTAMP(6) WITH TIMEZONE The date/time the folder was created
URI VARCHAR2(4000) URI to the folder instance in the Data Catalog API.
LIFECYCLE_STATE VARCHAR2(4000) The current state of the folder. For more information on possible life cycle states, see the Data Catalog Folder Reference for a list of possible states for lifecycleState.

ALL_DCAT_GLOBAL_ACCESSIBLE_CATALOGS View

This view lists all accessible catalogs across all regions, along with the level of access privileges for each catalog.

Column Datatype Description

CATALOG_ID

VARCHAR2(4000)

Catalog OCID

CATALOG_NAME VARCHAR2(4000) Name of the catalog
CATALOG_REGION VARCHAR2(4000) Name of the catalog region
CATALOG_SCORE NUMBER The catalog score is a numeric value calculated from the privileges configured for the Data Catalog access credential. A higher catalog score means greater privileges, which may equate to a higher likelihood that this catalog is intended for use with this Autonomous Database instance.

ALL_DCAT_LOCAL_ACCESSIBLE_CATALOGS View

This view lists all accessible catalogs in the current region, along with the level of access privileges for each catalog.

Column Datatype Description

CATALOG_ID

VARCHAR2(4000)

Catalog OCID

CATALOG_NAME VARCHAR2(4000) Name of the catalog
CATALOG_SCORE NUMBER The catalog score is a numeric value calculated from the privileges configured for the Data Catalog access credential. A higher catalog score means greater privileges, which may equate to a higher likelihood that this catalog is intended for use with this Autonomous Database instance.

ALL_GLUE_DATABASES View

The AWS Glue Data Catalog databases that the data catalog credential is authorized to access.

Column Data Type Description
DCAT_CON_ID VARCHAR2(4000) Unique identifier of data catalog connection id.
CATALOG_ID VARCHAR2(255) Data Catalog unique identifier.
NAME VARCHAR2(255) Name of the database.
DESCRIPTION VARCHAR2(2048) Description of the database.
LOCATION_URI VARCHAR2(1024) The location of the database.
CREATE_TIME TIMESTAMP The time that the database was created in the data catalog.
PARAMETERS CLOB JSON document with key-value pairs that define parameters and properties of the database.
TARGET_DATABASE VARCHAR2(4000) JSON document that describes a target database for resource linking in AWS.

ALL_GLUE_TABLES View

This view shows all AWS Glue Data Catalog tables that the data catalog credential is authorized to access.

Column Data Type Description
DCAT_CON_ID VARCHAR2(4000) Unique identifier of data catalog connection id.
CATALOG_ID VARCHAR2(255) Catalog identifier
DATABASE_NAME VARCHAR2(255) Database name
NAME VARCHAR2(255) Table name
TABLE_TYPE VARCHAR2(255) Table type
CLASSIFICATION VARCHAR2(255)  
DESCRIPTION VARCHAR2(2048) Table description
OWNER VARCHAR2(255) Table owner
CREATED_BY VARCHAR2(255) Table creator
CREATE_TIME TIMESTAMP The time the table was created in the data catalog.
LAST_ANALYZED_TIME TIMESTAMP The last time column statistics were computed for this table.
LAST_ACCESS_TIME TIMESTAMP The last time the table was accessed.
UPDATE_TIME TIMESTAMP The last time the table was updated.
IS_REGISTERED_WITH_LAKE_FORMATION NUMBER Indicates whether the table is registered with AWS lake formation.
PARAMETERS CLOB JSON document with key-value pairs that define properties of the table.
PARTITION_KEYS CLOB JSON document with a list of columns by which the table is partitioned.
RETENTION NUMBER The retention time for this table.
STORAGE_DESCRIPTION CLOB JSON document with information about the physical storage of a table.
TARGET_TABLE VARCHAR2(4000) JSON document describing a target table used for resource linking in AWS.
VERSION_ID VARCHAR2(255) The version identifier for the table.
VIEW_EXPANDED_TEXT CLOB Introduced by AWS Glue for compatibility with Hive. Not used by AWS Glue.
VIEW_ORIGINAL_TEXT CLOB Introduced by AWS Glue for compatibility with Hive. Not used by AWS Glue.

DCAT_ATTRIBUTES View

Lists the mapping of logical entity attributes to external table columns.

Column Datatype Description
DCAT_CON_ID VARCHAR2 (4000) Connection identifier that is unique within the instance
ASSET_KEY VARCHAR2(4000) Data Catalog asset key

ENTITY_KEY

VARCHAR2(4000)

Data Catalog entity key

ATTRIBUTE_KEY

VARCHAR2(4000)

Data Catalog attribute key

ORACLE_COLUMN_NAME VARCHAR2(128) Mapped column name

DCAT_ENTITIES View

Describes the mapping of logical entities to external tables.

Column Datatype Description
DCAT_CON_ID VARCHAR2(4000) Connection identifier that is unique within the instance
ASSET_KEY VARCHAR2(4000) Data Catalog asset key

ENTITY_KEY

VARCHAR2(4000)

Data Catalog entity key

FOLDER_KEY

VARCHAR2(4000)

Data Catalog folder key

ORACLE_TABLE_NAME VARCHAR2(128) Mapped table name
ORACLE_SCHEMA_NAME VARCHAR2(128) Mapped schema name
ENTITY_ORACLE_DB_SCHEMA VARCHAR2(4000) The entity's oracle-db-schema custom property used to derive the schema
ASSET_ORACLE_DB_SCHEMA VARCHAR2(4000) The data asset's oracle-db-schema custom property used to derive the schema
FOLDER_ORACLE_DB_SCHEMA VARCHAR2(4000) The folder's oracle-db-schema custom property used to derive the schema

DBMS_DCAT$SYNC_LOG View

The DBMS_DCAT$SYNC_LOG view provides easy access to the log table for the last sync operation executed by the current user.

Every call to the RUN_SYNC procedure is logged to a new log table, pointed to by the LOGFILE_TABLE field of USER_LOAD_OPERATIONS. The log tables are automatically dropped after 2 days, and users can clear all sync logs using the DELETE_ALL_OPERATIONS Procedure where type is DCAT_SYNC.

The DBMS_DCAT$SYNC_LOG view automatically identifies the latest log table. The schema for the DBMS_DCAT$SYNC_LOG view is described below and the access permissions are identical to those of the individual log tables. By default READ is granted to the dbms_dcat role and to the ADMIN user.

The log tables have the following format:

Column Datatype Description

LOG_TIMESTAMP

TIMESTAMP

Timestamp for the log entry.

LOG_LEVEL

VARCHAR2(32)

The entry log level can have one of the following values: OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE, ALL.

LOG_DETAILS VARCHAR2(32767) The log message.