Query External Tables with Implicit Partitioning
On Autonomous AI Database you can create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store.
- About External Tables with Implicit Partitioning
On Autonomous AI Database use theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store. - Query External Implicit Partitioned Data with Hive Format Source File Organization
Use theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to create an implicit partitioned external table from data in Object Store that was generated from Hive data. - Query External Implicit Partitioned Non-Hive Style Data
Use theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to create an implicit partitioned external table from data in object storage that has been generated from non-Hive data.
Parent topic: Query External Data with Autonomous AI Database
About External Tables with Implicit Partitioning
On Autonomous AI Database use the
DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to create implicit partitioned
external tables from Hive style partitioned data or from simple folder partitioned data stored
on your Cloud Object Store.
By passing the appropriate options to this procedure, the partitions are derived
from the source data. The partitioned external tables support runtime discovery of partition
columns and their values. Runtime discovery of changes in the underlying object store
structure, such as adding or removing objects, simplifies the maintenance process by
eliminating the need for additional synchronization procedures required by
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE
. This ensures that the data is up to
date at query runtime.
With implicit partitioning, Autonomous AI Database
automatically determines the columns a table is "partitioned on" based on the object storage
source hierarchical file structure. There is no need to explicitly declare a partitioning
scheme. Implicit partitioning provides partition-table-like performance benefits without the
need to explicitly define a partitioned external table using the
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE
procedure.
In order to optimize query response time for queries of very large data sets
using implicit partitioning, you can enable optimization by setting
strict_column_order
to true. The strict_column_order
parameter is a sub-option of the implicit_partition_config
format option.
This optimization only applies when partition_type
is hive. See Query External Implicit Partitioned Data with Hive Format Source File Organization for an example and DBMS_CLOUD Package Format Options for more information on the implicit partitioning format parameters.
Implicit partitioned external tables support the following naming styles for partitioned objects in object store:
- Hive Naming Format: Data in object storage that has been generated from Hive
has the following format.
OBJBUCKET/<table>/<partcol1>=<value1>/<partcol2>=<value2>/file.parquet
For an example, see Query External Implicit Partitioned Data with Hive Format Source File Organization
- Non-Hive "Pathtail" Naming Format: A second format that is often found in
data lakes is similar to the hive format, but the folder name does not include the partition
columns.
OBJBUCKET/<table>/<value1>/<value2>/file.parquet
For an example, see Query External Implicit Partitioned Non-Hive Style Data.
For more information on naming formats, see About External Tables with Source File Partitioning.
DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure, implicit partitioned external tables can be created in the following ways:
- Optimize implicit partitioning. Use the
implicit_partition_config
option to enable implicit partitioning and optimization of elapsed query time.In this example, implicit partitioning is enabled by setting
partition_type
to hive. The only valid value forpartition_type
is hive. Optimization is enabled by settingsetting strict_column_order
to true. Thepartition_columns
option specifies which columns are partitioned.For example:
BEGIN dbms_cloud.create_external_table ( table_name => 'partitions1', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'c varchar2(100), y number, total_sales number', format => '{"type":"parquet", "implicit_partition_config":{ "partition_type":"hive", "strict_column_order":true, "partition_columns":["org","tenant","dbtype","year","month","day"] } }'); END; /
- Set the partition type to hive with a given list of partition columns
In this case,
implicit_partition_type
is set tohive
andimplicit_partition_columns
provides a list of partition columns.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_type":"hive", "implicit_partition_columns":["country","year","month"]}'); END; /
- Set the partition type to hive without providing a list of partition
columns
In this case,
implicit_partition_type
is set tohive
andimplicit_partition_columns
is not provided. The partition columns are automatically detected by searching for '=' in the path specified byfile_uri_list
. Thecolumn_name
is to the left of the '=' and the value is on the right side. If acolumn_name
is not found in the path, an error is thrown.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_type":"hive"'); END; /
- Provide a list of partition columns without specifying the type
In this case,
implicit_partition_type
is not set andimplicit_partition_columns
provides a list of columns.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_columns":["country","year","month"]}'); END; /
See Query External Implicit Partitioned Data with Hive Format Source File Organization and CREATE_EXTERNAL_TABLE Procedure for further information.
Parent topic: Query External Tables with Implicit Partitioning
Query External Implicit Partitioned Data with Hive Format Source File Organization
Use the DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to
create an implicit partitioned external table from data in Object Store that was
generated from Hive data.
The sample source files in the follow example, use this naming format:
OBJBUCKET/<table>/<partcol1>=<value1>/<partcol2>=<value2>/file.parquet
Consider the following sample source files:
OBJBUCKET/sales/country=USA/year=2024/month=01/sales-2024-01.parquet
OBJBUCKET/sales/country=USA/year=2024/month=02/sales-2024-02.parquet
OBJBUCKET/sales/country=USA/year=2024/month=02/sales-2024-03.parquet
To create an implicit partitioned external table with data stored in this sample Hive format, do the following:
- Optimize Object Store Query Planning with the strict_column_order option
When a query targets an object-store folder with a very large number of files and subfolders, the planning and listing phase can become the primary cost before any data is actually scanned. This is common with Hive-style folder layouts where partition values are embedded in the path.
Parent topic: Query External Tables with Implicit Partitioning
Optimize Object Store Query Planning with the strict_column_order option
When a query targets an object-store folder with a very large number of files and subfolders, the planning and listing phase can become the primary cost before any data is actually scanned. This is common with Hive-style folder layouts where partition values are embedded in the path.
-
The
strict_column_order
setting is disabled by default. Enable it only if the path layout is consistent. -
If folder conventions change (such as adding, removing, or reordering partition columns), you must update the
partition_columns
option and may need to disable this option.
Enable the strict_column_order
option by the following:
If your data uses Hive-style partitioned paths and the partition columns always appear in a fixed, consistent order with no missing segments, enable the optimization by the following setting:
strict_column_order = true (with partition_type = "hive")
The database can skip unnecessary directories by following the order of partitions that are already defined. This means it does not have to list every single object, which can significantly speed up the planning time for large datasets.
Notes for using the strict_column_order option:
-
Paths follow Hive naming and order, for example:
.../country=US/year=2025/month=09/...
-
The partition columns maintain a fixed set and sequence, with no reordering or skipped prefixes.
- You need to optimize plan-time listing for folders containing a very large number of objects.
- You cannot use this option in datasets where not all prefixes are present. For example, some folders include year=, while others begin with month=).
Examples: Use strict_column_order option for Hive-Style Partitions with DBMS_CLOUD.CREATE_EXTERNAL_TABLE
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE (
table_name => 'sales_xt',
credential_name => 'MY_CREDENTIAL',
file_uri_list => 'https://objectstorage.example.com/n/tenant/b/lake/o/sales/',
-- Data is stored as: .../country=US/year=2025/month=09/*.parquet
column_list => 'product VARCHAR2(100),
units NUMBER,
amount NUMBER,
country VARCHAR2(30),
year NUMBER,
month NUMBER',
format => '{
"type": "parquet",
"implicit_partition_config": {
"partition_type": "hive",
"partition_columns": ["country","year","month"],
"strict_column_order": true
}
}'
);
END;
/
Verify that the planning logic avoids listing unrelated top-level paths (e.g., paths belonging to other countries or years):
- Run a selective query that constrains leading partitions, for example:
withSELECT COUNT(*) FROM sales_xt WHERE country = ''US'' AND year = 2025;
.strict_column_order = true
- If the layout assumptions are not met, disable the option and retry.
Query External Implicit Partitioned Non-Hive Style Data
Use the DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure
to create an implicit partitioned external table from data in object storage that
has been generated from non-Hive data.
The sample source files in the follow example, use this naming format:
OBJBUCKET/<table>/<value1>/<value2>/file.parquet
Consider the following sample source files:
OBJBUCKET/sales/USA/2024/01/sales-2024-01.parquet
OBJBUCKET/sales/USA/2024/01/sales-2024-02.parquet
OBJBUCKET/sales/USA/2024/01/sales-2024-03.parquet
To create an implicit partitioned external table with data stored in this sample Hive format, do the following:
Parent topic: Query External Tables with Implicit Partitioning