Load Data from Files in the Cloud
The PL/SQL package DBMS_CLOUD
provides support for loading data from files in the cloud to tables created in your Autonomous Database on Dedicated Exadata Infrastructure.
DBMS_CLOUD
:
- Text files in the cloud, using the
DBMS_CLOUD.COPY_DATA
procedure - JSON files in the cloud, using the
DBMS_CLOUD.COPY_TEXT
procedure
- The source file is available as a local file in your client computer or uploaded to a cloud-based object store, such as Oracle Cloud Infrastructure Object Storage and is accessible to the database user attempting to load data.
- Your Cloud Object Storage credentials are stored using the
DBMS_CLOUD.CREATE_CREDENTIAL
procedure. See Create Credentials for more information.
The package DBMS_CLOUD
supports loading from files in the following
cloud services: Oracle Cloud Infrastructure Object Storage, Oracle Cloud Infrastructure Object Storage Classic, Azure Blob Storage, and
Amazon S3.
- Create Credentials
Learn how to store your Cloud Object Storage credential using theDBMS_CLOUD.CREATE_CREDENTIAL
procedure. - Load Data from Text Files
Learn how to load data from text files in the cloud to your Autonomous Database using theDBMS_CLOUD.COPY_DATA
procedure. - Load a JSON File of Delimited Documents into a Collection
Learn how to load a JSON file of delimited documents into a collection in your Autonomous Database using theDBMS_CLOUD.COPY_DATA
procedure. - Load an Array of JSON Documents into a Collection
Learn how to load an array of JSON documents into a collection in your Autonomous Database using theDBMS_CLOUD.COPY_COLLECTION
procedure. - Copy JSON Data into an Existing Table
UseDBMS_CLOUD.COPY_DATA
to load JSON data in the cloud into a table. - Textual JSON Objects That Represent Extended Scalar Values
Native binary JSON data (OSON format) extends the JSON language by adding scalar types, such as date, that correspond to SQL types and are not part of the JSON standard. Oracle Database also supports the use of textual JSON objects that represent JSON scalar values, including such nonstandard values. - Monitor and Troubleshoot Data Loading
All data load operations done using the PL/SQL package
are logged in the tablesDBMS_CLOUD
dba_load_operations
anduser_load_operations
:
Parent topic: Move Data Into Autonomous Database
Create Credentials
Learn how to store your Cloud Object Storage credential using the DBMS_CLOUD.CREATE_CREDENTIAL
procedure.
DBMS_CLOUD.CREATE_CREDENTIAL
procedure using any database
tool such as SQL*Plus, SQL Developer, or Database Actions (web based SQL Developer
tool). For
example:BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL
(
credential_name => 'DEF_CRED_NAME',
username => 'adb_user@oracle.com',
password => 'password'
);
END;
/
The values you provide for username
and
password
depend on the Cloud Object Storage service you are
using:
-
Oracle Cloud Infrastructure Object Storage:
username
is your Oracle Cloud Infrastructure user name andpassword
is your Oracle Cloud Infrastructure auth token. See Working with Auth Tokens. -
Oracle Cloud Infrastructure Object Storage Classic:
username
is your Oracle Cloud Infrastructure Classic user name andpassword
is your Oracle Cloud Infrastructure Classic password.
This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for all data loads.
Parent topic: Load Data from Files in the Cloud
Load Data from Text Files
Learn how to load data from text files in the cloud to your Autonomous Database using the DBMS_CLOUD.COPY_DATA
procedure.
The source file in this example, channels.txt
, has the following
data:
S,Direct Sales,Direct T,Tele Sales,Direct C,Catalog,Indirect I,Internet,Indirect P,Partners,Others
Parent topic: Load Data from Files in the Cloud
Load a JSON File of Delimited Documents into a Collection
Learn how to load a JSON file of delimited documents into a collection in
your Autonomous Database using the DBMS_CLOUD.COPY_DATA
procedure.
This example loads JSON values from a line-delimited file and uses the JSON file
myCollection.json
. Each value, each line, is loaded into a
collection on your Autonomous Database as a
single document.
Here is an example of such a file. It has three lines, with one object per line. Each of those objects gets loaded as a separate JSON document.
{ "name" : "apple", "count": 20 } { "name" : "orange", "count": 42 } { "name" : "pear", "count": 10 }
Parent topic: Load Data from Files in the Cloud
Load an Array of JSON Documents into a Collection
Learn how to load an array of JSON documents into a collection in your Autonomous Database using the DBMS_CLOUD.COPY_COLLECTION
procedure.
This example uses the JSON file fruit_array.json
. The following
shows the contents of the file fruit_array.json
:
[{"name" : "apple", "count": 20 }, {"name" : "orange", "count": 42 }, {"name" : "pear", "count": 10 }]
Parent topic: Load Data from Files in the Cloud
Copy JSON Data into an Existing Table
Use DBMS_CLOUD.COPY_DATA
to load
JSON data in the cloud into a table.
The source file in this example is a JSON data file.
Parent topic: Load Data from Files in the Cloud
Textual JSON Objects That Represent Extended Scalar Values
Native binary JSON data (OSON format) extends the JSON language by adding scalar types, such as date, that correspond to SQL types and are not part of the JSON standard. Oracle Database also supports the use of textual JSON objects that represent JSON scalar values, including such nonstandard values.
When you create native binary JSON data from textual JSON data that contains such extended objects, they can optionally be replaced with corresponding (native binary) JSON scalar values.
An example of an extended object is {"$numberDecimal":31}
.
It represents a JSON scalar value of the nonstandard type decimal number, and
when interpreted as such it is replaced by a decimal number in native binary format.
For example, when you use the JSON data type constructor,
JSON
, if you use keyword EXTENDED
then recognized
extended objects in the textual input are replaced with corresponding scalar values in
the native binary JSON result. If you do not include keyword EXTENDED
then no such replacement occurs; the textual extended JSON objects are simply converted
as-is to JSON objects in the native binary format.
In the opposite direction, when you use Oracle SQL function
json_serialize
to serialize binary JSON data as textual JSON data
(VARCHAR2
, CLOB
, or BLOB
), you
can use keyword EXTENDED
to replace (native binary) JSON scalar values
with corresponding textual extended JSON objects.
If the database you use is an Oracle Autonomous Database then you can
use PL/SQL procedure DBMS_CLOUD.copy_collection
to create a JSON
document collection from a file of JSON data such as that produced by common NoSQL
databases, including Oracle NoSQL Database.
If you use ejson
as the value of the
type
parameter of the procedure, then recognized extended JSON
objects in the input file are replaced with corresponding scalar values in the
resulting native binary JSON collection. In the other direction, you can use
function json_serialize
with keyword EXTENDED
to
replace scalar values with extended JSON objects in the resulting textual JSON
data.
These are the two main use cases for extended objects:
-
Exchange (import/export):
-
Ingest existing JSON data (from somewhere) that contains extended objects.
-
Serialize native binary JSON data as textual JSON data with extended objects, for some use outside the database.
-
-
Inspection of native binary JSON data: see what you have by looking at corresponding extended objects.
For exchange purposes, you can ingest JSON data from a file produced by common NoSQL databases, including Oracle NoSQL Database, converting extended objects to native binary JSON scalars. In the other direction, you can export native binary JSON data as textual data, replacing Oracle-specific scalar JSON values with corresponding textual extended JSON objects.
Tip:
As an example of inspection, consider an object such as {"dob" :
"2000-01-02T00:00:00"}
as the result of serializing native JSON data.
Is "2000-01-02T00:00:00"
the result of serializing a native binary
value of type date, or is the native binary value just a string? Using
json_serialize
with keyword EXTENDED
lets you
know.
The mapping of extended object fields to scalar JSON types is, in general,
many-to-one: more than one kind of extended JSON object can be mapped to a given scalar
value. For example, the extended JSON objects {"$numberDecimal":"31"}
and {"$numberLong:"31"}
are both translated as the value 31 of
JSON-language scalar type number, and item method type()
returns
"number"
for each of those JSON scalars.
Item method type()
reports the JSON-language scalar type of
its targeted value (as a JSON string). Some scalar values are distinguishable
internally, even when they have the same scalar type. This generally allows function
json_serialize
(with keyword EXTENDED
) to
reconstruct the original extended JSON object. Such scalar values are distinguished
internally either by using different SQL types to implement them or by tagging
them with the kind of extended JSON object from which they were derived.
When json_serialize
reconstructs the original extended JSON
object the result is not always textually identical to the original, but it is
always semantically equivalent. For example,
{"$numberDecimal":"31"}
and
{"$numberDecimal":31}
are semantically equivalent, even
though the field values differ in type (string and number). They are translated to the
same internal value, and each is tagged as being derived from a
$numberDecimal
extended object (same tag). But when serialized, the
result for both is {"$numberDecimal":31}
. Oracle always uses
the most directly relevant type for the field value, which in this case is the
JSON-language value 31
, of scalar type number.
Table 4-1 presents correspondences among the various types used. It maps across
(1) types of extended objects used as input, (2) types reported by item method
type()
, (3) SQL types used internally, (4) standard JSON-language
types used as output by function json_serialize
, and (5) types of
extended objects output by json_serialize
when keyword
EXTENDED
is specified.
Table 4-1 Extended JSON Object Type Relations
Extended Object Type (Input) | Oracle JSON Scalar Type (Reported by type()) | SQL Scalar Type | Standard JSON Scalar Type (Output) | Extended Object Type (Output) |
---|---|---|---|---|
$numberDouble with value a JSON number,
a string representing the number, or one of these strings:
"Infinity" , "-Infinity" ,
"Inf" , "-Inf" ,
"Nan" Foot 1
|
double | BINARY_DOUBLE |
number |
$numberDouble with value a JSON number
or one of these strings: "Inf" ,
"-Inf" , "Nan" Foot 2 |
$numberFloat with value the same as for
$numberDouble |
float | BINARY_FLOAT |
number |
$numberFloat with value the same as for
$numberDouble |
$numberDecimal with value the same as
for $numberDouble |
number | NUMBER |
number |
$numberDecimal with value the same as
for $numberDouble |
$numberInt with value a signed 32-bit
integer or a string representing the number
|
number | NUMBER |
number |
$numberInt with value the same as for
$numberDouble |
$numberLong with value a JSON number or
a string representing the number
|
number | NUMBER |
number |
$numberLong with value the same as for
$numberDouble |
When the value is a string of base-64 characters, the
extended object can also have field |
binary | BLOB or RAW |
string Conversion is equivalent to the use of SQL function
|
One of the following:
|
$oid with value a string of 24 hexadecimal
characters
|
binary | RAW(12) |
string Conversion is equivalent to the use of SQL function
|
$rawid with value a string of 24 hexadecimal
characters
|
$rawhex with value a string with an even number of
hexadecimal characters
|
binary | RAW |
string Conversion is equivalent to the use of SQL function
|
$binary with value a string of base-64 characters,
right-padded with = characters
|
$rawid with value a string of 24 or 32 hexadecimal
characters
|
binary | RAW |
string Conversion is equivalent to the use of SQL function
|
$rawid |
$oracleDate with value an ISO 8601 date
string
|
date | DATE |
string |
$oracleDate with value an ISO 8601 date
string
|
$oracleTimestamp with value an ISO 8601 timestamp
string
|
timestamp | TIMESTAMP |
string |
$oracleTimestamp with value an ISO 8601 timestamp
string
|
$oracleTimestampTZ with value an ISO
8601 timestamp string with a numeric time zone offset or with
Z |
timestamp with time zone | TIMESTAMP WITH TIME ZONE |
string |
$oracleTimestampTZ with value an ISO
8601 timestamp string with a numeric time zone offset or with
Z |
|
timestamp with time zone | TIMESTAMP WITH TIME ZONE |
string |
$oracleTimestampTZ with value an ISO
8601 timestamp string with a numeric time zone offset or with
Z |
$intervalDaySecond with value an ISO 8601 interval
string as specified for SQL function
to_dsinterval |
daysecondInterval | INTERVAL DAY TO SECOND |
string |
$intervalDaySecond with value an ISO 8601 interval
string as specified for SQL function
to_dsinterval |
$intervalYearMonth with value an ISO
8601 interval string as specified for SQL function
to_yminterval |
yearmonthInterval | INTERVAL YEAR TO MONTH |
string |
$intervalYearMonth with value an ISO
8601 interval string as specified for SQL function
to_yminterval |
Two fields:
|
vector | VECTOR |
array of numbers |
Two fields:
|
Footnote 1 The string values are
interpreted case-insensitively. For example, "NAN"
"nan"
, and "nAn"
are accepted and
equivalent, and similarly "INF"
,
"inFinity"
, and "iNf"
.
Infinitely large ("Infinity"
or
"Inf"
) and small ("-Infinity"
or "-Inf"
) numbers are accepted with either the
full word or the abbreviation.
Footnote 2 On output, only these string values are used — no full-word Infinity or letter-case variants.
Parent topic: Load Data from Files in the Cloud
Monitor and Troubleshoot Data Loading
All data load operations done using the PL/SQL package
are logged in the tables
DBMS_CLOUD
dba_load_operations
and user_load_operations
:
-
dba_load_operations
: shows all load operations. -
user_load_operations
: shows the load operations in your schema.
Query these tables to see information about ongoing and completed data loads. For
example, using a SELECT
statement with a WHERE
clause
predicate on the TYPE
column, shows load operations with the type
COPY
:
SELECT table_name, owner_name, type, status, start_time, update_time, logfile_table, badfile_table
FROM user_load_operations WHERE type = 'COPY';
TABLE_NAME OWNER_NAME TYPE STATUS START_TIME UPDATE_TIME LOGFILE_TABLE BADFILE_TABLE
---------- ----------- ------- ---------- ---------------------- --------------------- --------------- ------------- ------------- -------------
CHANNELS SH COPY COMPLETED 04-MAR-21 07.38.30.522711000 AM GMT 04-MAR-21 07.38.30.522711000 AM GMT COPY$1_LOG COPY$1_BAD
The LOGFILE_TABLE
column shows the name of the table you can query
to look at the log of a load operation. For example, the following query shows the log of the load
operation:
select * from COPY$21_LOG;
The BADFILE_TABLE
column shows the name of the table you can query
to look at the rows that got errors during loading. For example, the following query shows
the rejected records for the load operation:
select * from COPY$21_BAD;
Depending on the errors shown in the log and the rows shown in the specified
BADFILE_TABLE
table, you can correct the error by specifying the
correct format options in DBMS_CLOUD.COPY_DATA
.
The
LOGFILE_TABLE
and
BADFILE_TABLE
tables are stored for two days for each load operation and then
removed automatically.
See DELETE_ALL_OPERATIONS Procedure for information on clearing the
user_load_operations
table.
Parent topic: Load Data from Files in the Cloud