The PL/SQL package DBMS_CLOUD enables you to build a text index on the object store files, which allows you to search the text and use wildcards with your search.
About Full-Text Search on Files in Object Storage You can create a text index on files in object storage. A text index allows you to perform a word-based search across very large data sets in object store.
Text Index Reference Table A local table is created within your database with a standard suffix INDEX_NAME$TXTIDX. This table is created internally when you run DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX.
Monitor Text Index Creation When you run DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX the text index creation operation is logged in the ALL_SCHEDULER_JOB_RUN_DETAILS view.
About Full-Text Search on Files in Object Storage 🔗
You
can create a text index on files in object storage. A text index allows you to perform a
word-based search across very large data sets in object store.
DBMS_CLOUD provides
fast and efficient ways to manage data in object store. The DBMS_CLOUD APIs let you create,
copy, download, delete, and traverse files present in object store. When you define
external tables you can run SQL queries on data stored in your object store (or with
hybrid partitioned external tables, across data in your database and in object store).
When you use DBMS_CLOUD to define
a text index, this allows you to search your data for text and use wildcards.
Autonomous Database support for word-based
search works for commonly used data formats, for example CSV or JSON and with formatted
documents (binary), for example PDF and DOC (MS Word) formats. You can configure a
refresh rate that indicates the frequency in minutes at which the index is refreshed for
any new uploads or deletes.
A local table with the standard suffix INDEX_NAME$TXTIDX is
created when you create an index on the object storage, and you can utilize the table to
perform a search using the CONTAINS keyword.
Run the DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX
procedure to create a text index on the object storage
files.
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX (
credential_name => 'DEFAULT_CREDENTIAL',
location_uri => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/ts_data/'
index_name => 'EMP',
format => JSON_OBJECT ('refresh_rate' value 10)
);
END;
/
This example creates a text index
EMP on the object storage files
located at the URI specified in the
location_uri parameter. The
refresh_rate option in the
format parameter specifies that
the index EMP is refreshed at an
interval of 10 minutes.
This creates a local table
INDEX_NAME$TXTIDX. You can
utilize the table
INDEX_NAME$TXTIDX to
perform a search using CONTAINS.
For example:
SELECT object_name FROM EMP$TXTIDX
WHERE CONTAINS(object_name,'king') > 0;
This query returns the object or file names
that contain the string king.
A local table is created within your database with a standard suffix
INDEX_NAME$TXTIDX. This table is created internally when you
run DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX.
You can query the INDEX_NAME$TXTIDX table to search for a
string using the CONTAINS keyword. For example, when you call DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX
procedure the INDEX_NAME value as EMP, this
creates the EMP$TXTIDX the text reference table.
The text reference table has the following columns:
object_name: is the file name on the object
storage that contains the searched text string.
object_path: is the object storage bucket or
folder URI that contains the object storage file.
mtime: is the last modified timestamp of the
object storage file. This is the time when the file was last accessed by
DBMS_CLOUD.
For
example:
SELECT object_path, object_name FROM EMP$TXTIDX WHERE CONTAINS(OBJECT_NAME, 'king') > 0;
OBJECT_PATH OBJECT_NAME
------------------------------------------------------------------------------------------ ------------------------------------
https://objectstorage.us-phoenix-1.oraclecloud.com/n/example1/b/adbs_data_share/o/ts_data/ data_2_20221026T195313585601Z.json
This query returns the file names and location URI on the object storage which
contains the text string king, in either upper or lowercase.
When you run DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX the text index creation
operation is logged in the ALL_SCHEDULER_JOB_RUN_DETAILS view.
You can query the ALL_SCHEDULER_JOB_RUN_DETAILS view to obtain the
status and any error reported by the index creation job.
The name of the DBMS_SCHEDULER job is derived from the
INDEX_NAME parameter specified when you call DBMS_CLOUD.CREATE_EXTERNAL_TEXT_INDEX.
To query the ALL_SCHEDULER_JOB_RUN_DETAILS view, you must be logged
in as the ADMIN user or have READ privilege on the
ALL_SCHEDULER_JOB_RUN_DETAILS view.
For example, the following SELECT statement with a
WHERE clause on job_name shows the run details for
the job:
SELECT status, additional_info
FROM all_scheduler_job_run_details WHERE LOWER(job_name) = LOWER('index_name$JOB');
You can also query for the existence of an index creation scheduler job.
For
example:
SELECT status
FROM all_scheduler_jobs where LOWER(job_name) = LOWER('index_name$JOB');