oracle.oci.oci_ai_document_analyze_document_result_actions – Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure¶
Note
This plugin is part of the oracle.oci collection (version 5.0.0).
You might already have this collection installed if you are using the ansible
package.
It is not included in ansible-core
.
To check whether it is installed, run ansible-galaxy collection list
.
To install it, use: ansible-galaxy collection install oracle.oci
.
To use it in a playbook, specify: oracle.oci.oci_ai_document_analyze_document_result_actions
.
New in version 2.9.0: of oracle.oci
Synopsis¶
Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure
For action=analyze_document, perform different types of document analysis.
Requirements¶
The below requirements are needed on the host that executes this module.
python >= 3.6
Python SDK for Oracle Cloud Infrastructure https://oracle-cloud-infrastructure-python-sdk.readthedocs.io
Parameters¶
Parameter | Choices/Defaults | Comments | |||||||
---|---|---|---|---|---|---|---|---|---|
action
string
/ required
|
|
The action to perform on the AnalyzeDocumentResult.
|
|||||||
api_user
string
|
The OCID of the user, on whose behalf, OCI APIs are invoked. If not set, then the value of the OCI_USER_ID environment variable, if any, is used. This option is required if the user is not specified through a configuration file (See
config_file_location ). To get the user's OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm. |
||||||||
api_user_fingerprint
string
|
Fingerprint for the key pair being used. If not set, then the value of the OCI_USER_FINGERPRINT environment variable, if any, is used. This option is required if the key fingerprint is not specified through a configuration file (See
config_file_location ). To get the key pair's fingerprint value please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm. |
||||||||
api_user_key_file
string
|
Full path and filename of the private key (in PEM format). If not set, then the value of the OCI_USER_KEY_FILE variable, if any, is used. This option is required if the private key is not specified through a configuration file (See
config_file_location ). If the key is encrypted with a pass-phrase, the api_user_key_pass_phrase option must also be provided. |
||||||||
api_user_key_pass_phrase
string
|
Passphrase used by the key referenced in
api_user_key_file , if it is encrypted. If not set, then the value of the OCI_USER_KEY_PASS_PHRASE variable, if any, is used. This option is required if the key passphrase is not specified through a configuration file (See config_file_location ). |
||||||||
auth_purpose
string
|
|
The auth purpose which can be used in conjunction with 'auth_type=instance_principal'. The default auth_purpose for instance_principal is None.
|
|||||||
auth_type
string
|
|
The type of authentication to use for making API requests. By default
auth_type="api_key" based authentication is performed and the API key (see api_user_key_file) in your config file will be used. If this 'auth_type' module option is not specified, the value of the OCI_ANSIBLE_AUTH_TYPE, if any, is used. Use auth_type="instance_principal" to use instance principal based authentication when running ansible playbooks within an OCI compute instance. |
|||||||
cert_bundle
string
|
The full path to a CA certificate bundle to be used for SSL verification. This will override the default CA certificate bundle. If not set, then the value of the OCI_ANSIBLE_CERT_BUNDLE variable, if any, is used.
|
||||||||
compartment_id
string
|
The compartment identifier.
|
||||||||
config_file_location
string
|
Path to configuration file. If not set then the value of the OCI_CONFIG_FILE environment variable, if any, is used. Otherwise, defaults to ~/.oci/config.
|
||||||||
config_profile_name
string
|
The profile to load from the config file referenced by
config_file_location . If not set, then the value of the OCI_CONFIG_PROFILE environment variable, if any, is used. Otherwise, defaults to the "DEFAULT" profile in config_file_location . |
||||||||
document
dictionary
/ required
|
|||||||||
bucket_name
string
|
The Object Storage bucket name.
Required when source is 'OBJECT_STORAGE'
|
||||||||
data
string
|
Raw document data with Base64 encoding.
Required when source is 'INLINE'
|
||||||||
namespace_name
string
|
The Object Storage namespace.
Required when source is 'OBJECT_STORAGE'
|
||||||||
object_name
string
|
The Object Storage object name.
Required when source is 'OBJECT_STORAGE'
|
||||||||
source
string
/ required
|
|
The location of the document data. The allowed values are: - `INLINE`: The data is included directly in the request payload. - `OBJECT_STORAGE`: The document is in OCI Object Storage.
|
|||||||
document_type
string
|
|
The document type.
|
|||||||
features
list
/ elements=dictionary / required
|
The types of document analysis requested.
|
||||||||
feature_type
string
/ required
|
|
The type of document analysis requested. The allowed values are: - `LANGUAGE_CLASSIFICATION`: Detect the language. - `TEXT_EXTRACTION`: Recognize text. - `TABLE_EXTRACTION`: Detect and extract data in tables. - `KEY_VALUE_EXTRACTION`: Extract form fields. - `DOCUMENT_CLASSIFICATION`: Identify the type of document.
|
|||||||
generate_searchable_pdf
boolean
|
|
Whether or not to generate a searchable PDF file.
Applicable when feature_type is 'TEXT_EXTRACTION'
|
|||||||
max_results
integer
|
The maximum number of results to return.
Applicable when feature_type is one of ['DOCUMENT_CLASSIFICATION', 'LANGUAGE_CLASSIFICATION']
|
||||||||
model_id
string
|
The custom model ID.
Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
|
||||||||
tenancy_id
string
|
The custom model tenancy ID when modelId represents aliasName.
Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
|
||||||||
language
string
|
The document language, abbreviated according to the BCP 47 syntax.
|
||||||||
ocr_data
dictionary
|
|||||||||
detected_document_types
list
/ elements=dictionary
|
An array of detected document types.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
document_id
string
|
The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
|
||||||||
document_type
string
/ required
|
The document type.
|
||||||||
detected_languages
list
/ elements=dictionary
|
An array of detected languages.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
language
string
/ required
|
The document language, abbreviated according to the BCP 47 syntax.
|
||||||||
document_classification_model_version
string
|
The document classification model version.
|
||||||||
document_metadata
dictionary
/ required
|
|||||||||
mime_type
string
/ required
|
The result data format.
|
||||||||
page_count
integer
/ required
|
Teh number of pages in the document.
|
||||||||
errors
list
/ elements=dictionary
|
The errors encountered during document analysis.
|
||||||||
code
string
/ required
|
The error code.
|
||||||||
message
string
/ required
|
The error message.
|
||||||||
key_value_extraction_model_version
string
|
The document keyValue extraction model version.
|
||||||||
language_classification_model_version
string
|
The document language classification model version.
|
||||||||
pages
list
/ elements=dictionary / required
|
The array of a Page.
|
||||||||
detected_document_types
list
/ elements=dictionary
|
An array of detected document types.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
document_id
string
|
The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
|
||||||||
document_type
string
/ required
|
The document type.
|
||||||||
detected_languages
list
/ elements=dictionary
|
An array of detected languages.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
language
string
/ required
|
The document language, abbreviated according to the BCP 47 syntax.
|
||||||||
dimensions
dictionary
|
|||||||||
height
float
/ required
|
The height of a page.
|
||||||||
unit
string
/ required
|
|
The unit of length.
|
|||||||
width
float
/ required
|
the width of a page.
|
||||||||
document_fields
list
/ elements=dictionary
|
The form fields detected on the page.
|
||||||||
field_label
dictionary
|
|||||||||
confidence
float
|
The confidence score between 0 and 1.
|
||||||||
name
string
/ required
|
The name of the field label.
|
||||||||
field_name
dictionary
|
|||||||||
bounding_polygon
dictionary
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
confidence
float
|
The confidence score between 0 and 1.
|
||||||||
name
string
/ required
|
The name of the field.
|
||||||||
word_indexes
list
/ elements=integer
|
The indexes of the words in the field name.
|
||||||||
field_type
string
/ required
|
|
The field type.
|
|||||||
field_value
dictionary
/ required
|
|||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
Required when value_type is 'TIME'
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
Required when value_type is 'TIME'
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
Required when value_type is 'TIME'
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
items
list
/ elements=dictionary
|
The array of values.
Required when value_type is 'ARRAY'
|
||||||||
field_label
dictionary
|
|||||||||
field_name
dictionary
|
|||||||||
field_type
string
/ required
|
|
The field type.
|
|||||||
field_value
dictionary
/ required
|
|||||||||
text
string
|
The detected text of a field.
|
||||||||
value
string
|
The time field value as yyyy-mm-dd hh-mm-ss.
Required when value_type is one of ['DATE', 'NUMBER', 'STRING', 'TIME', 'PHONE_NUMBER', 'INTEGER']
|
||||||||
value_type
string
/ required
|
|
The type of data detected.
|
|||||||
word_indexes
list
/ elements=integer / required
|
The indexes of the words in the field value.
|
||||||||
lines
list
/ elements=dictionary
|
The lines of text detected on the page.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
text
string
/ required
|
The text recognized.
|
||||||||
word_indexes
list
/ elements=integer / required
|
The array of words.
|
||||||||
page_number
integer
/ required
|
The document page number.
|
||||||||
tables
list
/ elements=dictionary
|
The tables detected on the page.
|
||||||||
body_rows
list
/ elements=dictionary / required
|
The body rows.
|
||||||||
cells
list
/ elements=dictionary / required
|
The cells in the row.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
column_index
integer
/ required
|
The index of the cell inside the column.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
row_index
integer
/ required
|
The index of the cell inside the row.
|
||||||||
text
string
/ required
|
The text recognized in the cell.
|
||||||||
word_indexes
list
/ elements=integer / required
|
The words detected in the cell.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
column_count
integer
/ required
|
The number of columns.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
footer_rows
list
/ elements=dictionary / required
|
the footer rows.
|
||||||||
cells
list
/ elements=dictionary / required
|
The cells in the row.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
column_index
integer
/ required
|
The index of the cell inside the column.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
row_index
integer
/ required
|
The index of the cell inside the row.
|
||||||||
text
string
/ required
|
The text recognized in the cell.
|
||||||||
word_indexes
list
/ elements=integer / required
|
The words detected in the cell.
|
||||||||
header_rows
list
/ elements=dictionary / required
|
The header rows.
|
||||||||
cells
list
/ elements=dictionary / required
|
The cells in the row.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
column_index
integer
/ required
|
The index of the cell inside the column.
|
||||||||
confidence
float
/ required
|
The confidence score between 0 and 1.
|
||||||||
row_index
integer
/ required
|
The index of the cell inside the row.
|
||||||||
text
string
/ required
|
The text recognized in the cell.
|
||||||||
word_indexes
list
/ elements=integer / required
|
The words detected in the cell.
|
||||||||
row_count
integer
/ required
|
The number of rows.
|
||||||||
words
list
/ elements=dictionary
|
The words detected on the page.
|
||||||||
bounding_polygon
dictionary
/ required
|
|||||||||
normalized_vertices
list
/ elements=dictionary / required
|
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
|
||||||||
x
float
/ required
|
The X-axis normalized coordinate.
|
||||||||
y
float
/ required
|
The Y-axis normalized coordinate.
|
||||||||
confidence
float
/ required
|
the confidence score between 0 and 1.
|
||||||||
text
string
/ required
|
The string of text characters in the word.
|
||||||||
searchable_pdf
string
|
The searchable PDF file that was generated.
|
||||||||
table_extraction_model_version
string
|
The document table extraction model version.
|
||||||||
text_extraction_model_version
string
|
The document text extraction model version.
|
||||||||
output_location
dictionary
|
|||||||||
bucket_name
string
/ required
|
The Object Storage bucket name.
|
||||||||
namespace_name
string
/ required
|
The Object Storage namespace.
|
||||||||
prefix
string
/ required
|
The Object Storage folder name.
|
||||||||
realm_specific_endpoint_template_enabled
boolean
|
|
Enable/Disable realm specific endpoint template for service client. By Default, realm specific endpoint template is disabled. If not set, then the value of the OCI_REALM_SPECIFIC_SERVICE_ENDPOINT_TEMPLATE_ENABLED variable, if any, is used.
|
|||||||
region
string
|
The Oracle Cloud Infrastructure region to use for all OCI API requests. If not set, then the value of the OCI_REGION variable, if any, is used. This option is required if the region is not specified through a configuration file (See
config_file_location ). Please refer to https://docs.us-phoenix-1.oraclecloud.com/Content/General/Concepts/regions.htm for more information on OCI regions. |
||||||||
tenancy
string
|
OCID of your tenancy. If not set, then the value of the OCI_TENANCY variable, if any, is used. This option is required if the tenancy OCID is not specified through a configuration file (See
config_file_location ). To get the tenancy OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm |
Notes¶
Note
For OCI python sdk configuration, please refer to https://oracle-cloud-infrastructure-python-sdk.readthedocs.io/en/latest/configuration.html
Examples¶
- name: Perform action analyze_document on analyze_document_result
oci_ai_document_analyze_document_result_actions:
# required
features:
- # required
feature_type: DOCUMENT_CLASSIFICATION
# optional
model_id: "ocid1.model.oc1..xxxxxxEXAMPLExxxxxx"
tenancy_id: "ocid1.tenancy.oc1..xxxxxxEXAMPLExxxxxx"
max_results: 56
document:
# required
namespace_name: namespace_name_example
bucket_name: bucket_name_example
object_name: object_name_example
source: OBJECT_STORAGE
action: analyze_document
# optional
compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
output_location:
# required
namespace_name: namespace_name_example
bucket_name: bucket_name_example
prefix: prefix_example
language: language_example
document_type: INVOICE
ocr_data:
# required
document_metadata:
# required
page_count: 56
mime_type: mime_type_example
pages:
- # required
page_number: 56
# optional
dimensions:
# required
width: 3.4
height: 3.4
unit: PIXEL
detected_document_types:
- # required
document_type: document_type_example
confidence: 3.4
# optional
document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
detected_languages:
- # required
language: language_example
confidence: 3.4
words:
- # required
text: text_example
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
lines:
- # required
text: text_example
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
tables:
- # required
row_count: 56
column_count: 56
header_rows:
- # required
cells:
- # required
text: text_example
row_index: 56
column_index: 56
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
body_rows:
- # required
cells:
- # required
text: text_example
row_index: 56
column_index: 56
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
footer_rows:
- # required
cells:
- # required
text: text_example
row_index: 56
column_index: 56
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
document_fields:
- # required
field_type: LINE_ITEM_GROUP
field_value:
# required
value: value_example
value_type: TIME
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
# optional
text: text_example
# optional
field_label:
# required
name: name_example
# optional
confidence: 3.4
field_name:
# required
name: name_example
# optional
confidence: 3.4
bounding_polygon:
# required
normalized_vertices:
- # required
x: 3.4
y: 3.4
word_indexes: [ "word_indexes_example" ]
# optional
detected_document_types:
- # required
document_type: document_type_example
confidence: 3.4
# optional
document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
detected_languages:
- # required
language: language_example
confidence: 3.4
document_classification_model_version: document_classification_model_version_example
language_classification_model_version: language_classification_model_version_example
text_extraction_model_version: text_extraction_model_version_example
key_value_extraction_model_version: key_value_extraction_model_version_example
table_extraction_model_version: table_extraction_model_version_example
errors:
- # required
code: code_example
message: message_example
searchable_pdf: searchable_pdf_example
Authors¶
Oracle (@oracle)
There were some errors parsing the documentation for this plugin. Please file a bug with the collection.
The errors were:
Unable to normalize oci_ai_document_analyze_document_result_actions: return due to: 3 validation errors for PluginReturnSchema return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_label -> type string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$) return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_name -> type string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$) return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_value -> type string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)