oracle.oci.oci_ai_document_analyze_document_result_actions – Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure

Note

This plugin is part of the oracle.oci collection (version 5.0.0).

You might already have this collection installed if you are using the ansible package. It is not included in ansible-core. To check whether it is installed, run ansible-galaxy collection list.

To install it, use: ansible-galaxy collection install oracle.oci.

To use it in a playbook, specify: oracle.oci.oci_ai_document_analyze_document_result_actions.

New in version 2.9.0: of oracle.oci

Synopsis

  • Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure

  • For action=analyze_document, perform different types of document analysis.

Requirements

The below requirements are needed on the host that executes this module.

Parameters

Parameter Choices/Defaults Comments
action
string / required
    Choices:
  • analyze_document
The action to perform on the AnalyzeDocumentResult.
api_user
string
The OCID of the user, on whose behalf, OCI APIs are invoked. If not set, then the value of the OCI_USER_ID environment variable, if any, is used. This option is required if the user is not specified through a configuration file (See config_file_location). To get the user's OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_fingerprint
string
Fingerprint for the key pair being used. If not set, then the value of the OCI_USER_FINGERPRINT environment variable, if any, is used. This option is required if the key fingerprint is not specified through a configuration file (See config_file_location). To get the key pair's fingerprint value please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_key_file
string
Full path and filename of the private key (in PEM format). If not set, then the value of the OCI_USER_KEY_FILE variable, if any, is used. This option is required if the private key is not specified through a configuration file (See config_file_location). If the key is encrypted with a pass-phrase, the api_user_key_pass_phrase option must also be provided.
api_user_key_pass_phrase
string
Passphrase used by the key referenced in api_user_key_file, if it is encrypted. If not set, then the value of the OCI_USER_KEY_PASS_PHRASE variable, if any, is used. This option is required if the key passphrase is not specified through a configuration file (See config_file_location).
auth_purpose
string
    Choices:
  • service_principal
The auth purpose which can be used in conjunction with 'auth_type=instance_principal'. The default auth_purpose for instance_principal is None.
auth_type
string
    Choices:
  • api_key ←
  • instance_principal
  • instance_obo_user
  • resource_principal
  • security_token
The type of authentication to use for making API requests. By default auth_type="api_key" based authentication is performed and the API key (see api_user_key_file) in your config file will be used. If this 'auth_type' module option is not specified, the value of the OCI_ANSIBLE_AUTH_TYPE, if any, is used. Use auth_type="instance_principal" to use instance principal based authentication when running ansible playbooks within an OCI compute instance.
cert_bundle
string
The full path to a CA certificate bundle to be used for SSL verification. This will override the default CA certificate bundle. If not set, then the value of the OCI_ANSIBLE_CERT_BUNDLE variable, if any, is used.
compartment_id
string
The compartment identifier.
config_file_location
string
Path to configuration file. If not set then the value of the OCI_CONFIG_FILE environment variable, if any, is used. Otherwise, defaults to ~/.oci/config.
config_profile_name
string
The profile to load from the config file referenced by config_file_location. If not set, then the value of the OCI_CONFIG_PROFILE environment variable, if any, is used. Otherwise, defaults to the "DEFAULT" profile in config_file_location.
document
dictionary / required
bucket_name
string
The Object Storage bucket name.
Required when source is 'OBJECT_STORAGE'
data
string
Raw document data with Base64 encoding.
Required when source is 'INLINE'
namespace_name
string
The Object Storage namespace.
Required when source is 'OBJECT_STORAGE'
object_name
string
The Object Storage object name.
Required when source is 'OBJECT_STORAGE'
source
string / required
    Choices:
  • OBJECT_STORAGE
  • INLINE
The location of the document data. The allowed values are: - `INLINE`: The data is included directly in the request payload. - `OBJECT_STORAGE`: The document is in OCI Object Storage.
document_type
string
    Choices:
  • INVOICE
  • RECEIPT
  • RESUME
  • TAX_FORM
  • DRIVER_LICENSE
  • PASSPORT
  • BANK_STATEMENT
  • CHECK
  • PAYSLIP
  • OTHERS
The document type.
features
list / elements=dictionary / required
The types of document analysis requested.
feature_type
string / required
    Choices:
  • DOCUMENT_CLASSIFICATION
  • KEY_VALUE_EXTRACTION
  • LANGUAGE_CLASSIFICATION
  • TEXT_EXTRACTION
  • TABLE_EXTRACTION
The type of document analysis requested. The allowed values are: - `LANGUAGE_CLASSIFICATION`: Detect the language. - `TEXT_EXTRACTION`: Recognize text. - `TABLE_EXTRACTION`: Detect and extract data in tables. - `KEY_VALUE_EXTRACTION`: Extract form fields. - `DOCUMENT_CLASSIFICATION`: Identify the type of document.
generate_searchable_pdf
boolean
    Choices:
  • no
  • yes
Whether or not to generate a searchable PDF file.
Applicable when feature_type is 'TEXT_EXTRACTION'
max_results
integer
The maximum number of results to return.
Applicable when feature_type is one of ['DOCUMENT_CLASSIFICATION', 'LANGUAGE_CLASSIFICATION']
model_id
string
The custom model ID.
Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
tenancy_id
string
The custom model tenancy ID when modelId represents aliasName.
Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
language
string
The document language, abbreviated according to the BCP 47 syntax.
ocr_data
dictionary
detected_document_types
list / elements=dictionary
An array of detected document types.
confidence
float / required
The confidence score between 0 and 1.
document_id
string
The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
document_type
string / required
The document type.
detected_languages
list / elements=dictionary
An array of detected languages.
confidence
float / required
The confidence score between 0 and 1.
language
string / required
The document language, abbreviated according to the BCP 47 syntax.
document_classification_model_version
string
The document classification model version.
document_metadata
dictionary / required
mime_type
string / required
The result data format.
page_count
integer / required
Teh number of pages in the document.
errors
list / elements=dictionary
The errors encountered during document analysis.
code
string / required
The error code.
message
string / required
The error message.
key_value_extraction_model_version
string
The document keyValue extraction model version.
language_classification_model_version
string
The document language classification model version.
pages
list / elements=dictionary / required
The array of a Page.
detected_document_types
list / elements=dictionary
An array of detected document types.
confidence
float / required
The confidence score between 0 and 1.
document_id
string
The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
document_type
string / required
The document type.
detected_languages
list / elements=dictionary
An array of detected languages.
confidence
float / required
The confidence score between 0 and 1.
language
string / required
The document language, abbreviated according to the BCP 47 syntax.
dimensions
dictionary
height
float / required
The height of a page.
unit
string / required
    Choices:
  • PIXEL
  • INCH
The unit of length.
width
float / required
the width of a page.
document_fields
list / elements=dictionary
The form fields detected on the page.
field_label
dictionary
confidence
float
The confidence score between 0 and 1.
name
string / required
The name of the field label.
field_name
dictionary
bounding_polygon
dictionary
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
confidence
float
The confidence score between 0 and 1.
name
string / required
The name of the field.
word_indexes
list / elements=integer
The indexes of the words in the field name.
field_type
string / required
    Choices:
  • LINE_ITEM_GROUP
  • LINE_ITEM
  • LINE_ITEM_FIELD
  • KEY_VALUE
The field type.
field_value
dictionary / required
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
Required when value_type is 'TIME'
x
float / required
The X-axis normalized coordinate.
Required when value_type is 'TIME'
y
float / required
The Y-axis normalized coordinate.
Required when value_type is 'TIME'
confidence
float / required
The confidence score between 0 and 1.
items
list / elements=dictionary
The array of values.
Required when value_type is 'ARRAY'
field_label
dictionary
field_name
dictionary
field_type
string / required
    Choices:
  • LINE_ITEM_GROUP
  • LINE_ITEM
  • LINE_ITEM_FIELD
  • KEY_VALUE
The field type.
field_value
dictionary / required
text
string
The detected text of a field.
value
string
The time field value as yyyy-mm-dd hh-mm-ss.
Required when value_type is one of ['DATE', 'NUMBER', 'STRING', 'TIME', 'PHONE_NUMBER', 'INTEGER']
value_type
string / required
    Choices:
  • TIME
  • INTEGER
  • DATE
  • NUMBER
  • STRING
  • PHONE_NUMBER
  • ARRAY
The type of data detected.
word_indexes
list / elements=integer / required
The indexes of the words in the field value.
lines
list / elements=dictionary
The lines of text detected on the page.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
confidence
float / required
The confidence score between 0 and 1.
text
string / required
The text recognized.
word_indexes
list / elements=integer / required
The array of words.
page_number
integer / required
The document page number.
tables
list / elements=dictionary
The tables detected on the page.
body_rows
list / elements=dictionary / required
The body rows.
cells
list / elements=dictionary / required
The cells in the row.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
column_index
integer / required
The index of the cell inside the column.
confidence
float / required
The confidence score between 0 and 1.
row_index
integer / required
The index of the cell inside the row.
text
string / required
The text recognized in the cell.
word_indexes
list / elements=integer / required
The words detected in the cell.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
column_count
integer / required
The number of columns.
confidence
float / required
The confidence score between 0 and 1.
footer_rows
list / elements=dictionary / required
the footer rows.
cells
list / elements=dictionary / required
The cells in the row.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
column_index
integer / required
The index of the cell inside the column.
confidence
float / required
The confidence score between 0 and 1.
row_index
integer / required
The index of the cell inside the row.
text
string / required
The text recognized in the cell.
word_indexes
list / elements=integer / required
The words detected in the cell.
header_rows
list / elements=dictionary / required
The header rows.
cells
list / elements=dictionary / required
The cells in the row.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
column_index
integer / required
The index of the cell inside the column.
confidence
float / required
The confidence score between 0 and 1.
row_index
integer / required
The index of the cell inside the row.
text
string / required
The text recognized in the cell.
word_indexes
list / elements=integer / required
The words detected in the cell.
row_count
integer / required
The number of rows.
words
list / elements=dictionary
The words detected on the page.
bounding_polygon
dictionary / required
normalized_vertices
list / elements=dictionary / required
An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
x
float / required
The X-axis normalized coordinate.
y
float / required
The Y-axis normalized coordinate.
confidence
float / required
the confidence score between 0 and 1.
text
string / required
The string of text characters in the word.
searchable_pdf
string
The searchable PDF file that was generated.
table_extraction_model_version
string
The document table extraction model version.
text_extraction_model_version
string
The document text extraction model version.
output_location
dictionary
bucket_name
string / required
The Object Storage bucket name.
namespace_name
string / required
The Object Storage namespace.
prefix
string / required
The Object Storage folder name.
realm_specific_endpoint_template_enabled
boolean
    Choices:
  • no
  • yes
Enable/Disable realm specific endpoint template for service client. By Default, realm specific endpoint template is disabled. If not set, then the value of the OCI_REALM_SPECIFIC_SERVICE_ENDPOINT_TEMPLATE_ENABLED variable, if any, is used.
region
string
The Oracle Cloud Infrastructure region to use for all OCI API requests. If not set, then the value of the OCI_REGION variable, if any, is used. This option is required if the region is not specified through a configuration file (See config_file_location). Please refer to https://docs.us-phoenix-1.oraclecloud.com/Content/General/Concepts/regions.htm for more information on OCI regions.
tenancy
string
OCID of your tenancy. If not set, then the value of the OCI_TENANCY variable, if any, is used. This option is required if the tenancy OCID is not specified through a configuration file (See config_file_location). To get the tenancy OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm

Examples

- name: Perform action analyze_document on analyze_document_result
  oci_ai_document_analyze_document_result_actions:
    # required
    features:
    - # required
      feature_type: DOCUMENT_CLASSIFICATION

      # optional
      model_id: "ocid1.model.oc1..xxxxxxEXAMPLExxxxxx"
      tenancy_id: "ocid1.tenancy.oc1..xxxxxxEXAMPLExxxxxx"
      max_results: 56
    document:
      # required
      namespace_name: namespace_name_example
      bucket_name: bucket_name_example
      object_name: object_name_example
      source: OBJECT_STORAGE
    action: analyze_document

    # optional
    compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
    output_location:
      # required
      namespace_name: namespace_name_example
      bucket_name: bucket_name_example
      prefix: prefix_example
    language: language_example
    document_type: INVOICE
    ocr_data:
      # required
      document_metadata:
        # required
        page_count: 56
        mime_type: mime_type_example
      pages:
      - # required
        page_number: 56

        # optional
        dimensions:
          # required
          width: 3.4
          height: 3.4
          unit: PIXEL
        detected_document_types:
        - # required
          document_type: document_type_example
          confidence: 3.4

          # optional
          document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
        detected_languages:
        - # required
          language: language_example
          confidence: 3.4
        words:
        - # required
          text: text_example
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
        lines:
        - # required
          text: text_example
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
          word_indexes: [ "word_indexes_example" ]
        tables:
        - # required
          row_count: 56
          column_count: 56
          header_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          body_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          footer_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
        document_fields:
        - # required
          field_type: LINE_ITEM_GROUP
          field_value:
            # required
            value: value_example
            value_type: TIME
            confidence: 3.4
            bounding_polygon:
              # required
              normalized_vertices:
              - # required
                x: 3.4
                y: 3.4
            word_indexes: [ "word_indexes_example" ]

                # optional
            text: text_example

          # optional
          field_label:
            # required
            name: name_example

            # optional
            confidence: 3.4
          field_name:
            # required
            name: name_example

            # optional
            confidence: 3.4
            bounding_polygon:
              # required
              normalized_vertices:
              - # required
                x: 3.4
                y: 3.4
            word_indexes: [ "word_indexes_example" ]

                # optional
      detected_document_types:
      - # required
        document_type: document_type_example
        confidence: 3.4

        # optional
        document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
      detected_languages:
      - # required
        language: language_example
        confidence: 3.4
      document_classification_model_version: document_classification_model_version_example
      language_classification_model_version: language_classification_model_version_example
      text_extraction_model_version: text_extraction_model_version_example
      key_value_extraction_model_version: key_value_extraction_model_version_example
      table_extraction_model_version: table_extraction_model_version_example
      errors:
      - # required
        code: code_example
        message: message_example
      searchable_pdf: searchable_pdf_example

Authors

  • Oracle (@oracle)

There were some errors parsing the documentation for this plugin. Please file a bug with the collection.

The errors were:

  • Unable to normalize oci_ai_document_analyze_document_result_actions: return due to: 3 validation errors for PluginReturnSchema
    return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_label -> type
      string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)
    return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_name -> type
      string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)
    return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_value -> type
      string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)