oracle.oci.oci_ai_document_analyze_document_result_actions – Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure¶

Note

This plugin is part of the oracle.oci collection (version 5.0.0).

You might already have this collection installed if you are using the ansible package. It is not included in ansible-core. To check whether it is installed, run ansible-galaxy collection list.

To install it, use: ansible-galaxy collection install oracle.oci.

To use it in a playbook, specify: oracle.oci.oci_ai_document_analyze_document_result_actions.

New in version 2.9.0: of oracle.oci

Synopsis
Requirements
Parameters
Notes
Examples

Synopsis ¶

Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure
For action=analyze_document, perform different types of document analysis.

Requirements ¶

The below requirements are needed on the host that executes this module.

python >= 3.6
Python SDK for Oracle Cloud Infrastructure https://oracle-cloud-infrastructure-python-sdk.readthedocs.io

Parameters ¶

Parameter								Choices/Defaults	Comments
action string / required								Choices: analyze_document	The action to perform on the AnalyzeDocumentResult.
api_user string									The OCID of the user, on whose behalf, OCI APIs are invoked. If not set, then the value of the OCI_USER_ID environment variable, if any, is used. This option is required if the user is not specified through a configuration file (See `config_file_location`). To get the user's OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_fingerprint string									Fingerprint for the key pair being used. If not set, then the value of the OCI_USER_FINGERPRINT environment variable, if any, is used. This option is required if the key fingerprint is not specified through a configuration file (See `config_file_location`). To get the key pair's fingerprint value please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm.
api_user_key_file string									Full path and filename of the private key (in PEM format). If not set, then the value of the OCI_USER_KEY_FILE variable, if any, is used. This option is required if the private key is not specified through a configuration file (See `config_file_location`). If the key is encrypted with a pass-phrase, the `api_user_key_pass_phrase` option must also be provided.
api_user_key_pass_phrase string									Passphrase used by the key referenced in `api_user_key_file`, if it is encrypted. If not set, then the value of the OCI_USER_KEY_PASS_PHRASE variable, if any, is used. This option is required if the key passphrase is not specified through a configuration file (See `config_file_location`).
auth_purpose string								Choices: service_principal	The auth purpose which can be used in conjunction with 'auth_type=instance_principal'. The default auth_purpose for instance_principal is None.
auth_type string								Choices: api_key ← instance_principal instance_obo_user resource_principal security_token	The type of authentication to use for making API requests. By default `auth_type="api_key"` based authentication is performed and the API key (see api_user_key_file) in your config file will be used. If this 'auth_type' module option is not specified, the value of the OCI_ANSIBLE_AUTH_TYPE, if any, is used. Use `auth_type="instance_principal"` to use instance principal based authentication when running ansible playbooks within an OCI compute instance.
cert_bundle string									The full path to a CA certificate bundle to be used for SSL verification. This will override the default CA certificate bundle. If not set, then the value of the OCI_ANSIBLE_CERT_BUNDLE variable, if any, is used.
compartment_id string									The compartment identifier.
config_file_location string									Path to configuration file. If not set then the value of the OCI_CONFIG_FILE environment variable, if any, is used. Otherwise, defaults to ~/.oci/config.
config_profile_name string									The profile to load from the config file referenced by `config_file_location`. If not set, then the value of the OCI_CONFIG_PROFILE environment variable, if any, is used. Otherwise, defaults to the "DEFAULT" profile in `config_file_location`.
document dictionary / required
	bucket_name string								The Object Storage bucket name. Required when source is 'OBJECT_STORAGE'
	data string								Raw document data with Base64 encoding. Required when source is 'INLINE'
	namespace_name string								The Object Storage namespace. Required when source is 'OBJECT_STORAGE'
	object_name string								The Object Storage object name. Required when source is 'OBJECT_STORAGE'
	source string / required							Choices: OBJECT_STORAGE INLINE	The location of the document data. The allowed values are: - `INLINE`: The data is included directly in the request payload. - `OBJECT_STORAGE`: The document is in OCI Object Storage.
document_type string								Choices: INVOICE RECEIPT RESUME TAX_FORM DRIVER_LICENSE PASSPORT BANK_STATEMENT CHECK PAYSLIP OTHERS	The document type.
features list / elements=dictionary / required									The types of document analysis requested.
	feature_type string / required							Choices: DOCUMENT_CLASSIFICATION KEY_VALUE_EXTRACTION LANGUAGE_CLASSIFICATION TEXT_EXTRACTION TABLE_EXTRACTION	The type of document analysis requested. The allowed values are: - `LANGUAGE_CLASSIFICATION`: Detect the language. - `TEXT_EXTRACTION`: Recognize text. - `TABLE_EXTRACTION`: Detect and extract data in tables. - `KEY_VALUE_EXTRACTION`: Extract form fields. - `DOCUMENT_CLASSIFICATION`: Identify the type of document.
	generate_searchable_pdf boolean							Choices: no yes	Whether or not to generate a searchable PDF file. Applicable when feature_type is 'TEXT_EXTRACTION'
	max_results integer								The maximum number of results to return. Applicable when feature_type is one of ['DOCUMENT_CLASSIFICATION', 'LANGUAGE_CLASSIFICATION']
	model_id string								The custom model ID. Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
	tenancy_id string								The custom model tenancy ID when modelId represents aliasName. Applicable when feature_type is one of ['KEY_VALUE_EXTRACTION', 'DOCUMENT_CLASSIFICATION']
language string									The document language, abbreviated according to the BCP 47 syntax.
ocr_data dictionary
	detected_document_types list / elements=dictionary								An array of detected document types.
		confidence float / required							The confidence score between 0 and 1.
		document_id string							The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
		document_type string / required							The document type.
	detected_languages list / elements=dictionary								An array of detected languages.
		confidence float / required							The confidence score between 0 and 1.
		language string / required							The document language, abbreviated according to the BCP 47 syntax.
	document_classification_model_version string								The document classification model version.
	document_metadata dictionary / required
		mime_type string / required							The result data format.
		page_count integer / required							Teh number of pages in the document.
	errors list / elements=dictionary								The errors encountered during document analysis.
		code string / required							The error code.
		message string / required							The error message.
	key_value_extraction_model_version string								The document keyValue extraction model version.
	language_classification_model_version string								The document language classification model version.
	pages list / elements=dictionary / required								The array of a Page.
		detected_document_types list / elements=dictionary							An array of detected document types.
			confidence float / required						The confidence score between 0 and 1.
			document_id string						The OCID of the Key-Value Extraction model that was used to extract the key-value pairs.
			document_type string / required						The document type.
		detected_languages list / elements=dictionary							An array of detected languages.
			confidence float / required						The confidence score between 0 and 1.
			language string / required						The document language, abbreviated according to the BCP 47 syntax.
		dimensions dictionary
			height float / required						The height of a page.
			unit string / required					Choices: PIXEL INCH	The unit of length.
			width float / required						the width of a page.
		document_fields list / elements=dictionary							The form fields detected on the page.
			field_label dictionary
				confidence float					The confidence score between 0 and 1.
				name string / required					The name of the field label.
			field_name dictionary
				bounding_polygon dictionary
					normalized_vertices list / elements=dictionary / required				An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
						x float / required			The X-axis normalized coordinate.
						y float / required			The Y-axis normalized coordinate.
				confidence float					The confidence score between 0 and 1.
				name string / required					The name of the field.
				word_indexes list / elements=integer					The indexes of the words in the field name.
			field_type string / required					Choices: LINE_ITEM_GROUP LINE_ITEM LINE_ITEM_FIELD KEY_VALUE	The field type.
			field_value dictionary / required
				bounding_polygon dictionary / required
					normalized_vertices list / elements=dictionary / required				An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image. Required when value_type is 'TIME'
						x float / required			The X-axis normalized coordinate. Required when value_type is 'TIME'
						y float / required			The Y-axis normalized coordinate. Required when value_type is 'TIME'
				confidence float / required					The confidence score between 0 and 1.
				items list / elements=dictionary					The array of values. Required when value_type is 'ARRAY'
					field_label dictionary
					field_name dictionary
					field_type string / required			Choices: LINE_ITEM_GROUP LINE_ITEM LINE_ITEM_FIELD KEY_VALUE	The field type.
					field_value dictionary / required
				text string					The detected text of a field.
				value string					The time field value as yyyy-mm-dd hh-mm-ss. Required when value_type is one of ['DATE', 'NUMBER', 'STRING', 'TIME', 'PHONE_NUMBER', 'INTEGER']
				value_type string / required				Choices: TIME INTEGER DATE NUMBER STRING PHONE_NUMBER ARRAY	The type of data detected.
				word_indexes list / elements=integer / required					The indexes of the words in the field value.
		lines list / elements=dictionary							The lines of text detected on the page.
			bounding_polygon dictionary / required
				normalized_vertices list / elements=dictionary / required					An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
					x float / required				The X-axis normalized coordinate.
					y float / required				The Y-axis normalized coordinate.
			confidence float / required						The confidence score between 0 and 1.
			text string / required						The text recognized.
			word_indexes list / elements=integer / required						The array of words.
		page_number integer / required							The document page number.
		tables list / elements=dictionary							The tables detected on the page.
			body_rows list / elements=dictionary / required						The body rows.
				cells list / elements=dictionary / required					The cells in the row.
					bounding_polygon dictionary / required
						normalized_vertices list / elements=dictionary / required			An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
							x float / required		The X-axis normalized coordinate.
							y float / required		The Y-axis normalized coordinate.
					column_index integer / required				The index of the cell inside the column.
					confidence float / required				The confidence score between 0 and 1.
					row_index integer / required				The index of the cell inside the row.
					text string / required				The text recognized in the cell.
					word_indexes list / elements=integer / required				The words detected in the cell.
			bounding_polygon dictionary / required
				normalized_vertices list / elements=dictionary / required					An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
					x float / required				The X-axis normalized coordinate.
					y float / required				The Y-axis normalized coordinate.
			column_count integer / required						The number of columns.
			confidence float / required						The confidence score between 0 and 1.
			footer_rows list / elements=dictionary / required						the footer rows.
				cells list / elements=dictionary / required					The cells in the row.
					bounding_polygon dictionary / required
						normalized_vertices list / elements=dictionary / required			An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
							x float / required		The X-axis normalized coordinate.
							y float / required		The Y-axis normalized coordinate.
					column_index integer / required				The index of the cell inside the column.
					confidence float / required				The confidence score between 0 and 1.
					row_index integer / required				The index of the cell inside the row.
					text string / required				The text recognized in the cell.
					word_indexes list / elements=integer / required				The words detected in the cell.
			header_rows list / elements=dictionary / required						The header rows.
				cells list / elements=dictionary / required					The cells in the row.
					bounding_polygon dictionary / required
						normalized_vertices list / elements=dictionary / required			An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
							x float / required		The X-axis normalized coordinate.
							y float / required		The Y-axis normalized coordinate.
					column_index integer / required				The index of the cell inside the column.
					confidence float / required				The confidence score between 0 and 1.
					row_index integer / required				The index of the cell inside the row.
					text string / required				The text recognized in the cell.
					word_indexes list / elements=integer / required				The words detected in the cell.
			row_count integer / required						The number of rows.
		words list / elements=dictionary							The words detected on the page.
			bounding_polygon dictionary / required
				normalized_vertices list / elements=dictionary / required					An array of normalized points defining the polygon's perimeter, with an implicit segment between subsequent points and between the first and last point. Rectangles are defined with four points. For example, `[{"x": 0, "y": 0}, {"x": 1, "y": 0}, {"x": 1, "y": 0.5}, {"x": 0, "y": 0.5}]` represents the top half of an image.
					x float / required				The X-axis normalized coordinate.
					y float / required				The Y-axis normalized coordinate.
			confidence float / required						the confidence score between 0 and 1.
			text string / required						The string of text characters in the word.
	searchable_pdf string								The searchable PDF file that was generated.
	table_extraction_model_version string								The document table extraction model version.
	text_extraction_model_version string								The document text extraction model version.
output_location dictionary
	bucket_name string / required								The Object Storage bucket name.
	namespace_name string / required								The Object Storage namespace.
	prefix string / required								The Object Storage folder name.
realm_specific_endpoint_template_enabled boolean								Choices: no yes	Enable/Disable realm specific endpoint template for service client. By Default, realm specific endpoint template is disabled. If not set, then the value of the OCI_REALM_SPECIFIC_SERVICE_ENDPOINT_TEMPLATE_ENABLED variable, if any, is used.
region string									The Oracle Cloud Infrastructure region to use for all OCI API requests. If not set, then the value of the OCI_REGION variable, if any, is used. This option is required if the region is not specified through a configuration file (See `config_file_location`). Please refer to https://docs.us-phoenix-1.oraclecloud.com/Content/General/Concepts/regions.htm for more information on OCI regions.
tenancy string									OCID of your tenancy. If not set, then the value of the OCI_TENANCY variable, if any, is used. This option is required if the tenancy OCID is not specified through a configuration file (See `config_file_location`). To get the tenancy OCID, please refer https://docs.us-phoenix-1.oraclecloud.com/Content/API/Concepts/apisigningkey.htm

Notes ¶

Note

For OCI python sdk configuration, please refer to https://oracle-cloud-infrastructure-python-sdk.readthedocs.io/en/latest/configuration.html

Examples ¶

- name: Perform action analyze_document on analyze_document_result
  oci_ai_document_analyze_document_result_actions:
    # required
    features:
    - # required
      feature_type: DOCUMENT_CLASSIFICATION

      # optional
      model_id: "ocid1.model.oc1..xxxxxxEXAMPLExxxxxx"
      tenancy_id: "ocid1.tenancy.oc1..xxxxxxEXAMPLExxxxxx"
      max_results: 56
    document:
      # required
      namespace_name: namespace_name_example
      bucket_name: bucket_name_example
      object_name: object_name_example
      source: OBJECT_STORAGE
    action: analyze_document

    # optional
    compartment_id: "ocid1.compartment.oc1..xxxxxxEXAMPLExxxxxx"
    output_location:
      # required
      namespace_name: namespace_name_example
      bucket_name: bucket_name_example
      prefix: prefix_example
    language: language_example
    document_type: INVOICE
    ocr_data:
      # required
      document_metadata:
        # required
        page_count: 56
        mime_type: mime_type_example
      pages:
      - # required
        page_number: 56

        # optional
        dimensions:
          # required
          width: 3.4
          height: 3.4
          unit: PIXEL
        detected_document_types:
        - # required
          document_type: document_type_example
          confidence: 3.4

          # optional
          document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
        detected_languages:
        - # required
          language: language_example
          confidence: 3.4
        words:
        - # required
          text: text_example
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
        lines:
        - # required
          text: text_example
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
          word_indexes: [ "word_indexes_example" ]
        tables:
        - # required
          row_count: 56
          column_count: 56
          header_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          body_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          footer_rows:
          - # required
            cells:
            - # required
              text: text_example
              row_index: 56
              column_index: 56
              confidence: 3.4
              bounding_polygon:
                # required
                normalized_vertices:
                - # required
                  x: 3.4
                  y: 3.4
              word_indexes: [ "word_indexes_example" ]
          confidence: 3.4
          bounding_polygon:
            # required
            normalized_vertices:
            - # required
              x: 3.4
              y: 3.4
        document_fields:
        - # required
          field_type: LINE_ITEM_GROUP
          field_value:
            # required
            value: value_example
            value_type: TIME
            confidence: 3.4
            bounding_polygon:
              # required
              normalized_vertices:
              - # required
                x: 3.4
                y: 3.4
            word_indexes: [ "word_indexes_example" ]

                # optional
            text: text_example

          # optional
          field_label:
            # required
            name: name_example

            # optional
            confidence: 3.4
          field_name:
            # required
            name: name_example

            # optional
            confidence: 3.4
            bounding_polygon:
              # required
              normalized_vertices:
              - # required
                x: 3.4
                y: 3.4
            word_indexes: [ "word_indexes_example" ]

                # optional
      detected_document_types:
      - # required
        document_type: document_type_example
        confidence: 3.4

        # optional
        document_id: "ocid1.document.oc1..xxxxxxEXAMPLExxxxxx"
      detected_languages:
      - # required
        language: language_example
        confidence: 3.4
      document_classification_model_version: document_classification_model_version_example
      language_classification_model_version: language_classification_model_version_example
      text_extraction_model_version: text_extraction_model_version_example
      key_value_extraction_model_version: key_value_extraction_model_version_example
      table_extraction_model_version: table_extraction_model_version_example
      errors:
      - # required
        code: code_example
        message: message_example
      searchable_pdf: searchable_pdf_example

Authors¶

Oracle (@oracle)

There were some errors parsing the documentation for this plugin. Please file a bug with the collection.

The errors were:

Unable to normalize oci_ai_document_analyze_document_result_actions: return due to: 3 validation errors for PluginReturnSchema
return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_label -> type
  string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)
return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_name -> type
  string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)
return -> analyze_document_result -> contains -> pages -> contains -> document_fields -> contains -> field_value -> contains -> items -> contains -> field_value -> type
  string does not match regex "^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$" (type=value_error.str.regex; pattern=^(any|bits|bool|bytes|complex|dict|float|int|json|jsonarg|list|path|sid|str|pathspec|pathlist)$)

oracle.oci.oci_ai_document_analyze_document_result_actions – Perform actions on an AnalyzeDocumentResult resource in Oracle Cloud Infrastructure¶

Synopsis¶

Requirements¶

Parameters¶

Notes¶

Examples¶