Document Understanding provides pretrained models that
allow you to organize and extract text and structure from business documents.
Pretrained models let you use AI with no data science experience. Simply provide a file to
the Document Understanding service and, without having to
create your own model, get back information about your document.
Use Cases
Pretrained models let you automate back-office tasks, and process documents more
accurately and quickly.
Intelligent search
Enrich document files with metadata, including document type and key fields, for
easier retrieval.
Invoice automation
Extract data from invoices, such as the amounts, dates, and payment terms to
automatically process the invoices for payment. This data can save time and effort, and
can improve the accuracy and reliability of the invoicing process.
Document analysis
Extract data from documents like invoices or insurance claims to identify trends or
identify opportunities for cost savings.
Human resources management
Extract employee record data, such as names, addresses, and job titles, to populate a
human resources management system. This data makes it easier to store and manage
employee information, and can reduce the risk of errors.
Supported Formats
Document Understanding supports several document
formats.
You can upload documents from a local file or Oracle Cloud Infrastructure Object Storage. They can be in the following formats:
JPEG
PDF
PNG
TIFF
Pretrained Models
There are five types of pretrained model with Document Understanding.
Document Understanding can detect and recognize text in
a document. OCR draws bounding boxes around the printed or hand-written text that it locates in
a document, and digitizes the text.
If you have a PDF with text, Document Understanding
locates the text in that document and extracts the text. It then provides bounding boxes for
the identified text. Text Detection can be used with Document AI or Image Analysis models.
Document Understanding provides a confidence score for each
text grouping. The confidence score is a decimal number. Scores closer to 1 indicate a higher
confidence in the extracted text, while lower scores indicate lower confidence score. The
range of the confidence score for each label is from 0 to 1.
Document Classification can be used to classify a document.
Document Understanding provides a list of possible
document types for the analyzed document. Each document type has a confidence score. The
confidence score is a decimal number. Scores closer to 1 indicate a higher confidence in the
extracted text, while lower scores indicate lower confidence score. The range of the
confidence score for each label is between 0-1. The list of possible document types is:
Table extraction can be used to identify tables in a document and extract their
contents. For example, if a PDF receipt contains a table that includes the taxes and total
amount, Document Understanding identifies the table and extracts
the table structure.
Document Understanding provides the number of rows and
columns for the table and the contents in each table cell. Each cell has a confidence score.
The confidence score is a decimal number. Scores closer to 1 indicate a higher confidence in
the extracted text, while lower scores indicate lower confidence score. The range of the
confidence score for each label is from 0 to 1.
Supported features are:
Table extraction for tables with and without borders
Key value extraction can be used to identify values for predefined keys in a receipt.
For example, if a receipt includes a merchant name, merchant address, or merchant phone number,
Document Understanding can identify these values and return
them as a key value pair.
The supported features are:
Extract values for predefined key value pairs
Bounding polygons
Single request
Batch request
Limitations:
Supports receipts in English only.
The supported fields are:
Supported Fields
Field
Description
MerchantName
The name of the merchant issuing the receipt.
MerchantPhoneNumber
The telephone number of the merchant.
MerchantAddress
The address of the merchant.
TransactionDate
The date the receipt was issued.
TransactionTime
The time the receipt was issued.
Total
The total amount of the receipt, after all charges and taxes have been
applied.
Key value extraction can be used to identify values for predefined keys in an
invoice. For example, if an invoice includes a vendor name, total, and invoice ID, Document Understanding can identify these values and return
them as a key value pair.
The supported features are:
Extract values for predefined key value pairs
Bounding polygons
Confidence score
The supported fields are:
Supported Fields
Field
Description
CustomerName
Name of invoiced customer.
CustomerId
Customer reference identifier.
PurchaseOrder
Purchase order number.
InvoiceId
Identifier for the specific invoice.
InvoiceDate
Date of issue on the invoice.
DueDate
Date when payment is due on this invoice.
VendorName
Name of vendor.
VendorAddress
Vendor mailing address.
VendorAddressRecipient
Name referenced with the VendorAddress.
CustomerAddress
Mailing address for the Customer.
CustomerAddressRecipient
Name referenced with the CustomerAddress.
BillingAddress
Explicit billing address for the customer.
BillingAddressRecipient
Name referenced with the BillingAddress.
ShippingAddress
Explicit shipping address for the customer.
ShippingAddressRecipient
Name referenced with the ShippingAddress.
PaymentTerm
The terms of payment for the invoice.
Subtotal
Subtotal field identified on this invoice.
TotalTax
Total tax value identified on this invoice.
InvoiceTotal
Total charge amount associated with the invoice.
AmountDue
Total amount due to the vendor.
ServiceAddress
Explicit service address or property address for the
customer.
ServiceAddressRecipient
Name referenced with the ServiceAddress.
RemittanceAddress
Explicit remittance or payment address for the customer.
RemittanceAddressRecipient
Name referenced with the RemittanceAddress.
ShippingCost
Total shipping or shipping and handling costs associated with an
invoice.
ServiceStartDate
First date for the service period.
ServiceEndDate
End date for the service period.
PreviousUnpaidBalance
Explicit previously unpaid balance.
The supported line items are:
Supported Line Items
Line Item
Description
Items
Concatenation of all other line item values (that is, the entire
line of the line item).
Name
The name listed for a product or service, for example, t-shirt.
Amount
The amount of the line item.
Description
The text description for the invoice line item, for example,
men's rayon shirt, sizes small, medium, and large.
Quantity
The quantity for this invoice line item.
UnitPrice
The price per item identified on the invoice.
ProductCode
Product code, product number, or SKU referenced in the line item.
For example, 123456.
Key value extraction can be used to identify values for predefined keys in a US or UK
driver's documentation. For example, if a Driver ID includes an issue date, region, and
expiry date, Document Understanding can identify these
values and return them as a key value pair.
The supported features are:
Extract values for predefined key value pairs
Bounding polygons
Confidence score
The supported fields are:
Supported Fields
Field
Description
API Response Value
FirstName
First name (given name) listed on the document.
Extracted Text
LastName
Last name (family name) listed on the document.
Extracted Text
Country
Country listed on the document.
Extracted ISO 3166-1 country code
BirthDate
Date of birth.
Date in YYYY/MM/DD format
ExpiryDate
Date of expiration listed on the document.
Date in YYYY/MM/DD format
IssueDate
Date of issue listed on the document.
Date in YYYY/MM/DD format
Gender
Gender listed on the document.
Extracted Text
DocumentNumber
Document identification number.
Extracted Text
Address
Address listed on the document.
Extracted Text
Region
Region listed on the document. For example, state or
territory.
Key value extraction can be used to identify values for predefined keys in a
MRZ-supported passport. For example, if a passport includes nationality and date of issue,
Document Understanding can identify these values and
return them as a key value pair.
The supported features are:
Extract values for predefined key value pairs
Confidence score
The supported fields are:
Supported Fields
Field
Description
API Response Value
FirstName
First name (given name) listed on the document.
Extracted Text
LastName
Last name (family name) listed on the document.
Extracted Text
Country
Country listed on the document.
Extracted ISO 3166-1 country code
Nationality
Nationality of the document owner.
Extracted ISO 3166-1 country code
BirthDate
Date of birth.
Date in YYYY/MM/DD format
Else: <
ExpiryDate
Date of expiration listed on the document.
Date in YYYY/MM/DD format
Else: <
Gender
Gender listed on the document.
Extracted Text
DocumentType
Document type, often listed as a single character, such as "P"
for passport or "V" for Visa.
OCR PDF generates a searchable PDF file in your Object Storage. For example, Document Understanding can take a PDF file with text and images,
and return a PDF file where you can search for the text in the PDF.
Document Understanding provides pretrained models for
customers to extract insights about their documents without needing data scientists.
You need the following before using a pretrained model:
A paid tenancy account in Oracle Cloud Infrastructure.
Familiarity with Oracle Cloud Infrastructure Object Storage.
You can call the pretrained Document AI models as a batch request using Rest APIs,
SDK, or CLI. You can call the pretrained Document AI models as a single request using
the Console, Rest APIs, SDK, or CLI.
See the Limits section for information on what is allowed in batch
requests.
For more information about using the Document AI models with the REST API, see
Analyzing with the API.
For more information about using the Document AI models in the Console, see Using the Console.