Document Understanding is now Available

Document Understanding is an AI service that enables developers to extract text, tables, and other key data from document files through APIs and command line interface tools.

With Document Understanding, you can automate tedious business processing tasks with prebuilt AI models, and customize document extraction to fit your industry-specific needs. The following pretrained models are supported:

  • Optical Character Recognition (OCR): Document Understanding can detect and recognize text in a document.
  • Text extraction: Document Understanding provides the word level and line level text, and the bounding box coordinates of where the text is located.
  • Key-value extraction: Document Understanding extracts a predefined list of key-value pair information from receipts, invoices, passports, and driver IDs.
  • Table extraction: Document Understanding extracts content in tabular format, maintaining the row and column relationships of cells.
  • Document classification: Document Understanding classifies documents into different types based on visual appearance, high-level features, and extracted keywords. For example, document types such as invoice, receipt, and resume.
  • Optical Character Recognition (OCR) PDF: Document Understanding generates a searchable PDF file in your Object Storage.

For more information, see the Document Understanding documentation.