Using Live Transcribe

Learn how to create and manage live transcribe jobs.

OCI Live Transcribe provides real-time transcription, a new feature that enables you to send audio streams and receive text results in real time. Real-time transcription is required for many use cases across industries such as healthcare, call centers, media, and more. For example, physicians and nurses use medical dictation, which requires real-time capabilities and increases work efficiency. With the release of OCI Live Transcribe, customers can get accurate transcription in their application in less than a few seconds. You can use the embedded text window to try Live Transcribe or refer to the API document on how to programmatically integrate with the OCI real time transcription service.

Creating a Live Transcribe Job

Create and submit a Speech live transcribe job to transcribe a live audio stream to text.

  1. Open the navigation menu and click Analytics & AI. Under AI Services, click Speech.
  2. Under List Scope, select the compartment that you want to work in.
  3. In the left-side navigation menu, click Live transcribe.
  4. (Optional) To configure the transcription, select the following from the Configure transcription section:
    • Choose Domain: This is the domain of the speech model to be used. Select the domain from the provided options.
    • Choose Language: This is the language to transcribe in. Select the language to transcribe in from the dropdown list.
    • Punctuation: Configure punctuation in the generated transcriptions. Three options are available, None for no punctuation (the default value), Auto to insert punctuation automatically, and Spoken to insert punctuation when they're verbally spoken.
    • Partial silence threshold: This is how long, in milliseconds, the service waits for additional speech, after it stops detecting speech activity, before ending the speech recognition. Select from the dropdown list.
    • Final silence threshold: This is how long, in milliseconds of silence after a word is spoken, the end of the session is indicated. Select from the dropdown list.
    • Partial result stability: This is the amount of confidence required for the latest tokens before returning them as part of a new partial result. Select from the dropdown list.
    • Enable customizations: Check this to customize transcription response using previously trained customizations.
  5. To start a session, click Start session, and begin to speak.
  6. To stop a session, stop speaking and then click Stop session.
  7. (Optional) To view the JSON file, click View JSON.
  8. (Optional) To reset the session, click Reset.