Intents

Intents allow your skill to understand what the user wants it to do. An intent categorizes typical user requests by the tasks and actions that your skill performs. The PizzaBot’s OrderPizza intent, for example, labels a direct request, I want to order a Pizza, along with another that implies a request, I feel like eating a pizza.

Intents are comprised of permutations of typical user requests and statements, which are also referred to as utterances. As described in Create an Intent, you can create the intent by naming a compilation of utterances for a particular action. Because your skill’s cognition is derived from these intents, each intent should be created from a data set that’s robust (one to two dozen utterances) and varied, so that your skill can interpret ambiguous user input. A rich set of utterances enables a skill to understand what the user wants when it receives messages like “Forget this order!” or “Cancel delivery!”—messages that mean the same thing, but are expressed differently. To find out how sample user input allows your skill to learn, see Intent Training and Testing.

Create an Intent

To create an intent:
  1. Click Intents This is an image of the Intent icon. in the left navbar.
  2. If you already have defined your intents in a CSV file, click Import Intents. Import Intents from a CSV File describes this file's format. Otherwise, click Add Intent. Your skill needs at least two intents.
  3. Click This is an image of the Edit icon to enter a descriptive name or phrase for the intent in the Conversation Name field. For example, if the intent name is callAgent, the conversation name would be Talk to a customer representative. When the skill can't resolve a message to an intent, it outputs the user-friendly names and phrases that you enter into the Conversation Name field as the options that are listed in the Do you want to disambiguation messages described in How Confidence Win Margin Works and Configure the Intent and Q&A Routing.
  4. Add the intent name in the Name field. If you don't enter a conversation name, then the Name field value is used instead. Keep in mind that a short name with no end punctuation might not contribute to the user experience. The intent name displays in the Conversation Name field for skills built with prior versions of Digital Assistant.
    Note

    In naming your intents, do not use system. as a prefix. system. is a namespace that's reserved for the intents that we provide. Because intents with this prefix are handled differently by Trainer Tm, using it may cause your intents to resolve in unexpected ways.
  5. Add a description of the intent. Your description should focus on what makes the intent unique and the task or actions it performs.
  6. If this is an answer intent, add a short answer to the Answer field.
  7. Optionally, in the Annotations field, add one or more tags for the intent to categorize it in a way that is useful for you. You can use any words of your choosing.

    Tip:

    On the Intents page, you can filter the display of intents by annotation.
  8. Start building the training corpus by adding example utterances that illustrate the meaning behind the intent. To ensure optimal intent resolution, use terms, wording, and phrasing specific to the individual intent. Ideally, you should base your training data on real-world phrases. You can save your utterances by clicking Enter or by clicking outside of the input field. To manage the training set, select a row to access the Edit (This is an image of the Edit button.) and Delete (This is an image of the Delete function.) functions.
    If your skill supports multiple native languages, augment the training set with phrases in the secondary languages to strengthen the model's accuracy in this and all other native languages supported by the skill.
    Description of select_additional_language.png follows

    See Build Your Training Corpus for tips on building an effective training corpus.
    To allow your skill to cleanly distinguish between intents, create an intent that resolves inappropriate user input or gibberish.
    While utterances can be added to an existing intent manually or by importing a CSV, they can also be assigned to intents through data manufacturing jobs and the Insights retrainer.
  9. In the Auto-Complete Suggestions field, enter a set of suggested phrases that help the user enter an appropriately worded request. Do not add the entire set of training data. Add a set of phrases that represent ideal user requests instead. Adding too broad a set of utterances may not only confuse users, but may also result in unexpected behavior.
    This is an optional step. This function is only supported by the Oracle Web Channel.
  10. Add an entity if the intent needs one to resolve the user input. To find out how, see Add Entities to Intents.
  11. To teach your skill how to comprehend user input using the set of utterances that you’ve provided so far, click Train, choose a model and then click Submit.
    As described in Which Training Model Should I Use?, we provide two models that learn from your corpus: Trainer Ht and Trainer Tm. Each uses a different algorithm to reconcile the user input against your intents. Trainer Ht uses pattern matching while Trainer Tm a machine learning algorithm which uses word vectors. Both skills that use Digital Assistant's native language support and skills with answer intents require Trainer TM.
    You’d typically follow this process:
    1. Create the initial training corpus.

    2. Train with Trainer Ht. You should start with Trainer Ht because it doesn’t require a large set of utterances. As long as there are enough utterances to disambiguate the intents, your skill will be able to resolve user input.

      If you get a Something’s gone wrong message when you try to train your skill, then you may not have added a sufficient number of utterances to support training. First off, make sure that you have at least two intents with at least two (or preferable more) utterances each. If you haven’t added enough utterances, add a few more then train your skill.

    3. Refine your corpus, retrain with Trainer Ht. Repeat as necessary—training is an iterative process.

    4. Train with Trainer Tm. Use this trainer when you’ve accumulated a robust set of intents.

    The Training Needed displays whenever you add an intent or when you update an intent by adding, changing, or deleting its utterances. To bring the training up to date, choose a training model and then click Train. The model displays an exclamation point whenever it needs training. When its training is current, it displays a check mark.

  12. Click Test Utterances (located at the upper left) to open the Utterance Tester. Select the target language, then enter utterances similar to those in your training set. The Utterance Tester returns the confidence level for this utterance and enables you to assign the utterance to an intent, or add it as a test case.
    To log your intent testing results, enable the conversation intent logging (Settings > General > Enable Insights) .
  13. Click Validate and review the validation messages for errors such as too few utterances and for guidance on applying best practices like adding an unresolvedIntent intent.

Add Entities to Intents

Some intents require entities—both built-in and custom— to complete an action within the dialog flow or make a REST call to a backend API. The system uses only these entities, which are known as intent entities, to fulfill the intent that’s associated with them. You can associate an entity to an intent when you click Add New Entity and then select from the custom (This is an image of the Custom icon.) or built-in (This is an image of the System icon.) entities. If you're assigning a built-in entity, leave Value Agnostic enabled (the default) if specific entity values do not factor into intent classification (which is generally the case). If the intent requires a specific entity value, switch this feature off.
Note

Value Agnostic applies to built-in entities only. You cannot apply it to custom entities.

Description of choose_req_entity.png follows

Alternatively, you can click New Entity to add an intent-specific entity.
Description of create_intent_entity.png follows

Tip:

Only intent entities that are included in the JSON payloads are sent to, and returned by, the Component Service. The ones that aren’t associated with an intent won’t be included, even if they contribute to the intent resolution by recognizing user input. If your custom component accesses entities through entity matches, then be sure to add the entity to your intent.
Value Agnostic Intent Entities

The Value Agnostic feature allows you to adjust how entity values affect intent classification. When you enable this feature, the specific values for an associated built-in entity do not have bearing on the intent classification. However, when you disable this feature, you allow the entity value to play a key role in resolving the input.

In general, you can leave this feature in its default setting (enabled) because a specific entity value seldom factors into intent classification. The training utterances for an account balances intent, for example, may include specific dates (What was my balance on October 5?) but these values are not the deciding factor in resolving the input to the intent. Leaving Value Agnostic enabled will, in most cases, improve intent resolution because it prevents the values from affecting confidence scores or even signaling an unintended intent. However, whenever specific values play a key role in intent resolution, you should switch this feature off. For example, you would disable the feature if the value for a DATE is central to distinguishing an intent for checking past vacation balances from an intent that checks for future vacation balances. If these intents were date agnostic, then the model would ignore past and present and would not resolve input correctly.
Example Intents Associated Entity Training Utterances Enable Value Agnostic?
Account Balance DATE
  • Can you tell me my account balance yesterday?
  • How much money do I have in checking?
  • What was my balance on October 5th?
  • What was my credit card balance last week?
  • What is my bank balance today?
  • What was my savings account balance on 5/3?
Yes – The specific date values do not signal the intent. The various date values in these utterances can be ignored because a user can ask for an account balance on any day.
Holiday Store Hours DATE
  • Are you open on January 1st?
  • Are you open on Thanksgiving?
  • Hours for New Year's Day
  • What are the store hours for July 4th?
  • What are your holiday hours?
  • Will you be open on Christmas?
No – The intent classification is based on a specific (and limited) set of values and users are inquiring about holidays.
  • Check Past Vacation Balance
  • Check Future Vacation Balance
DATE
  • Check Past Vacation Balance
    • Did I take any time off last month?
  • Check Future Vacation Balance
    • Any planned vacation in next month?
No – Disable Value Agnostic for both intents. Agnostic DATE values in this case would mean that the model would not consider a value as past or future. A "last month" value, which should signal the Check Past Vacation Balance intent, would be ignored. As a result, similarly worded input like "Did I take any time off next month" may resolve incorrectly to this intent.

Import Intents from a CSV File

You can add your intents manually, or import them from a CSV file. You can create this file from a CSV of exported intents, or by creating it from scratch in a spreadsheet program or a text file.

The CSV file has six columns for skills that use the Natively-Supported language mode and five columns for those that don't. Here are the column names and what they represent:

  • query: An example utterance.
  • topIntent: The intent that the utterance should match to.
  • conversationName: The conversation name for the intent.
  • answer: For answer intents, the static answer for the intent.
  • enabled: If true, the intent is enabled in the skill.
  • nativeLanguageTag: (For skills with native-language support only) the language of the utterance. For values, use two-character language tags (fr, en, etc,).
    • For skills with Digital Assistant's native language support, this column is required.
    • For skills without the native language support, you can't import a CSV that has this column.

Here's an excerpt from a CSV file for a skill that does not have native language support and which doesn't use answer intents.

query,topIntent,conversationName,answer,enabled
I want to order a pizza,OrderPizza,Order a Pizza.,,true
I want a pizza,OrderPizza,Order a Pizza.,,true
I want a pizaa,OrderPizza,Order a Pizza.,,true
I want a pizzaz,OrderPizza,Order a Pizza.,,true
I'm hungry,OrderPizza,Order a Pizza.,,true
Make me a pizza,OrderPizza,Order a Pizza.,,true
I feel like eating a pizza,OrderPizza,Order a Pizza.,,true
Gimme a pie,OrderPizza,Order a Pizza.,,true
Give me a pizza,OrderPizza,Order a Pizza.,,true
pizza I want,OrderPizza,Order a Pizza.,,true
I do not want to order a pizza,CancelPizza,Cancel your order.,,true
I do not want this,CancelPizza,Cancel your order.,,true
I don't want to order this pizza,CancelPizza,Cancel your order.,,true
Cancel this order,CancelPizza,Cancel your order.,,true
Can I cancel this order?,CancelPizza,Cancel your order.,,true
Cancel my pizza,CancelPizza,Cancel your order.,,true
Cancel my pizaa,CancelPizza,Cancel your order.,,true
Cancel my pizzaz,CancelPizza,Cancel your order.,,true
I'm not hungry anymore,CancelPizza,Cancel your order.,,true
don't cancel my pizza,unresolvedIntent,unresolvedIntent,,true
Why is a cheese pizza called Margherita,unresolvedIntent,unresolvedIntent,,true

Here's an excerpt from a CSV file for a skill with native-language support that uses answer intents.

query,topIntent,conversationName,answer,enabled,nativeLanguageTag
Do you sell pasta,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Vendez-vous des salades,Products,Our Products,Nous ne vendons que des pizzas. Pas de salades. Pas de pâtes. Pas de hamburgers. Seulement pizza,fr
do you sell burgers,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Do you sell salads,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Vendez des hamburgers,Products,Our Products,Nous ne vendons que des pizzas. Pas de salades. Pas de pâtes. Pas de hamburgers. Seulement pizza,true,fr

To import a CSV file:

  1. Click Intents (This is an image of the Intent icon.) in the left navbar.

  2. Click More, and then choose Import intents.
    Description of import_intents.png follows

  3. Select the .csv file and then click Open.

  4. Train your skill.

Export Intents to a CSV File

You can reuse your training corpus by exporting it to CSV. You can then import this file to another skill.

To export your intents and their utterances:
  1. Click Intents This is an image of the Intent icon. in the left navbar.

  2. Click More, and then choose Export intents.
    Description of export_corpus.png follows

  3. Save the file. This file has the following columns, which are described in Import Intents from a CSV File:
    query, topIntent, conversationName, answer, enabled, nativeLanguageTag

Which Training Model Should I Use?

We provide a duo of training models that mold your skill’s cognition, Trainer Tm and Trainer Ht. You can use either of these models, each of which uses a different approach to machine learning. In general, you train your with Trainer Tm before you put your skills into production. Because of its shorter training time, you can use Ht for prototyping or for skills.
Note

You can't use Trainer Ht for skills that use answer intents, use native language support, or have a large number of intents. Use Trainer Tm for these skills.
Trainer Ht is the default model, but you can change this by clicking Settings > General and then by choosing another model from the list. The default model displays in the tile in the skill catalog.
Trainer Tm
Trainer Tm (Tm) achieves highly accurate intent classification even when a skill has hundreds, or even thousands, of intents. Even though the intents in these large data sets are often closely related and are sometimes "unbalanced" in quantity of utterances, Tm can still differentiate between them. In general, you would apply Tm to any skill before you put it into production.
Note

When you train with Trainer Tm, you can also use the Similar Utterances Report.

You don't need to bulk up your training data with utterances that accommodate case sensitivity (Tm recognizes BlacK Friday as Black Friday, for example), punctuation, similar verbs and nouns, or misspellings. In the latter case, Trainer Tm uses context to resolve a phrase even when a user enters a key word incorrectly. Here are some general guidelines for building a training corpus when you're developing your skill with this model.

Trainer Tm enhances the skill's cognition by
  • Recognizing the irrelevant content. For I'm really excited about the coming Black Friday deals, and can't wait for the deals. Can you tell me what's going to be on sale for Black Friday?, Trainer Tm:
    • Discards the extraneous content (I'm really excited about the coming Black Friday deals...)
    • Resolves the relevant content (Can you tell me what's going to be on sale for Black Friday?) to an intent. In this case, an intent called Black Friday Deals.
    Trainer Tm can also distinguish between the relevant and irrelevant content in a message even when the irrelevant content can potentially be resolved to an intent. I bought the new 80 inch TV on Black Friday for $2200, but now I see that the same set is available online for $2100. Do you offer price match? for example, could be matched to the Black Friday Deals intent and to a Price Matching intent, which is appropriate for this message. In this case Trainer Tm:
    • Recognizes that I bought the new 80 inch TV on Black Friday for $2200, but now I see that the same set is available online for $2100 is extraneous content.
    • Resolves Do you offer price match?
  • Resolving intents when a single word or a name matches an entity. For example, Trainer Tm can resolve a message consisting of only Black Friday to an intent that's associated with a entity for Black Friday.
  • Distinguishing between similar utterances (Cancel my order vs. Why did you cancel my order?).
  • Recognizing out-of-scope utterances, such as Show me pizza recipes or How many calories in a Meat Feast for a skill for fulfilling a pizza order and nothing else.
  • Recognizing out-of-domain utterances, such as What's the weather like today for a pizza ordering skill.

    Tip:

    While Trainer Tm can easily distinguish when a user message is unclassifiable because it's clearly dissimilar from the training data, you might still want to define an unresolvedIntent with utterances that represent the phrases that you want to make sure do not resolve to any of your skill's intents. These phrases can be within the domain of your skill, but are still out of scope, even though they may share some of the same words as the training data. For example, I want to order a car for a pizza skill, which has also been trained with I want to order a pizza.
  • Distinguishing between similar entities – For example, Tm recognizes that mail is not same as email in the context of an intent called Sign Up for Email Deals. Because it recognizes that an entity called regular mail would be out of scope, it would resolve the phrase I want to sign up for deals through regular mail at a lower confidence than it would for I want to sign up for email deals.
Trainer Ht

Trainer Ht is the default training model. It needs only a small training corpus, so use it as you develop the entities, intents, and the training corpus. When the training corpus has matured to the point where tests reveal highly accurate intent resolution, you’re ready to add a deeper dimension to your skill’s cognition by training Trainer Tm.

You can get a general understanding of how Trainer Ht resolves intents just from the training corpus itself. It forms matching rules from the sample sentences by tagging parts of speech and entities (both custom and built-in) and by detecting words that have the same meaning within the context of the intent. If an intent called SendMoney has both Send $500 to Mom and Pay Cleo $500, for example, Trainer Ht interprets pay as the equivalent to send . After training, Trainer Ht’s tagging reduces these sentences to templates (Send Currency to person, Pay person Currency) that it applies to the user input.

Because Trainer Ht draws on the sentences that you provide, you can predict its behavior: it will be highly accurate when tested with sentences similar to the ones that make up the training corpus (the user input that follows the rules, so to speak), but may fare less well when confronted with esoteric user input.

Build Your Training Corpus

When you define an intent, you first give it a name that illustrates some user action and then follow up by compiling a set of real-life user statements, or utterances. Collectively, your intents, and the utterances that belong to them, make up a training corpus. The term corpus is just a quick way of saying “all of the intents and sample phrases that I came up with to make this skill smart”. The corpus is the key to your skill’s intelligence. By training a model with your corpus, you essentially turn that model into a reference tool for resolving user input to a single intent. Because your training corpus ultimately plays the key role in deciding which route the skill-human conversation will take, you need to choose your words carefully when building it.

Generally speaking, a large and varied set of sample phrases increases a model’s ability to resolve intents accurately. But building a robust training corpus doesn’t just begin with well-crafted sample phrases; it actually begins with intents that are clearly delineated. Not only should they clearly reflect your use case, but their relationship to their sample sentences should be equally clear. If you’re not sure where a sample sentence belongs, then your intents aren’t distinct from one another.

You probably have sample utterances in mind when you create your intents, but you can expand upon them by using these guidelines.

Guidelines for Trainer Tm
  • Use a minimum confidence threshold of 0.7 for any skill that you plan to put into production.
  • Use good naming conventions for your intent names so it's easy to review related intents.
  • As a general rule, create at least 80 to 100 utterances for each intent. Per the corpus size and shape guidelines, the minimum (through not recommended) number of utterances for an intent is two. The total number of utterances in your training set should not exceed 25,000.
  • If possible, use unmodified, real-word phrases that include:
    • vernacular
    • standard abbreviations that a user might enter ("opty" for opportunity, for example)
    • non-standard names, such a product names
    • spelling variants ("check" and "cheque", for example)
    If you don't have any actual data, incorporate these in your own training data. Here are some pointers:
    • Create fully formed sentences that mention both the action and the entity on which the action is performed.
    • Try to keep the utterance length between 3 and 30 words. Utterances that are too short and lacking context can cause the model to generalize in unpredictable ways. Utterances that are too long may prevent the model from identifying the pertinent words and phrases. There can be exceptions, however, for one- or two-word utterances when they're commonly used phrases. If you expect two-word messages like order status, price check, membership info, or ship internationally) that specify both the entity and action, add them to your training data. Be sure that your sample phrases have both an action and an entity.
    • Be specific. For example, What is your store phone number? is better than What is your phone number? because it enables Trainer Tm to associate a phone number with a store. As a result of this learning, it will resolve What's your mom's phone number? to a lower confidence score.
    • While Trainer Tm detects out-of-scope utterances, you can still improve confidence and accuracy by creating an unresolvedIntent for utterances that are in domain but still out of scope for the skill's intents. This enables Trainer Tm to learn the boundary of domain intents. You can define an unresolvedIntent for phrases that you do not want resolved to any of your skill's intents. You may only want to define an unresolvedIntent when user messages have been resolved to a skill's intents even when they don't apply to any of them.
    • Vary the words and phrases that surround the significant content as much as possible. For example, I'd like a pizza, please", "Can you get me a pizza?", "A pizza, please"
    • Some practices to avoid:
      • Do not associate a single word or phrase with a specific intent unless that word or phrase indicates the intent. Repeated phrases can skew the intent resolution. For example, starting each OrderPizza utterance with "I want to …" and each ShowMenu intent with "Can you help me to …" may increase the likelihood of the model resolving any user input that begins with "Can you help me to" with OrderPizza and "I want to" with ShowMenu.
      • A high occurrence of one-word utterances in your intents. One-word utterances are an exception. Use them sparingly, if at all.
      • Open-ended utterances that can easily apply to other domains or out-of-domain topics.
      • Your corpus doesn't need to repeat the same utterance with different casing or with different word forms that have same lemma. For example, because Trainer Tm can distinguish between manage, manages, and manager, it not only differentiates between "Who does Sam manage?" and "Who manages Sam?", but also understands that these words are related to one another.
        Note

        You may be tempted to add misspellings of words. But before you do, use those misspellings in the utterance tester to see if the model recognizes them. You might be surprised at how well it handles them. Also, by not adding misspellings you run less risk of skewing your model in unexpected ways.
  • Create test cases to ensure the integrity of the intent resolution.
  • Run the Overview report for skill quality to maintain a balanced training set. Run the anomalies report as well to check for misclassified or unusual utterances.
  • When you deploy your skill, you can continuously improve the training data by:
    • Reviewing the Conversation Logs, summaries of all conversations that have occurred for a specified period. You enable the logging by switching Enable Insights on in Settings.
    • Running Skill Quality Reports and by assigning (or reassigning) actual user messages to your intents with the Insights Retrainer. If these reports indicate that unresolvedIntent has a lot of misclassified utterances within the domain intents:
      • Move the in-scope utterances from unresolvedIntent to the domain intents.
      • Move the out-of-scope utterances from the domain intents to unresolvedIntent.
Guidelines for Trainer Ht
Create 12 to 24 sample phrases per intent, if possible. Use unmodified, real-word phrases that include:
  • vernacular
  • common misspellings
  • standard abbreviations that a user might enter ("opty" for "opportunity", for example)
  • non-standard names, such a product names
  • spelling variants ("check" and "cheque", for example)
If you don't have any actual data, incorporate these in your own training data. Here are some pointers:
  • Vary the vocabulary and sentence structure in these starter phrases by one or two permutations using:
    • slang words (moolah, lucre, dough)

    • standard abbreviations that a user might enter ("opty" for opportunity, for example)
    • non-standard names, such a product names
    • common expressions (Am I broke? for an intent called AccountBalance)

    • alternate wording (Send cash to savings, Send funds to savings, Send money to savings, Transfer cash to savings.)

    • different categories of objects (I want to order a pizza, I want to order some food).

    • alternate spellings (check, cheque)

    • common misspellings ("buisness" for business)

    • unusual word order (To checking, $20 send)

  • Use different concepts to express the same intent, like I am hungry and Make me a pizza
  • Do not associate a single word or phrase with a specific intent unless that word or phrase indicates the intent. Repeated phrases can skew the intent resolution. For example, starting each OrderPizza utterance with "I want to …" and each ShowMenu intent with "Can you help me to …" may increase the likelihood of the model resolving any user input that begins with "I want to" with OrderPizza and "Can you help me to" with ShowMenu.
  • Avoid sentence fragments and single words. Instead, use complete sentences (which can be up to 255 characters) that include the action and the entity. If you must use single key word examples, choose them carefully.

  • Create test cases to ensure the integrity of the test the intent resolution. Because adding a new intent examples can cause regressions, you might end up adding several test phrases to stabilize the intent resolution behavior.
Limits for Training Data Shape and Size

Regarding training data and shape, here are the limits to the number of intents and utterances.

Intents:

  • Minimum number of intents per skill: 2
  • Maximum number of intents per skill: 2,500

Utterances:

  • Maximum number of utterances per skill: 25,000
  • Minimum number of utterances per intent: 2
  • Utterance word length: Between 3 and 30 words. Per the guidelines for Trainer Tm, there are exceptions where one or two-word utterances can be appropriate if they are commonly used.
Note

These are technical limits, not recommendations. See Guidelines for Trainer Tm for practical recommendations for shaping your skills and providing robust training data.

Export Intent Data

To log conversations, be sure to enable Enable Insights in Settings > General before you test your intents.

To export data for a skill:
  1. Click icon to open the side menu to open the side menu and select Development > Skills.
  2. In the tile for the skill, click icon to open the Options menu and select Export Conversations.
  3. Choose Intent Conversation Log, set the logging period, and then click Export.
  4. Review the user input by opening the CSV files in a spreadsheet program.

Intent Training and Testing

Training a model with your training corpus allows your bot to discern what users say (or in some cases, are trying to say).

You can improve the acuity of the cognition through rounds of intent testing and intent training. You control the training through the intent definitions alone; the skill can’t learn on its own from the user chat.

Testing Utterances

We recommend that you set aside 20% percent of your corpus for intent testing and use the remaining 80% to train your intents. Keep these two sets separate so that the test utterances, which you incorporate into test cases, remain "unknown" to your skill.

Apply the 80/20 split to the each intent's data set. Randomize your utterances before making this split to allow the training models to weigh the terms and patterns in the utterances equally.

The Utterance Tester

The Utterance Tester is your window to your skill's cognition. By entering phrases that are not part of the training corpus, you can find out how well you've crafted your intents by reviewing the intent confidence ranking and the returned JSON. This ranking, which is the skill's estimate for the best candidate to resolve the user input, demonstrates its acuity at the current time.
Description of utterance_tester_quick_test.png follows

Using the Utterance Tester, you can perform quick tests for one-off testing, or you can incorporate an utterance as a test case to gauge intent resolution across different versions of training models.

Quick Tests
To find out how well your intents work:
  1. Click Test Utterances (located at the left side).
  2. If your skill supports multiple native languages, choose the testing language. Choosing this option ensures that the utterance will be added to the corresponding language version of the corpus. The skill's primary language is selected by default.
  3. Enter a string of text.
  4. Click Test and then take a look at the ranking and the entities detected in the utterance (if any).
  5. Review the Intent Confidence scores. (The progress bars for each intent listed are green if they meet or exceed the Confidence Level or red if they fall short).
    If your skill’s top-ranking candidate isn’t what you expect, you might need to retrain the intents after doing one or both of the following:
    • Update the better candidate’s corpus with the input text that you just entered—Select the appropriate intent and then click Add to Intent.

      Caution:

      Consider how adding new test phrase might affect the training data. Adding a test phrase can change how the utterances that are similar to it get classified after retraining. In addition, adding a test phrase invalidates the test, because the incorporation of a test phrase into the training set ensures that the test will be successful. Rather than adding a test phrase to the training data, you should instead save it as a test case.
    • In the Intents page, you can edit an utterance Edit (This is an image of the Edit button.) or remove it. A FAQ intent, for example, might receive a top rank because of the scope and phrasing of its constituent utterances. If you don’t want your users to get a FAQ whenever they ask typical questions, you’ll need to revise the corpus.

    You need to retrain an intent whenever you add, change, or delete an utterance. Training Needed This is an image of the Training Needed indicator. displays whenever you make a change to the training data.

  6. If your intents aren't resolving as intended, you can expand the JSON window to review the matched intents, scores, and detected entities in the returned JSON.
  7. Click Reset.
Test Cases

Each test has an utterance and the intent that it's expected to resolve to, which is known as a label match. A test case can also include matching entity values and the expected language for the utterance. You can run test cases when you’re developing a skill and later on, when the skill is in production, you can use the test cases for regression testing. In the latter case, you can run test cases to find out if a new release of the training model has negatively affected intent resolution.

Like the test cases that you create with the Conversation Tester, utterance test cases are part of the skill and are carried along with each version. If you extend a skill, then the extension inherits the test cases. Whereas conversation test cases are intended to test a scenario, utterance test cases are intended to test fragments of a conversation independently, ensuring that each utterance resolves to the correct intent.

Manage Test Cases
The Test Cases page, accessed by clicking Go to Test Cases in the Utterance Tester, lists the test suites and the test cases that belong to them. The test suites may be ones that you have created, or may have been inherited from a skill that you've extended or cloned. In addition to editing, adding and removing test cases, you use this page to compile test cases into test runs.

Description of test_suites_page.png follows

By default, All is selected, which displays every test case. If you want to narrow the display to only the test cases that belong to a single test suite, you can either select the test suite from the list of test suites or filter this list using a full or partial match of the test suite name. The test suite view enables you to manage the suite's member test cases from its Test Cases tab.

Description of test_suite_test_cases_view.png follows

From its General tab, you can, in addition to updating the name and description of the test suite, exclude the test suite from a test run by switching off Enable Test Suite. By switching off Include in Skill Export, you can prevent the test suite from getting included in the nluTestSuites folder that houses the skill's test suites when the skill is exported.
Description of test_suite_test_general_view.png follows

Create Test Suites
All test cases belong to a test suite. We provide one for you called Default Test Suite, but you might want to partition your testing by creating your own test suites. You can create test suites manually or by importing a CSV. To create a test suite manually:
  1. Click + Test Suite.
  2. In the General tab, replace the placeholder name (TestSuite0001, for example) with a more meaningful one by adding a value in the Display Name field.
  3. Optionally, add a description that explains the functionality that's covered by the test suite.
  4. Populate the test suite with test cases using any (or a combination of ) the following methods:
    • Manually adding test cases (either by creating a test case or by saving an utterance as a test case from the Utterance Tester).
    • Importing test cases.
      Note

      To assign a test case to a test suite via import, the CSV's testSuite field can either be empty, or must contain a name that matches the test suite that's selected in the import dialog.
    • Editing a test case to reassign its test suite.
  5. If you want to exclude the test suite from test runs that are launched using the All and Run All options, switch off Enable Test Suite.
  6. If you don't want the test suite included with the skill export, switch off Include in Skill Export. When you switch off this option for a test suite, it won't be included in the nluTestSuites folder that houses the skill's test suites in the exported ZIP file.
Create Utterance Test Cases

You can add test cases one-by-one using either Utterance Tester or the New Test Case dialog (accessed by clicking + Test Case), or you can add them in bulk by uploading a CSV.

Each test case must belong to a test suite, so before you create a test case, you may want to first create a test suite that reflects a capability of the skill, or some aspect of intent testing, such as failure testing, in-domain testing, or out-of-domain testing.

We provide a suite called Default Test Suite. You can assign test cases to this test suite if you haven't yet created any others. Later on, you can edit the test case to reassign it to a new test suite.

Tip:

To provide adequate coverage in your testing, create test suite utterances that are not only varied conceptually, but also grammatically since users will not make requests in a uniform fashion. You can add these dimensions by creating test suites from actual user message that have been queried in the Insights Retrainer and also from crowd-sourced input gathered from Data Manufacturing.
Add Test Cases from the Utterance Tester
In addition to adding utterances to the training corpus, you can use the Quick Test page to create a test case:
  1. Click Test Utterances.
  2. If the skill is multi-lingual, select the native language.
  3. Enter the utterance then click Test.
  4. Click Save as Test Case then choose a test suite.
Create a Test Case
To create a single test case:
  1. Click Go to Test Cases in the Utterance Tester.
  2. Click + Test Case.
  3. Complete the New Test Case dialog:
    • If needed, disable the test case.
    • Enter the test utterance.
    • Select the test suite.
    • Select the expected intent. If you're creating a test case for failure testing, select unresolvedIntent.
    • For multi-lingual skills, select the language tag and the expected language.
  4. Click Add to Suite. From the Test Cases page, you can delete a test case, or edit a test case, which includes reassigning the test case to a different test suite.
    Description of create_new_test_case_with_entity.png follows

  5. To test for entity values:
    • Switch on Test Entities. Then click Continue.
    • Highlight the word (or words) and then apply an entity label to it by selecting an entity from the list. When you're done, click Add to Suite.
      Note

      Always select words or phrases from the test case utterance after you enable Test Entities. The test case will fail if you've enabled Test Entities but have not highlighted any words.

      Description of new_test_case_entity_test_page.png follows

Import Test Cases for Skill-Level Test Suites
From the Test Cases page (accessed by clicking Go to Test Cases in the Utterance Tester), you can add test suites and their cases in bulk by uploading a CSV file that has the following fields:
  • testSuite – The name of the test suite to which the test case belongs. The testSuite field in each row of the CSV can have a different test suite name or can be empty.
    • Test cases with empty testSuite fields get added to a test suite that you select when you import the CSV. If you don't select a test suite, they will be assigned to Default Test Suite.
    • Test cases with populated testSuite fields get assigned to the test suite that you select when you import the CSV only when the name of the selected test suite matches the name in the testSuite field.
    • If a test suite by the name of the one specified in testSuite field doesn't already exist, it will be created after you import the CSV.
  • utterance – An example utterance (required). Maps to query in pre-21.04 versions of Oracle Digital Assistant.
  • expectedIntent – The matching intent (required). This field maps to TopIntent in pre-21.04 versions of Oracle Digital Assistant.

    Tip:

    Importing Pre-21.04 Versions of the CSV tells you how to reformat Pre-21.04 CSVs so that you can use them for bulk testing.
  • enabledTRUE includes the test case in the test run. FALSE excludes it.
  • languageTag – The language tag (en, for example). When there's no value, the language detected from the skill's language settings is used by default.
  • expectedLanguageTag (optional) – For multilingual skills, this is the language tag for the language that you want the model to use when resolving the test utterance to an intent. For the test case to pass, this tag must match the detected language.
  • expectedEntities – The matching entities in the test case utterance, represented as an array of entityName objects. Each entityName identifies the entity value's position in the utterance using the beginOffset and endOffset properties. This offset is determined by character, not by word, and is calculated from the first character of the utterance (0-1). For example, the entityName object for the PizzaSize entity value of small in I want to order a small pizza is:
    [{"entityName":"PizzaSize","beginOffset":18,"endOffset":23,"originalString":"small"}, …]

Description of utterance_test_case_csv_example.png follows

To import this CSV:
  1. Click More, then select Import.
  2. Browse to, then select the CSV.
  3. Choose the test suite. The test case can only be assigned to the selected test suite if the testSuite field is empty or matches the name of the selected test suite.
  4. Click Upload.
Importing Pre-21.04 Versions of the CSV
Test cases imported via the pre-21.04 versions of CSVs, which have the query and TopIntent fields, get added to Default Test Suite only. You can reassign these test cases to other test suites individually by editing them after you import the CSV, or you can update the CSV to the current format and then edit before you import it as follows:
  1. Click More > Import.
  2. After the import completes, select Default Test Suite, then click More > Export Selected Suite. The exported file will be converted to the current format.
  3. Extract the ZIP file and edit the CSV. When you've finished, import the CSV again ( More > Import). You may need to delete duplicate test cases from the Default Test Suite.
    Note

    If you upload the same CSV multiple times with minor changes, any new or updated data will be merged with the old: new updates get applied and new rows are inserted. However, you can't delete any utterances by uploading a new CSV. If you need to delete utterances, then you need to delete them manually from the user interface.
Create Test Runs

Test runs are a compilation of test cases or test suites aimed at evaluating some aspect of the skill's cognition. The contents (and volume) of a test run depends on the capability that you want to test, so a test run might include a subset of test cases from a test suite, a complete test suite, or multiple test suites.

The test cases included in a test run are evaluated against the confidence threshold that's set for the skill. For a test case to pass in the overall test run, it must resolve to the expected intent at, or above, the confidence threshold. If specified the test case must also satisfy the entity value and language-match criteria. By reviewing the test run results, you can find out if changes made to the platform, or to the skill itself, have compromised the accuracy of the intent resolution.

In addition to testing the model, you can also use the test run results to assess the reliability of your testing. For example, results showing that nearly all of the test cases have passed might, on the surface, indicate optimal functioning of the model. However, a review of the passing test cases may reveal that the test cases do not reflect the current training because their utterances are too simple or have significant overlap in terms of the concepts and verbiage that they're testing for. A high number of failed tests, on the other hand, might indicate deficiencies in the training data, but a review of these test cases might reveal that their utterances are paired with the wrong expected intents.

To create a test run:
  1. Click Run All to create a test run for all of the test cases in a selected test suite. (Or if you want to run all test suites, select All then click Run All).
    Description of test_cases_all_run_all.png follows

    • To create a test run for a selection of test cases within a suite (or a test run for subset of all test cases if you selected All), filter the test cases by adding a string that matches the utterance text and an expected intent. Select the utterance(s), then click Run.
      Description of test_cases_filtered.png follows

    • To exclude test suite from the test run, first select the test suite, open the General tab, and then switch off Enable Test Suite.
      Description of test_runs_disable_test_suite.png follows

    • For multilingual skills, you can also filter by Language Tag and Expected Language options (accessed through Optional Attributes).
      Description of test_cases_optional_attributes.png follows

  2. Enter a test run name that reflects the subject of test. This is an optional step.
  3. Click Start
    Description of new_test_run_dialog.png follows

  4. Click Test Results, then select the test run.

    Tip:

    Test runs that contain a large number of test cases may take several minutes to complete. For these large test runs, you may need to click Refresh periodically until the testing completes. A percentage replaces the In Progress status for the Accuracy metric and the Intents report renders after all of the test cases have been evaluated.

    Description of test_cases_testing_in_progress.png follows

  5. Review the test run reports. For example, first review the high-level metrics for the test run provided by the Overview report. Next, validate the test results against the actual test cases by filtering the Test Cases report, which lists all of the test cases included in the test run, for passed and failed test cases. You can then examine the individual test case results. You might also compare the Accuracy score in the Overview report to the Accuracy score in the Intents report, which measures the model's ability to predict the correct intents. To review the test cases listed in this report, open the Test Cases report and filter by intents.
Test Run Summary Report

The Summary report provides you with an overall assessment of how successfully the model can handle the type of user input that's covered in the test run. For the test suites included in the test run, it shows you the total number of test cases that have been used to evaluate the model, and from that total, both the number of test cases (both reliable and unreliable) that failed along with the number of reliable and unreliable test cases that passed. The model's overall accuracy – its ability to predict expected intents at or above the skill's confidence level, recognize entity values, and resolve utterances in the skill's language – is gauged by the success rate of the passing tests in the test run.
Description of test_run_test_results_summary.png follows

Summary Report Metrics
The Summary report includes the following metrics:
  • Accuracy – The model's accuracy in terms of the success rate of the passing test cases (the number of passing test cases compared to the total number of test cases included in the test run).
    Note

    Disabled test cases are not factored into the Accuracy score. Neither are the tests that failed because of errors. Any test that failed is instead added to the Failed count.

    A low Accuracy score might indicate the test run is evaluating the model on concepts and language that are not adequately supported by the training data. To increase the Accuracy score, retrain the model with utterances that reflect the test cases in the test run.

    This Accuracy metric applies to the entire test run and provides a separate score from the Accuracy metric in the Intents report. This metric is the percentage of test cases where the model passed all of the test case criteria. The Accuracy score in the Intents report, on the other hand, is not end-to-end testing. It is the percentage of test cases where the model had only to predict the expected intent at, or above the skill's confidence threshold. Other test case criteria (such as enity value or skill language) are not factored in. Given the differing criteria of what a passing test case means for these two reports, their respective Accuracy scores may not always be in step. The intent match Accuracy score may be higher than the overall test run score when the testing data is not aligned with the training data. Retraining the model with utterances that support the test cases will enable it to predict the expected intents with higher confidence that will, in turn, increase the Accuracy score for the test run.

    Note

    The Accuracy metric is not available until the test run has completed and is not available for test runs that were completed when the skill ran on pre-22.12 versions of the Oracle Digital Assistant platform.
  • Test Cases – The total number of test cases (both reliable and unreliable test cases) included in the test run. Skipped test cases are included in this tally, but they are not considered when computing the Accuracy metric.
  • Passed – The number of test cases (both reliable and unreliable) that passed by resolving to the intent at the confidence threshold and by matching the selected entity values or language.
  • Failed – The number of test cases (bot reliable and unreliable) that failed to resolve to the expected intent at the confidence threshold and failed to match the selected entity values or language.

    To review the actual test cases behind the Passed and Failed metrics in this report, open the Test Cases report and then apply its Passed or Failed filters.
    Description of test_runs_intent_report.png follows

Test Suite Breakdown

The Test Suite Breakdown table lists test suites included in the test run and their individual statistics. You can review the actual test cases belonging to a test suite by clicking the link in the Test Suite column.
Description of test_suite_breakdown.png follows

Intents Report

The metrics in this report track the model's label matches throughout the test run's test cases. This is where the model correctly predicts the expected intent for the test case utterance. Within the context of this report, accuracy, passing, and failing are measured in terms of the test cases where the model predicted the correct expected intent at, or above, the confidence threshold. Other criteria considered in the Summary report, such as entity value matches or skill language are not considered. As a result, this report provides you with a different view of model accuracy, one that helps you to verify if the current training enables the model to consistently predict the correct intents.

This report provides you with label-match (or intent-match) metrics for the test run at two levels: one that aggregates the results for the test run and one separates these results by intent.
Note

This report is not available for test runs that were completed when the skill ran on a pre-22.12 version of the Oracle Digital Assistant platform.

Description of unfiltered_intents_report_all_tests.png follows

Intents Report Metrics
The overall intent-matching results include:
  • Test Cases – The number of test cases included in this test run. This total includes both reliable and unreliable test cases. Skipped test cases are not included in this tally.

    Tip:

    The unreliable test case links for the Test Cases, Passed and Failed metrics open the Test Cases report filtered by unreliable test cases. This navigation is not available when you filter the report by test suite.
  • Accuracy – The model's accuracy in matching the expected intent at, or above, the skill's confidence threshold across the test cases in this test run. The Label Match submetric represents the percentage of test cases in the test run where the model correctly predicted the expected intent, regardless of the confidence score. Because Label Match factors in failing test cases along with passing test cases, its score may be higher than the Accuracy score.
    You can compare this Accuracy metric with the Accuracy metric from the Summary report. When the Accuracy score in Summary report is low, you can use this report to quickly find out if the model's failings can be attributed to its inability to predict the expected intent. When the Accuracy score in this report is high, however, you can rule out label matching as root of the problem and, rather than having to heavily revise the training data to increase the test run's Accuracy score, you can instead focus adding utterances that reflect the concepts and language in the test case utterances.

    Description of compare_accuracy_scores.png follows

  • Passed – The number of test cases (reliable and unreliable) where the model predicted the expected intent at the skill's confidence threshold.
  • Failed – The number of test cases (reliable and unreliable) where the model predicted the expected intent below the skill's confidence threshold.
  • Confidence Pass – An average of the confidence scores for all of the test cases that passed in this test run.
  • Confidence Fail – An average of the confidence scores for all of the test cases that failed in this test run.
Note

When you filter the Intents report by test suite, access to the Test Cases report from the unreliable test case links in the Test Cases, Passed, and Failed tiles is not available. These links become active again when you remove all entries from the Filter by Test Suite field.
Filter by Test Suite
The default results of the Intents report reflect all of the test suites included in the test run. Likewise, its metrics are based on all of the enabled test cases that belong to these test suites. If you want to breakdown individual test suite performance (and essentially create a comparison to the Summary report's Test Suite Breakdown table), you don't need to create additional test runs. Instead, you can isolate the results for the test suite (or test suites) in question using the Filter by Test Suite field. You can add one or more test suites to this field.
This is an image of the Filter by Test Suite field.

The report adjusts the metrics for each test suite that you add (or subsequently remove). It tabulates the intent matching results in terms of the number of enabled test cases that belong to the selected test suite.
Note

You can only filter on test suites that have been run on Version 23.06 of the Oracle Digital Assistant platform. Because test suites that were run on a prior version of the platform can't be filtered, you'll have to run them again when you upgrade to 23.06.

Description of filtered_intents_report_all_tests.png follows

Note

Filtering by test suite disables navigation to the Test Cases report from the unreliable test cases links in the Test Cases, Passed, and Failed tiles. The links in the Total column of the Intents Breakdown are also disabled. All of these links become active again after you remove all of the entries from the Filter by Test Suite field.
Intents Breakdown
The report's Intents Breakdown table provides the following top-level metrics for the expected intents named in the test run's test cases. You can narrow the focus by selecting the names of these intents from the Filter by Intents field.
Note

The Filter by Intent field changes the view of the Intents Breakdown table but does not change the report's overall metrics. These metrics reflect the entries (or lack of entries) in the Filter by Test Suite field.
  • Intent – The name of the expected intent.
  • Total – The number of test cases, represented as a link, for the expected intent. You can traverse to the Test Cases report by clicking this link.
    Note

    You can't navigate to the Test Cases report when you've applied a test suite filter to this report. This link becomes active again when you remove all entries from the Filter by Test Suite field.
  • Accuracy – The percentage of test cases that resulted in label matches for the expected intent at, or above the skill's confidence threshold.
  • Passed – The number of test cases (including unreliable test cases) where the model predicted the expected intent at, or above, the skill's confidence threshold.
  • Passed - Unreliable – The number test cases where the model predicted the expected intent at 5% or less above the skill's confidence threshold.
  • Failed – The number of test cases in the test run that failed because the model predicted the expected intent below the skill's confidence threshold.
  • Failed - Unreliable – The number test cases that failed because the model's confidence in predicting the expected intent fell 5% below the skill's confidence threshold. These test cases can factor into the
  • Label Match – The number of test cases where the model successfully predicted the expected intent, regardless of confidence level. Because it factors in failed test cases, the Label Match and Accuracy scores may not always be in step with one another. For example, four passing test cases out of five results in an 80% Accuracy score for the intent. However, if the model predicted the intent correctly for the one failing test case, then Label Match would outscore Accuracy by 20%.
  • Confidence Pass – An average of the confidence scores for all of the test cases that successfully matched the expected intent.
  • Confidence Fail – An average of the confidence scores for all of the test cases that failed to match the expected intent.

    Tip:

    To review the actual test cases, open the Test Cases report and the filter by the intent.

Description of test_run_compare_intents_to_test_cases.png follows

Test Cases Report
This report lists all of the test cases included in the test run.
  1. You can filter the results by clicking All, Passed (green), or Failed (red). The test cases counted as skipped include both disabled test cases and test cases where the expected intent has been disabled.
    Description of filtered_test_run_results_passed.png follows

    You can filter the results by unreliable test cases by either clicking Show me unreliable cases in the warning message, or by selecting Only Unreliable Cases filter.
  2. If needed, filter the results for a specific intent or entity or by reliable or unreliable test cases.
  3. For unreliable and failed test cases, click View Similar Utterances (located in the Test Info page) to find out if the test case utterance has any similarity to the utterances in the training set.
    This is an image of the View Similar Utterances button.

  4. Check the following results:
    • Test Info – Presents the test case overview, including the target confidence threshold, the expected intent, and the matched entity values.
    • Test Result – The ranking of intent by confidence level. When present, the report also identifies the entities contained in the utterance by entity name and value. You can also view the JSON object containing the full results.
    • Failure Analysis – Explains why the test case failed. For example, the actual intent is not the expected intent, the labeled entity value in the test case doesn't match the resolved entity, or the expected language is not the same as the detected language.
Unreliable Test Cases

Some test cases cannot provide consistent results because they resolve within 5% or less of the Confidence Threshold. This narrow margin makes these test cases unreliable. When the skill's Confidence Threshold is set a 0.7, for example, a test case that's passing at 74% may fail after you've made only minor modifications to your training data or because the skill has been upgraded to a new version of the model. The fragility of these test cases may indicate that the utterances that they represent in the training data may be too few in number and that you may need to balance the intent's training data with similar utterances.

To locate unreliable test cases:
  1. Run the test suite. Then click Test Results and select the test run. The unreliable test cases are sorted at the beginning of the test run results and are flagged with warningsThis is an image of the warning icon..
    Description of unreliable_test_cases_test_run.png follows

  2. To isolate the unreliable test cases:
    • Click Show me the unreliable cases in the message.
      This is an image of the Show me the unreliable cases link.

    • Select Only Unreliable Cases from the Filter by Cases menu.
      This is an image of the Only Unreliable Cases filtering option.

  3. To find the proximity of the test case's top-ranking intent to the Confidence Threshold, open the Test Result window. For a comparison of the top-ranking confidence score to the Confidence Threshold, click This is an image of the warning icon. .
    Description of unreliable_test_case_click_icon.png follows

  4. If you need to supplement the training data for the top-ranking intent, click Go to top intent in the warning message.
    This is an image of the Go to top intent link in the warning message.

  5. If you want to determine the quantity of utterances that are represented by the test case in the training data, click View Similar Utterances.
    Description of unreliable_test_case_view_similar_utterances.png follows

    You can also check if any of the utterances most similar to the test case utterance are also anomalies in the training set by running the Anomalies Report.
Exported Test Runs

Test runs are not persisted with with the skill, but you can download them to your system for analysis by clicking Export Test Run. If the intents no longer resolve the user input as expected, or if platform changes have negatively impacted intent resolution, you can gather the details for an SR (service request) using the logs of exported test runs.

Failure Testing

Failure (or negative) testing enables you to bulk test utterances that should never be resolved, either because they result in unresolvedIntent, or because they only resolve to other intents below the confidence threshold for all of the intents.

To conduct failure testing:
  • Specify unresolvedIntent as the Expected Intent for all of the test cases that you expect to be unresolved. Ideally, these "false" phrases will remain unresolved.
    Description of new_test_case_utterance_unresolved.png follows

  • If needed, adjust the confidence threshold when creating a test run to confirm that the false phrases (the ones with unresolvedIntent as their expected intent) can only resolve below the value that you set here. For example, increasing the threshold might result in the false phrases failing to resolve at the confidence level to any intent (including unresolvedIntent), which means they pass because they're considered unresolved.
  • Review the test results, checking that the test cases passed by matching unresolvedIntent at the threshold, or failed to match any intent (unresolvedIntent or otherwise) at the threshold.

Similar Utterances

You can find out how similar your test phrase is to the utterances in the training corpus by clicking View Similar Utterances. This tool provides you with an added perspective on the skill's training data by showing you how similar its utterances are to the test phrase, and by extension, how similar the utterances are to one another across intents. Using this tool, you can find out if the similarity of the test phrase to utterances belonging to other intents is the reason why the test phrase is not resolving as expected. It might even point out where training data belongs to the wrong intent because if its similarity to the test phrase.
Description of similar_utterance_report_all_intents.png follows

The list generated by this tool ranks 20 utterances (along with their associated intents) that are closest to the test phrase. Ideally, the top-ranking utterance on this list – the one most like the test phrase – belongs to the intent that's targeted for the test phrase. If the closest utterance that belongs to the expected intent is further down, then a review of the list might provide a few hints as to why. For example, if you're testing a Transactions intent utterance, how much money did I transfer yesterday?, you'd expect the top-ranking utterance to likewise belong to a Transactions intent. However, if this test utterance is resolving to the wrong intent, or resolving below the confidence level, the list might reveal that it has more in common with highly ranked utterances with similar wording that belong to other intents. The Balances intent's How much money do I have in all of my accounts?, for example, might be closer to the test utterance than the Transactions intent's lower-ranked How much did I deposit in April? utterance.

You can access the list, which is generated for skills trained on Trainer Tm, by clicking View Similar Utterances in the Utterance Tester or from the Test Cases report.
Description of similar_utterances_tester.png follows

Note

You can only use this tool for skills trained on Trainer Tm (it's not available for skills trained with Ht).
You can query utterances from both the Utterance Tester and through testing in the View Similar Utterances tool itself. When you click View Similar Utterances, the entire corpus is compared against the test phrase and a ranking is applied to each utterance. Because no filters are applied by default, however, the list only includes the 20 top-ranked utterances and numbers them sequentially. To find out how utterances ranked 21 and higher compared, you need to use the filters. By applying the following filters, you can learn the proximity of similar utterances within the ranking in terms of language, the intents they belong to, or the words or phrases that they have in common.
  • Filter by Intent – Returns 20 utterances that are closest to the test utterance that belong to the selected intent (or intents).
    Description of similar_utterance_report_filter_by_intent.png follows

  • Filter by Utterance – Returns 20 of the of utterances closest to the test utterance that contain a word or phrase.
    Description of similar_utterance_report_filter_by_utterance.png follows

  • Language – For multi-lingual skills, you can query and filter the report by selecting a language.
    Description of similar_utterance_report_filter_by_language.png follows

Note

Applying these filters does not change the rankings, just the view. An utterance ranked third, for example, will be noted as such regardless of the filter. The report's rankings and contents change only when you've updated the corpus and retrained the skill with Trainer Tm.

Tutorial: Best Practices for Building and Training Intents

Use this tutorial to find out about batch testing and other testing and training tips: Best Practices for Building and Training Intents.

Reference Intents in the Dialog Flow

You can reference intents directly from the dialog flow.

For dialog flows designed in Visual mode you define intent events in the Main Flow. see Map an Intent to a Flow.

For dialogs designed in YAML mode, you configure intents as action transitions for the System.Intent component to navigate to appropriate state for the resolved intent. For example, here's what the System.Intent might like if you had intents called OrderPizza and CancelPizza:
  intent:
    component: "System.Intent"
    properties:
      variable: "iResult"
    transitions:
      actions:
        OrderPizza: "startOrder"
        CancelPizza: "cancelorder"
        unresolvedIntent: "unresolved"

Tune Intent Resolution Before Publishing

Before you publish a version of a skill (and thus freeze that version), you should thoroughly test it and, if necessary, adjust its settings to fine tune its intent resolution.

You can use these settings to tune intent resolution:

  • Confidence Threshold: Determines the minimum confidence level required for user input to match an intent. It's recommended to set this value to .70 or higher.

  • Confidence Win Margin: When a skill has multiple intents that exceed the value of the Confidence Threshold, it displays a list of possible intents and prompts the user to choose one. This property helps the skill determine what intents should be in the list. Set the maximum level to use for the delta between the respective confidence levels for the top intents. The list includes the intents that are greater than or equal to this delta and exceed the value set for the Confidence Threshold.

To access these settings:

  • Click icon to open the side menu to open the side menu, select Development > Skills, and open your bot.

  • In the left navigation for the skill, click Settings icon and select the Configuration tab.

Note

Once you add a skill to a digital assistant, there is another range of settings that you may need to adjust to better handle intent resolution in the context of the digital assistant. See Tune Routing Behavior.

How Confidence Threshold Works

You use the Confidence Threshold property to adjust the likelihood that given user input will resolve to the skill's intents.

When you increase the confidence threshold, you increase the certainty that any matching intents are accurate (not false positives). However, this also increases the chance that intents that you want to match with certain input will not get high enough confidence scores for the matching to occur, thus resulting in matches to unresolvedIntent.

When you lower the value of the Confidence Threshold property, you reduce the chance that intents that you want to match will fail to match. However, the lower you set this threshold, the greater risk you have of generating false positives in your matches.

As a general rule the underlying language model works better with higher confidence thresholds, so you should set the confidence threshold to 70% (.70) or higher to get the best results.

How Confidence Win Margin Works

With the Confidence Win Margin property (accessed through Settings > Configuration), you can enable your skill to prompt users for an intent when the confidence scores for multiple intents are close. For example, if a user asks the FinancialBot, “I want to check balance or send money,” the skill responds with a select list naming the top intents, Check Balances and Send Money.

Description of win_margin_list.png follows

The skill offers these two intents in a select list, because its confidence in them exceeds the value set for the Confidence Threshold property and the difference between their respective confidence levels (that is, the win margin) is within value set for the Win Margin property.

Answer Intents

In some cases, a user's question requires only a single answer and no further conversation. Answer intents enable your skill to output these types of replies without you having to update the dialog definition.

You can create answer intents in the following ways:

  • Use the Knowledge feature to generate answer intents from an existing resource, such as an FAQ that is hosted on a web page or in a PDF document.
  • On the skill's Intents page, define answer intents like you would any other intent but also include an answer in the Answer field.
  • Do bulk creation of answer intents by uploading a CSV file.

Here are a few more things you need to know about answer intents:

  • Skills with answer intents should be trained with Trainer Tm.
  • Unlike regular intents, you don't need to map answer intents to flows (in the Visual Flow Designer) or to states with System.Intent actions (in the YAML editor).
    • In the Visual Flow Designer, you can create a standard flow that handles all answer intents, map specific answer intents, or use a combination of the approaches.
    • In the YAML editor, you just need to have a System.Intent component to resolve the answer intents.
  • You can optionally store the answer intent in a resource bundle by clicking This is an image of the resource bundle icon.The resource bundle entries for answer intents are listed in the resource bundle's Q&A page.
    Description of qna_tab_rb.png follows

Generate Answer Intents from an Existing Knowledge Resource

If you already have a web page or PDF document with question and answer pairs, you can use the Knowledge feature to ingest those Q&A pairs from the document and generate answer intents automatically. (Other text in the document that is not organized as question/answer pairs is ignored.) When you create answer intents this way, example utterances are also generated for the intents.

To generate answer intents from a question and answer document:

  1. In the left navbar of the skill, click the Knowledge icon.
  2. Click + Knowledge Document.
  3. In the New Knowledge Document dialog:
    1. Specify a name and language for the document.

      For the language, you can select from the natively-supported languages that you have specified for your skill.

    2. Select PDF or HTML and upload the document, or select URL.

      If you are providing a URL, it must point to a static HTML page containing the FAQ.

    3. If the document is a PDF or HTML document for upload, select the checkbox acknowledging that it will be temporarily stored.
    4. Click Create.
    Note

    The URL option only works for HTML web pages. If you want to import an online PDF file, you need to first download it from the web page and then upload it into Digital Assistant.
  4. Wait for the generation of the answer intents to occur.

    (The status and progress of the job will be updated every 10 seconds.)

  5. Once the job is completed, click Review Intents to go over the generated intents and training utterances. Pay particular attention to each question and answer to make sure that each contains the right text.

    Tip:

    For PDF documents, you can click Open PDF to view a color-coded version of the document to see what text was used to generate the intents and how it was divided into questions and answers.
  6. To edit an intent's name, question, answer, or utterances, click its Edit icon.
    Note

    You can also later edit these values on the Intents page.
  7. For an intents that you don't want added to the skill, clear the Include checkbox.
  8. Click Add Intents to Skill to add the generated intents to the skill.
  9. In the left navbar, click Intents This is an image of the Intent icon. and make any further adjustments to the intents, such as changing the conversation name and adding further example utterances.

The answers are generated with HTML tags for formatting that is included in the original for things like bold text, italics, and hyperlinks. This markup is then automatically transformed into the appropriate markup or markdown for the channels through which the skill is exposed. (If the channel doesn't support formatting, the tags are removed when the message is sent through that channel.) See Rich Text Formatting in Channels.

Note

There is a limit of 100 answer intents that can be created at a time. If your knowledge document has more than 100 question/answer pairs, divide the document into smaller documents and create the answer intents from each of those documents.

Create a Single Answer Intent

If you need just a few answer intents, you can create them similarly to how you create regular intents.

  1. Click Intents This is an image of the Intent icon. in the left navbar.
  2. Click Add Intent.
  3. Click This is an image of the Edit icon to enter a descriptive name or phrase for the intent in the Conversation Name field.
  4. Add the intent name in the Name field. If you don't enter a conversation name, then the Name field value is used instead.
    Note

    In naming your intents, do not use system. as a prefix. system. is a namespace that's reserved for the intents that we provide. Because intents with this prefix are handled differently by Trainer Tm, using it may cause your intents to resolve in unexpected ways.
  5. Click
    This is an image of the Edit icon.

    and then add an answer to the Answer field. Apply formatting to the text as needed.
    Description of answer_intent_editing.png follows

  6. In the Examples section, add training utterances that reflect typical ways that users would express the question that the intent is answering.

Create Answer Intents from a CSV File

You can create answer intents in bulk by importing a CSV file. This file is similar to the standard intent CSV file, but in addition to the query, topIntent, and conversationName columns, it also has the answer column:
query,topIntent,conversationName,answer
What are your hours?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
When are you open?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
When do you close?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
What do you sell?,Products,Our Products,We sell only hammers. All types.
Do you sell brick hammers?,Products,Our Products,We sell only hammers. All types.
Do you sell claw hammers?,Products,Our Products,We sell only hammers. All types.
Do you deliver?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can I buy one of your hammers on the web?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can you mail me a hammer?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can I return a hammer?,Returns,Our Return Policy,You cannot return any items. All sales are final.
My hammer doesn't work,Returns,Our Return Policy,You cannot return any items. All sales are final.
Can I exchange my hammer,Returns,Our Return Policy,You cannot return any items. All sales are final.

DO's and DON'Ts for Conversational Design

Creating a robust set of intents for a successful skill requires a lot of attention. Here are some best practices to keep in mind.

Intent Design and Training

DO DON'T
DO plan to add utterances until you get results you expect. Generally speaking, models perform well as you add more quality training utterances. The number of utterances you need depends on the model, the training data, and the level of accuracy that is realistic for your model. DON'T over-train individual intents. Don’t add excessive training data to some intents to make them work "perfectly". If intent resolution is not behaving as expected, evaluate your intent structure for overlap between intents. Intent resolution will NEVER be 100% accurate.
DO use real world data. Using the actual language that your skill is most likely to encounter is critical. Fabricated utterances can only take you so far and will not prepare your skill for real-world engagement. DON'T use just keywords in training data. While it is acceptable to use single words/short phrases for training, the training data should have the same structure as the user’s inputs. The fewer the words in utterances, the less successful classification will be.
DO use whole sentences to train intents. While it’s OK to use short training utterances, be sure to match the conversational style of your users as closely as possible. DON'T inadvertently skew intents. Be careful of words which add no specific meaning (e.g. "please" and "thanks") or entity values within utterances as they can inadvertently skew intent resolution if they are heavily used in one intent but not in another.
DO use similar numbers of utterances per intent. Some intents (e.g., "hello", "goodbye") may have fewer utterances in their training sets. However, ensure that your main intents have a similar number of utterances to avoid biasing your model. DON’T rely ONLY on intent resolution. Use entities to disambiguate common intents. If there’s linguistic overlap between intents, consider using entities to disambiguate the user’s intentions (and corresponding unique conversational path).
DO handle small talk. Users will make requests that are not relevant to the skill's purpose, such as for jokes and weather reports. They may also do things like ask if the skill is human. Ensure that you have a small talk strategy and aggressively test how the skill responds at all steps of your conversational flow. DON’T overuse unresolvedIntent. Create “out-of-scope" intents for the things you know you don't know (that you may or may not enable the skill to do later).
DO consider multiple intents for a single use case. Customers may express the same need in multiple ways, e.g. in terms of the solution they desire OR the symptom of their problem. Use multiple intents that all resolve to the same "answer". DON’T ignore abusive interactions. Similar to small talk, have a plan for abuse. This plan may need to include measures to ensure any abusive input from the user is not reflected back by the skill, as well as provisions for immediate escalation.

Conversational User Experience

DO DON'T
DO give indications of most likely responses (including help and exit). For example, "Hey, I'm Bob the Bot. Ask me about X, Y, or Z. If you run into any problems, just type 'help'." DON'T delay conversational design until "later in the project". For all but the simplest skills, conversational design must be given the same priority and urgency as other development work. It should start early and proceed in parallel with other tasks.
DO consider a personality for your bot. You should consider the personality and tone of your bot. However, be careful of overdoing human-like interaction (humor and sympathy often don't resonate well from a bot) and never try to fool your users into thinking that they are interacting with a human. DON'T say that the skill "is still learning". While well-intended, this bad practice signals to the user (consciously or subconsciously) that the skill is not up to the task.
DO guide the user on what is expected from them. The skill should try to guide the user toward an appropriate response and not leave questions open ended. Open-ended questions make the user more likely to fall off the happy path. DON'T use "cute" or "filler" responses. See "DO guide the user on what is expected from them".
DO break up long responses into individual chat bubbles and/or use line breaks. Large blobs of text without visual breaks are hard to read and can lead to confusion. DON'T say "I’m sorry, I don’t understand. Would you please rephrase your question?" This lazy error-handling approach is, more often than not, inaccurate. No matter how many times a user rephrases an out-of-scope question, the skill will NEVER have anything intelligent to say.
-- DON'T overuse "confirmation" phrases. Confirmation phrases have their place. However, don’t overuse them. Consider dialog flows that are able to take confidence levels into account before asking users to confirm.

Test Strategies

DO DON'T
DO develop utterances cyclically. Developing a robust training corpus requires multiple iterations and testing cycles and ongoing monitoring and tuning. Use a cyclical "build, test, deploy, monitor, update" approach. DON'T neglect the need for a performance measurement and improvement plan. Lacking a plan for measuring and improving your skill, you'll have no way of knowing whether it’s really working.
DO test utterances using the 80/20 rule. Always test the robustness of your intents against one another by conducting multiple 80/20 tests, where 80% of newly harvested utterances are used to train the model and 20% are added to your testing data. DON'T test only the happy path. "Getting it working" is 20% of the work. The remaining 80% is testing and adjusting how the skill responds to incorrect input and user actions.
DO test skill failure. Aggressively try to break your skill to see what happens. Don’t rely solely on positive testing. DON'T ignore processing out of order messages. Users will scroll back in conversation history and click on past buttons. Testing the results need to be part of your 80% work (as noted in DON'T test only the happy path).
-- DON’T forget to re-test as you update your intents. If you add more training data (e.g., as you bot gets more real-world usage) and/or you add new intents for new use cases, don’t forget to retest your model.

Project Considerations

DO DON'T
DO select use cases that are enhanced by conversational UI (CUI). Enabling conversational UI (via skills and digital assistants) is work. Make sure that the use case will be truly enhanced by adding CUI. DON'T fail to have an escalation path. Even if you don’t plan on allowing escalation to a human, you must have a strategy for those interactions where the skill can’t help.
DO anticipate the first day being the worst day. Even the best-tested skills and digital assistants require tuning on day 1. DON'T disband the project team immediately after launch. When scheduling your skill project, ensure that you keep the skill’s creators (Conversational Designer, Project Manager, Tech Lead, etc.) on the project long enough for adequate tuning and, ultimately, knowledge transfer.

Names You Can't Use for Intents

Intent names can not start with system. ("system" followed by ".").

Note

The Automated Agent Assistant (which is digital assistant template available in the Skill Store) has several such intents, but they are treated as a special case and should not be used elsewhere.