Intents
Intents allow your skill to understand what the user wants it to do. An intent categorizes typical user requests by the tasks and actions that your skill performs. The PizzaBot’s OrderPizza intent, for example, labels a direct request, I want to order a Pizza, along with another that implies a request, I feel like eating a pizza.
Intents are comprised of permutations of typical user requests and statements, which are also referred to as utterances. As described in Create an Intent, you can create the intent by naming a compilation of utterances for a particular action. Because your skill’s cognition is derived from these intents, each intent should be created from a data set that’s robust (one to two dozen utterances) and varied, so that your skill can interpret ambiguous user input. A rich set of utterances enables a skill to understand what the user wants when it receives messages like “Forget this order!” or “Cancel delivery!”—messages that mean the same thing, but are expressed differently. To find out how sample user input allows your skill to learn, see Intent Training and Testing.
Create an Intent
Add Entities to Intents


Value Agnostic applies to built-in entities only. You cannot apply it to custom entities.

Alternatively, you can click New Entity to add an intent-specific entity.

Tip:
Only intent entities that are included in the JSON payloads are sent to, and returned by, the Component Service. The ones that aren’t associated with an intent won’t be included, even if they contribute to the intent resolution by recognizing user input. If your custom component accesses entities through entity matches, then be sure to add the entity to your intent.Value Agnostic Intent Entities
The Value Agnostic feature allows you to adjust how entity values affect intent classification. When you enable this feature, the specific values for an associated built-in entity do not have bearing on the intent classification. However, when you disable this feature, you allow the entity value to play a key role in resolving the input.
Example Intents | Associated Entity | Training Utterances | Enable Value Agnostic? |
---|---|---|---|
Account Balance | DATE |
|
Yes – The specific date values do not signal the intent. The various date values in these utterances can be ignored because a user can ask for an account balance on any day. |
Holiday Store Hours | DATE |
|
No – The intent classification is based on a specific (and limited) set of values and users are inquiring about holidays. |
|
DATE |
|
No – Disable Value Agnostic for both intents. Agnostic DATE values in this case would mean that the model would not consider a value as past or future. A "last month" value, which should signal the Check Past Vacation Balance intent, would be ignored. As a result, similarly worded input like "Did I take any time off next month" may resolve incorrectly to this intent. |
Import Intents from a CSV File
You can add your intents manually, or import them from a CSV file. You can create this file from a CSV of exported intents, or by creating it from scratch in a spreadsheet program or a text file.
The CSV file has six columns for skills that use the Natively-Supported language mode and five columns for those that don't. Here are the column names and what they represent:
query
: An example utterance.topIntent
: The intent that the utterance should match to.conversationName
: The conversation name for the intent.answer
: For answer intents, the static answer for the intent.enabled
: Iftrue
, the intent is enabled in the skill.nativeLanguageTag
: (For skills with native-language support only) the language of the utterance. For values, use two-character language tags (fr
,en
, etc,).- For skills with Digital Assistant's native language support, this column is required.
- For skills without the native language support, you can't import a CSV that has this column.
Here's an excerpt from a CSV file for a skill that does not have native language support and which doesn't use answer intents.
query,topIntent,conversationName,answer,enabled
I want to order a pizza,OrderPizza,Order a Pizza.,,true
I want a pizza,OrderPizza,Order a Pizza.,,true
I want a pizaa,OrderPizza,Order a Pizza.,,true
I want a pizzaz,OrderPizza,Order a Pizza.,,true
I'm hungry,OrderPizza,Order a Pizza.,,true
Make me a pizza,OrderPizza,Order a Pizza.,,true
I feel like eating a pizza,OrderPizza,Order a Pizza.,,true
Gimme a pie,OrderPizza,Order a Pizza.,,true
Give me a pizza,OrderPizza,Order a Pizza.,,true
pizza I want,OrderPizza,Order a Pizza.,,true
I do not want to order a pizza,CancelPizza,Cancel your order.,,true
I do not want this,CancelPizza,Cancel your order.,,true
I don't want to order this pizza,CancelPizza,Cancel your order.,,true
Cancel this order,CancelPizza,Cancel your order.,,true
Can I cancel this order?,CancelPizza,Cancel your order.,,true
Cancel my pizza,CancelPizza,Cancel your order.,,true
Cancel my pizaa,CancelPizza,Cancel your order.,,true
Cancel my pizzaz,CancelPizza,Cancel your order.,,true
I'm not hungry anymore,CancelPizza,Cancel your order.,,true
don't cancel my pizza,unresolvedIntent,unresolvedIntent,,true
Why is a cheese pizza called Margherita,unresolvedIntent,unresolvedIntent,,true
Here's an excerpt from a CSV file for a skill with native-language support that uses answer intents.
query,topIntent,conversationName,answer,enabled,nativeLanguageTag
Do you sell pasta,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Vendez-vous des salades,Products,Our Products,Nous ne vendons que des pizzas. Pas de salades. Pas de pâtes. Pas de hamburgers. Seulement pizza,fr
do you sell burgers,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Do you sell salads,Products,Our Products,We sell only pizzas. No salads. No pasta. No burgers. Only pizza,true,en
Vendez des hamburgers,Products,Our Products,Nous ne vendons que des pizzas. Pas de salades. Pas de pâtes. Pas de hamburgers. Seulement pizza,true,fr
To import a CSV file:
-
Click Intents (
) in the left navbar.
-
Click More, and then choose Import intents.
-
Select the
.csv
file and then click Open. -
Train your skill.
Export Intents to a CSV File
You can reuse your training corpus by exporting it to CSV. You can then import this file to another skill.
-
Click Intents
in the left navbar.
-
Click More, and then choose Export intents.
-
Save the file. This file has the following columns, which are described in Import Intents from a CSV File:
query, topIntent, conversationName, answer, enabled, nativeLanguageTag
Which Training Model Should I Use?
You can't use Trainer Ht for skills that use answer intents, use native language support, or have a large number of intents. Use Trainer Tm for these skills.
Trainer Tm
You don't need to bulk up your training data with utterances that accommodate case sensitivity (Tm recognizes BlacK Friday as Black Friday, for example), punctuation, similar verbs and nouns, or misspellings. In the latter case, Trainer Tm uses context to resolve a phrase even when a user enters a key word incorrectly. Here are some general guidelines for building a training corpus when you're developing your skill with this model.
- Recognizing the irrelevant content. For I'm really excited about the
coming Black Friday deals, and can't wait for the deals. Can you tell me
what's going to be on sale for Black Friday?, Trainer Tm:
- Discards the extraneous content (I'm really excited about the coming Black Friday deals...)
- Resolves the relevant content (Can you tell me what's going to be on sale for Black Friday?) to an intent. In this case, an intent called Black Friday Deals.
Trainer Tm can also distinguish between the relevant and irrelevant content in a message even when the irrelevant content can potentially be resolved to an intent. I bought the new 80 inch TV on Black Friday for $2200, but now I see that the same set is available online for $2100. Do you offer price match? for example, could be matched to the Black Friday Deals intent and to a Price Matching intent, which is appropriate for this message. In this case Trainer Tm:- Recognizes that I bought the new 80 inch TV on Black Friday for $2200, but now I see that the same set is available online for $2100 is extraneous content.
- Resolves Do you offer price match?
- Resolving intents when a single word or a name matches an entity. For example, Trainer Tm can resolve a message consisting of only Black Friday to an intent that's associated with a entity for Black Friday.
- Distinguishing between similar utterances (Cancel my order vs. Why did you cancel my order?).
- Recognizing out-of-scope utterances, such as Show me pizza recipes or How many calories in a Meat Feast for a skill for fulfilling a pizza order and nothing else.
- Recognizing out-of-domain utterances, such as What's the weather like
today for a pizza ordering skill.
Tip:
While Trainer Tm can easily distinguish when a user message is unclassifiable because it's clearly dissimilar from the training data, you might still want to define anunresolvedIntent
with utterances that represent the phrases that you want to make sure do not resolve to any of your skill's intents. These phrases can be within the domain of your skill, but are still out of scope, even though they may share some of the same words as the training data. For example, I want to order a car for a pizza skill, which has also been trained with I want to order a pizza. - Distinguishing between similar entities – For example, Tm recognizes that mail is not same as email in the context of an intent called Sign Up for Email Deals. Because it recognizes that an entity called regular mail would be out of scope, it would resolve the phrase I want to sign up for deals through regular mail at a lower confidence than it would for I want to sign up for email deals.
Trainer Ht
Trainer Ht is the default training model. It needs only a small training corpus, so use it as you develop the entities, intents, and the training corpus. When the training corpus has matured to the point where tests reveal highly accurate intent resolution, you’re ready to add a deeper dimension to your skill’s cognition by training Trainer Tm.
You can get a general understanding of how Trainer Ht resolves intents just from the training corpus itself. It forms matching rules from the sample sentences by tagging parts of speech and entities (both custom and built-in) and by detecting words that have the same meaning within the context of the intent. If an intent called SendMoney has both Send $500 to Mom and Pay Cleo $500, for example, Trainer Ht interprets pay as the equivalent to send . After training, Trainer Ht’s tagging reduces these sentences to templates (Send Currency to person, Pay person Currency) that it applies to the user input.
Because Trainer Ht draws on the sentences that you provide, you can predict its behavior: it will be highly accurate when tested with sentences similar to the ones that make up the training corpus (the user input that follows the rules, so to speak), but may fare less well when confronted with esoteric user input.
Build Your Training Corpus
When you define an intent, you first give it a name that illustrates some user action and then follow up by compiling a set of real-life user statements, or utterances. Collectively, your intents, and the utterances that belong to them, make up a training corpus. The term corpus is just a quick way of saying “all of the intents and sample phrases that I came up with to make this skill smart”. The corpus is the key to your skill’s intelligence. By training a model with your corpus, you essentially turn that model into a reference tool for resolving user input to a single intent. Because your training corpus ultimately plays the key role in deciding which route the skill-human conversation will take, you need to choose your words carefully when building it.
Generally speaking, a large and varied set of sample phrases increases a model’s ability to resolve intents accurately. But building a robust training corpus doesn’t just begin with well-crafted sample phrases; it actually begins with intents that are clearly delineated. Not only should they clearly reflect your use case, but their relationship to their sample sentences should be equally clear. If you’re not sure where a sample sentence belongs, then your intents aren’t distinct from one another.
You probably have sample utterances in mind when you create your intents, but you can expand upon them by using these guidelines.
Guidelines for Trainer Tm
- Use a minimum confidence threshold of 0.7 for any skill that you plan to put into production.
- Use good naming conventions for your intent names so it's easy to review related intents.
- As a general rule, create at least 80 to 100 utterances for each intent. Per the corpus size and shape guidelines, the minimum (through not recommended) number of utterances for an intent is two. The total number of utterances in your training set should not exceed 25,000.
- If possible, use unmodified, real-word phrases that include:
- vernacular
- standard abbreviations that a user might enter ("opty" for opportunity, for example)
- non-standard names, such a product names
- spelling variants ("check" and "cheque", for example)
- Create fully formed sentences that mention both the action and the entity on which the action is performed.
- Try to keep the utterance length between 3 and 30 words. Utterances that are too short and lacking context can cause the model to generalize in unpredictable ways. Utterances that are too long may prevent the model from identifying the pertinent words and phrases. There can be exceptions, however, for one- or two-word utterances when they're commonly used phrases. If you expect two-word messages like order status, price check, membership info, or ship internationally) that specify both the entity and action, add them to your training data. Be sure that your sample phrases have both an action and an entity.
- Be specific. For example, What is your store phone number? is better than What is your phone number? because it enables Trainer Tm to associate a phone number with a store. As a result of this learning, it will resolve What's your mom's phone number? to a lower confidence score.
- While Trainer Tm detects out-of-scope utterances, you can
still improve confidence and accuracy by creating an
unresolvedIntent
for utterances that are in domain but still out of scope for the skill's intents. This enables Trainer Tm to learn the boundary of domain intents. You can define anunresolvedIntent
for phrases that you do not want resolved to any of your skill's intents. You may only want to define anunresolvedIntent
when user messages have been resolved to a skill's intents even when they don't apply to any of them. - Vary the words and phrases that surround the significant content as much as possible. For example, I'd like a pizza, please", "Can you get me a pizza?", "A pizza, please"
- Some practices to avoid:
- Do not associate a single word or phrase with a specific intent unless that word or phrase indicates the intent. Repeated phrases can skew the intent resolution. For example, starting each OrderPizza utterance with "I want to …" and each ShowMenu intent with "Can you help me to …" may increase the likelihood of the model resolving any user input that begins with "Can you help me to" with OrderPizza and "I want to" with ShowMenu.
- A high occurrence of one-word utterances in your intents. One-word utterances are an exception. Use them sparingly, if at all.
- Open-ended utterances that can easily apply to other domains or out-of-domain topics.
- Your corpus doesn't need to repeat the same
utterance with different casing or with different word forms
that have same lemma. For example, because Trainer Tm can
distinguish between manage, manages, and manager, it not only
differentiates between "Who does Sam manage?" and "Who manages
Sam?", but also understands that these words are related to one
another.
Note
You may be tempted to add misspellings of words. But before you do, use those misspellings in the utterance tester to see if the model recognizes them. You might be surprised at how well it handles them. Also, by not adding misspellings you run less risk of skewing your model in unexpected ways.
- Create test cases to ensure the integrity of the intent resolution.
- Run the Overview report for skill quality to maintain a balanced training set. Run the anomalies report as well to check for misclassified or unusual utterances.
- When you deploy your skill, you can continuously improve the
training data by:
- Reviewing the Conversation Logs, summaries of all conversations that have occurred for a specified period. You enable the logging by switching Enable Insights on in Settings.
- Running Skill Quality Reports and by assigning (or reassigning) actual user messages
to your intents with the Insights Retrainer. If these reports indicate
that
unresolvedIntent
has a lot of misclassified utterances within the domain intents:- Move the in-scope utterances from
unresolvedIntent
to the domain intents. - Move the out-of-scope utterances from the domain
intents to
unresolvedIntent
.
- Move the in-scope utterances from
Guidelines for Trainer Ht
- vernacular
- common misspellings
- standard abbreviations that a user might enter ("opty" for "opportunity", for example)
- non-standard names, such a product names
- spelling variants ("check" and "cheque", for example)
-
Vary the vocabulary and sentence structure in these starter phrases by one or two permutations using:
-
slang words (moolah, lucre, dough)
- standard abbreviations that a user might enter ("opty" for opportunity, for example)
- non-standard names, such a product names
-
common expressions (Am I broke? for an intent called AccountBalance)
-
alternate wording (Send cash to savings, Send funds to savings, Send money to savings, Transfer cash to savings.)
-
different categories of objects (I want to order a pizza, I want to order some food).
-
alternate spellings (check, cheque)
-
common misspellings ("buisness" for business)
-
unusual word order (To checking, $20 send)
-
- Use different concepts to express the same intent, like I am hungry and Make me a pizza
- Do not associate a single word or phrase with a specific intent unless that word or phrase indicates the intent. Repeated phrases can skew the intent resolution. For example, starting each OrderPizza utterance with "I want to …" and each ShowMenu intent with "Can you help me to …" may increase the likelihood of the model resolving any user input that begins with "I want to" with OrderPizza and "Can you help me to" with ShowMenu.
-
Avoid sentence fragments and single words. Instead, use complete sentences (which can be up to 255 characters) that include the action and the entity. If you must use single key word examples, choose them carefully.
- Create test cases to ensure the integrity of the test the intent resolution. Because adding a new intent examples can cause regressions, you might end up adding several test phrases to stabilize the intent resolution behavior.
Limits for Training Data Shape and Size
Regarding training data and shape, here are the limits to the number of intents and utterances.
Intents:
- Minimum number of intents per skill: 2
- Maximum number of intents per skill: 2,500
Utterances:
- Maximum number of utterances per skill: 25,000
- Minimum number of utterances per intent: 2
- Utterance word length: Between 3 and 30 words. Per the guidelines for Trainer Tm, there are exceptions where one or two-word utterances can be appropriate if they are commonly used.
These are technical limits, not recommendations. See Guidelines for Trainer Tm for practical recommendations for shaping your skills and providing robust training data.
Export Intent Data
To log conversations, be sure to enable Enable Insights in Settings > General before you test your intents.
To export data for a skill:- Click
to open the side menu and select Development > Skills.
- In the tile for the skill, click
and select Export Conversations.
- Choose Intent Conversation Log, set the logging period, and then click Export.
- Review the user input by opening the CSV files in a spreadsheet program.
Intent Training and Testing
Training a model with your training corpus allows your bot to discern what users say (or in some cases, are trying to say).
You can improve the acuity of the cognition through rounds of intent testing and intent training. You control the training through the intent definitions alone; the skill can’t learn on its own from the user chat.
Testing Utterances
We recommend that you set aside 20% percent of your corpus for intent testing and use the remaining 80% to train your intents. Keep these two sets separate so that the test utterances, which you incorporate into test cases, remain "unknown" to your skill.
Apply the 80/20 split to the each intent's data set. Randomize your utterances before making this split to allow the training models to weigh the terms and patterns in the utterances equally.
The Utterance Tester
The Utterance Tester is your window to your skill's cognition. By entering phrases
that are not part of the training corpus, you can find out how well you've crafted your
intents by reviewing the intent confidence ranking and the returned JSON. This ranking,
which is the skill's estimate for the best candidate to resolve the user input,
demonstrates its acuity at the current time.
Using the Utterance Tester, you can perform quick tests for one-off testing, or you can incorporate an utterance as a test case to gauge intent resolution across different versions of training models.
Test Cases
Each test has an utterance and the intent that it's expected to resolve to, which is known as a label match. A test case can also include matching entity values and the expected language for the utterance. You can run test cases when you’re developing a skill and later on, when the skill is in production, you can use the test cases for regression testing. In the latter case, you can run test cases to find out if a new release of the training model has negatively affected intent resolution.
Like the test cases that you create with the Conversation Tester, utterance test cases are part of the skill and are carried along with each version. If you extend a skill, then the extension inherits the test cases. Whereas conversation test cases are intended to test a scenario, utterance test cases are intended to test fragments of a conversation independently, ensuring that each utterance resolves to the correct intent.
Manage Test Cases


nluTestSuites
folder that houses the skill's test suites when the
skill is exported.
Create Test Suites
- Click + Test Suite.
- In the General tab, replace the placeholder name (TestSuite0001, for example) with a more meaningful one by adding a value in the Display Name field.
- Optionally, add a description that explains the functionality that's covered by the test suite.
- Populate the test suite with test cases using any (or a combination of ) the
following methods:
- Manually adding test cases (either by creating a test case or by saving an utterance as a test case from the Utterance Tester).
- Importing test cases.
Note
To assign a test case to a test suite via import, the CSV'stestSuite
field can either be empty, or must contain a name that matches the test suite that's selected in the import dialog. - Editing a test case to reassign its test suite.
- If you want to exclude the test suite from test runs that are launched using the All and Run All options, switch off Enable Test Suite.
- If you don't want the test suite included with the skill export, switch off
Include in Skill Export. When you switch off this option for a
test suite, it won't be included in the
nluTestSuites
folder that houses the skill's test suites in the exported ZIP file.
Create Utterance Test Cases
You can add test cases one-by-one using either Utterance Tester or the New Test Case dialog (accessed by clicking + Test Case), or you can add them in bulk by uploading a CSV.
Each test case must belong to a test suite, so before you create a test case, you may want to first create a test suite that reflects a capability of the skill, or some aspect of intent testing, such as failure testing, in-domain testing, or out-of-domain testing.
Tip:
To provide adequate coverage in your testing, create test suite utterances that are not only varied conceptually, but also grammatically since users will not make requests in a uniform fashion. You can add these dimensions by creating test suites from actual user message that have been queried in the Insights Retrainer and also from crowd-sourced input gathered from Data Manufacturing.Add Test Cases from the Utterance Tester
- Click Test Utterances.
- If the skill is multi-lingual, select the native language.
- Enter the utterance then click Test.
- Click Save as Test Case then choose a test suite.
Create a Test Case
- Click Go to Test Cases in the Utterance Tester.
- Click + Test Case.
- Complete the New Test Case dialog:
- If needed, disable the test case.
- Enter the test utterance.
- Select the test suite.
- Select the expected intent. If you're creating a test case for failure testing, select unresolvedIntent.
- For multi-lingual skills, select the language tag and the expected language.
- Click Add to Suite. From the Test Cases
page, you can delete a test case, or edit a test case, which includes reassigning the test
case to a different test suite.
- To test for entity values:
- Switch on Test Entities. Then click Continue.
- Highlight the word (or words) and
then apply an entity label to it by selecting an entity from the list. When you're
done, click Add to Suite.
Note
Always select words or phrases from the test case utterance after you enable Test Entities. The test case will fail if you've enabled Test Entities but have not highlighted any words.
Import Test Cases for Skill-Level Test Suites
testSuite
– The name of the test suite to which the test case belongs. ThetestSuite
field in each row of the CSV can have a different test suite name or can be empty.- Test cases with empty
testSuite
fields get added to a test suite that you select when you import the CSV. If you don't select a test suite, they will be assigned to Default Test Suite. - Test cases with populated
testSuite
fields get assigned to the test suite that you select when you import the CSV only when the name of the selected test suite matches the name in thetestSuite
field. - If a test suite by the name of the one specified in
testSuite
field doesn't already exist, it will be created after you import the CSV.
- Test cases with empty
utterance
– An example utterance (required). Maps toquery
in pre-21.04 versions of Oracle Digital Assistant.expectedIntent
– The matching intent (required). This field maps toTopIntent
in pre-21.04 versions of Oracle Digital Assistant.Tip:
Importing Pre-21.04 Versions of the CSV tells you how to reformat Pre-21.04 CSVs so that you can use them for bulk testing.enabled
–TRUE
includes the test case in the test run.FALSE
excludes it.languageTag
– The language tag (en
, for example). When there's no value, the language detected from the skill's language settings is used by default.expectedLanguageTag
(optional) – For multilingual skills, this is the language tag for the language that you want the model to use when resolving the test utterance to an intent. For the test case to pass, this tag must match the detected language.expectedEntities
– The matching entities in the test case utterance, represented as an array ofentityName
objects. EachentityName
identifies the entity value's position in the utterance using thebeginOffset
andendOffset
properties. This offset is determined by character, not by word, and is calculated from the first character of the utterance (0-1). For example, theentityName
object for the PizzaSize entity value of small in I want to order a small pizza is:[{"entityName":"PizzaSize","beginOffset":18,"endOffset":23,"originalString":"small"}, …]

- Click More, then select Import.
- Browse to, then select the CSV.
- Choose the test suite. The test case can only be assigned to the
selected test suite if the
testSuite
field is empty or matches the name of the selected test suite. - Click Upload.
Importing Pre-21.04 Versions of the CSV
query
and TopIntent
fields, get added to Default
Test Suite only. You can reassign these test cases to other test suites individually by
editing them after you import the CSV, or you can update the CSV to the current format
and then edit before you import it as follows:
- Click More > Import.
- After the import completes, select Default Test Suite, then click More > Export Selected Suite. The exported file will be converted to the current format.
- Extract the ZIP file and edit the CSV. When you've finished, import
the CSV again ( More > Import).
You may need to delete duplicate test cases from the Default Test Suite.
Note
If you upload the same CSV multiple times with minor changes, any new or updated data will be merged with the old: new updates get applied and new rows are inserted. However, you can't delete any utterances by uploading a new CSV. If you need to delete utterances, then you need to delete them manually from the user interface.
Create Test Runs
Test runs are a compilation of test cases or test suites aimed at evaluating some aspect of the skill's cognition. The contents (and volume) of a test run depends on the capability that you want to test, so a test run might include a subset of test cases from a test suite, a complete test suite, or multiple test suites.
The test cases included in a test run are evaluated against the confidence threshold that's set for the skill. For a test case to pass in the overall test run, it must resolve to the expected intent at, or above, the confidence threshold. If specified the test case must also satisfy the entity value and language-match criteria. By reviewing the test run results, you can find out if changes made to the platform, or to the skill itself, have compromised the accuracy of the intent resolution.
In addition to testing the model, you can also use the test run results to assess the reliability of your testing. For example, results showing that nearly all of the test cases have passed might, on the surface, indicate optimal functioning of the model. However, a review of the passing test cases may reveal that the test cases do not reflect the current training because their utterances are too simple or have significant overlap in terms of the concepts and verbiage that they're testing for. A high number of failed tests, on the other hand, might indicate deficiencies in the training data, but a review of these test cases might reveal that their utterances are paired with the wrong expected intents.
- Click Run All to create a test run for all
of the test cases in a selected test suite. (Or if you want to run all test
suites, select All then click Run
All).
- To create a test run for a selection of test cases within a
suite (or a test run for subset of all test cases if you selected
All), filter the test cases by adding a
string that matches the utterance text and an expected intent. Select
the utterance(s), then click Run.
- To exclude test suite from the test run, first select the test suite,
open the General tab, and then switch off Enable Test
Suite.
- For multilingual skills, you can also filter by
Language Tag and Expected
Language options (accessed through Optional
Attributes).
- To create a test run for a selection of test cases within a
suite (or a test run for subset of all test cases if you selected
All), filter the test cases by adding a
string that matches the utterance text and an expected intent. Select
the utterance(s), then click Run.
- Enter a test run name that reflects the subject of test. This is an optional step.
- Click Start
- Click Test Results, then select the test
run.
Tip:
Test runs that contain a large number of test cases may take several minutes to complete. For these large test runs, you may need to click Refresh periodically until the testing completes. A percentage replaces the In Progress status for the Accuracy metric and the Intents report renders after all of the test cases have been evaluated.
- Review the test run reports. For example, first review the high-level metrics for the test run provided by the Overview report. Next, validate the test results against the actual test cases by filtering the Test Cases report, which lists all of the test cases included in the test run, for passed and failed test cases. You can then examine the individual test case results. You might also compare the Accuracy score in the Overview report to the Accuracy score in the Intents report, which measures the model's ability to predict the correct intents. To review the test cases listed in this report, open the Test Cases report and filter by intents.
Test Run Summary Report
The Summary report provides you with an overall assessment of how successfully the
model can handle the type of user input that's covered in the test run. For the test
suites included in the test run, it shows you the total number of test cases that have
been used to evaluate the model, and from that total, both the number of test cases
(both reliable and unreliable) that failed along with the number of reliable and
unreliable test cases that passed. The model's overall accuracy – its ability to predict
expected intents at or above the skill's confidence level, recognize entity values, and
resolve utterances in the skill's language – is gauged by the success rate of the
passing tests in the test run.
Summary Report Metrics
- Accuracy – The model's accuracy in terms of
the success rate of the passing test cases (the number of passing test cases
compared to the total number of test cases included in the test run).
Note
Disabled test cases are not factored into the Accuracy score. Neither are the tests that failed because of errors. Any test that failed is instead added to the Failed count.A low Accuracy score might indicate the test run is evaluating the model on concepts and language that are not adequately supported by the training data. To increase the Accuracy score, retrain the model with utterances that reflect the test cases in the test run.
This Accuracy metric applies to the entire test run and provides a separate score from the Accuracy metric in the Intents report. This metric is the percentage of test cases where the model passed all of the test case criteria. The Accuracy score in the Intents report, on the other hand, is not end-to-end testing. It is the percentage of test cases where the model had only to predict the expected intent at, or above the skill's confidence threshold. Other test case criteria (such as enity value or skill language) are not factored in. Given the differing criteria of what a passing test case means for these two reports, their respective Accuracy scores may not always be in step. The intent match Accuracy score may be higher than the overall test run score when the testing data is not aligned with the training data. Retraining the model with utterances that support the test cases will enable it to predict the expected intents with higher confidence that will, in turn, increase the Accuracy score for the test run.
Note
The Accuracy metric is not available until the test run has completed and is not available for test runs that were completed when the skill ran on pre-22.12 versions of the Oracle Digital Assistant platform. - Test Cases – The total number of test cases (both reliable and unreliable test cases) included in the test run. Skipped test cases are included in this tally, but they are not considered when computing the Accuracy metric.
- Passed – The number of test cases (both reliable and unreliable) that passed by resolving to the intent at the confidence threshold and by matching the selected entity values or language.
- Failed – The number of test cases (bot
reliable and unreliable) that failed to resolve to the expected intent at the
confidence threshold and failed to match the selected entity values or
language.
To review the actual test cases behind the Passed and Failed metrics in this report, open the Test Cases report and then apply its Passed or Failed filters.
Intents Report
The metrics in this report track the model's label matches throughout the test run's test cases. This is where the model correctly predicts the expected intent for the test case utterance. Within the context of this report, accuracy, passing, and failing are measured in terms of the test cases where the model predicted the correct expected intent at, or above, the confidence threshold. Other criteria considered in the Summary report, such as entity value matches or skill language are not considered. As a result, this report provides you with a different view of model accuracy, one that helps you to verify if the current training enables the model to consistently predict the correct intents.
This report is not available for test runs that were completed when the skill ran on a pre-22.12 version of the Oracle Digital Assistant platform.

Intents Report Metrics
- Test Cases – The number of test cases
included in this test run. This total includes both reliable and unreliable test
cases. Skipped test cases are not included in this tally.
Tip:
The unreliable test case links for the Test Cases, Passed and Failed metrics open the Test Cases report filtered by unreliable test cases. This navigation is not available when you filter the report by test suite. - Accuracy – The model's accuracy in matching
the expected intent at, or above, the skill's confidence threshold across the
test cases in this test run. The Label Match submetric
represents the percentage of test cases in the test run where the model
correctly predicted the expected intent, regardless of the confidence score.
Because Label Match factors in failing test cases along with passing test cases,
its score may be higher than the Accuracy score.
You can compare this Accuracy metric with the Accuracy metric from the Summary report. When the Accuracy score in Summary report is low, you can use this report to quickly find out if the model's failings can be attributed to its inability to predict the expected intent. When the Accuracy score in this report is high, however, you can rule out label matching as root of the problem and, rather than having to heavily revise the training data to increase the test run's Accuracy score, you can instead focus adding utterances that reflect the concepts and language in the test case utterances.
- Passed – The number of test cases (reliable and unreliable) where the model predicted the expected intent at the skill's confidence threshold.
- Failed – The number of test cases (reliable and unreliable) where the model predicted the expected intent below the skill's confidence threshold.
- Confidence Pass – An average of the confidence scores for all of the test cases that passed in this test run.
- Confidence Fail – An average of the confidence scores for all of the test cases that failed in this test run.
When you filter the Intents report by test suite, access to the Test Cases report from the unreliable test case links in the Test Cases, Passed, and Failed tiles is not available. These links become active again when you remove all entries from the Filter by Test Suite field.
Filter by Test Suite

The report adjusts the metrics for each test suite that you add (or subsequently remove). It tabulates the intent matching results in terms of the number of enabled test cases that belong to the selected test suite.
You can only filter on test suites that have been run on Version 23.06 of the Oracle Digital Assistant platform. Because test suites that were run on a prior version of the platform can't be filtered, you'll have to run them again when you upgrade to 23.06.

Filtering by test suite disables navigation to the Test Cases report from the unreliable test cases links in the Test Cases, Passed, and Failed tiles. The links in the Total column of the Intents Breakdown are also disabled. All of these links become active again after you remove all of the entries from the Filter by Test Suite field.
Intents Breakdown
The Filter by Intent field changes the view of the Intents Breakdown table but does not change the report's overall metrics. These metrics reflect the entries (or lack of entries) in the Filter by Test Suite field.
- Intent – The name of the expected intent.
- Total – The number of test cases,
represented as a link, for the expected intent. You can traverse to the Test
Cases report by clicking this link.
Note
You can't navigate to the Test Cases report when you've applied a test suite filter to this report. This link becomes active again when you remove all entries from the Filter by Test Suite field. - Accuracy – The percentage of test cases that resulted in label matches for the expected intent at, or above the skill's confidence threshold.
- Passed – The number of test cases (including unreliable test cases) where the model predicted the expected intent at, or above, the skill's confidence threshold.
- Passed - Unreliable – The number test cases where the model predicted the expected intent at 5% or less above the skill's confidence threshold.
- Failed – The number of test cases in the test run that failed because the model predicted the expected intent below the skill's confidence threshold.
- Failed - Unreliable – The number test cases that failed because the model's confidence in predicting the expected intent fell 5% below the skill's confidence threshold. These test cases can factor into the
- Label Match – The number of test cases where the model successfully predicted the expected intent, regardless of confidence level. Because it factors in failed test cases, the Label Match and Accuracy scores may not always be in step with one another. For example, four passing test cases out of five results in an 80% Accuracy score for the intent. However, if the model predicted the intent correctly for the one failing test case, then Label Match would outscore Accuracy by 20%.
- Confidence Pass – An average of the confidence scores for all of the test cases that successfully matched the expected intent.
- Confidence Fail – An average of the
confidence scores for all of the test cases that failed to match the expected
intent.
Tip:
To review the actual test cases, open the Test Cases report and the filter by the intent.

Test Cases Report
- You can filter the results by clicking All,
Passed (green), or Failed
(red). The test cases counted as skipped include both disabled test cases and
test cases where the expected intent has been disabled.
You can filter the results by unreliable test cases by either clicking Show me unreliable cases in the warning message, or by selecting Only Unreliable Cases filter. - If needed, filter the results for a specific intent or entity or by reliable or unreliable test cases.
- For unreliable and failed test cases, click View Similar
Utterances (located in the Test Info page) to find out if the
test case utterance has any similarity to the utterances in the training
set.
- Check the following results:
- Test Info – Presents the test case overview, including the target confidence threshold, the expected intent, and the matched entity values.
- Test Result – The ranking of intent by confidence level. When present, the report also identifies the entities contained in the utterance by entity name and value. You can also view the JSON object containing the full results.
- Failure Analysis – Explains why the test case failed. For example, the actual intent is not the expected intent, the labeled entity value in the test case doesn't match the resolved entity, or the expected language is not the same as the detected language.
Unreliable Test Cases
Some test cases cannot provide consistent results because they resolve within 5% or less of the Confidence Threshold. This narrow margin makes these test cases unreliable. When the skill's Confidence Threshold is set a 0.7, for example, a test case that's passing at 74% may fail after you've made only minor modifications to your training data or because the skill has been upgraded to a new version of the model. The fragility of these test cases may indicate that the utterances that they represent in the training data may be too few in number and that you may need to balance the intent's training data with similar utterances.
- Run the test suite. Then click Test Results
and select the test run. The unreliable test cases are sorted at the beginning
of the test run results and are flagged with warnings
.
- To isolate the unreliable test cases:
- Click Show me the unreliable cases in
the message.
- Select Only Unreliable Cases from the
Filter by Cases menu.
- Click Show me the unreliable cases in
the message.
- To find the proximity of the test case's top-ranking intent to the
Confidence Threshold, open the Test Result window. For a comparison of the
top-ranking confidence score to the Confidence Threshold, click
.
- If you need to supplement the training data for the top-ranking
intent, click Go to top intent in the warning
message.
- If you want to determine the quantity of utterances that are
represented by the test case in the training data, click View Similar
Utterances.
You can also check if any of the utterances most similar to the test case utterance are also anomalies in the training set by running the Anomalies Report.
Exported Test Runs
Test runs are not persisted with with the skill, but you can download them to your system for analysis by clicking Export Test Run. If the intents no longer resolve the user input as expected, or if platform changes have negatively impacted intent resolution, you can gather the details for an SR (service request) using the logs of exported test runs.
Failure Testing
Failure (or negative) testing enables you to bulk test utterances that should never be resolved, either because they result in unresolvedIntent, or because they only resolve to other intents below the confidence threshold for all of the intents.
- Specify unresolvedIntent as the Expected Intent for all of the test
cases that you expect to be unresolved. Ideally, these "false" phrases will
remain unresolved.
- If needed, adjust the confidence threshold when creating a test run
to confirm that the false phrases (the ones with
unresolvedIntent
as their expected intent) can only resolve below the value that you set here. For example, increasing the threshold might result in the false phrases failing to resolve at the confidence level to any intent (including unresolvedIntent), which means they pass because they're considered unresolved. - Review the test results, checking that the test cases passed by matching unresolvedIntent at the threshold, or failed to match any intent (unresolvedIntent or otherwise) at the threshold.
Similar Utterances
You can find out how similar your test phrase is to the utterances in the training
corpus by clicking View Similar Utterances. This tool provides
you with an added perspective on the skill's training data by showing you how similar
its utterances are to the test phrase, and by extension, how similar the utterances are
to one another across intents. Using this tool, you can find out if the similarity of
the test phrase to utterances belonging to other intents is the reason why the test
phrase is not resolving as expected. It might even point out where training data belongs
to the wrong intent because if its similarity to the test phrase.
The list generated by this tool ranks 20 utterances (along with their associated intents) that are closest to the test phrase. Ideally, the top-ranking utterance on this list – the one most like the test phrase – belongs to the intent that's targeted for the test phrase. If the closest utterance that belongs to the expected intent is further down, then a review of the list might provide a few hints as to why. For example, if you're testing a Transactions intent utterance, how much money did I transfer yesterday?, you'd expect the top-ranking utterance to likewise belong to a Transactions intent. However, if this test utterance is resolving to the wrong intent, or resolving below the confidence level, the list might reveal that it has more in common with highly ranked utterances with similar wording that belong to other intents. The Balances intent's How much money do I have in all of my accounts?, for example, might be closer to the test utterance than the Transactions intent's lower-ranked How much did I deposit in April? utterance.

You can only use this tool for skills trained on Trainer Tm (it's not available for skills trained with Ht).
- Filter by Intent – Returns 20 utterances that
are closest to the test utterance that belong to the selected intent (or
intents).
- Filter by Utterance – Returns 20 of the of
utterances closest to the test utterance that contain a word or phrase.
- Language – For multi-lingual skills, you can query and filter the
report by selecting a language.
Applying these filters does not change the rankings, just the view. An utterance ranked third, for example, will be noted as such regardless of the filter. The report's rankings and contents change only when you've updated the corpus and retrained the skill with Trainer Tm.
Tutorial: Best Practices for Building and Training Intents
Use this tutorial to find out about batch testing and other testing and training tips: Best Practices for Building and Training Intents.
Reference Intents in the Dialog Flow
You can reference intents directly from the dialog flow.
For dialog flows designed in Visual mode you define intent events in the Main Flow. see Map an Intent to a Flow.
System.Intent
component to navigate to appropriate state for the resolved
intent. For example, here's what the System.Intent
might like if you had
intents called OrderPizza and
CancelPizza: intent:
component: "System.Intent"
properties:
variable: "iResult"
transitions:
actions:
OrderPizza: "startOrder"
CancelPizza: "cancelorder"
unresolvedIntent: "unresolved"
Tune Intent Resolution Before Publishing
Before you publish a version of a skill (and thus freeze that version), you should thoroughly test it and, if necessary, adjust its settings to fine tune its intent resolution.
You can use these settings to tune intent resolution:
-
Confidence Threshold: Determines the minimum confidence level required for user input to match an intent. It's recommended to set this value to
.70
or higher. -
Confidence Win Margin: When a skill has multiple intents that exceed the value of the Confidence Threshold, it displays a list of possible intents and prompts the user to choose one. This property helps the skill determine what intents should be in the list. Set the maximum level to use for the delta between the respective confidence levels for the top intents. The list includes the intents that are greater than or equal to this delta and exceed the value set for the Confidence Threshold.
To access these settings:
-
Click
to open the side menu, select Development > Skills, and open your bot.
-
In the left navigation for the skill, click
and select the Configuration tab.
Once you add a skill to a digital assistant, there is another range of settings that you may need to adjust to better handle intent resolution in the context of the digital assistant. See Tune Routing Behavior.
How Confidence Threshold Works
You use the Confidence Threshold property to adjust the likelihood that given user input will resolve to the skill's intents.
When you increase the confidence threshold, you increase the certainty that any
matching intents are accurate (not false positives). However, this also increases the
chance that intents that you want to match with certain input will not get high enough
confidence scores for the matching to occur, thus resulting in matches to
unresolvedIntent
.
When you lower the value of the Confidence Threshold property, you reduce the chance that intents that you want to match will fail to match. However, the lower you set this threshold, the greater risk you have of generating false positives in your matches.
As a general rule the underlying language model works better with higher confidence
thresholds, so you should set the confidence threshold to 70% (.70
)
or higher to get the best results.
How Confidence Win Margin Works

Answer Intents
In some cases, a user's question requires only a single answer and no further conversation. Answer intents enable your skill to output these types of replies without you having to update the dialog definition.
You can create answer intents in the following ways:
- Use the Knowledge feature to generate answer intents from an existing resource, such as an FAQ that is hosted on a web page or in a PDF document.
- On the skill's Intents page, define answer intents like you would any other intent but also include an answer in the Answer field.
- Do bulk creation of answer intents by uploading a CSV file.
Here are a few more things you need to know about answer intents:
- Skills with answer intents should be trained with Trainer Tm.
- Unlike regular intents, you don't need to map answer intents to flows (in the Visual
Flow Designer) or to states with
System.Intent
actions (in the YAML editor).- In the Visual Flow Designer, you can create a standard flow that handles all answer intents, map specific answer intents, or use a combination of the approaches.
- In the YAML editor, you just need to have a
System.Intent
component to resolve the answer intents.
- You can optionally store the answer intent in a resource bundle by clicking
The resource bundle entries for answer intents are listed in the resource bundle's Q&A page.
Generate Answer Intents from an Existing Knowledge Resource
If you already have a web page or PDF document with question and answer pairs, you can use the Knowledge feature to ingest those Q&A pairs from the document and generate answer intents automatically. (Other text in the document that is not organized as question/answer pairs is ignored.) When you create answer intents this way, example utterances are also generated for the intents.
To generate answer intents from a question and answer document:
- In the left navbar of the skill, click
.
- Click + Knowledge Document.
- In the New Knowledge Document dialog:
- Specify a name and language for the document.
For the language, you can select from the natively-supported languages that you have specified for your skill.
- Select PDF or
HTML and upload the document, or select
URL.
If you are providing a URL, it must point to a static HTML page containing the FAQ.
- If the document is a PDF or HTML document for upload, select the checkbox acknowledging that it will be temporarily stored.
- Click Create.
Note
The URL option only works for HTML web pages. If you want to import an online PDF file, you need to first download it from the web page and then upload it into Digital Assistant. - Specify a name and language for the document.
- Wait for the generation of the answer intents to occur.
(The status and progress of the job will be updated every 10 seconds.)
- Once the job is completed, click Review Intents
to go over the generated intents and training utterances. Pay particular attention
to each question and answer to make sure that each contains the right text.
Tip:
For PDF documents, you can click Open PDF to view a color-coded version of the document to see what text was used to generate the intents and how it was divided into questions and answers. - To edit an intent's name, question, answer, or utterances, click its
Edit icon.
Note
You can also later edit these values on the Intents page. - For an intents that you don't want added to the skill, clear the Include checkbox.
- Click Add Intents to Skill to add the generated intents to the skill.
- In the left navbar, click Intents
and make any further adjustments to the intents, such as changing the conversation name and adding further example utterances.
The answers are generated with HTML tags for formatting that is included in the original for things like bold text, italics, and hyperlinks. This markup is then automatically transformed into the appropriate markup or markdown for the channels through which the skill is exposed. (If the channel doesn't support formatting, the tags are removed when the message is sent through that channel.) See Rich Text Formatting in Channels.
There is a limit of 100 answer intents that can be created at a time. If your knowledge document has more than 100 question/answer pairs, divide the document into smaller documents and create the answer intents from each of those documents.
Create a Single Answer Intent
If you need just a few answer intents, you can create them similarly to how you create regular intents.
- Click Intents
in the left navbar.
- Click Add Intent.
- Click
to enter a descriptive name or phrase for the intent in the Conversation Name field.
- Add the intent name in the Name field. If you don't
enter a conversation name, then the Name field value is used
instead.
Note
In naming your intents, do not usesystem.
as a prefix.system.
is a namespace that's reserved for the intents that we provide. Because intents with this prefix are handled differently by Trainer Tm, using it may cause your intents to resolve in unexpected ways. - Click
and then add an answer to the Answer field. Apply formatting to the text as needed. - In the Examples section, add training utterances that reflect typical ways that users would express the question that the intent is answering.
Create Answer Intents from a CSV File
query
,
topIntent
, and conversationName
columns, it also
has the answer
column:query,topIntent,conversationName,answer
What are your hours?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
When are you open?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
When do you close?,StoreHours,Our Store Hours,"We're open from 9-5, Mondays-Thursdays or by appointment."
What do you sell?,Products,Our Products,We sell only hammers. All types.
Do you sell brick hammers?,Products,Our Products,We sell only hammers. All types.
Do you sell claw hammers?,Products,Our Products,We sell only hammers. All types.
Do you deliver?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can I buy one of your hammers on the web?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can you mail me a hammer?,Delivery_and_Pickup,Pickup and Delivery options,"No delivery service, sorry. Purchases are in-store only"
Can I return a hammer?,Returns,Our Return Policy,You cannot return any items. All sales are final.
My hammer doesn't work,Returns,Our Return Policy,You cannot return any items. All sales are final.
Can I exchange my hammer,Returns,Our Return Policy,You cannot return any items. All sales are final.
DO's and DON'Ts for Conversational Design
Creating a robust set of intents for a successful skill requires a lot of attention. Here are some best practices to keep in mind.
Intent Design and Training
DO | DON'T |
---|---|
DO plan to add utterances until you get results you expect. Generally speaking, models perform well as you add more quality training utterances. The number of utterances you need depends on the model, the training data, and the level of accuracy that is realistic for your model. | DON'T over-train individual intents. Don’t add excessive training data to some intents to make them work "perfectly". If intent resolution is not behaving as expected, evaluate your intent structure for overlap between intents. Intent resolution will NEVER be 100% accurate. |
DO use real world data. Using the actual language that your skill is most likely to encounter is critical. Fabricated utterances can only take you so far and will not prepare your skill for real-world engagement. | DON'T use just keywords in training data. While it is acceptable to use single words/short phrases for training, the training data should have the same structure as the user’s inputs. The fewer the words in utterances, the less successful classification will be. |
DO use whole sentences to train intents. While it’s OK to use short training utterances, be sure to match the conversational style of your users as closely as possible. | DON'T inadvertently skew intents. Be careful of words which add no specific meaning (e.g. "please" and "thanks") or entity values within utterances as they can inadvertently skew intent resolution if they are heavily used in one intent but not in another. |
DO use similar numbers of utterances per intent. Some intents (e.g., "hello", "goodbye") may have fewer utterances in their training sets. However, ensure that your main intents have a similar number of utterances to avoid biasing your model. | DON’T rely ONLY on intent resolution. Use entities to disambiguate common intents. If there’s linguistic overlap between intents, consider using entities to disambiguate the user’s intentions (and corresponding unique conversational path). |
DO handle small talk. Users will make requests that are not relevant to the skill's purpose, such as for jokes and weather reports. They may also do things like ask if the skill is human. Ensure that you have a small talk strategy and aggressively test how the skill responds at all steps of your conversational flow. | DON’T overuse unresolvedIntent. Create “out-of-scope" intents for the things you know you don't know (that you may or may not enable the skill to do later). |
DO consider multiple intents for a single use case. Customers may express the same need in multiple ways, e.g. in terms of the solution they desire OR the symptom of their problem. Use multiple intents that all resolve to the same "answer". | DON’T ignore abusive interactions. Similar to small talk, have a plan for abuse. This plan may need to include measures to ensure any abusive input from the user is not reflected back by the skill, as well as provisions for immediate escalation. |
Conversational User Experience
DO | DON'T |
---|---|
DO give indications of most likely responses (including help and exit). For example, "Hey, I'm Bob the Bot. Ask me about X, Y, or Z. If you run into any problems, just type 'help'." | DON'T delay conversational design until "later in the project". For all but the simplest skills, conversational design must be given the same priority and urgency as other development work. It should start early and proceed in parallel with other tasks. |
DO consider a personality for your bot. You should consider the personality and tone of your bot. However, be careful of overdoing human-like interaction (humor and sympathy often don't resonate well from a bot) and never try to fool your users into thinking that they are interacting with a human. | DON'T say that the skill "is still learning". While well-intended, this bad practice signals to the user (consciously or subconsciously) that the skill is not up to the task. |
DO guide the user on what is expected from them. The skill should try to guide the user toward an appropriate response and not leave questions open ended. Open-ended questions make the user more likely to fall off the happy path. | DON'T use "cute" or "filler" responses. See "DO guide the user on what is expected from them". |
DO break up long responses into individual chat bubbles and/or use line breaks. Large blobs of text without visual breaks are hard to read and can lead to confusion. | DON'T say "I’m sorry, I don’t understand. Would you please rephrase your question?" This lazy error-handling approach is, more often than not, inaccurate. No matter how many times a user rephrases an out-of-scope question, the skill will NEVER have anything intelligent to say. |
-- | DON'T overuse "confirmation" phrases. Confirmation phrases have their place. However, don’t overuse them. Consider dialog flows that are able to take confidence levels into account before asking users to confirm. |
Test Strategies
DO | DON'T |
---|---|
DO develop utterances cyclically. Developing a robust training corpus requires multiple iterations and testing cycles and ongoing monitoring and tuning. Use a cyclical "build, test, deploy, monitor, update" approach. | DON'T neglect the need for a performance measurement and improvement plan. Lacking a plan for measuring and improving your skill, you'll have no way of knowing whether it’s really working. |
DO test utterances using the 80/20 rule. Always test the robustness of your intents against one another by conducting multiple 80/20 tests, where 80% of newly harvested utterances are used to train the model and 20% are added to your testing data. | DON'T test only the happy path. "Getting it working" is 20% of the work. The remaining 80% is testing and adjusting how the skill responds to incorrect input and user actions. |
DO test skill failure. Aggressively try to break your skill to see what happens. Don’t rely solely on positive testing. | DON'T ignore processing out of order messages. Users will scroll back in conversation history and click on past buttons. Testing the results need to be part of your 80% work (as noted in DON'T test only the happy path). |
-- | DON’T forget to re-test as you update your intents. If you add more training data (e.g., as you bot gets more real-world usage) and/or you add new intents for new use cases, don’t forget to retest your model. |
Project Considerations
DO | DON'T |
---|---|
DO select use cases that are enhanced by conversational UI (CUI). Enabling conversational UI (via skills and digital assistants) is work. Make sure that the use case will be truly enhanced by adding CUI. | DON'T fail to have an escalation path. Even if you don’t plan on allowing escalation to a human, you must have a strategy for those interactions where the skill can’t help. |
DO anticipate the first day being the worst day. Even the best-tested skills and digital assistants require tuning on day 1. | DON'T disband the project team immediately after launch. When scheduling your skill project, ensure that you keep the skill’s creators (Conversational Designer, Project Manager, Tech Lead, etc.) on the project long enough for adequate tuning and, ultimately, knowledge transfer. |
Names You Can't Use for Intents
Intent names can not start with system. ("system" followed by ".").
The Automated Agent Assistant (which is digital assistant template available in the Skill Store) has several such intents, but they are treated as a special case and should not be used elsewhere.