Evaluations

Evaluating the model performance with AI Quick Actions

With deployed models, you can create a model evaluation to evaluate its performance. You can select a dataset from Object Storage or upload one from the storage of the notebook you're working in. To upload datasets from your notebook, you must first set up policies that let the notebook session write files to Object Storage. You can label your model evaluation with an experiment name. You can either select from an existing experiment or create a new one. BERTScore, BLEU Score, Perplexity Score, Text Readability, and ROUGE are the evaluation metrics available for measuring model performance. You can save the model evaluation result in Object Storage. You can set the model evaluation parameters. Under advanced options, you can select the compute instance shape for the evaluation and optionally enter the Stop sequence. In addition, you can set up logging with your model evaluation to monitor it. Logging is optional but we recommend it to help troubleshoot errors with evaluation. You need to have the necessary policy to enable logging. For more information on logging, see Logs section. You can review the configurations and parameters of your evaluation before creating it.

If you go back to the Evaluation tab, you see the evaluation lifecycle state is Succeeded when the model evaluation is completed. You can view the evaluation result and download a copy of the model evaluation report to your local machine.

See Evaluation on GitHub for more information on, and tips about, Evaluations.

Note

Evaluations can't be run on ARM-based shapes.

1. Under AI Quick Actions, click Evaluations.
  The Evaluations page is shown.
2. Select Create evaluations.
3. Enter the name of the evaluation.
4. Select the model deployment name.
5. (Optional) Enter a description of the evaluation.
6. To specify a dataset, select Choose an existing dataset or Upload dataset from notebook storage.
7. (Optional) If you selected Choose an existing dataset in step 6, select the compartment.
8. (Optional) If you selected Choose an existing dataset in step 6, select the Object Storage location of the dataset.
9. (Optional) If you selected Choose an existing dataset in step 6, specify the Object Storage path.
10. To specify an experiment, select Choose an existing experiment or Create a new experiment. Use experiments to group similar models together for evaluation.
11. Optional: If you selected Choose an existing experiment, select the experiment.
12. Optional: If you selected Create a new experiment:
  
  Enter the experiment name.
  
  Optional: Give the experiment a description.
13. Specify the Object Storage bucket to store the results in.
  
  Select the compartment.
  
  Select the Object Storage location.
  
  Optional: Specify the Object Storage path.
14. Select Next.
15. (Optional) Under Parameters, update the model evaluation parameters from the default values.
16. Select Show advance options.
17. Specify the instance shape and stop sequence to use.
18. (Optional) Under Logging, specify the log group and log to use.
19. Select Next.
  The review page is shown for the evaluation you want to create.
20. Select Submit to start the evaluation.
21. When the evaluation completes, and the Lifecycle state is set to Succeeded, click the arrow next to the evaluation.
  The evaluation metrics and model parameters are shown. Select Download to download the report in HTML format.
For a complete list of parameters and values for AI Quick Actions CLI commands, see AI Quick Actions CLI.
This task can't be performed using the API.

Oracle Cloud Infrastructure Documentation Try Free Tier

Evaluations

Oracle Cloud Infrastructure Documentation
Try Free Tier