Choosing a Fine-Tuning Method in Generative AI
When you create a custom model, OCI Generative AI fine-tunes the pretrained base models using a method that matches the base model.
Important
Some OCI Generative AI foundational pretrained base models supported for the dedicated serving mode are now deprecated and will retire no sooner than 6 months after the release of the 1st replacement model. You can host a base model, or fine-tune a base model and host the fine-tuned model on a dedicated AI cluster (dedicated serving mode) until the base model is retired. For dedicated serving mode retirement dates, see Retiring the Models.
Some OCI Generative AI foundational pretrained base models supported for the dedicated serving mode are now deprecated and will retire no sooner than 6 months after the release of the 1st replacement model. You can host a base model, or fine-tune a base model and host the fine-tuned model on a dedicated AI cluster (dedicated serving mode) until the base model is retired. For dedicated serving mode retirement dates, see Retiring the Models.
The following table lists the method that Generative AI uses to train each type of base model:
Pretrained Base Model | Training Method |
---|---|
cohere.command-r-08-2024
|
|
cohere.command-r-16k
|
|
meta.llama-3.1-70b-instruct
|
|
cohere.command (deprecated) |
|
cohere.command-light (deprecated) |
|
meta.llama-3-70b-instruct (deprecated) |
|
Note
For information about the hyperparameters used for each training method, see Hyperparameters for Fine-Tuning a Model in Generative AI.
For information about the hyperparameters used for each training method, see Hyperparameters for Fine-Tuning a Model in Generative AI.
Choosing Between T-Few
and Vanilla
For the cohere.command
and cohere.command-light
models,
OCI
Generative AI has two training methods:
T-Few
, and Vanilla
. Use the following guidelines to help
you choose the best training method for your use cases.
Feature | Options and Recommendations |
---|---|
Training methods for cohere.command and
cohere.command-light
|
|
Dataset Size |
Using small datasets for the |
Complexity |
|
Hosting |
|