Retiring the Models

OCI Generative AI retires its large language models (LLMs) based on each model's type and serving mode. The LLMs serve user requests in either an on-demand serving mode or a dedicated serving mode. Review the following sections to learn about deprecation and removal times and to decide what serving mode works best for you.

Terminology

Retirement
When a model is retired, it's no longer available for use in the Generative AI service.
Deprecation
When a model is deprecated it remains available in the Generative AI service, but will have a defined amount of time that it can be used before it's retired.

About Serving Modes

On-Demand Serving Mode

The on-demand serving mode is available only for pretrained foundational models and has the following characteristics:

  • When OCI Generative AI releases a new model version or family, you might get an overlapping period during which both versions or two families of the same model are supported until the older model version or family is retired.
  • All model family and versions are not available in all supported OCI regions. See the key features in Pretrained Foundational Models in Generative AI for the available models in each region.
Supported On-Demand Serving Mode Models

The following table shows the model retirement dates for the on-demand serving mode.

Model Release Date Retirement Date Suggested Replacement Options
meta.llama-3.1-405b-instruct 2024-09-19 At least one month after the release of the 1st replacement model. Tentative
meta.llama-3.1-70b-instruct 2024-09-19 At least one month after the release of the 1st replacement model. Tentative
cohere.command-r-plus 2024-06-18 At least one month after the release of the 1st replacement model. Tentative
cohere.command-r-16k 2024-06-04 At least one month after the release of the 1st replacement model. Tentative
meta.llama-3-70b-instruct 2024-06-04 2024-10-22

meta.llama-3.1-70b-instruct

meta.llama-3.1-405b-instruct

cohere.command 2024-02-07 2024-10-02

cohere.command-r-plus

cohere.command-r-16k

cohere.command-light 2024-02-07 2024-10-02

cohere.command-r-plus

cohere.command-r-16k

cohere.embed-english-v3.0 2024-02-07 At least 6 months after the release of the 1st replacement model. Tentative
cohere.embed-multilingual-v3.0 2024-02-07 At least 6 months after the release of the 1st replacement model. Tentative
meta.llama-2-70b-chat 2024-01-22 2024-10--02

meta.llama-3.1-70b-instruct

meta.llama-3.1-405b-instruct

Dedicated Serving Mode

The dedicated serving mode is available for custom and pretrained base models and has the following characteristics:

  • Because each hosting dedicated AI cluster can only host the same version of each model, if you decide to keep using the model version that the dedicated AI cluster is already hosting and not migrate within the overlapping time period, you can request long-term support for that version.
  • Existing endpoints will continue to run.
Important

If you need a dedicated serving mode model to stay alive longer than the retirement date, create a support ticket.
Supported Dedicated Serving Mode Models
Model Release Date Retirement Date Suggested Replacement Options
meta.llama-3.1-405b-instruct 2024-09-19 At least 6 months after the release of the 1st replacement model. Tentative
meta.llama-3.1-70b-instruct 2024-09-19 At least 6 months after the release of the 1st replacement model. Tentative
cohere.command-r-plus 2024-06-18 At least 6 months after the release of the 1st replacement model. Tentative
cohere.command-r-16k 2024-06-04 At least 6 months after the release of the 1st replacement model. Tentative
meta.llama-3-70b-instruct 2024-06-04 No sooner than 2025-03-19

meta.llama-3.1-70b-instruct

meta.llama-3.1-405b-instruct

cohere.command 2024-02-07 No sooner than 2025-01-18

cohere.command-r-plus

cohere.command-r-16k

cohere.command-light 2024-02-07 No sooner than 2025-01-04

cohere.command-r-plus

cohere.command-r-16k

cohere.embed-english-v3.0 2024-02-07 At least 6 months after the release of the 1st replacement model. Tentative
cohere.embed-multilingual-v3.0 2024-02-07 At least 6 months after the release of the 1st replacement model. Tentative
meta.llama-2-70b-chat 2024-01-22 No sooner than 2025-01-04

meta.llama-3.1-70b-instruct

meta.llama-3.1-405b-instruct

Note

Deprecation times might change in the future.
Security Vulnerabilities and Bug Fixes for Base Models

The Generative AI service strives to mitigate quickly against any security issues or bug fixes that are present for any of the supported base models. Check OCI release notes to learn if you need to migrate to a different version.