Under Models, you can find Model Explorer that shows all the
foundation models supported by AI Quick Actions and your fine-tuned models. The model cards
include a tag to indicate the family of shapes that are supported for the model. Enter text in
the Search and filter models text box to search for a model in the list. Or select the text
box and select an option on which to filter the list of models. Under My
models are service cached foundation models, ready-to-register models, and
models you have registered. Service cached models are models whose configuration have been
verified by the Data Science team and are ready to be used without the downloading of model
artifacts. Ready-to-register models are models whose configurations have been verified by the
Data Science team, and which you can bring into AI Quick Actions through the model
registration process. Under Fine-tuned models are models you have
fine-tuned.
Service Cached Models
Service cached models have been tested by Data Science and the
model artifacts are downloaded to a bucket in the service's object storage. They're ready to
be used.
The available service cached models are:
codellama-34b-instruct-hf
codellama-13b-instruct-hf
codellama-7b-instruct-hf
mistralai/Mixtral-8x7b-v0.1
mistralai/Mistral-7b-Instruct-v0.3
mixtral-8x7b-instruct-v0.1
mistral-7b-instruct-v0.2
mistral-7b-v0.1
mistral-7b-instruct-v0.1
falcon-7b
phi-2
falcon-40b-instruct
microsoft/Phi-3-vision-128k-instruct
microsoft/Phi-3-mini-128k-instruct
microsoft/Phi-3-mini-4k-instruct
microsoft/Phi-3-mini-4k-instruct-gguf-fp16
microsoft/Phi-3-mini-4k-instruct-gguf-q4
meta-llama/Meta-Llama-3.1-8B
meta-llama/Meta-Llama-3.1-8B-Instruct
meta-llama/Meta-Llama-3.1-70B
meta-llama/Meta-Llama-3.1-70B-Instruct
meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
meta-llama/Meta-Llama-3.1-405B-FP8
meta-llama/Llama-3.3-70B-Instruct
Ready-to-Register Models 🔗
Ready-to-Register models have been tested by Data Science,
and they can be used in AI Quick Actions through the Model Registration process.
The ready-to-register models are:
core42/jais-13b-chat
core42/jais-13b
llama-3-70b-instruct
llama-3-8b-instruct
meta-llama/Llama-3.2-1B
meta-llama/Llama-3.2-1B-Instruct
meta-llama/Llama-3.2-3B
meta-llama/Llama-3.2-3B-Instruct
meta-llama/Llama-3.2-11B-Vision
meta-llama/Llama-3.2-90B-Vision
meta-llama/Llama-3.2-11B-Vision-Instruct
meta-llama/Llama-3.2-90B-Vision-Instruct
meta-llama-3-8b
meta-llama-3-70b
elyza/ELYZA-japanese-Llama-2-13b-instruct
elyza/ELYZA-japanese-Llama-2-7b-instruct
elyza/ELYZA-japanese-Llama-2-13b
elyza/ELYZA-japanese-Llama-2-7b
google/gemma-1.1-7b-it
google/gemma-2b-it
google/gemma-2b
google/gemma-7b
google/codegemma-2b
google/codegemma-1.1-7b-it
google/codegemma-1.1-2b
google/codegemma-7b
intfloat/e5-mistral-7b-instruct
Note
The meta-llama/Meta-Llama-3.1 and meta-llama/Llama-3.2 models aren't available in
EU regions.
Working with Multimodal Models 🔗
AI Quick Actions supports deployment of multimodal models. For an example of deploying
and testing a multimodal model, see the AI Quick Actions samples in the Data Science section on GitHub.
To work with image payload and mulitmodal models, when creating a Model Deployment, under Advanced
options, select /v1/chat/completions as the Inference
Mode.