Bring Your Own Models

If you have models you want to use instead of the service cached models provided by Data Science, you can bring them into AI Quick Actions from Object Storage or from Hugging Face by registering the model.

Hugging Face is an open source model repository. You can bring in models from here to use in AI Quick Actions. Hugging Face offers certain gated models that require the acceptance of user agreement. To bring a gated model from Hugging Face into AI Quick Actions, sign in to Hugging Face using the Hugging Face CLI and your Hugging Face token from a terminal inside the Notebook. This is to verify your access to the model. See the Hugging Face guides to see how to sign in with the Hugging Face CLI. If you don't have a Hugging Face token, see this Hugging Face article on security tokens to generate one. If you try to register a gated model which you haven't been granted access to in Hugging Face or fail to sign in with the Hugging Face CLI, the registration process fails.

For more information, see a video on importing a Hugging Face model to AI Quick Actions.

There are two ways to register a model:
  • Register service verified model.
  • Register any model.
A service verified model is one the Data Science service has tested the configurations for deployment and fine tuning.
Note

The difference between a service cached model and a verified model is that, for a verified model, you must register the model in AI Quick Actions before using it.

Service Managed Inference Containers

Four inference containers are available to use with Bring Your Own Model.

For cached and verified models, Data Science has tested which inference container works best with each model and so the inference container can't be chosen. For unverified models, you must decide which inference container is most suitable for each model. Four service managed inference containers are available:
  • for models compatible with inference engine vLLM 0.7.1
  • for models compatible with TGI 2.0.1
  • for models compatible with inference framework llama.cpp 0.3.2 (for models in GGUF format)
  • for models compatible with inference framework Test Embeddings Inference (TEI)
for models in the GGUF format.

Register Service Verified Models

Data Science has models you can select to use that have been tested.

Register Any Model

Follow these steps to use models that haven't been tested by Data Science.