Model Deployment
Follow these steps to deploy models with AI Quick Actions.
Model Deployment Creation
You can create a Model Deployment from the foundation models with the tag Ready to Deploy in the Model Explorer, or with fine tuned models. When you create a Model Deployment in AI Quick Actions, you're creating an OCI Data Science Model Deployment, which is a managed resource in the OCI Data Science Service. You can deploy the model as HTTP endpoints in OCI.
You need to have the necessary policy to use Data Science Model Deployment. You can select the compute shape for the model deployment. You can set up logging to monitor your model deployment. Logging is optional but it's highly recommended to help troubleshoot errors with your Model Deployment. You need to have the necessary policy to enable logging, see Model Deployment Logs for more information on logs. Under advanced option, you can select the number of instances to deploy and the Load Balancer bandwidth.
See Model Deployment on GitHub for more information about, and tips on, deploying models.
For a complete list of parameters and values for AI Quick Actions CLI commands, see AI Quick Actions CLI.
This task can't be performed using the API.
Invoke Model Deployment in AI Quick Actions
You can invoke model deployment in AI Quick Actions from the CLI or Python SDK.
For more information, see the section on model deployment tips in GitHub.
Model Artifacts
When a model is downloaded into a Model Deployment instance, it's downloaded in the
/opt/ds/model/deployed_model/<object_storage_folder_name_and_path>
folder. Artifacts such as the chat template can be found inside this folder.