Use Default Networking

Create a model deployment with the default networking option.

The workload is attached by using a secondary VNIC to a preconfigured, service-managed VCN, and subnet. This provided subnet gives access to other Oracle Cloud services through a service gateway but not to the public internet.

If you need access only to OCI services, we recommend using this option. It doesn't require you to create networking resources or write policies for networking permissions.

You can create and run default networking model deployments using the Console, the OCI Python SDK, the OCI CLI, or the Data Science API.

    1. Use the Console to sign in to a tenancy with the necessary policies.
    2. Open the navigation menu and select Analytics & AI. Under Machine Learning, select Data Science.
    3. Select the compartment that contains the project that you want to create the model deployment in.

      All projects in the compartment are listed.

    4. Select the name of the project.

      The project details page opens and lists the notebook sessions.

    5. Under Resources, select Model deployments.

      A tabular list of model deployments in the project is displayed.

    6. Select Create model deployment.
    7. (Optional) Enter a unique name for the model (limit of 255 characters). If you don't provide a name, a name is automatically generated.

      For example, modeldeployment20200108222435.

    8. (Optional) Enter a description (limit of 400 characters) for the model deployment.
    9. (Optional) Under Default configuration, enter a custom environment variable key and corresponding value. Select + Additional custom environment key to add more environment variables.
    10. In the Models section, select Select to select an active model to deploy from the model catalog.
      1. Find a model by using the default compartment and project, or by selecting Using OCID and searching for the model by entering its OCID.
      2. Select the model.
      3. Select Submit.
      Important

      Model artifacts that exceed 400 GB aren't supported for deployment. Select a smaller model artifact for deployment.
    11. (Optional) Change the Compute shape by selecting Change shape. Then, follow these steps in the Select compute panel.
      1. Select an instance type.
      2. Select an shape series.
      3. Select one of the supported Compute shapes in the series.
      4. Select the shape that best suits how you want to use the resource. For the AMD shape, you can use the default or set the number of OCPUs and memory.

        For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.

      5. Select Select shape.
    12. Enter the number of instances for the model deployment to replicate the model on.
    13. Select Default networking to configure the network type.
    14. Select one of the following options to configure the endpoint type:
      • Public endpoint: Data access in a managed instance from outside a VCN.
      • Private endpoint: The private endpoint that you want to use for the model deployment.
      If you selected Private endpoint, select Private Endpoint from Private Endpoint in Data Science.

      Select Change compartment to select the compartment containing the private endpoint.

    15. (Optional) If you configured access or predict logging, in the Logging section, select Select and then follow these steps:
      1. For access logs, select a compartment, log group, and log name.
      2. For predict logs, select a compartment, log group, and log name.
      3. Select Submit.
    16. (Optional) Select Show Advanced Options to add tags.
      1. (Optional) Select the serving mode for the model deployment, either as an HTTPS endpoint or using a Streaming service stream.
      2. (Optional) Select the load balancing bandwidth in Mbps or use the 10 Mbps default.

        Tips for load balancing:

        If you know the common payload size and the frequency of requests per second, you can use the following formula to estimate the bandwidth of the load balancer that you need. We recommend that you add an extra 20% to account for estimation errors and sporadic peak traffic.

        (Payload size in KB) * (Estimated requests per second) * 8 / 1024

        For example, if the payload is 1,024 KB and you estimate 120 requests per second, then the recommended load balancer bandwidth would be (1024 * 120 * 8 / 1024) * 1.2 = 1152 Mbps.

        Remember that the maximum supported payload size is 10 MB when dealing with image payloads.

        If the request payload size is more than the allocated bandwidth of the load balancer that was defined, then the request is rejected with a 429 status code.

      3. (Optional) Select Use a custom container image and enter the following:
        • Repository in <tenancy>: The repository that contains the custom image.

        • Image: The custom image to use in the model deployment at runtime.

        • CMD: More commands to run when the container starts. Add one instruction per text-box. For example if CMD is ["--host", "0.0.0.0"], then pass --host in one text-box and 0.0.0.0 in another one. Don't use quotation marks at the end.

        • Entrypoint: One or more entry point files to run when the container starts. For example /opt/script/entrypoint.sh. Don't use quotation marks at the end.

        • Server port: The port that the web server serving the inference is running on. The default is 8080. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.

        • Health check port: The port that the container HEALTHCHECK listens on. Defaults to the server port. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.

      4. (Optional) Select the Tags tab, and then enter the tag namespace (for a defined tag), key, and value to assign tags to the resource.

        To add more than one tag, select Add tag.

        Tagging describes the various tags that you can use organize and find resources including cost-tracking tags.

    17. Select Create.
  • You can use the OCI CLI to create a model deployment as in this example.

    1. Deploy the model with:
      oci data-science model-deployment create \
      --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \
      --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \
      --project-id <PROJECT_OCID> \
      --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \
      --display-name <MODEL_DEPLOYMENT_NAME>
    2. Use this model deployment JSON configuration file:
      {
            "deploymentType": "SINGLE_MODEL",
            "modelConfigurationDetails": {
              "bandwidthMbps": <YOUR_BANDWIDTH_SELECTION>,
              "instanceConfiguration": {
                "instanceShapeName": "<YOUR_VM_SHAPE>"
              },
              "modelId": "<YOUR_MODEL_OCID>",
              "scalingPolicy": {
                  "instanceCount": <YOUR_INSTANCE_COUNT>,
                  "policyType": "FIXED_SIZE"
               }
           }
       }

      If you are specifying an environment configuration, you must include the environmentConfigurationDetails object as in this example:

      
      {
        "modelDeploymentConfigurationDetails": {
          "deploymentType": "SINGLE_MODEL",
          "modelConfigurationDetails": {
            "modelId": "ocid1.datasciencemodel.oc1.iad........",
            "instanceConfiguration": {
              "instanceShapeName": "VM.Standard.E4.Flex",
              "modelDeploymentInstanceShapeConfigDetails": {
                "ocpus": 1,
                "memoryInGBs": 16
              }
            },
            "scalingPolicy": {
              "policyType": "FIXED_SIZE",
              "instanceCount": 1
            },
            "bandwidthMbps": 10
          },
          "environmentConfigurationDetails" : {
            "environmentConfigurationType": "OCIR_CONTAINER",
            "image": "iad.ocir.io/testtenancy/image_name:1.0.0",
            "entrypoint": [
              "python",
              "/opt/entrypoint.py"
            ],
            "serverPort": "5000",
            "healthCheckPort": "5000"
          },
          "streamConfigurationDetails": {
            "inputStreamIds": null,
            "outputStreamIds": null
          }
        }
      }
    3. (Optional) Use this logging JSON configuration file:
      {
          "access": {
            "logGroupId": "<YOUR_LOG_GROUP_OCID>",
            "logId": "<YOUR_LOG_OCID>"
          },
          "predict": {
            "logGroupId": "<YOUR_LOG_GROUP_OCID>",
            "logId": "<YOUR_LOG_OCID>"
          }
      }
    4. (Optional) Use this to use a custom container:
      oci data-science model-deployment create \
      --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \
      --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \
      --project-id <PROJECT_OCID> \
      --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \
      --display-name <MODEL_DEPLOYMENT_NAME>
  • Use the CreateModelDeployment operation to create a model deployment.

Using the OCI Python SDK

We've developed an OCI Python SDK model deployment example that includes authentication.

Important

Artifacts that exceed 400 GB aren't supported for deployment. Select a smaller model artifact for deployment.
Note

You must upgrade the OCI SDK to version 2.33.0 or later before creating a deployment with the Python SDK. Use the following command:

pip install --upgrade oci

Use this example to create a model deployment that uses a custom container:

# create a model configuration details object
model_config_details = ModelConfigurationDetails(
    model_id=<model-id>,
    bandwidth_mbps=<bandwidth-mbps>,
    instance_configuration=<instance-configuration>,
    scaling_policy=<scaling-policy>
)
 
# create the container environment configiguration
environment_config_details = OcirModelDeploymentEnvironmentConfigurationDetails(
    environment_configuration_type="OCIR_CONTAINER",
    environment_variables={'key1': 'value1', 'key2': 'value2'},
    image="iad.ocir.io/testtenancy/ml_flask_app_demo:1.0.0",
    image_digest="sha256:243590ea099af4019b6afc104b8a70b9552f0b001b37d0442f8b5a399244681c",
    entrypoint=[
        "python",
        "/opt/ds/model/deployed_model/api.py"
    ],
    server_port=5000,
    health_check_port=5000
)
 
# create a model type deployment
single_model_deployment_config_details = data_science.models.SingleModelDeploymentConfigurationDetails(
    deployment_type="SINGLE_MODEL",
    model_configuration_details=model_config_details,
    environment_configuration_details=environment_config_details
)
 
# set up parameters required to create a new model deployment.
create_model_deployment_details = CreateModelDeploymentDetails(
    display_name=<deployment_name>,
    model_deployment_configuration_details=single_model_deployment_config_details,
    compartment_id=<compartment-id>,
    project_id=<project-id>
)

Notebook Examples

We have provided various notebook examples that show you how to train, prepare, save, deploy, and invoke model deployments.