Performing an In-Place Managed Node Kubernetes Upgrade by Cycling Nodes in an Existing Node Pool

Find out how to upgrade the Kubernetes version on managed nodes in a node pool by changing properties of the existing node pool, and then cycling the nodes, using Container Engine for Kubernetes (OKE).

Note

You can only cycle nodes to perform an in-place managed node Kubernetes upgrade when using enhanced clusters. See Working with Enhanced Clusters and Basic Clusters.

You cannot cycle nodes with bare metal shapes. Instead, upgrade nodes with bare metal shapes by manually replacing existing nodes or the existing node pool. See Performing an In-Place Managed Node Kubernetes Upgrade by Manually Replacing Nodes an Existing Node Pool and Performing an Out-of-Place Managed Node Kubernetes Upgrade by Replacing an Existing Node Pool with a New Node Pool.

This section applies to managed nodes only. For information about upgrading self-managed nodes, see Upgrading Self-Managed Nodes to a Newer Kubernetes Version by Replacing an Existing Self-Managed Node.

You can upgrade the version of Kubernetes running on managed nodes in a node pool by specifying a more recent Kubernetes version for the existing node pool, and then cycling the nodes. Before cycling the nodes, you can specify both a maximum allowed number of new nodes that can be created during the upgrade operation, and a maximum allowed number of nodes that can be unavailable.

When you cycle the nodes, Container Engine for Kubernetes automatically replaces all existing managed nodes with new nodes that run the more recent Kubernetes version you specified.

When cycling nodes, Container Engine for Kubernetes cordons, drains, and terminates nodes according to the node pool's Cordon and drain options.

Balancing service availability and cost when cycling managed nodes

Container Engine for Kubernetes uses two strategies when cycling nodes:

  • Create new (additional) nodes, and then remove existing nodes: Container Engine for Kubernetes adds an additional node (or nodes) to the node pool, running the more recent version of Kubernetes. When the additional node is active, Container Engine for Kubernetes cordons an existing node, drains the node, and removes the node from the node pool. This strategy maintains service availability, but costs more.
  • Remove existing nodes, and then create new nodes: Container Engine for Kubernetes cordons an existing node (or nodes) to make it unavailable, drains the node, and removes the node from the node pool. When the node has been removed, Container Engine for Kubernetes adds a new node to the node pool to replace the node that has been removed. This strategy costs less, but might compromise service availability.

To tailor Container Engine for Kubernetes behavior to meet your own requirements for service availability and cost, you can control and balance the two strategies by specifying:

  • The number of additional nodes to temporarily allow during the upgrade operation (referred to as maxSurge). The greater the number of additional nodes that you allow, the more nodes Container Engine for Kubernetes can upgrade in parallel without compromising service availability. However, the greater the number of additional nodes that you allow, the greater the cost.
  • The number of nodes to allow to be unavailable during the upgrade operation (referred to as maxUnavailable). The greater the number of nodes that you allow to be unavailable, the more nodes Container Engine for Kubernetes can upgrade in parallel without increasing costs. However, the greater the number of nodes that you allow to be unavailable, the more service availability might be compromised.

In both cases, you can specify the allowed number of nodes as an integer, or as a percentage of the number of nodes shown in the node pool's Node count property in the Console (the node pool's Size property in the API). If you don't explicitly specify allowed numbers for additional nodes (maxSurge) and unavailable nodes (maxUnavailable), then the following defaults apply:

  • If you don't specify a value for either maxSurge or maxUnavailable, then maxSurge defaults to 1, and maxUnavailable defaults to 0.
  • If you only specify a value for maxSurge, then maxUnavailable defaults to 0.
  • If you only specify a value for maxUnavailable, then maxSurge defaults to 1.

You cannot specify 0 as the allowed number for both additional nodes (maxSurge) and unavailable nodes (maxUnavailable).

Note the following:

  • At the end of the upgrade operation, the number of nodes in the node pool returns to the number specified by the node pool's Node count property shown in the Console (the node pool's Size property in the API).
  • If you specify a value for maxSurge during the upgrade operation, your tenancy must have sufficient quota for the number of additional nodes you specify.
  • If you specify a value for maxUnavailable during the ugrade operation, but the node pool cannot make that number of nodes unavailable (for example, due to a pod disruption budget), the upgrade operation fails.
  • If you enter a percentage as the value of either maxSurge or maxUnavailable, Container Engine for Kubernetes rounds up the percentage to the closest integer when calculating the allowed number of nodes.
  • If you have used kubectl to update nodes directly (for example, to apply a custom tag to a node), such changes are lost when Container Engine for Kubernetes cycles the nodes.
  • When upgrading large node pools, be aware that the values you specify for maxSurge and maxUnavailable might result in unacceptably long cycle times. For example, if you specify 1 as the value for maxSurge when cycling the nodes of a node pool with 1000 nodes, Container Engine for Kubernetes might take several days to cycle all the nodes in the node pool. If the node cycling operation does not complete within 30 days, the status of the associated work request is set to Failed. Submit another node cycling request to resume the operation.

Using the Console

To perform an 'in-place' upgrade of a node pool in a cluster, by specifying a more recent Kubernetes version for the existing node pool and then cycling nodes:

  1. Open the navigation menu and click Developer Services. Under Containers & Artifacts, click Kubernetes Clusters (OKE).
  2. Choose a Compartment you have permission to work in.
  3. On the Cluster List page, click the name of the cluster where you want to change the Kubernetes version running on managed nodes.
  4. On the Cluster page, display the Node Pools tab, and click the name of the node pool where you want to upgrade the Kubernetes version running on the managed nodes.

  5. On the Node Pool page, click Edit and in the Version field, specify the required Kubernetes version for managed nodes.

    The Kubernetes version you specify must be compatible with the version that is running on the control plane nodes.

  6. Click Save changes to save the change.

    You now cycle nodes to automatically delete existing managed nodes, and start new managed nodes running the Kubernetes version you specified.

    Recommended: Leverage pod disruption budgets as appropriate for your application to ensure that there's a sufficient number of replica pods running throughout the upgrade operation. For more information, see Specifying a Disruption Budget for your Application in the Kubernetes documentation.

  7. On the Node Pool page, click Cycle nodes.

  8. In the Cycle nodes dialog:
    1. Control the number of nodes to upgrade in parallel, and balance service availability and cost, by specifying:
      • Maximum number or percentage of additional nodes (maxSurge): The maximum number of additional nodes to temporarily allow in the node pool during the upgrade operation (expressed either as an integer or as a percentage). Additional nodes are nodes over and above the number specified in the node pool's Node count property. If you specify an integer for the number of additional nodes, do not specify a number greater than the value of Node count.
      • Maximum number or percentage of unavailable nodes (maxUnavailable): The maximum number of nodes to allow to be unavailable in the node pool during the upgrade operation (expressed either as an integer or as a percentage). If you specify an integer for the number of unavailable nodes, do not specify a number greater than the value of Node count.

      See Balancing service availability and cost when cycling managed nodes.

    2. Click Cycle nodes to start the upgrade operation.
  9. Monitor the progress of the upgrade operation by viewing the status of the associated work request (see Getting Work Request Details).

Using the CLI

For information about using the CLI, see Command Line Interface (CLI). For a complete list of flags and options available for CLI commands, see the Command Line Reference.

To perform an 'in-place' managed node upgrade by cycling nodes

Update the node pool's worker node Kubernetes version property, and specify the OCID of the corresponding image. Include the --node-pool-cycling-details parameter in the command to specify that you want to cycle the nodes in the node pool, optionally specifying a maximum allowed number of new nodes that can be created during the upgrade operation, and a maximum allowed number of nodes that can be unavailable:

oci ce node-pool update --node-pool-id <node-pool-ocid> --kubernetes-version <version> --node-source-details "{\"imageId\":\"<image-ocid>\",\"sourceType\":\"IMAGE\"}" --node-pool-cycling-details "{\"isNodeCyclingEnabled\":true,\"maximumUnavailable\":\"<value>\",\"maximumSurge\":\"<value>\"}"

Monitor the progress of the upgrade operation by viewing the status of the associated work request:

oci ce work-request list --compartment-id <compartment-ocid> --resource-id <node-pool-ocid>
oci ce work-request get --work-request-id <work-request-ocid>