Deleting a Worker Node
Find out how to delete a worker node in a Kubernetes cluster you've created using Container Engine for Kubernetes.
You can delete specific worker nodes in node pools in clusters you've created with Container Engine for Kubernetes.
Note the following:
- By default, deleting a worker node both deletes that specific worker node from the node pool, and also scales down the node pool itself by subtracting 1 from the number of worker nodes specified for the node pool. To delete a worker node without scaling down the node pool, use the CLI or API.
- When deleting managed nodes, the Cordon and drain options you select determine when and how worker nodes are terminated. See Notes on cordoning and draining managed nodes before termination. The Cordon and drain options are not supported with virtual nodes.
- As well as deleting specific worker nodes, worker nodes are also deleted when you scale down node pools and change placement configurations.
- Once you have marked a worker node for deletion (during a delete node operation, a scale down operation, or a change to placement configuration), you cannot recover the node. Even if the delete node operation is initially unsuccessful, the next update node pool operation (including a scale up operation) will attempt to terminate the node again.
- Container Engine for Kubernetes creates the worker nodes in a cluster with auto-generated names (managed node names have the format
oke-c<part-of-cluster-OCID>-n<part-of-node-pool-OCID>-s<part-of-subnet-OCID>-<slot>
, virtual node names are the same as the node's private IP address). Do not change the auto-generated names of worker nodes. If you do change the auto-generated name of a worker node and then delete the cluster, the renamed worker node is not deleted. You would have to delete the renamed worker node manually.
To delete a worker node using Container Engine for Kubernetes:
- In the Console, open the navigation menu and click Developer Services. Under Containers, click Kubernetes Clusters (OKE).
- Choose a Compartment you have permission to work in.
- On the Cluster List page, click the name of the cluster containing the worker node you want to delete.
- Under Resources, click Node Pools and click the name of the node pool containing the worker node you want to delete.
- Under Resources, click Nodes.
- Select Delete Node from the Actions menu beside the node you want to delete, .
Note that deleting a worker node permanently deletes the node. You cannot recover a deleted worker node.
-
Either accept the defaults for advanced options, or click Show Advanced Options and specify alternatives as follows:
-
Cordon and drain: Specify when and how to cordon and drain worker nodes before terminating them.
- Eviction grace period (mins): The length of time to allow to cordon and drain worker nodes before terminating them. Either accept the default (60 minutes), or specify an alternative. For example, when scaling down a node pool or changing its placement configuration, you might want to allow 30 minutes to cordon worker nodes and drain them of their workloads. To terminate worker nodes immediately, without cordoning and draining them, specify 0 minutes.
- Force terminate after grace period: Whether to terminate worker nodes at the end of the eviction grace period, even if they have not been successfully cordoned and drained. By default, this option is not selected.
Select this option if you always want worker nodes terminated at the end of the eviction grace period, even if they have not been successfully cordoned and drained.
De-select this option if you do not want worker nodes that have not been successfully cordoned and drained to be terminated at the end of the eviction grace period. Node pools containing worker nodes that could not be terminated within the eviction grace period have the Needs attention status (see Monitoring Clusters). The status of the work request that initiated the termination operation is set to Failed and the termination operation is cancelled.
For more information, see Notes on cordoning and draining managed nodes before termination.
-
- Click Delete to delete the worker node. Deleting the worker node also scales down the node pool itself by subtracting 1 from the number of worker nodes specified for the node pool.
Use the oci ce node-pool delete-node command and required parameters to delete a node.
To delete a worker node and scale down the node pool by one:
oci ce node-pool delete-node --node-pool-id <node-pool-ocid> --node-id <node-ocid> [OPTIONS]
For example:
oci ce node-pool delete-node --node-pool-id ocid1.nodepool.oc1.iad.aaaaaaa______eya --node-id ocid1.instance.oc1.iad.anu___4cq
To delete a worker node without scaling down the node pool:
oci ce node-pool delete-node --node-pool-id <node-pool-ocid> --node-id <node-ocid> --is-decrement-size false [OPTIONS]
For example:
oci ce node-pool delete-node --node-pool-id ocid1.nodepool.oc1.iad.aaaaaaa______eya --node-id ocid1.instance.oc1.iad.anu___4cq --is-decrement-size false
For a complete list of flags and variable options for CLI commands, see the Command Line Reference.
Run the DeleteNode operation to delete a worker node.
Notes on cordoning and draining managed nodes before termination
Cordoning
Cordoning is the name given to marking a worker node in a Kubernetes cluster as unschedulable. Cordoning a worker node prevents the kube-scheduler from placing new pods onto that node, but does not affect existing pods on the node. Cordoning a worker node is a useful preparatory step before terminating the node to perform administrative tasks (such as node deletion, scaling down a node pool, and changing placement configuration). For more information, see Manual Node Administration in the Kubernetes documentation.
Draining
Draining is the name given to safely evicting pods from a worker node in a Kubernetes cluster. Safely evicting pods ensures the pod's containers terminate gracefully and perform any necessary cleanup. For more information, see Safely Drain a Node and Termination of Pods in the Kubernetes documentation.
Pod disruption budgets
Pod disruption budgets are a Kubernetes feature to limit the number of concurrent disruptions that an application experiences. Using pod disruption budgets ensures high application availability whilst at the same time enabling you to perform administrative tasks on worker nodes. Pod disruption budgets can prevent pods being evicted when draining worker nodes. For more information, see Specifying a Disruption Budget for your Application in the Kubernetes documentation.
Node pools with a "Needs attention" status
When deleting worker nodes from clusters you've created with Container Engine for Kubernetes, you can use the Cordon and drain options to specify when and how worker nodes are terminated:
- Eviction grace period (mins): The length of time to allow to cordon and drain worker nodes before terminating them. Either accept the default (60 minutes), or specify an alternative. For example, when scaling down a node pool or changing its placement configuration, you might want to allow 30 minutes to cordon worker nodes and drain them of their workloads. To terminate worker nodes immediately, without cordoning and draining them, specify 0 minutes.
- Force terminate after grace period: Whether to terminate worker nodes at the end of the eviction grace period, even if they have not been successfully cordoned and drained. By default, this option is not selected.
Select this option if you always want worker nodes terminated at the end of the eviction grace period, even if they have not been successfully cordoned and drained.
De-select this option if you do not want worker nodes that have not been successfully cordoned and drained to be terminated at the end of the eviction grace period. Node pools containing worker nodes that could not be terminated within the eviction grace period have the Needs attention status (see Monitoring Clusters).
A node pool with the Needs attention status indicates that one or more of the worker nodes in the node pool failed to evict all the pods running on it within the eviction grace period. The status of the work request that initiated the termination operation is set to Failed. You can view the reason for the failure, including the specific pods that cannot be evicted, in the work request logs (see Viewing Work Requests). There are a number of possible reasons why a pod cannot be evicted, including restrictive pod disruption budgets. For more information, see Scheduling, Preemption and Eviction in the Kubernetes documentation.
To resolve a node pool's Needs attention status and terminate affected worker nodes, do either of the following:
- Re-issue the original command and select the Force terminate after grace period option. Nodes are terminated at the end of the eviction grace period, even if they have not been successfully cordoned and drained.
- Examine the work request log to determine the reason for the eviction failure, address the reason (for example, by creating a less restrictive pod disruption budget), and re-issue the original command.
Using the CLI to resolve a node pool's "Needs attention" status
To use the CLI to resolve a node pool's Needs attention status and terminate affected worker nodes, enter:
oci ce node-pool get --node-pool-id <nodepool-ocid> | jq '{ state: .data."lifecycle-state", nodes: (.data.nodes | .[] | {id, "node-error"} ) }'
where --node-pool-id <nodepool-ocid>
is the OCID of the node pool with the Needs attention status.
For example:
oci ce node-pool get --node-pool-id ocid1.nodepool.oc1.iad.aaaaaaa______eya | jq '{ state: .data."lifecycle-state", nodes: (.data.nodes | .[] | {id, "node-error"} ) }'
The response to the command lists worker nodes currently in a node-error state, along with an explanation. For example:
{
"state": "NEEDS_ATTENTION",
"nodes": {
"id": "ocid1.instance.oc1.iad.anu___4cq",
"node-error":
{
"code": "PodEvictionFailureError",
"message": "Pod(s) {sigterm - app - 55 c4f4f657 - wccqn} of Node ocid1.instance.oc1.iad.anuwc______4cq could not be evicted.",
"opc-request-id": null,
"status": null
}
}
}
In this example, you can see that a pod could not be evicted from the worker node within the eviction grace period. As a result, the worker node could not be terminated. It is your responsibility to identify why the pod could not be evicted, and then to fix the underlying problem. For example, by creating a less restrictive pod disruption budget.
Having fixed the problem, you can go ahead and delete the worker node by entering:
oci ce node-pool delete-node --node-pool-id <nodepool-ocid> --node-id <node-ocid>
For example:
oci ce node-pool delete-node --node-pool-id ocid1.nodepool.oc1.iad.aaaaaaa______eya --node-id ocid1.instance.oc1.iad.anu___4cq
If you want to force the deletion of the worker node without cordoning and draining the worker node, and without rectifying the underlying problem, enter:
oci ce node-pool delete-node --node-pool-id <nodepool-ocid> --node-id <node-ocid> --override-eviction-grace-duration PT0M
where --override-eviction-grace-duration PT0M
sets the eviction grace period to 0 minutes.
For example:
oci ce node-pool delete-node --node-pool-id ocid1.nodepool.oc1.iad.aaaaaaa______eya --node-id ocid1.instance.oc1.iad.anu___4cq --override-eviction-grace-duration PT0M
Node pools with quantityPerSubnet set to 1 or more
When creating and updating node pools in earlier Container Engine for Kubernetes releases, you specified how many worker nodes you wanted in a node pool by entering a value for the Quantity per subnet property (quantityPerSubnet
in the API).
In more recent Container Engine for Kubernetes releases, you specify how many worker nodes you want in a node pool by entering a value for the Number of Nodes property (size
in the API).
Note that you can only delete specific worker nodes (and select Cordon and drain options) when deleting from node pools that have Quantity per subnet (quantityPerSubnet
) set to zero or null. To delete specific worker nodes (and select Cordon and drain options) from an older node pool that has Quantity per subnet (quantityPerSubnet
) set to 1 or more, you must first set Quantity per subnet (quantityPerSubnet
) to zero or null. Having set Quantity per subnet (quantityPerSubnet
) to zero or null, you can then specify the number of worker nodes by entering a value for Number of Nodes (size
) instead. From that point onwards, you can delete specific worker nodes (and select Cordon and drain options).
To find out the value of Quantity per subnet (quantityPerSubnet
) for a node pool, enter the following command:
oci ce node-pool get --node-pool-id <node-pool-ocid>