Adding Node Pools to Scale Up Clusters

Find out how to scale up clusters by adding node pools using Kubernetes Engine (OKE).

You can scale up clusters by adding node pools using the Console, the CLI, and the API.

  • To scale up an existing cluster by increasing the number of node pools in the cluster using the Console:

    1. Open the navigation menu and select Developer Services. Under Containers & Artifacts, select Kubernetes Clusters (OKE).
    2. Choose a Compartment you have permission to work in.
    3. On the Cluster List page, click the name of the cluster you want to modify.
    4. Under Resources, click Node pools.
    5. Click the Add node pool button and scale up the cluster by adding node pools.
    6. Enter details for the new node pool:
      • Name: A name of your choice for the new node pool. Avoid entering confidential information.
      • Compartment: The compartment in which to create the new node pool.
      • Node type: If the cluster's network type is VCN-native pod networking, specify the type of worker nodes in this node pool (see Virtual Nodes and Managed Nodes). Select one of the following options:
        • Managed: Select this option when you want to have responsibility for managing the worker nodes in the node pool. Managed nodes, running on compute instances (either bare metal or virtual machine) in your tenancy. As you are responsible for managing managed nodes, you have the flexibility to configure them to meet your specific requirements. You are responsible for upgrading Kubernetes on managed nodes, and for managing cluster capacity.
        • Virtual: Select this option when you want to benefit from a 'serverless' Kubernetes experience. Virtual nodes enable you to run Kubernetes pods at scale without the operational overhead of upgrading the data plane infrastructure and managing the capacity of clusters.

        For more information, see Comparing Virtual Nodes with Managed Nodes.

      • Version: (Managed node pools only) The version of Kubernetes to run on each managed node in a managed node pool. By default, the version of Kubernetes specified for the control plane nodes is selected. The Kubernetes version on worker nodes must be either the same version as that on the control plane nodes, or an earlier version that is still compatible. See Kubernetes Versions and Kubernetes Engine (OKE).

        Note that if you specify an OKE image for worker nodes, the Kubernetes version you select here must be the same as the version of Kubernetes in the OKE image.

    7. If the cluster's network type is VCN-native pod networking and you selected Managed as the Node Type, or if the cluster's network type is Flannel overlay:
      1. Specify configuration details for the managed node pool:

        • Node Placement Configuration:
          • Availability domain: An availability domain in which to place worker nodes.
          • Worker node subnet: A regional subnet (recommended) or AD-specific subnet configured to host worker nodes. If you specified load balancer subnets, the worker node subnets must be different. The subnets you specify can be private (recommended) or public. See Subnet Configuration.
          • Fault domains: (Optional) One or more fault domains in the availability domain in which to place worker nodes.

          Optionally click Show advanced options to specify a capacity type to use (see Managing Worker Node Capacity Types). If you specify a capacity reservation, make sure that the node shape, availability domain, and fault domain in the node pool's placement configuration match the capacity reservation's instance type, availability domain, and fault domain respectively. See Using Capacity Reservations to Provision Managed Nodes.

          Optionally click Another Row to select additional domains and subnets in which to place worker nodes.

          When the worker nodes are created, they are distributed as evenly as possible across the availability domains and fault domains you select. If you don't select any fault domains for a particular availability domain, the worker nodes are distributed as evenly as possible across all the fault domains in that availability domain.

        • Node Shape: The shape to use for worker nodes in the node pool. The shape determines the number of CPUs and the amount of memory allocated to each node.

          Only those shapes available in your tenancy that are supported by Kubernetes Engine are shown.

          If you select a flexible shape, you can explicitly specify the number of CPUs and the amount of memory.

          See Supported Images (Including Custom Images) and Shapes for Worker Nodes.

        • Image: The image to use on worker nodes in the node pool. An image is a template of a virtual hard drive that determines the operating system and other software for the node.

          To change the default image, click Change image. In the Browse all images window, choose an Image source and select an image as follows:

          • OKE Worker Node Images: Recommended. Provided by Oracle and built on top of platform images. OKE images are optimized to serve as base images for worker nodes, with all the necessary configurations and required software. Select an OKE image if you want to minimize the time it takes to provision worker nodes at runtime when compared to platform images and custom images.

            OKE image names include the version number of the Kubernetes version they contain. Note that if you specify a Kubernetes version for the node pool, the OKE image you select here must have the same version number as the node pool's Kubernetes version.

          • Platform images: Provided by Oracle and only contain an Oracle Linux operating system. Select a platform image if you want Kubernetes Engine to download, install, and configure required software when the compute instance hosting a worker node boots up for the first time.

          See Supported Images (Including Custom Images) and Shapes for Worker Nodes.

        • Node count: The number of worker nodes to create in the node pool, placed in the availability domains you select, and in the regional subnet (recommended) or AD-specific subnet you specify for each availability domain.
        • Use security rules in Network Security Group (NSG): Control access to the node pool using security rules defined for one or more network security groups (NSGs) that you specify (up to a maximum of five). You can use security rules defined for NSGs instead of, or as well as, those defined for security lists (NSGs are recommended). For more information about the security rules to specify for the NSG, see Security Rules for Worker Nodes.
        • Boot volume: Configure the size and encryption options for the worker node's boot volume:

          • To specify a custom size for the boot volume, select the Specify a custom boot volume size check box. Then, enter a custom size from 50 GB to 32 TB. The specified size must be larger than the default boot volume size for the selected image. See Custom Boot Volume Sizes for more information.

            Note that if you increase the boot volume size, you also need to extend the partition for the boot volume (the root partition) to take advantage of the larger size. See Extending the Partition for a Boot Volume. Oracle Linux platform images include the oci-utils package. You can use the oci-growfs command from that package in a custom cloud-init script to extend the root partition and then grow the file system. For more information, see Extending the Root Partition of Worker Nodes.

          • For VM instances, you can optionally select the Use in-transit encryption check box. For bare metal instances that support in-transit encryption, it is enabled by default and is not configurable. See In-transit Encryption for more information about in-transit encryption. If you are using your own Vault service encryption key for the boot volume, then this key is also used for in-transit encryption. Otherwise, the Oracle-provided encryption key is used.
          • Boot volumes are encrypted by default, but you can optionally use your own Vault service encryption key to encrypt the data in this volume. To use the Vault service for your encryption needs, select the Encrypt this volume with a key that you manage check box. Select the vault compartment and vault that contains the master encryption key that you want to use, and then select the master encryption key compartment and master encryption key. If you enable this option, this key is used for both data at rest encryption and in-transit encryption.
            Important

            The Block Volume service does not support encrypting volumes with keys encrypted using the Rivest-Shamir-Adleman (RSA) algorithm. When using your own keys, you must use keys encrypted using the Advanced Encryption Standard (AES) algorithm. This applies to block volumes and boot volumes.

          Note that to use your own Vault service encryption key to encrypt data, an IAM policy must grant access to the service encryption key. See Create Policy to Access User-Managed Encryption Keys for Encrypting Boot Volumes, Block Volumes, and/or File Systems.

        • Pod communication: When the cluster's Network type is VCN-native pod networking, specify how pods in the node pool communicate with each other using a pod subnet:
          • Subnet: A regional subnet configured to host pods. The pod subnet you specify must be private. In some situations, the worker node subnet and the pod subnet can be the same subnet (in which case, Oracle recommends defining security rules in network security groups rather than in security lists). See Subnet Configuration.
          • Use security rules in Network Security Group (NSG): Control access to the pod subnet using security rules defined for one or more network security groups (NSGs) that you specify (up to a maximum of five). You can use security rules defined for NSGs instead of, or as well as, those defined for security lists (NSGs are recommended). For more information about the security rules to specify for the NSG, see Security Rules for Worker Nodes and Pods.

          Optionally click Show advanced options to specify the maximum number of pods that you want to run on a single worker node in a node pool, up to a limit of 110. The limit of 110 is imposed by Kubernetes. If you want more than 31 pods on a single worker node, the shape you specify for the node pool must support three or more VNICs (one VNIC to connect to the worker node subnet, and at least two VNICs to connect to the pod subnet). See Maximum Number of VNICs and Pods Supported by Different Shapes.

          For more information about pod communication, see Pod Networking.

      2. Either accept the defaults for advanced node pool options, or select Show advanced options and specify alternatives as follows:

        • Cordon and drain: Specify when and how to cordon and drain worker nodes before terminating them.

          • Eviction grace period (mins): The length of time to allow to cordon and drain worker nodes before terminating them. Either accept the default (60 minutes) or specify an alternative. For example, when scaling down a node pool or changing its placement configuration, you might want to allow 30 minutes to cordon worker nodes and drain them of their workloads. To terminate worker nodes immediately, without cordoning and draining them, specify 0 minutes.
          • Force terminate after grace period: Whether to terminate worker nodes at the end of the eviction grace period, even if they haven't been successfully cordoned and drained. By default, this option isn't selected.

            Select this option if you always want worker nodes terminated at the end of the eviction grace period, even if they haven't been successfully cordoned and drained.

            De-select this option if you don't want worker nodes that haven't been successfully cordoned and drained to be terminated at the end of the eviction grace period. Node pools containing worker nodes that can't be terminated within the eviction grace period have the Needs attention status. The status of the work request that initiated the termination operation is set to Failed, and the termination operation is cancelled. For more information, see Monitoring Clusters.

          For more information, see Notes on Cordoning and Draining Managed Nodes Before Termination.

        • Initialization script: (Optional) A script for cloud-init to run on each instance hosting worker nodes when the instance boots up for the first time. The script you specify must be written in one of the formats supported by cloud-init (for example, cloud-config), and must be a supported filetype (for example, .yaml). Specify the script as follows:
          • Choose Cloud-Init Script: Select a file containing the cloud-init script, or drag and drop the file into the box.
          • Paste Cloud-Init Script: Copy the contents of a cloud-init script, and paste it into the box.

          If you have not previously written cloud-init scripts for initializing worker nodes in clusters created by Kubernetes Engine, you might find it helpful to click Download Custom Cloud-Init Script Template. The downloaded file contains the default logic provided by Kubernetes Engine. You can add your own custom logic either before or after the default logic, but do not modify the default logic. For examples, see Example Usecases for Custom Cloud-init Scripts.

        • Kubernetes Labels: (Optional) One or more labels (in addition to a default label) to add to worker nodes in the node pool to enable the targeting of workloads at specific node pools. For example, to exclude all the nodes in a node pool from the list of backend servers in a load balancer backend set, specify node.kubernetes.io/exclude-from-external-load-balancers=true (see node.kubernetes.io/exclude-from-external-load-balancers).
        • Node pool tags and Node tags: (Optional) One or more tags to add to the node pool, and to compute instances hosting worker nodes in the node pool. Tagging enables you to group disparate resources across compartments, and also enables you to annotate resources with your own metadata. See Tagging Kubernetes Cluster-Related Resources.
        • Public SSH Key: (Optional) The public key portion of the key pair you want to use for SSH access to each node in the node pool. The public key is installed on all worker nodes in the cluster. Note that if you don't specify a public SSH key, Kubernetes Engine will provide one. However, since you won't have the corresponding private key, you will not have SSH access to the worker nodes. Note that you cannot use SSH to access directly any worker nodes in private subnets (see Connecting to Managed Nodes in Private Subnets Using SSH).
    8. If you selected Virtual as the Node Type:
      1. Specify configuration details for the virtual node pool:
        • Node count: The number of virtual nodes to create in the virtual node pool, placed in the availability domains you select, and in the regional subnet (recommended) or AD-specific subnet you specify for each availability domain.
        • Pod shape: The shape to use for pods running on virtual nodes in the virtual node pool. The shape determines the processor type on which to run the pod.

          Only those shapes available in your tenancy that are supported by Kubernetes Engine are shown. See Supported Images (Including Custom Images) and Shapes for Worker Nodes.

          Note that you explicitly specify the CPU and memory resource requirements for virtual nodes in the pod spec (see Assign Memory Resources to Containers and Pods and Assign CPU Resources to Containers and Pods in the Kubernetes documentation).

        • Pod communication: Pods running on virtual nodes use VCN-native pod networking. Specify how pods in the node pool communicate with each other using a pod subnet:
          • Subnet: A regional subnet configured to host pods. The pod subnet you specify for virtual nodes must be private. We recommend that the pod subnet and the virtual node subnet are the same subnet (in which case, Oracle recommends defining security rules in network security groups rather than in security lists). See Subnet Configuration.
          • Use security rules in Network Security Group (NSG): Control access to the pod subnet using security rules defined for one or more network security groups (NSGs) that you specify (up to a maximum of five). You can use security rules defined for NSGs instead of, or as well as, those defined for security lists (NSGs are recommended). For more information about the security rules to specify for the NSG, see Security Rules for Worker Nodes and Pods.

          For more information about pod communication, see Pod Networking.

        • Virtual node communication:
          • Subnet: A regional subnet (recommended) or AD-specific subnet configured to host virtual nodes. If you specified load balancer subnets, the virtual node subnets must be different. The subnets you specify can be private (recommended) or public, and can be regional (recommended) or AD-specific. We recommend that the pod subnet and the virtual node subnet are the same subnet (in which case, the virtual node subnet must be private). See Subnet Configuration.
        • Node Placement Configuration:
          • Availability domain: An availability domain in which to place virtual nodes.
          • Fault domains: (Optional) One or more fault domains in the availability domain in which to place virtual nodes.

          Optionally click Another Row to select additional domains and subnets in which to place virtual nodes.

          When the virtual nodes are created, they are distributed as evenly as possible across the availability domains and fault domains you select. If you don't select any fault domains for a particular availability domain, the virtual nodes are distributed as evenly as possible across all the fault domains in that availability domain.

      2. Either accept the defaults for advanced virtual node pool options, or click Show advanced options and specify alternatives as follows:

        • Node pool tags: (Optional) One or more tags to add to the virtual node pool. Tagging enables you to group disparate resources across compartments, and also enables you to annotate resources with your own metadata. See Tagging Kubernetes Cluster-Related Resources.
        • Kubernetes labels and taints: (Optional) Enable the targeting of workloads at specific node pools by adding labels and taints to virtual nodes:
          • Labels: One or more labels (in addition to a default label) to add to virtual nodes in the virtual node pool to enable the targeting of workloads at specific node pools.
          • Taints: One or more taints to add to virtual nodes in the virtual node pool. Taints enable virtual nodes to repel pods, thereby ensuring that pods do not run on virtual nodes in a particular virtual node pool. Note that you can only apply taints to virtual nodes.

          For more information, see Assigning Pods to Nodes in the Kubernetes documentation.

    9. Click Add to create the new node pool.
  • Use the oci ce node-pool create command and required parameters to scale up a cluster by adding a managed node pool:

    oci ce node-pool create --cluster-id <cluster-ocid> --compartment-id <compartment-ocid> --name <node-pool-name> --node-shape <shape>

    Use the oci ce virtual-node-pool create command and required parameters to scale up a cluster by adding a virtual node pool:

    oci ce virtual-node-pool create \
    --cluster-id <cluster-ocid> \
    --compartment-id <compartment-ocid> \
    --display-name <node-pool-name> \
    --kubernetes-version <kubernetes-version> \
    --placement-configurations "[{\"availabilityDomain\":\"<ad-name>\",\"faultDomain\":[\"FAULT-DOMAIN-<n>\"],\"subnetId\":\"<virtualnode-subnet-ocid>\"}]" \
    --nsg-ids "[\"<virtual-node-nsg-ocid>\"]" \
    --pod-configuration "{\"subnetId\":\"<pod-subnet-ocid>\",\"nsgIds\":[\"<pod-nsg-ocid>\"],\"shape\":\"<shape-name>\"}" \
    --size <number-of-nodes>
    where:
    • <ad-name> is the name of the availability domain in which to place virtual nodes. To find out the availability domain name to use, run:
      oci iam availability-domain list
    • <shape-name> is one of Pod.Standard.E3.Flex, Pod.Standard.E4.Flex.

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

  • Run the CreateNodePool operation to scale up a cluster by adding a managed node pool.

    Run the CreateVirtualNodePool operation to scale up a cluster by adding a virtual node pool.