Azure Replace a Node
Replace a worker node
Prerequisites
Before you begin, you must:
Replace a worker node
In certain situations, you may want to delete a worker node and have Cluster API replace it with a newly provisioned machine.
Identify the name of the node to delete.
List the nodes:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get nodes
The output from this command resembles the following:
CODENAME STATUS ROLES AGE VERSION azure-example-control-plane-ckwm4 Ready control-plane,master 35m v1.28.7 azure-example-control-plane-d4fdf Ready control-plane,master 31m v1.28.7 azure-example-control-plane-qrvm9 Ready control-plane,master 33m v1.28.7 azure-example-md-0-4w7gq Ready <none> 33m v1.28.7 azure-example-md-0-6gb9k Ready <none> 33m v1.28.7 azure-example-md-0-p2n8c Ready <none> 11m v1.28.7 azure-example-md-0-s5zbh Ready <none> 33m v1.28.7
Export a variable with the node name to use in the next steps:
This example uses the name
azure-example-control-plane-ckwm4
.CODEexport NAME_NODE_TO_DELETE="<azure-example-control-plane-ckwm4>"
Delete the Machine resource
CODEexport NAME_MACHINE_TO_DELETE=$(kubectl --kubeconfig ${CLUSTER_NAME}.conf get machine -ojsonpath="{.items[?(@.status.nodeRef.name==\"$NAME_NODE_TO_DELETE\")].metadata.name}") kubectl --kubeconfig ${CLUSTER_NAME}.conf delete machine "$NAME_MACHINE_TO_DELETE"
CODEmachine.cluster.x-k8s.io "azure-example-control-plane-slprd" deleted
The command will not return immediately. It will return once the Machine resource has been deleted.
A few minutes after the Machine resource is deleted, the corresponding Node resource is also deleted.
Observe that the Machine resource is being replaced using this command:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment
CODENAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION azure-example-md-0 azure-example 4 3 4 1 ScalingUp 7m30s v1.28.7 long-running-md-0 long-running 4 4 4 0 Running 7m28s v1.28.7
In this example, there are two replicas, but only 1 is ready. One replica is unavailable, and the
ScalingUp
phase means a new Machine is being created.Identify the replacement Machine using this command:
CODEexport NAME_NEW_MACHINE=$(kubectl --kubeconfig ${CLUSTER_NAME}.conf get machines \ -l=cluster.x-k8s.io/deployment-name=${CLUSTER_NAME}-md-0 \ -ojsonpath='{.items[?(@.status.phase=="Running")].metadata.name}{"\n"}') echo $NAME_NEW_MACHINE
CODEazure-example-md-0-d67567c8b-2674r azure-example-md-0-d67567c8b-n276j azure-example-md-0-d67567c8b-pzg8k azure-example-md-0-d67567c8b-z8km9
If the output is empty, the new Machine has probably exited the
Provisioning
phase and entered theRunning
phase.Identify the replacement Node using this command:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get nodes
CODENAME STATUS ROLES AGE VERSION azure-example-control-plane-d4fdf Ready control-plane,master 43m v1.28.7 azure-example-control-plane-qrvm9 Ready control-plane,master 45m v1.28.7 azure-example-control-plane-tz56m Ready control-plane,master 8m22s v1.28.7 azure-example-md-0-4w7gq Ready <none> 45m v1.28.7 azure-example-md-0-6gb9k Ready <none> 45m v1.28.7 azure-example-md-0-p2n8c Ready <none> 22m v1.28.7 azure-example-md-0-s5zbh Ready <none> 45m v1.28.7
If the output is empty, the Node resource is not yet available, or does not yet have the expected annotation. Wait a few minutes, then repeat the command.