Prerequisites
Before you begin, make sure you have created a Bootstrap cluster.
By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool
command.
Name Your Cluster
The cluster name may only contain the following characters: a-z
, 0-9
, .
, and -
. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information. Give your cluster a unique name suitable for your environment.
In AWS it is critical that the name is unique, as no two clusters in the same AWS account can have the same name. You will set this environment variable during cluster creation below.
⚠️ IMPORTANT: Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If you need to change the
kubernetes subnets, you must do this at cluster creation. The default subnets used in DKP are:
CODE
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
For multi-tenancy, every tenant should be in a different AWS account to ensure they are truly independent of other tenants in order to enforce security.
Create a New AWS Kubernetes Cluster
Ensure your AWS credentials are up to date. If you are using Static Credentials, use the following command to refresh the credentials. Otherwise, proceed to step 2:
CODE
dkp update bootstrap credentials aws
Set the environment variables of cluster name selected according to requirements and custom AMI identification:
CODE
export CLUSTER_NAME=<aws-example>
export AWS_AMI_ID=<ami-...>
In previous DKP releases, AMI images provided by the upstream CAPA project would be used if you did not specify an AMI. However, the upstream images are not recommended for production and may not always be available. Therefore, DKP now requires you to specify an AMI when creating a cluster. To create an AMI, use Konvoy Image Builder.
There are two approaches to supplying the ID of your AMI. Either provide the ID of the AMI or provide a way for DKP to discover the AMI using location, format and OS information:
Option One - Provide the ID of your AMI:
Option Two - Provide a path for your AMI with the information required for image discover:
Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID
The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus the base OS name: --ami-base-os ubuntu-20.04
The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
(Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when attempting to pull images previously pushed to your registry.
Set an environment variable with your registry address for ECR:
CODE
export REGISTRY_URL=<ecr-registry-URI>
MORE REGISTRY EXPORT COMMANDS for later use with flags during cluster creation:
For other registries, more environment variables are:
CODE
export REGISTRY_URL=<registry-address>:<registry-port>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>
Definitions:
Other local registries may use the options below:
JFrog - REGISTRY_CA
: (optional) the path on the bastion machine to the registry CA. This value is only needed if the registry is using a self-signed certificate and the AMIs are not already configured to trust this CA.
REGISTRY_USERNAME
: optional-set to a user that has pull access to this registry.
REGISTRY_PASSWORD
: optional if username is not set.
To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster, by setting flags --registry-mirror-url=https://registry-1.docker.io --registry-mirror-username=<your-username> --registry-mirror-password=<your-password>
when running dkp create cluster
.
EKS with AWS ECR - Adding the mirror flags to EKS would enable new clusters to also use ECR as image mirror. If you set the --registry-mirror
flag, the Kubelet will now send to requests to the dynamic-credential-provider
with a different config. You can still pull your own images from ECR directly or use ECR as a mirror.
Generate the Kubernetes cluster objects with a dry run
. The following example shows a common configuration. See dkp create cluster aws reference for the full list of cluster creation options:
CODE
dkp create cluster aws \
--cluster-name=${CLUSTER_NAME} \
--ami=${AWS_AMI_ID} \
--dry-run \
--output=yaml \
> ${CLUSTER_NAME}.yaml
Expand the drop-downs for more flags for use in cluster creation such as registry, HTTP, FIPS and other flags to apply in the step above.
Flatcar
Flatcar OS use this flag used to instruct the bootstrap cluster to make some changes related to the installation paths:
If using a REGISTRY MIRROR, use these FLAGS in your create cluster command:
CODE
--registry-mirror-url=${REGISTRY_URL} \
--registry-mirror-cacert=${REGISTRY_CA} \
--registry-mirror-username=${REGISTRY_USERNAME} \
--registry-mirror-password=${REGISTRY_PASSWORD}
See also: Use a Registry Mirror
Applying exported image variable
Don’t forget, as explained in theCustom AMI in Cluster Creation before you bootstrapped, you need to apply the flag for your image in cluster creation.
FIPS Requirements
To create a cluster in FIPS mode, inform the controllers of the appropriate image repository and version tags of the official D2iQ FIPS builds of Kubernetes by adding those flags to dkp create cluster
command:
CODE
--kubernetes-version=v1.28.7+fips.0 \
--etcd-version=3.5.10+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
HTTP ONLY
If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy
, --https-proxy
, and --no-proxy
and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.
CODE
--http-proxy <<http proxy list>>
--https-proxy <<https proxy list>>
--no-proxy <<no proxy list>>
Individual manifests using the Output Directory flag:
You can create individual files with different smaller manifests for ease in editing using the --output-directory
flag. This will create multiple files in the specified directory which must already exist:
CODE
--output-directory=<existing-directory>
Refer to the Cluster Creation Customization Choices section for more information on how to use optional flags such as the --output-directory
flag.
Inspect or edit the cluster objects and familiarize yourself with Cluster API before editing the cluster objects as edits can prevent the cluster from deploying successfully. See AWS Customizing CAPI Clusters.
(Optional) Modify Control Plane Audit logs - Users can make modifications to the KubeadmControlplane
cluster-api object to configure different kubelet
options. See the following guide if you wish to configure your control plane beyond the existing options that are available from flags.
Create the cluster from the objects generated with the dry run
. A warning will appear in the console if the resource already exists and will require you to remove the resource or update your YAML.
CODE
kubectl create -f ${CLUSTER_NAME}.yaml
NOTE: If you used the --output-directory
flag in your dkp create .. --dry-run
step above, create the cluster from the objects you created by specifying the directory:
CODE
kubectl create -f <existing-directory>/
OUTPUT will be similar to output shown in this drop-down:
CODE
cluster.cluster.x-k8s.io/aws-example created
awscluster.infrastructure.cluster.x-k8s.io/aws-example created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/aws-example-control-plane created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/aws-example-control-plane created
secret/aws-example-etcd-encryption-config created
machinedeployment.cluster.x-k8s.io/aws-example-md-0 created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/aws-example-md-0 created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/aws-example-md-0 created
clusterresourceset.addons.cluster.x-k8s.io/calico-cni-installation-aws-example created
configmap/calico-cni-installation-aws-example created
configmap/tigera-operator-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/aws-ebs-csi-aws-example created
configmap/aws-ebs-csi-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/cluster-autoscaler-aws-example created
configmap/cluster-autoscaler-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/node-feature-discovery-aws-example created
configmap/node-feature-discovery-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/nvidia-feature-discovery-aws-example created
configmap/nvidia-feature-discovery-aws-example created
Wait for the cluster control-plane to be ready:
CODE
kubectl wait --for=condition=ControlPlaneReady "clusters/${CLUSTER_NAME}" --timeout=20m
Output:
CODE
cluster.cluster.x-k8s.io/aws-example condition met
The READY
status becomes True
after the cluster control-plane becomes ready in one of the following steps.
After the objects are created on the API server, the Cluster API controllers reconcile them. They create infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command to describe the current status of the cluster:
CODE
dkp describe cluster -c ${CLUSTER_NAME}
OUTPUT:
CODE
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/aws-example True 60s
├─ClusterInfrastructure - AWSCluster/aws-example True 5m23s
├─ControlPlane - KubeadmControlPlane/aws-example-control-plane True 60s
│ ├─Machine/aws-example-control-plane-55jh4 True 4m59s
│ ├─Machine/aws-example-control-plane-6sn97 True 2m49s
│ └─Machine/aws-example-control-plane-nx9v5 True 66s
└─Workers
└─MachineDeployment/aws-example-md-0 True 117s
├─Machine/aws-example-md-0-cb9c9bbf7-hcl8z True 3m1s
├─Machine/aws-example-md-0-cb9c9bbf7-rtdqw True 3m2s
├─Machine/aws-example-md-0-cb9c9bbf7-t894m True 3m1s
└─Machine/aws-example-md-0-cb9c9bbf7-td29r True
As they progress, the controllers also create Events. List the Events using this command:
CODE
kubectl get events | grep ${CLUSTER_NAME}
For brevity, the example uses grep
. It is also possible to use separate commands to get Events for specific objects. For example, kubectl get events --field-selector involvedObject.kind="AWSCluster"
and kubectl get events --field-selector involvedObject.kind="AWSMachine"
.
OUTPUT will be similar to output shown in this drop-down:
CODE
7m26s Normal SuccessfulSetNodeRef machine/aws-example-control-plane-2wb9q ip-10-0-182-218.us-west-2.compute.internal
11m Normal SuccessfulCreate awsmachine/aws-example-control-plane-vcjkr Created new control-plane instance with id "i-0dde024e80ae3de7a"
11m Normal SuccessfulAttachControlPlaneELB awsmachine/aws-example-control-plane-vcjkr Control plane instance "i-0dde024e80ae3de7a" is registered with load balancer
7m25s Normal SuccessfulDeleteEncryptedBootstrapDataSecrets awsmachine/aws-example-control-plane-vcjkr AWS Secret entries containing userdata deleted
7m6s Normal FailedDescribeInstances awsmachinepool/aws-example-mp-0 No Auto Scaling Groups with aws-example-mp-0 found
7m3s Warning FailedLaunchTemplateReconcile awsmachinepool/aws-example-mp-0 Failed to reconcile launch template: ValidationError: AutoScalingGroup name not found - AutoScalingGroup aws-example-mp-0 not found
74s Warning FailedLaunchTemplateReconcile awsmachinepool/aws-example-mp-0 (combined from similar events): Failed to reconcile launch template: ValidationError: AutoScalingGroup name not found - AutoScalingGroup aws-example-mp-0 not found
16m Normal SuccessfulCreateVPC awscluster/aws-example Created new managed VPC "vpc-032fff0fe06a85035"
16m Normal SuccessfulSetVPCAttributes awscluster/aws-example Set managed VPC attributes for "vpc-032fff0fe06a85035"
16m Normal SuccessfulCreateSubnet awscluster/aws-example Created new managed Subnet "subnet-0677a4fbd7d170dfe"
16m Normal SuccessfulModifySubnetAttributes awscluster/aws-example Modified managed Subnet "subnet-0677a4fbd7d170dfe" attributes
16m Normal SuccessfulCreateSubnet awscluster/aws-example Created new managed Subnet "subnet-04fc9deb4fa9f8333"
16m Normal SuccessfulCreateInternetGateway awscluster/aws-example Created new managed Internet Gateway "igw-07cd7ad3e6c7c1ca7"
16m Normal SuccessfulAttachInternetGateway awscluster/aws-example Internet Gateway "igw-07cd7ad3e6c7c1ca7" attached to VPC "vpc-032fff0fe06a85035"
16m Normal SuccessfulCreateNATGateway awscluster/aws-example Created new NAT Gateway "nat-0a0cf17d29150cf9a"
13m Normal SuccessfulCreateRouteTable awscluster/aws-example Created managed RouteTable "rtb-09f4e2eecb7462d22"
13m Normal SuccessfulCreateRoute awscluster/aws-example Created route {
13m Normal SuccessfulAssociateRouteTable awscluster/aws-example Associated managed RouteTable "rtb-09f4e2eecb7462d22" with subnet "subnet-0677a4fbd7d170dfe"
13m Normal SuccessfulCreateRouteTable awscluster/aws-example Created managed RouteTable "rtb-0007b98b36f37d1e4"
13m Normal SuccessfulCreateRoute awscluster/aws-example Created route {
13m Normal SuccessfulAssociateRouteTable awscluster/aws-example Associated managed RouteTable "rtb-0007b98b36f37d1e4" with subnet "subnet-04fc9deb4fa9f8333"
13m Normal SuccessfulCreateRouteTable awscluster/aws-example Created managed RouteTable "rtb-079a1d7d3667c2525"
13m Normal SuccessfulCreateRoute awscluster/aws-example Created route {
13m Normal SuccessfulAssociateRouteTable awscluster/aws-example Associated managed RouteTable "rtb-079a1d7d3667c2525" with subnet "subnet-0a266c15dd211ce6c"
13m Normal SuccessfulCreateRouteTable awscluster/aws-example Created managed RouteTable "rtb-0e5ebc8ec29848a17"
13m Normal SuccessfulCreateRoute awscluster/aws-example Created route {
13m Normal SuccessfulAssociateRouteTable awscluster/aws-example Associated managed RouteTable "rtb-05a05080bbb3cead9" with subnet "subnet-0725068cca16ad9f9"
13m Normal SuccessfulCreateSecurityGroup awscluster/aws-example Created managed SecurityGroup "sg-0379bf77211472854" for Role "bastion"
13m Normal SuccessfulCreateSecurityGroup awscluster/aws-example Created managed SecurityGroup "sg-0a4e0635f68a2f57d" for Role "apiserver-lb"
13m Normal SuccessfulAuthorizeSecurityGroupIngressRules awscluster/aws-example Authorized security group ingress rules [protocol=tcp/range=[5473-5473]/description=typha (calico) protocol=tcp/range=[179-179]/description=bgp (calico) protocol=4/range=[-1-65535]/description=IP-in-IP (calico) protocol=tcp/range=[22-22]/description=SSH protocol=tcp/range=[6443-6443]/description=Kubernetes API protocol=tcp/range=[2379-2379]/description=etcd protocol=tcp/range=[2380-2380]/description=etcd peer] for SecurityGroup "sg-00db2e847c0b49d6e"
13m Normal SuccessfulAuthorizeSecurityGroupIngressRules awscluster/aws-example Authorized security group ingress rules [protocol=tcp/range=[5473-5473]/description=typha (calico) protocol=tcp/range=[179-179]/description=bgp (calico) protocol=4/range=[-1-65535]/description=IP-in-IP (calico) protocol=tcp/range=[22-22]/description=SSH protocol=tcp/range=[30000-32767]/description=Node Port Services protocol=tcp/range=[10250-10250]/description=Kubelet API] for SecurityGroup "sg-01fe3426404f94708"
DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI compatible storage solution that is suitable for production. See the Kubernetes documentation called Changing the Default Storage Class for more information.
If you’re not using the default, you cannot deploy an alternate provider until after the dkp create cluster
is finished. However, it must be determined before Kommander installation.
Known Limitations
Be aware of these limitations in the current release of Konvoy.
The Konvoy version used to create a bootstrap cluster must match the Konvoy version used to create a workload cluster.
Konvoy supports deploying one workload cluster.
Konvoy generates a set of objects for one Node Pool.
Konvoy does not validate edits to cluster objects.
Creating DKP Non-air-gapped Clusters from the UI
Next Step
Make the AWS Non-air-gapped Cluster Self-Managed