Skip to main content
Skip table of contents

AWS Create a New Customized Cluster

Prerequisites

Before you begin, make sure you have created a Bootstrap cluster.

By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool command.

Name Your Cluster

The cluster name may only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information. Give your cluster a unique name suitable for your environment.

In AWS it is critical that the name is unique, as no two clusters in the same AWS account can have the same name. You will set this environment variable during cluster creation below.

⚠️ IMPORTANT: Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If you need to change the

Kubernetes subnets, you must do this at cluster creation. The default subnets used in DKP are:

CODE
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12

For multi-tenancy, every tenant should be in a different AWS account to ensure they are truly independent of other tenants in order to enforce security.

Create a New AWS Kubernetes Cluster

  1. Ensure your AWS credentials are up to date. If you are using Static Credentials, use the following command to refresh the credentials. Otherwise, proceed to step 2:

    CODE
    dkp update bootstrap credentials aws
  2. Set the environment variables of cluster name selected according to requirements and custom AMI identification:

    CODE
    export CLUSTER_NAME=<aws-example>
    export AWS_AMI_ID=<ami-...>

In previous DKP releases, AMI images provided by the upstream CAPA project would be used if you did not specify an AMI. However, the upstream images are not recommended for production and may not always be available.   Therefore, DKP now requires you to specify an AMI when creating a cluster. To create an AMI, use Konvoy Image Builder.

There are two approaches to supplying the ID of your AMI. Either provide the ID of the AMI or provide a way for DKP to discover the AMI using location, format and OS information:

  • Option One - Provide the ID of your AMI:

    • Use the example command below leaving the existing flag that provides the AMI ID: --ami AMI_ID

  • Option Two - Provide a path for your AMI with the information required for image discover:

    1. Where the AMI is published using your AWS Account ID: --ami-owner AWS_ACCOUNT_ID

    2. The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus the base OS name: --ami-base-os ubuntu-20.04

    3. The base OS information: --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'

      See the Custom AMI in Cluster Creation topic for more information.

  • (Optional) Use a registry mirror. Configure your cluster to use an existing local registry as a mirror when attempting to pull images previously pushed to your registry.
    Set an environment variable with your registry address for ECR:

    CODE
    export REGISTRY_URL=<ecr-registry-URI>
MORE REGISTRY EXPORT COMMANDS for later use with flags during cluster creation:

For other registries, more environment variables are:

CODE
export REGISTRY_URL=<registry-address>:<registry-port>
export REGISTRY_USERNAME=<username>
export REGISTRY_PASSWORD=<password>
export REGISTRY_CA=<path to the cacert file on the bastion>

Definitions:

  • REGISTRY_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will be configured to use a mirror registry when pulling images.

Other local registries may use the options below:

  • JFrog - REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. This value is only needed if the registry is using a self-signed certificate and the AMIs are not already configured to trust this CA.

  • REGISTRY_USERNAME: optional-set to a user that has pull access to this registry.

  • REGISTRY_PASSWORD: optional if username is not set.

  • To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster, by setting flags --registry-mirror-url=https://registry-1.docker.io --registry-mirror-username=<your-username> --registry-mirror-password=<your-password> when running dkp create cluster.

EKS with AWS ECR - Adding the mirror flags to EKS would enable new clusters to also use ECR as image mirror. If you set the --registry-mirror flag, the Kubelet will now send to requests to the dynamic-credential-provider with a different config. You can still pull your own images from ECR directly or use ECR as a mirror.

  1. Generate the Kubernetes cluster objects with a dry run. The following example shows a common configuration. See dkp create cluster aws reference for the full list of cluster creation options:

    CODE
    dkp create cluster aws \ 
    --cluster-name=${CLUSTER_NAME} \
    --ami=${AWS_AMI_ID} \
    --dry-run \
    --output=yaml \
    > ${CLUSTER_NAME}.yaml

Expand the drop-downs for more flags for use in cluster creation such as registry, HTTP, FIPS and other flags to apply in the step above.

Flatcar

Flatcar OS use this flag used to instruct the bootstrap cluster to make some changes related to the installation paths:

CODE
--os-hint flatcar
If using a REGISTRY MIRROR, use these FLAGS in your create cluster command:
CODE
  --registry-mirror-url=${REGISTRY_URL} \
  --registry-mirror-cacert=${REGISTRY_CA} \
  --registry-mirror-username=${REGISTRY_USERNAME} \
  --registry-mirror-password=${REGISTRY_PASSWORD}

See also: Use a Registry Mirror

Applying exported image variable

Don’t forget, as explained in theCustom AMI in Cluster Creation before you bootstrapped, you need to apply the flag for your image in cluster creation.

CODE
--ami=${AWS_AMI_ID}
FIPS Requirements

To create a cluster in FIPS mode, inform the controllers of the appropriate image repository and version tags of the official D2iQ FIPS builds of Kubernetes by adding those flags to dkp create cluster command:

CODE
--kubernetes-version=v1.27.6+fips.0 \
--etcd-version=3.5.7+fips.0 \
--kubernetes-image-repository=docker.io/mesosphere \
HTTP ONLY

If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy, and --no-proxy and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.

CODE
--http-proxy <<http proxy list>>
--https-proxy <<https proxy list>>
--no-proxy <<no proxy list>>
  • To configure the Control Plane and Worker nodes to use an HTTP proxy:

    CODE
    export CONTROL_PLANE_HTTP_PROXY=http://example.org:8080
    export CONTROL_PLANE_HTTPS_PROXY=http://example.org:8080
    export CONTROL_PLANE_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.elb.amazonaws.com"
    
    export WORKER_HTTP_PROXY=http://example.org:8080
    export WORKER_HTTPS_PROXY=http://example.org:8080
    export WORKER_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.elb.amazonaws.com"
  • Replace:

    • example.org,example.com,example.net with you internal addresses

    • localhost and 127.0.0.1 addresses should not use the proxy

    • 10.96.0.0/12 is the default Kubernetes service subnet

    • 192.168.0.0/16 is the default Kubernetes pod subnet

    • kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local is the internal Kubernetes kube-apiserver service

    • .svc,.svc.cluster,.svc.cluster.local is the internal Kubernetes services

    • 169.254.169.254 is the AWS metadata server

    • .elb.amazonaws.com is for the worker nodes to allow them to communicate directly to the kube-apiserver ELB

  • Use the HTTP flags in the dkp create cluster command:

    CODE
    --control-plane-http-proxy=${CONTROL_PLANE_HTTP_PROXY} \
    --control-plane-https-proxy=${CONTROL_PLANE_HTTPS_PROXY} \
    --control-plane-no-proxy=${CONTROL_PLANE_NO_PROXY} \
    --worker-http-proxy=${WORKER_HTTP_PROXY} \
    --worker-https-proxy=${WORKER_HTTPS_PROXY} \
    --worker-no-proxy=${WORKER_NO_PROXY}
Individual manifests using the Output Directory flag:

You can create individual files with different smaller manifests for ease in editing using the --output-directory flag. This will create multiple files in the specified directory which must already exist:

CODE
--output-directory=<existing-directory>

Refer to the Cluster Creation Customization Choices section for more information on how to use optional flags such as the --output-directory flag.

  1. Inspect or edit the cluster objects and familiarize yourself with Cluster API before editing the cluster objects as edits can prevent the cluster from deploying successfully. See AWS Customizing CAPI Clusters.

  2. (Optional) Modify Control Plane Audit logs - Users can make modifications to the KubeadmControlplane cluster-api object to configure different kubelet options. See the following guide if you wish to configure your control plane beyond the existing options that are available from flags.

  3. Create the cluster from the objects generated with the dry run. A warning will appear in the console if the resource already exists and will require you to remove the resource or update your YAML.

    CODE
    kubectl create -f ${CLUSTER_NAME}.yaml

    NOTE: If you used the --output-directory flag in your dkp create .. --dry-run step above, create the cluster from the objects you created by specifying the directory:

    CODE
    kubectl create -f <existing-directory>/
OUTPUT will be similar to output shown in this drop-down:
CODE
cluster.cluster.x-k8s.io/aws-example created
awscluster.infrastructure.cluster.x-k8s.io/aws-example created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/aws-example-control-plane created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/aws-example-control-plane created
secret/aws-example-etcd-encryption-config created
machinedeployment.cluster.x-k8s.io/aws-example-md-0 created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/aws-example-md-0 created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/aws-example-md-0 created
clusterresourceset.addons.cluster.x-k8s.io/calico-cni-installation-aws-example created
configmap/calico-cni-installation-aws-example created
configmap/tigera-operator-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/aws-ebs-csi-aws-example created
configmap/aws-ebs-csi-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/cluster-autoscaler-aws-example created
configmap/cluster-autoscaler-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/node-feature-discovery-aws-example created
configmap/node-feature-discovery-aws-example created
clusterresourceset.addons.cluster.x-k8s.io/nvidia-feature-discovery-aws-example created
configmap/nvidia-feature-discovery-aws-example created
  1. Wait for the cluster control-plane to be ready:

    CODE
    kubectl wait --for=condition=ControlPlaneReady "clusters/${CLUSTER_NAME}" --timeout=20m

    Output:

    CODE
    cluster.cluster.x-k8s.io/aws-example condition met

    The READY status becomes True after the cluster control-plane becomes ready in one of the following steps.

  2. After the objects are created on the API server, the Cluster API controllers reconcile them. They create infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command to describe the current status of the cluster:

    CODE
    dkp describe cluster -c ${CLUSTER_NAME}
OUTPUT:
CODE
NAME                                                              READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/aws-example                                             True                     60s
├─ClusterInfrastructure - AWSCluster/aws-example                True                     5m23s
├─ControlPlane - KubeadmControlPlane/aws-example-control-plane  True                     60s
│ ├─Machine/aws-example-control-plane-55jh4                     True                     4m59s
│ ├─Machine/aws-example-control-plane-6sn97                     True                     2m49s
│ └─Machine/aws-example-control-plane-nx9v5                     True                     66s
└─Workers
  └─MachineDeployment/aws-example-md-0                          True                     117s
	├─Machine/aws-example-md-0-cb9c9bbf7-hcl8z                  True                     3m1s
	├─Machine/aws-example-md-0-cb9c9bbf7-rtdqw                  True                     3m2s
	├─Machine/aws-example-md-0-cb9c9bbf7-t894m                  True                     3m1s
	└─Machine/aws-example-md-0-cb9c9bbf7-td29r                  True
  1. As they progress, the controllers also create Events. List the Events using this command:

    CODE
    kubectl get events | grep ${CLUSTER_NAME}

    For brevity, the example uses grep. It is also possible to use separate commands to get Events for specific objects. For example, kubectl get events --field-selector involvedObject.kind="AWSCluster" and kubectl get events --field-selector involvedObject.kind="AWSMachine".

OUTPUT will be similar to output shown in this drop-down:
CODE
7m26s       Normal    SuccessfulSetNodeRef                            machine/aws-example-control-plane-2wb9q      ip-10-0-182-218.us-west-2.compute.internal
11m         Normal    SuccessfulCreate                                awsmachine/aws-example-control-plane-vcjkr   Created new control-plane instance with id "i-0dde024e80ae3de7a"
11m         Normal    SuccessfulAttachControlPlaneELB                 awsmachine/aws-example-control-plane-vcjkr   Control plane instance "i-0dde024e80ae3de7a" is registered with load balancer
7m25s       Normal    SuccessfulDeleteEncryptedBootstrapDataSecrets   awsmachine/aws-example-control-plane-vcjkr   AWS Secret entries containing userdata deleted
7m6s        Normal    FailedDescribeInstances                         awsmachinepool/aws-example-mp-0              No Auto Scaling Groups with aws-example-mp-0 found
7m3s        Warning   FailedLaunchTemplateReconcile                   awsmachinepool/aws-example-mp-0              Failed to reconcile launch template: ValidationError: AutoScalingGroup name not found - AutoScalingGroup aws-example-mp-0 not found
74s         Warning   FailedLaunchTemplateReconcile                   awsmachinepool/aws-example-mp-0              (combined from similar events): Failed to reconcile launch template: ValidationError: AutoScalingGroup name not found - AutoScalingGroup aws-example-mp-0 not found
16m         Normal    SuccessfulCreateVPC                             awscluster/aws-example                       Created new managed VPC "vpc-032fff0fe06a85035"
16m         Normal    SuccessfulSetVPCAttributes                      awscluster/aws-example                       Set managed VPC attributes for "vpc-032fff0fe06a85035"
16m         Normal    SuccessfulCreateSubnet                          awscluster/aws-example                       Created new managed Subnet "subnet-0677a4fbd7d170dfe"
16m         Normal    SuccessfulModifySubnetAttributes                awscluster/aws-example                       Modified managed Subnet "subnet-0677a4fbd7d170dfe" attributes
16m         Normal    SuccessfulCreateSubnet                          awscluster/aws-example                       Created new managed Subnet "subnet-04fc9deb4fa9f8333"
16m         Normal    SuccessfulCreateInternetGateway                 awscluster/aws-example                       Created new managed Internet Gateway "igw-07cd7ad3e6c7c1ca7"
16m         Normal    SuccessfulAttachInternetGateway                 awscluster/aws-example                       Internet Gateway "igw-07cd7ad3e6c7c1ca7" attached to VPC "vpc-032fff0fe06a85035"
16m         Normal    SuccessfulCreateNATGateway                      awscluster/aws-example                       Created new NAT Gateway "nat-0a0cf17d29150cf9a"
13m         Normal    SuccessfulCreateRouteTable                      awscluster/aws-example                       Created managed RouteTable "rtb-09f4e2eecb7462d22"
13m         Normal    SuccessfulCreateRoute                           awscluster/aws-example                       Created route {
13m         Normal    SuccessfulAssociateRouteTable                   awscluster/aws-example                       Associated managed RouteTable "rtb-09f4e2eecb7462d22" with subnet "subnet-0677a4fbd7d170dfe"
13m         Normal    SuccessfulCreateRouteTable                      awscluster/aws-example                       Created managed RouteTable "rtb-0007b98b36f37d1e4"
13m         Normal    SuccessfulCreateRoute                           awscluster/aws-example                       Created route {
13m         Normal    SuccessfulAssociateRouteTable                   awscluster/aws-example                       Associated managed RouteTable "rtb-0007b98b36f37d1e4" with subnet "subnet-04fc9deb4fa9f8333"
13m         Normal    SuccessfulCreateRouteTable                      awscluster/aws-example                       Created managed RouteTable "rtb-079a1d7d3667c2525"
13m         Normal    SuccessfulCreateRoute                           awscluster/aws-example                       Created route {
13m         Normal    SuccessfulAssociateRouteTable                   awscluster/aws-example                       Associated managed RouteTable "rtb-079a1d7d3667c2525" with subnet "subnet-0a266c15dd211ce6c"
13m         Normal    SuccessfulCreateRouteTable                      awscluster/aws-example                       Created managed RouteTable "rtb-0e5ebc8ec29848a17"
13m         Normal    SuccessfulCreateRoute                           awscluster/aws-example                       Created route {
13m         Normal    SuccessfulAssociateRouteTable                   awscluster/aws-example                       Associated managed RouteTable "rtb-05a05080bbb3cead9" with subnet "subnet-0725068cca16ad9f9"
13m         Normal    SuccessfulCreateSecurityGroup                   awscluster/aws-example                       Created managed SecurityGroup "sg-0379bf77211472854" for Role "bastion"
13m         Normal    SuccessfulCreateSecurityGroup                   awscluster/aws-example                       Created managed SecurityGroup "sg-0a4e0635f68a2f57d" for Role "apiserver-lb"
13m         Normal    SuccessfulAuthorizeSecurityGroupIngressRules    awscluster/aws-example                       Authorized security group ingress rules [protocol=tcp/range=[5473-5473]/description=typha (calico) protocol=tcp/range=[179-179]/description=bgp (calico) protocol=4/range=[-1-65535]/description=IP-in-IP (calico) protocol=tcp/range=[22-22]/description=SSH protocol=tcp/range=[6443-6443]/description=Kubernetes API protocol=tcp/range=[2379-2379]/description=etcd protocol=tcp/range=[2380-2380]/description=etcd peer] for SecurityGroup "sg-00db2e847c0b49d6e"
13m         Normal    SuccessfulAuthorizeSecurityGroupIngressRules    awscluster/aws-example                       Authorized security group ingress rules [protocol=tcp/range=[5473-5473]/description=typha (calico) protocol=tcp/range=[179-179]/description=bgp (calico) protocol=4/range=[-1-65535]/description=IP-in-IP (calico) protocol=tcp/range=[22-22]/description=SSH protocol=tcp/range=[30000-32767]/description=Node Port Services protocol=tcp/range=[10250-10250]/description=Kubelet API] for SecurityGroup "sg-01fe3426404f94708"

DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI compatible storage solution that is suitable for production. See the Kubernetes documentation called Changing the Default Storage Class for more information.

If you’re not using the default, you cannot deploy an alternate provider until after the dkp create cluster is finished. However, it must be determined before Kommander installation.

Known Limitations

Be aware of these limitations in the current release of Konvoy.

  • The Konvoy version used to create a bootstrap cluster must match the Konvoy version used to create a workload cluster.

  • Konvoy supports deploying one workload cluster.

  • Konvoy generates a set of objects for one Node Pool.

  • Konvoy does not validate edits to cluster objects.

Related Topics

Creating DKP Non-air-gapped Clusters from the UI

Next Step

Make the AWS Non-air-gapped Cluster Self-Managed

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.