Skip to main content
Skip table of contents

AWS Air-gapped GPU: Create the Management Cluster

To create a cluster in an AWS Air-gapped environment using GPUs, execute the following:

To increase Docker Hub's rate limit, use your Docker Hub credentials when creating the cluster by setting the following flag --registry-mirror-url=https://registry-1.docker.io --registry-mirror-username= --registry-mirror-password= on the dkp create cluster command.

  1. Give your cluster a unique name suitable for your environment.

    In AWS it is critical that the name is unique, as no two clusters in the same AWS account can have the same name.

  2. Set the environment variable to the name you assigned this cluster:

    CODE
    export CLUSTER_NAME=<aws-example>

    (info) NOTE: The cluster name may only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information.

  3. Export variables for the existing infrastructure details:

    CODE
    export AWS_VPC_ID=<vpc-...>
    export AWS_SUBNET_IDS=<subnet-...,subnet-...,subnet-...>
    export AWS_ADDITIONAL_SECURITY_GROUPS=<sg-...>
    export AWS_AMI_ID=<ami-...>
    • AWS_VPC_ID: the VPC ID where the cluster will be created. The VPC requires the ec2, elasticloadbalancing, secretsmanager and autoscaling VPC endpoints to be already present.

    • AWS_SUBNET_IDS: a comma-separated list of one or more private Subnet IDs with each one in a different Availability Zone. The cluster control-plane and worker nodes will automatically be spread across these Subnets.

    • AWS_ADDITIONAL_SECURITY_GROUPS: a comma-seperated list of one or more Security Groups IDs to use in addition to the ones automatically created by CAPA.

    • AWS_AMI_ID: the AMI ID to use for control-plane and worker nodes. The AMI must be created by the konvoy-image-builder.

    ⚠️ IMPORTANT: You must tag the subnets as described below to allow for Kubernetes to create ELBs for services of type LoadBalancer in those subnets. If the subnets are not tagged, they will not receive an ELB and the following error displays: Error syncing load balancer, failed to ensure load balancer; could not find any suitable subnets for creating the ELB..

    The tags should be set as follows, where <CLUSTER_NAME> corresponds to the name set in CLUSTER_NAME environment variable:

    CODE
    kubernetes.io/cluster = <CLUSTER_NAME>
    kubernetes.io/cluster/<CLUSTER_NAME> = owned
    kubernetes.io/role/internal-elb = 1
  4. Configure your cluster to use an existing registry as a mirror when attempting to pull images:
    ⚠️ If you do not already have a local registry set up, please refer to Local Registry Tools page for more information.

    ⚠️ IMPORTANT: The AMI must be created by the konvoy-image-builder project in order to use the registry mirror feature.

    CODE
    export REGISTRY_URL=<https/http>://<registry-address>:<registry-port>
    export REGISTRY_CA=<path to the CA on the bastion>
    export REGISTRY_USERNAME=<username>
    export REGISTRY_PASSWORD=<password>
    • REGISTRY_URL: the address of an existing registry accessible in the VPC that the new cluster nodes will be configured to use a mirror registry when pulling images.

    • REGISTRY_CA: (optional) the path on the bastion machine to the registry CA. Konvoy will configure the cluster nodes to trust this CA. This value is only needed if the registry is using a self-signed certificate and the AMIs are not already configured to trust this CA.

    • REGISTRY_USERNAME: optional, set to a user that has pull access to this registry.

    • REGISTRY_PASSWORD: optional if username is not set.

  5. Create a Kubernetes cluster. The following example shows a common configuration. See dkp create cluster aws reference for the full list of cluster creation options:

DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI compatible storage solution that is suitable for production. See the Kubernetes documentation called Changing the Default Storage Class for more information.

  • CODE
    dkp create cluster aws \ 
    --cluster-name=${CLUSTER_NAME} \
    --additional-tags=owner=$(whoami) \
    --with-aws-bootstrap-credentials=true \
    --vpc-id=${AWS_VPC_ID} \
    --ami=${AWS_AMI_ID} \
    --subnet-ids=${AWS_SUBNET_IDS} \
    --internal-load-balancer=true \
    --additional-security-group-ids=${AWS_ADDITIONAL_SECURITY_GROUPS} \
    --registry-mirror-url=${REGISTRY_URL} \
    --registry-mirror-cacert=${REGISTRY_CA} \
    --registry-mirror-username=${REGISTRY_USERNAME} \
    --registry-mirror-password=${REGISTRY_PASSWORD} \
    --self-managed

    If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy, and --no-proxy and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.

6. After cluster creation, create the node pool after cluster creation:

CODE
dkp create nodepool aws -c ${CLUSTER_NAME} \
--instance-type p2.xlarge \
--ami-id=${AMI_ID_FROM_KIB} \
--replicas=1 ${NODEPOOL_NAME} \
--kubeconfig=${CLUSTER_NAME}.conf

  • A self-managed cluster refers to one in which the CAPI resources and controllers that describe and manage it are running on the same cluster they are managing. As part of the underlying processing, the DKP CLI:

    • creates a bootstrap cluster

    • creates a workload cluster

    • moves CAPI controllers from the bootstrap cluster to the workload cluster, making it self-managed

    • deletes the bootstrap cluster

    To understand how this process works step by step, you can find a customizable Create a New AWS Cluster under Additional Infrastructure Configuration.

Cluster Verification

If you want to monitor or verify the installation of your clusters, refer to:

Verify your Cluster and DKP Installation.

Next Step:

AWS Air-gapped GPU: Install Kommander

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.