Skip to main content
Skip table of contents

AWS GPU: Create the Management Cluster

Use this procedure to create a self-managed AWS GPU Management cluster with DKP. A self-managed cluster refers to one in which the CAPI resources and controllers that describe and manage it are running on the same cluster they are managing. After the GPU compatible image is created with KIB, a cluster can be generated using that custom AMI.

Name Your Cluster

The cluster name may only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information.

By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool command.

Follow these steps:

  1. Give your cluster a unique name suitable for your environment.

    In AWS it is critical that the name is unique, as no two clusters in the same AWS account can have the same name.

  2. Set the environment variable:

CODE
export CLUSTER_NAME=<aws-example>

Create a New AWS Kubernetes Cluster

If you use these instructions to create a cluster on AWS using the DKP default settings without any edits to configuration files or additional flags, your cluster is deployed on an Ubuntu 20.04 operating system image with 3 control plane nodes, and 4 worker nodes.

The default AWS image is not recommended for use in production. D2iQ suggests using Konvoy Image Builder to create a custom AMI and take advantage of enhanced cluster operations.

  • (Optional) Configure your cluster to use an existing local registry as a mirror when attempting to pull images. Below is an AWS ECR example:

    ⚠️ IMPORTANT: The AMI must be created with Konvoy Image Builder in order to use the registry mirror feature.

    CODE
    export REGISTRY_MIRROR_URL=<your_registry_url>
    • REGISTRY_MIRROR_URL: the address of an existing local registry accessible in the VPC that the new cluster nodes will be configured to use a mirror registry when pulling images.

DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI compatible storage solution that is suitable for production. See the Kubernetes documentation called Changing the Default Storage Class for more information.

  1. Execute this command to create a cluster with a GPU AMI and --self-managed flag. A self-managed cluster refers to one in which the CAPI resources and controllers that describe and manage it are running on the same cluster they are managing:

    CODE
    dkp create cluster aws \
    --cluster-name=${CLUSTER_NAME} \
    --with-aws-bootstrap-credentials=true \
    --self-managed

    If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy, and --no-proxy and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.

  2. Create the node pool after cluster creation:

    CODE
    dkp create nodepool aws -c ${CLUSTER_NAME} \
    --instance-type p2.xlarge \
    --ami-id=${AMI_ID_FROM_KIB} \
    --replicas=1 ${NODEPOOL_NAME} \
    --kubeconfig=${CLUSTER_NAME}.conf

You can find a customizable Create a New AWS Cluster under Custom Installation and Additional Infrastructure Tools .

Cluster Verification

If you want to monitor or verify the installation of your clusters, refer to:

Verify your Cluster and DKP Installation.

Next Step:

AWS GPU: Install Kommander

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.