AWS GPU: Create the Management Cluster
Use this procedure to create a self-managed AWS GPU Management cluster with DKP. A self-managed cluster refers to one in which the CAPI resources and controllers that describe and manage it are running on the same cluster they are managing. After the GPU compatible image is created with KIB, a cluster can be generated using that custom AMI.
Name Your Cluster
The cluster name may only contain the following characters: a-z
, 0-9
, .
, and -
. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information.
By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool
command.
Follow these steps:
Give your cluster a unique name suitable for your environment.
In AWS it is critical that the name is unique, as no two clusters in the same AWS account can have the same name.
Set the environment variable:
export CLUSTER_NAME=<aws-example>
Create a New AWS Kubernetes Cluster
If you use these instructions to create a cluster on AWS using the DKP default settings without any edits to configuration files or additional flags, your cluster is deployed on an Ubuntu 20.04 operating system image with 3 control plane nodes, and 4 worker nodes.
In previous DKP releases, AMI images provided by the upstream CAPA project would be used if you did not specify an AMI. However, the upstream images are not recommended for production and may not always be available. Therefore, DKP now requires you to specify an AMI when creating a cluster. To create an AMI, use Konvoy Image Builder.
There are two approaches to supplying the ID of your AMI. Either provide the ID of the AMI or provide a way for DKP to discover the AMI using location, format and OS information:
Option One - Provide the ID of your AMI:
Use the example command below leaving the existing flag that provides the AMI ID:
--ami AMI_ID
Option Two - Provide a path for your AMI with the information required for image discover:
Where the AMI is published using your AWS Account ID:
--ami-owner AWS_ACCOUNT_ID
The format or string used to search for matching AMIs and ensure it references the Kubernetes version plus the base OS name:
--ami-base-os ubuntu-20.04
The base OS information:
--ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*'
(Optional) Configure your cluster to use an existing local registry as a mirror when attempting to pull images. Below is an AWS ECR example:
⚠️ IMPORTANT: The AMI must be created with Konvoy Image Builder in order to use the registry mirror feature.
CODEexport REGISTRY_MIRROR_URL=<your_registry_url>
REGISTRY_MIRROR_URL
: the address of an existing local registry accessible in the VPC that the new cluster nodes will be configured to use a mirror registry when pulling images.
DKP uses AWS CSI as the default storage provider. You can use a Kubernetes CSI compatible storage solution that is suitable for production. See the Kubernetes documentation called Changing the Default Storage Class for more information.
Run the Option One command as explained above to create a cluster with a GPU AMI and
--self-managed
flag:CODEdkp create cluster aws \ --cluster-name=${CLUSTER_NAME} \ --additional-tags=owner=$(whoami) \ --with-aws-bootstrap-credentials=true \ --ami AMI_ID \ --self-managed
OR
Run the Option Two command as explained above to create a cluster with a GPU AMI providing the location, format and base OS:CODEdkp create cluster aws \ --cluster-name=${CLUSTER_NAME} \ --additional-tags=owner=$(whoami) \ --with-aws-bootstrap-credentials=true \ --ami-owner AWS_ACCOUNT_ID \ --ami-base-os ubuntu-20.04 \ --ami-format 'example-{{.BaseOS}}-?{{.K8sVersion}}-*' \ --self-managed
If your environment uses HTTP/HTTPS proxies, you must include the flags
--http-proxy
,--https-proxy
, and--no-proxy
and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.Create the node pool after cluster creation:
CODEdkp create nodepool aws -c ${CLUSTER_NAME} \ --instance-type p2.xlarge \ --ami-id=${AMI_ID_FROM_KIB} \ --replicas=1 ${NODEPOOL_NAME} \ --kubeconfig=${CLUSTER_NAME}.conf
You can find a customizable Create a New AWS Cluster under Custom Installation and Additional Infrastructure Tools .
Cluster Verification
If you want to monitor or verify the installation of your clusters, refer to:
Verify your Cluster and DKP Installation.