Skip to main content
Skip table of contents

Pre-provisioned GPU: Create a Management Cluster

Create a Cluster with GPU AMI

If a custom AMI was created using Konvoy Image Builder, the custom ami id is printed and written to ./manifest.json.

To use the built ami with Konvoy, specify it with the --ami flag when calling cluster create.

For GPU Steps in Pre-provisioned section of the documentation to use the overrides/nvidia.yaml.

Additional helpful information can be found in the NVIDIA Device Plug-in for Kubernetes instructions and the Installation Guide of Supported Platforms.

Name Your Cluster

The cluster name may only contain the following characters: a-z, 0-9, ., and -. Cluster creation will fail if the name has capital letters. See Kubernetes for more naming information.

When specifying the cluster-name, you must use the same cluster-name as used when defining your inventory objects.

By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool command.

Follow these steps:

  1. Give your cluster a unique name suitable for your environment.

  2. Set the environment variable:

CODE
export CLUSTER_NAME=<preprovisioned-example>

Create a Kubernetes Cluster

Once you’ve defined the infrastructure and control plane endpoints, you can proceed to creating the cluster by following these steps to create a new pre-provisioned cluster.

Before you create a new DKP cluster below, choose an external load balancer (LB) or virtual IP and use the corresponding dkp create cluster command.

In a pre-provisioned environment, use the Kubernetes CSI and third party drivers for local volumes and other storage devices in your data center.

DKP uses local static provisioner as the default storage provider for a pre-provisioned environment. However, localvolumeprovisioner is not suitable for production use. You should use a Kubernetes CSI compatible storage that is suitable for production.

After disabling localvolumeprovisioner, you can choose from any of the storage options available for Kubernetes. To make that storage the default storage, use the commands shown in this section of the Kubernetes documentation: https://kubernetes.io/docs/tasks/administer-cluster/change-default-storage-class/

The following command relies on the pre-provisioned cluster API infrastructure provider to initialize the Kubernetes control plane and worker nodes on the hosts defined in the inventory YAML previously created.

The create cluster command below includes the --self-managed flag. A self-managed cluster refers to one in which the CAPI resources and controllers that describe and manage it are running on the same cluster they are managing.

  1. Execute this command to create a cluster with a GPU AMI using the default external load balancer option:

    CODE
    dkp create cluster preprovisioned \
      --cluster-name=${CLUSTER_NAME} \
      --control-plane-endpoint-host <control plane endpoint host> \
      --control-plane-endpoint-port <control plane endpoint port, if different than 6443> \
      --pre-provisioned-inventory-file preprovisioned_inventory.yaml \
      --ssh-private-key-file <path-to-ssh-private-key> \
      --ami <ami> \
      --self-managed

If your environment uses HTTP/HTTPS proxies, you must include the flags --http-proxy, --https-proxy, and --no-proxy and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.

  1. Virtual IP ALTERNATIVE - if you don’t have an external LB, and wish to use a VIRTUAL IP provided by kube-vip, specify these flags example below:

    CODE
    dkp create cluster preprovisioned \
        --cluster-name ${CLUSTER_NAME} \
        --control-plane-endpoint-host 196.168.1.10 \
        --virtual-ip-interface eth1

    The output from this command is shortened here for reading clarity, but should start like this:

    CODE
    Generating cluster resources
    cluster.cluster.x-k8s.io/preprovisioned-example created
    cont.........
  2. Create the node pool after cluster creation:

    CODE
    dkp create nodepool aws -c ${CLUSTER_NAME} \
    --instance-type p2.xlarge \
    --ami-id=${AMI_ID_FROM_KIB} \
    --replicas=1 ${NODEPOOL_NAME} \
    --kubeconfig=${CLUSTER_NAME}.conf
  3. Use the wait command to monitor the cluster control-plane readiness:

    CODE
    kubectl wait --for=condition=ControlPlaneReady "clusters/${CLUSTER_NAME}" --timeout=30m

    Output:

    CODE
    cluster.cluster.x-k8s.io/preprovisioned-example condition met

Depending on the cluster size, it will take a few minutes to create.

When the command completes, you will have a running Kubernetes cluster! Use this command to get the Kubernetes kubeconfig for the new cluster and proceed to installing the DKP Kommander UI:

CODE
dkp get kubeconfig -c ${CLUSTER_NAME} > ${CLUSTER_NAME}.conf

If changing the Calico encapsulation, D2iQ recommends changing it after cluster creation, but before production.

To increase Docker Hub's rate limit use your Docker Hub credentials when creating the cluster, by setting the following flag --registry-mirror-url=https://registry-1.docker.io --registry-mirror-username= --registry-mirror-password= on the dkp create cluster command.

Audit logs

To modify Control Plane Audit logs settings using the information contained in the page Configure the Control Plane.

Further Steps:

For more customized cluster creation, access the Pre-Provisioned Additional Configurations section. That section is for Pre-Provisioned Override Files, custom flags, and more that specify the secret as part of the create cluster command. If these are not specified, the overrides for your nodes will not be applied.

Cluster Verification

If you want to monitor or verify the installation of your clusters, refer to:

Verify your Cluster and DKP Installation.

Next Step:

Pre-provisioned GPU: Configure MetalLB

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.