GPU Nodepools in a Pre-provisioned Environment
For pre-provisioned environments, DKP has introduced the nvidia-runfile
flag for Air-gapped Pre-provisioned environments. If the NVIDIA runfile installer has not been downloaded, then retrieve and install the download first by running the following command. The first line in the command below downloads and installs the runfile and the second line places it in the artifacts directory (you must create an artifacts
directory if it doesn’t already exist).
curl -O https://download.nvidia.com/XFree86/Linux-x86_64/470.82.01/NVIDIA-Linux-x86_64-470.82.01.run
mv NVIDIA-Linux-x86_64-470.82.01.run artifacts
DKP supported NVIDIA driver version is 470.x.
Create the secret that GPU nodepool would use, this secret is populated from the KIB overrides. In this example we have a file called,
overrides/nvidia.yaml
. It should resemble this:BASHgpu: types: - nvidia build_name_extra: "-nvidia"
Create a secret on the bootstrap cluster that is populated from the above file. We will name it
${CLUSTER_NAME}-user-overrides
kubectl create secret generic ${CLUSTER_NAME}-user-overrides --from-file=overrides.yaml=overrides/nvidia.yaml
3. Create an inventory and nodepool with the instructions below and use the $CLUSTER_NAME-user-overrides
secret.
Follow these steps:
Create an inventory object that has the same name as the node pool you’re creating, and the details of the pre-provisioned machines that you want to add to it. For example, to create a node pool named
gpu-nodepool
an inventory namedgpu-nodepool
must be present in the same namespace:YAMLapiVersion: infrastructure.cluster.konvoy.d2iq.io/v1alpha1 kind: PreprovisionedInventory metadata: name: ${MY_NODEPOOL_NAME} spec: hosts: - address: ${IP_OF_NODE} sshConfig: port: 22 user: ${SSH_USERNAME} privateKeyRef: name: ${NAME_OF_SSH_SECRET} namespace: ${NAMESPACE_OF_SSH_SECRET}
(Optional) If your pre-provisioned machines have overrides, you must create a secret that includes all of the overrides you want to provide in one file. Create an override secret using the instructions detailed on this page.
Once the
PreprovisionedInventory
object and overrides are created, create a node pool:BASHdkp create nodepool preprovisioned -c ${MY_CLUSTER_NAME} ${MY_NODEPOOL_NAME} --override-secret-name ${MY_OVERRIDE_SECRET}
Advanced users can use a combination of the
--dry-run
and--output=yaml
or--output-directory=<existing-directory>
flags to get a complete set of node pool objects to modify locally or store in version control.
For more information regarding this flag or others, please refer to the dkp create nodepool section of the documentation for either cluster or nodepool and select your provider.
For more information, see: