Skip to main content
Skip table of contents

NVIDIA Platform Application Management Cluster

Instructions on enabling the NVIDIA platform application on a Management cluster

Enable NVIDIA Platform Application on Kommander for Management Cluster

If you intend to run applications that make use of GPU’s on your cluster, you should install the NVIDIA GPU operator. To enable NVIDIA GPU support when installing Kommander on a management cluster, perform the following steps:

  1. Create an installation configuration file:

    CODE
    dkp install kommander --init > install.yaml
  2. Append the following to the apps section in the install.yaml file to enable Nvidia platform services.

    CODE
    apps:
      nvidia-gpu-operator:
       enabled: true
  3. Install Kommander using the configuration file you created:

    CODE
    dkp install kommander --installer-config ./install.yaml --kubeconfig=${CLUSTER_NAME}.conf

    In the previous command, the --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you set the context to install Kommander on the right cluster. For alternatives and recommendations around setting your context, refer to Provide Context for Commands with a kubeconfig File.

  4. Proceed to the Select the correct Toolkit version for your NVIDIA GPU Operator section.

TIP: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out. Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of applications.

Select the Correct Toolkit Version for your NVIDIA GPU Operator

The NVIDIA Container Toolkit allows users to run GPU accelerated containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPU and must be configured correctly according to your base operating system.

Kommander (Management Cluster) Customization

  1. Select the correct Toolkit version based on your OS:

    The NVIDIA Container Toolkit allows users to run GPU accelerated containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPU and must be configured correctly according to your base operating system.

    Centos 7.9/RHEL 7.9:
    If you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:

    CODE
    kind: Installation
    apps:
      nvidia-gpu-operator:
       enabled: true
       values: |
         toolkit:
           version: v1.10.0-centos7

    RHEL 8.4/8.6 and SLES 15 SP3
    If you’re using RHEL 8.4/8.6 or SLES 15 SP3 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:

    CODE
    kind: Installation
    apps:
      nvidia-gpu-operator:
       enabled: true
       values: |
         toolkit:
           version: v1.10.0-ubi8

    Ubuntu 18.04 and 20.04
    If you’re using Ubuntu 18.04 or 20.04 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:

    CODE
    kind: Installation
    apps:
      nvidia-gpu-operator:
       enabled: true
       values: |
         toolkit:
           version: v1.11.0-ubuntu20.04

  2. Install Kommander, using the configuration file you created:

    CODE
    dkp install kommander --installer-config ./install.yaml

    In the previous command, the --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you set the context to install Kommander on the right cluster. For alternatives and recommendations around setting your context, refer to Provide Context for Commands with a kubeconfig File.

    TIP: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out. Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of applications.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.