Skip to main content
Skip table of contents

GPUs

This section describes NVIDIA GPU support on Kommander. DKP supported NVIDIA driver version is 470.x. GPUs, such as those made by AMD, are not currently supported. This document assumes familiarity with Kubernetes GPU support. More information about GPUs in AWS environment can be found in the Advanced AWS section.

Kommander GPU Overview

GPU support on Kommander uses the NVIDIA GPU operator. Through the NVIDIA GPU operator application, Kommander configures the container runtime to run GPU containers, and installs all the necessary items to power up the NVIDIA GPU devices.

The following components provide NVIDIA GPU support on Kommander:

  • libnvidia-container and nvidia-container-runtime: GPU Support in Kommander depends on the containerd runtime. libnvidia-container and nvidia-container-runtime fit between containerd and runc, simplifying the container runtime integration with the GPU.

  • NVIDIA Device Plugin: Kommander makes use of NVIDIA GPUs using this Kubernetes device plugin. It allows GPU enabled containers to run on Kubernetes, tracking the number of available GPUs on each node and their health.

  • NVIDIA Data Center GPU Manager: Contains a Prometheus exporter that provides NVIDIA GPU metrics.

Kommander runs these components as daemonsets, making them easier to manage and upgrade across all GPU nodes.

For more information from NVIDIA, see the Getting Started page for NVIDIA.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.