AWS Air-gapped GPU: Use Node Label Automatic Configuration
When using GPU nodes, it is important they have the proper label identifying them as Nvidia GPU nodes. Node feature discovery (NFD), by default labels PCI hardware as:
"feature.node.kubernetes.io/pci-<device label>.present": "true"
<device label> is by default as defined in this topic:
< class > _ < vendor >
However, because there is a wide variety in devices and their assigned PCI classes, you may find that the labels assigned to your GPU nodes do not always properly identify them as containing an Nvidia GPU.
If the default detection does not work, you can manually change the daemonset that the GPU operator creates by running the following command:
nodeSelector: feature.node.kubernetes.io/pci-< class > _ < vendor>.present: "true"
class is any 4 digit number starting with
03xy and the vendor for Nvidia is
10de. If this is already deployed, you can always change the
daemonset and change the
nodeSelector field so that it deploys to the right nodes.