Upgrade Platform Applications
Prerequisites
Before you begin, you must:
Set the
WORKSPACE_NAMESPACE
environment variable to the name of the workspace’s namespace where the cluster is attached:export WORKSPACE_NAMESPACE=<workspace_namespace>
CODE
Set the
WORKSPACE_NAME
environment variable to the name of the workspace where the cluster is attached:export WORKSPACE_NAME=<workspace_name>
CODE
Upgrade Platform Applications from the CLI
The DKP upgrade process deploys and upgrades Platform applications as a bundle for each cluster or workspace. For the Management Cluster or workspace, DKP upgrade handles all Platform applications; no other steps are necessary to upgrade the Platform application bundle. However, for managed or attached clusters or workspaces, you MUST manually upgrade the Platform applications bundle with the following command.
If you are upgrading your Platform applications as part of the DKP upgrade, upgrade your Platform applications on any additional Workspaces before proceeding with the Konvoy upgrade. Some applications in the previous release are not compatible with the Kubernetes version of this release, and upgrading Kubernetes is part of the DKP Konvoy upgrade process.
Prerequisites for upgrading clusters with NVIDIA GPU Nodes
NOTE: The information in this subsection is applicable only to clusters with GPU nodes. If your cluster is not running GPU nodes, then skip these steps and proceed to the Run DKP upgrade command subsection.
If your attached or managed cluster is running GPU nodes and has deployed the nvidia
Platform Application, you must run additional steps prior to upgrading the workspace. In DKP 2.4, the nvidia
application is replaced by nvidia-gpu-operator
during upgrade, and the configuration for the new application must be created prior to the upgrade to ensure that the new application is deployed with the proper configuration, or it may not deploy successfully.
NOTE: The following steps enable the application for all clusters in the workspace and add configuration overrides at the workspace level, which apply to all clusters in the workspace. If only a subset of clusters in the workspace are running GPU nodes, or some clusters require different configurations, refer to Customize an Application per Cluster on how to customize the application with different configuration overrides per cluster.
Create the ConfigMap with the necessary workspace-level configuration overrides to set the correct Toolkit version. For example, if you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU enabled nodes, set the
toolkit.version
parameter:cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: namespace: ${WORKSPACE_NAMESPACE} name: nvidia-gpu-operator-ws-overrides data: values.yaml: | toolkit: version: v1.10.0-centos7 EOF
CODECreate an
AppDeployment
namednvidia-gpu-operator
in the workspace namespace:cat <<EOF | kubectl apply -f - apiVersion: apps.kommander.d2iq.io/v1alpha3 kind: AppDeployment metadata: name: nvidia-gpu-operator namespace: ${WORKSPACE_NAMESPACE} spec: appRef: kind: ClusterApp configOverrides: name: nvidia-gpu-operator-ws-overrides EOF
CODEYou may now proceed to the workspace upgrade steps.
Execute the DKP Upgrade Command
Use this command to upgrade all platform applications in the given workspace and its projects to the same version as the platform applications running on the management cluster:
dkp upgrade workspace ${WORKSPACE_NAME}
An output similar to this appears:
✓ Ensuring HelmReleases are upgraded on clusters in namespace "${WORKSPACE_NAME}"
...
If the upgrade fails or times out, retry the command with higher verbosity to get more information on the upgrade process:
dkp upgrade workspace ${WORKSPACE_NAME} -v 4
If you find any HelmReleases
in a “broken” release state such as “exhausted” or “another rollback/release in progress”, you can trigger a reconciliation of the HelmRelease
using the following commands:
kubectl -n ${WORKSPACE_NAMESPACE} patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": true}]'
kubectl -n 4{WORKSPACE_NAMESPACE} patch helmrelease <HELMRELEASE_NAME> --type='json' -p='[{"op": "replace", "path": "/spec/suspend", "value": false}]'