DKP 2.7.3 Known Issues and Limitations

The following items are known issues with this release.

Upstream Deprecations in DKP 2.7.3

If you are deploying DKP using CentOS 7.9, RHEL 7.9, or Oracle Linux RHCK 7.9 in a networked environment, you must maintain your own OS package mirrors so that they can build images. We cannot depend on upstream artifacts because they have reached their End Of Life or are using dependencies that have reached their End of Life. NCN-103127

AWS Custom AMI Required for Kubernetes Version

Previous versions of DKP would default to using upstream AMIs published by the CAPA (Cluster API AWS) project when building AWS clusters if you did not specify your own AMI. However, those images are not currently available for the Kubernetes version used in the 2.7.3 patch release.

As a result, starting with this release of DKP, the behavior of the DKP create cluster aws command has been changed. It no longer defaults to using the upstream AMIs and instead requires that you specify an AMI built using Konvoy Image Builder (KIB), or by explicitly requesting that it use the upstream images.

For more information on using a custom AMI in cluster creation or during the upgrade process, refer to these topics:

Azure Custom Image Required for Kubernetes Version

Previous versions of DKP would default to using Virtual Machine images published by the CAPZ (Cluster API Azure) project when building Azure clusters if you did not specify your own image. However, those images are no longer available for the Kubernetes version used in the 2.7.3 patch release. Therefore, you must create a custom Virtual Machine image to use DKP 2.7 with Azure. Attempting to create an Azure cluster without specifying a custom image using the--compute-gallery-id parameter will fail to create a working cluster.

For more information on using a custom image in cluster creation or during the upgrade process, refer to these topics:

Attempting to delete VCD Clusters can hang

Due to a bug in the upstream VCD CAPI provider, attempting to delete a VCD cluster will never complete, because the cluster delete operation does not attempt to drain the nodes first. To delete VCD clusters, first delete the nodepools, which will drain the nodes. After the nodepools are deleted, the cluster can be successfully deleted.

Known CVE's

Since the release of DKP 2.6, DKP has been scanning Catalog applications for CVEs. Beginning with DKP 2.7, only the latest version of each Catalog application will be scanned (and have its critical CVEs mitigated). For more information about the known CVE’s for compatible catalog apps, refer to D2iQ Security Updates.

CruiseControl Image Required for `KafkaCluster` Customizations

If you use a custom version of KafkaCluster with cruise.control, ensure you use the custom resource image version required for this version of DKP to ensure your environment is not vulnerable. See Kafka in a Workspace for more information.

Flux v2.0 Upgrade Considerations

Due to API changes in the Flux v2.0 update, depending on your deployment, a manual upgrade must be ran in the Project Continuous Delivery (CD) or Catalog Git Repositories before the DKP workspace upgrade.

If the upgrade is needed per any of these conditions, it must be performed after the DKP kommander upgrade and before the DKP workspace upgrade

Project CD
- If the Project CD git repository has any K8s manifests that contain Flux Kustomization ( i.e. apiVersion: source.toolkit.fluxcd.io/v1beta*) AND they used removed fields, then the upgrade procedure must be performed.
- If the Project CD Git repository has any K8s manifests that include a Flux GitRepository AND removed fields, then the upgrade procedure must be performed.
- No action is needed if the manifest in Git does not use removed fields.
Catalog GitRepository
- If the Catalog GitRepository object used removed fields, then the upgrade procedure must be performed.
- if the Catalog Git repository contains any K8s manifests that include Flux Kustomizations (i.e. apiVersion: source.toolkit.fluxcd.io/v1beta*) in their catalog app definition AND removed fields are used, then the upgrade procedure must be performed.
- No action is needed if the manifest in Git does not use removed fields.

Kubeconfig is not displayed correctly for the Kommander Workspace UI

Due to a bug in DKP 2.7.0 dex-k8s-authenticator image, the management cluster’s kubeconfig is not correctly presented when accessing the “Generate Token” page.

This issue can be solved by applying the following workaround:

Export a KUBECONFIG variable to use the DKP Management Cluster

CODE

export KUBECONFIG=__MANAGEMENT_CLUSTER_KUBECONFIG_PATH__

Execute the following workaround:

CODE

#!/bin/bash
DKA_IMAGE_TAG="v1.3.2-d2iq"
EXISTING_OVERRIDE=$(kubectl get appdeployments -n kommander dex-k8s-authenticator -o go-template='{{ .spec.configOverrides.name }}')
if [[ $EXISTING_OVERRIDE != "<no value>" ]]; then
  echo "Override already defined ( value=$EXISTING_OVERRIDE ). Not overwriting."
  exit 1
fi
cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: dex-k8s-authenticator-270-fix
  namespace: kommander
data:
  values.yaml: |-
    ---
    image:
      tag: ${DKA_IMAGE_TAG}
---
EOF
kubectl -n kommander patch appdeployment dex-k8s-authenticator --type merge --patch-file <(cat <<EOF
spec:
  configOverrides:
    name: dex-k8s-authenticator-270-fix
EOF
)

This workaround will create an override config map for the dex-k8s-authenticator configuration, making it transition to a patched image.

If you are in an air-gapped environment, you will need to first pull the mesosphere/dex-k8s-authenticator:v1.3.2-d2iq image from the Docker registry and push it to your local registry before applying the workaround.

Rook Ceph Install Error

An issue might emerge when installing rook-ceph on vSphere clusters using RHEL operating systems.

This issue occurs during initial installation of rook-ceph, causing the object store used by Velero and Grafana Loki, to be unavailable. If the installation of Kommander component of DKP is unsuccessful due to rook-ceph failing, you might need to apply the following workaround:

Run this command to check if the cluster is affected by this issue.
CODE
```
kubectl describe CephObjectStores dkp-object-store -n kommander
```

If this output appears, the workaround needs to be applied so continue with the next step. If you do not see this output, you can stop at this step.

CODE

Name:         dkp-object-store
Namespace:    kommander
...
  Warning  ReconcileFailed     7m55s (x19 over 52m)
  rook-ceph-object-controller  failed to reconcile CephObjectStore
  "kommander/dkp-object-store". failed to create object store deployments: failed
  to configure multisite for object store: failed create ceph multisite for
  object-store ["dkp-object-store"]: failed to commit config changes after
  creating multisite config for CephObjectStore "kommander/dkp-object-store":
  failed to commit RGW configuration period changes%!(EXTRA []string=[]): signal: interrupt

Kubectl exec into the rook-ceph-tools pod.

CODE

export WORKSPACE_NAMESPACE=<workspace namespace>
CEPH_TOOLS_POD=$(kubectl get pods -l app=rook-ceph-tools -n ${WORKSPACE_NAMESPACE} -o name)
kubectl exec -it -n ${WORKSPACE_NAMESPACE} $CEPH_TOOLS_POD bash

Run these commands to set dkp-object-store as the default zonegroup.
NOTE: The period update command may take a few minutes to complete
CODE
```
radosgw-admin zonegroup default --rgw-zonegroup=dkp-object-store
radosgw-admin period update --commit
```
Next, restart the rook-ceph-operator deployment for the CephobjectStore to be reconciled.
CODE
```
kubectl rollout restart deploy -n${WORKSPACE_NAMESPACE} rook-ceph-operator
```
After running the commands above, the CephObjectStore should be Connected once the rook-ceph operator reconciles the object (this may take some time).
CODE
```
kubectl wait CephObjectStore --for=jsonpath='{.status.phase}'=Connected dkp-object-store -n ${WORKSPACE_NAMESPACE} --timeout 10m
```

Upstream Deprecations in DKP 2.7.3

AWS Custom AMI Required for Kubernetes Version

Azure Custom Image Required for Kubernetes Version

Attempting to delete VCD Clusters can hang

Known CVE's

CruiseControl Image Required for KafkaCluster Customizations

Flux v2.0 Upgrade Considerations

Kubeconfig is not displayed correctly for the Kommander Workspace UI

Rook Ceph Install Error

CruiseControl Image Required for `KafkaCluster` Customizations