DKP 2.7.1 Known Issues and Limitations
The following items are known issues with this release.
AWS Custom AMI Required for Kubernetes Version
Previous versions of DKP would default to using upstream AMIs published by the CAPA (Cluster API AWS) project when building AWS clusters if you did not specify your own AMI. However, those images are not currently available for the Kubernetes version used in the 2.7.1 patch release.
As a result, starting with this release of DKP, the behavior of the DKP create cluster aws
command has been changed. It no longer defaults to using the upstream AMIs and instead requires that you specify an AMI built using Konvoy Image Builder (KIB), or by explicitly requesting that it use the upstream images.
For more information on using a custom AMI in cluster creation or during the upgrade process, refer to these topics:
vSphere and Ubuntu using Konvoy Image Builder Issue
Building a Konvoy Image Builder (KIB) image on vSphere using Ubuntu 20.04 with cloud-init 23.3.3-0ubuntu0~20.04.1
will fail in all 2.7.x versions of DKP. The issue will be resolved in an upcoming KIB patch release.
Attempting to delete VCD Clusters can hang
Due to a bug in the upstream VCD CAPI provider, attempting to delete a VCD cluster will never complete, because the cluster delete operation does not attempt to drain the nodes first. To delete VCD clusters, first delete the nodepools, which will drain the nodes. After the nodepools are deleted, the cluster can be successfully deleted.
Known CVE's
Since the release of DKP 2.6, DKP has been scanning Catalog applications for CVEs. Beginning with DKP 2.7, only the latest version of each Catalog application will be scanned (and have its critical CVEs mitigated). ` For more information about the known CVE’s for compatible catalog apps, refer to D2iQ Security Updates.
CruiseControl Image Required for KafkaCluster
Customizations
If you use a custom version of KafkaCluster with cruise.control
, ensure you use the custom resource image version required for this version of DKP to ensure your environment is not vulnerable. See Kafka in a Workspace for more information.
Flux v2.0 Upgrade Considerations
Due to API changes in the Flux v2.0 update, depending on your deployment, a manual upgrade must be ran in the Project Continuous Delivery (CD) or Catalog Git Repositories before the DKP workspace upgrade.
If the upgrade is needed per any of these conditions, it must be performed after the DKP kommander upgrade and before the DKP workspace upgrade
Project CD
If the Project CD git repository has any K8s manifests that contain Flux Kustomization ( i.e.
apiVersion: source.toolkit.fluxcd.io/v1beta*
) AND they used removed fields, then the upgrade procedure must be performed.If the Project CD Git repository has any K8s manifests that include a Flux GitRepository AND removed fields, then the upgrade procedure must be performed.
No action is needed if the manifest in Git does not use removed fields.
Catalog GitRepository
If the Catalog
GitRepository
object used removed fields, then the upgrade procedure must be performed.if the Catalog Git repository contains any K8s manifests that include Flux Kustomizations (
i.e. apiVersion: source.toolkit.fluxcd.io/v1beta*
) in their catalog app definition AND removed fields are used, then the upgrade procedure must be performed.No action is needed if the manifest in Git does not use removed fields.
KIB image build error for CentOS 7.9 in GCP
Due to a change in Python version required by the Google Cloud SDK, CentOS 7.9 KIB image builds in GCP may fail during the Execute install-gcloud.sh
task with the following error:
TASK [providers : Execute install-gcloud.sh] ***********************************
googlecompute.kib_image: fatal: [default]: FAILED! => {"changed": true, "cmd": "bash -o errexit -o pipefail /tmp/install-gcloud.sh --disable-prompts --install-dir=/", "delta": "0:00:05.720660", "end": "2023-12-06 20:04:36.240760", "msg": "non-zero return code", "rc": 1, "start": "2023-12-06 20:04:30.520100", "stderr": "\r###########
. . .
. . .
"mkdir -p /", "tar -C / -zxvf /tmp/tmp.3e2sumYWjQ/google-cloud-sdk.tar.gz", "//google-cloud-sdk/install.sh",
"WARNING: Python 3.6.x is no longer officially supported by the Google Cloud CLI", "and may not function correctly. Please use Python version 3.8 and up.", "", "If you have a compatible Python interpreter installed, you can use it by setting", "the CLOUDSDK_PYTHON environment variable to point to it.", "", "Traceback (most recent call last):",
" File \"//google-cloud-sdk/bin/bootstrapping/install.py\", line 30, in <module>",
" from googlecloudsdk import gcloud_main", " File \"/google-cloud-sdk/lib/googlecloudsdk/gcloud_main.py\", line 35, in <module>",
" from googlecloudsdk.calliope import cli", " File \"/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py\", line 32, in <module>",
" from googlecloudsdk.calliope import backend", " File \"/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py\", line 41, in <module>", " from googlecloudsdk.calliope.concepts import handlers",
" File \"/google-cloud-sdk/lib/googlecloudsdk/calliope/concepts/handlers.py\", line 22, in <module>",
" from googlecloudsdk.calliope.concepts import concepts", " File \"/google-cloud-sdk/lib/googlecloudsdk/calliope/concepts/concepts.py\", line 46, in <module>",
" from googlecloudsdk.command_lib.util.apis import registry", " File \"/google-cloud-sdk/lib/googlecloudsdk/command_lib/util/apis/registry.py\", line 26, in <module>",
" from googlecloudsdk.api_lib.util import apis", " File \"/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/apis.py\", line 24, in <module>",
" from google.api_core import exceptions as api_core_exceptions", " File \"/google-cloud-sdk/lib/third_party/google/api_core/exceptions.py\", line 29, in <module>",
" from google.rpc import error_details_pb2", " File \"/google-cloud-sdk/lib/third_party/google/rpc/__init__.py\", line 18, in <module>", " import pkg_resources",
" File \"/google-cloud-sdk/lib/third_party/pkg_resources/__init__.py\", line 85, in <module>", " from pkg_resources.extern import platformdirs", " File \"<frozen importlib._bootstrap>\", line 971, in _find_and_load",
" File \"<frozen importlib._bootstrap>\", line 955, in _find_and_load_unlocked", " File \"<frozen importlib._bootstrap>\", line 658, in _load_unlocked",
" File \"<frozen importlib._bootstrap>\", line 571, in module_from_spec",
" File \"/google-cloud-sdk/lib/third_party/pkg_resources/extern/__init__.py\", line 52, in create_module",
" return self.load_module(spec.name)", " File \"/google-cloud-sdk/lib/third_party/pkg_resources/extern/__init__.py\", line 37, in load_module",
" __import__(extant)", " File \"/google-cloud-sdk/lib/third_party/pkg_resources/_vendor/platformdirs/__init__.py\", line 5",
" from __future__ import annotations", " ^", "SyntaxError: future feature annotations is not defined"]
We recommend choosing another supported OS variant when deploying DKP clusters to GCP.
Kubeconfig is not displayed correctly for the Kommander Workspace UI
Due to a bug in DKP 2.7.0 dex-k8s-authenticator image, the management cluster’s kubeconfig is not correctly presented when accessing the “Generate Token” page.
This issue can be solved by applying the following workaround:
Export a KUBECONFIG variable to use the DKP Management Cluster
export KUBECONFIG=__MANAGEMENT_CLUSTER_KUBECONFIG_PATH__
Execute the following workaround:
#!/bin/bash
DKA_IMAGE_TAG="v1.3.2-d2iq"
EXISTING_OVERRIDE=$(kubectl get appdeployments -n kommander dex-k8s-authenticator -o go-template='{{ .spec.configOverrides.name }}')
if [[ $EXISTING_OVERRIDE != "<no value>" ]]; then
echo "Override already defined ( value=$EXISTING_OVERRIDE ). Not overwriting."
exit 1
fi
cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: ConfigMap
metadata:
name: dex-k8s-authenticator-270-fix
namespace: kommander
data:
values.yaml: |-
---
image:
tag: ${DKA_IMAGE_TAG}
---
EOF
kubectl -n kommander patch appdeployment dex-k8s-authenticator --type merge --patch-file <(cat <<EOF
spec:
configOverrides:
name: dex-k8s-authenticator-270-fix
EOF
)
This workaround will create an override config map for the dex-k8s-authenticator configuration, making it transition to a patched image.
If you are in an air-gapped environment, you will need to first pull the mesosphere/dex-k8s-authenticator:v1.3.2-d2iq
image from the Docker registry and push it to your local registry before applying the workaround.
Pre-provisioned Ubuntu Non-air-gapped Konvoy Image Builder Provisioning Error
Due to an upstream repository change, Konvoy Image Builder (KIB) cannot install the OS packages required for DKP when provisioning each machine. This can result in failures when creating Pre-provisioned Ubuntu non-air-gapped clusters, or when modifying their configuration in a way that machines must be provisioned.
New DKP patch releases 2.6.2 and 2.7.1 contain corresponding Konvoy Image Builder releases that include the changes necessary to provision machines successfully.
Rook Ceph Install Error
An issue might emerge when installing rook-ceph
on vSphere clusters using RHEL operating systems.
This issue occurs during initial installation of rook-ceph, causing the object store used by Velero and Grafana Loki, to be unavailable. If the installation of Kommander component of DKP is unsuccessful due to rook-ceph
failing, you might need to apply the following workaround:
Run this command to check if the cluster is affected by this issue.
CODEkubectl describe CephObjectStores dkp-object-store -n kommander
If this output appears, the workaround needs to be applied so continue with the next step. If you do not see this output, you can stop at this step.
CODEName: dkp-object-store Namespace: kommander ... Warning ReconcileFailed 7m55s (x19 over 52m) rook-ceph-object-controller failed to reconcile CephObjectStore "kommander/dkp-object-store". failed to create object store deployments: failed to configure multisite for object store: failed create ceph multisite for object-store ["dkp-object-store"]: failed to commit config changes after creating multisite config for CephObjectStore "kommander/dkp-object-store": failed to commit RGW configuration period changes%!(EXTRA []string=[]): signal: interrupt
Kubectl exec into the
rook-ceph-tools
pod.CODEexport WORKSPACE_NAMESPACE=<workspace namespace> CEPH_TOOLS_POD=$(kubectl get pods -l app=rook-ceph-tools -n ${WORKSPACE_NAMESPACE} -o name) kubectl exec -it -n ${WORKSPACE_NAMESPACE} $CEPH_TOOLS_POD bash
Run these commands to set
dkp-object-store
as the default zonegroup.
NOTE: Theperiod update
command may take a few minutes to completeCODEradosgw-admin zonegroup default --rgw-zonegroup=dkp-object-store radosgw-admin period update --commit
Next, restart the
rook-ceph-operator
deployment for theCephobjectStore
to be reconciled.CODEkubectl rollout restart deploy -n${WORKSPACE_NAMESPACE} rook-ceph-operator
After running the commands above, the
CephObjectStore
should beConnected
once therook-ceph
operator reconciles the object (this may take some time).CODEkubectl wait CephObjectStore --for=jsonpath='{.status.phase}'=Connected dkp-object-store -n ${WORKSPACE_NAMESPACE} --timeout 10m