DKP 2.8.2 Known Issues and Limitations
The following items are known issues with this release.
Upstream Deprecations in DKP 2.8.2
If you are deploying DKP using CentOS 7.9, RHEL 7.9, or Oracle Linux RHCK 7.9 in a networked environment, you must maintain your own OS package mirrors so that they can build images. We cannot depend on upstream artifacts because they have reached their End Of Life or are using dependencies that have reached their End of Life. NCN-103128
CAPV controller can deadlock
A known bug in the upstream Cluster-API vSphere provider can result in the CAPV controller deadlocking. This can block machines from being being provisioned, with the observed behavior that the new machines are stuck in the Provisioning
state. Restarting the capv-controller-manager will allow the provisioning to continue. NCN-101990
Custom Banner on Cluster is lost during an upgrade
If you used the ‘Custom Banner’ UI functionality in a previous version of DKP and then Upgrade to DKP 2.8, the Banner is lost during the upgrade and will need to be recreated.
Rook Ceph Install Error
An issue might emerge when installing rook-ceph
on vSphere clusters using RHEL operating systems.
This issue occurs during initial installation of rook-ceph, causing the object store used by Velero and Grafana Loki, to be unavailable. If the installation of Kommander component of DKP is unsuccessful due to rook-ceph
failing, you might need to apply the following workaround:
Run this command to check if the cluster is affected by this issue.
CODEkubectl describe CephObjectStores dkp-object-store -n kommander
If this output appears, the workaround needs to be applied so continue with the next step. If you do not see this output, you can stop at this step.
CODEName: dkp-object-store Namespace: kommander ... Warning ReconcileFailed 7m55s (x19 over 52m) rook-ceph-object-controller failed to reconcile CephObjectStore "kommander/dkp-object-store". failed to create object store deployments: failed to configure multisite for object store: failed create ceph multisite for object-store ["dkp-object-store"]: failed to commit config changes after creating multisite config for CephObjectStore "kommander/dkp-object-store": failed to commit RGW configuration period changes%!(EXTRA []string=[]): signal: interrupt
Kubectl exec into the
rook-ceph-tools
pod.CODEexport WORKSPACE_NAMESPACE=<workspace namespace> CEPH_TOOLS_POD=$(kubectl get pods -l app=rook-ceph-tools -n ${WORKSPACE_NAMESPACE} -o name) kubectl exec -it -n ${WORKSPACE_NAMESPACE} $CEPH_TOOLS_POD bash
Run these commands to set
dkp-object-store
as the default zonegroup.
NOTE: Theperiod update
command may take a few minutes to completeCODEradosgw-admin zonegroup default --rgw-zonegroup=dkp-object-store radosgw-admin period update --commit
Next, restart the
rook-ceph-operator
deployment for theCephobjectStore
to be reconciled.CODEkubectl rollout restart deploy -n${WORKSPACE_NAMESPACE} rook-ceph-operator
After running the commands above, the
CephObjectStore
should beConnected
once therook-ceph
operator reconciles the object (this may take some time).CODEkubectl wait CephObjectStore --for=jsonpath='{.status.phase}'=Connected dkp-object-store -n ${WORKSPACE_NAMESPACE} --timeout 10m