DKP 2.6.0 Known Issues and Limitations

The following items are known issues with this release.

AWS `additionalTags` cannot contain spaces

Due to an upstream bug in the cluster-api-provider-aws component, it is not possible to specify tags with spaces in their name in the additionalTagssection of an AWSCluster. If you have any tags like this during an upgrade of the capi-components, you may receive a validation error, and will need to remove any such tags. This issue will be corrected in a future DKP release.

Increased Time to Attach Several Clusters

When attaching clusters in a batch, the wait time for applications to deploy and become ready might take longer compared to previous releases. This issue is actively being addressed and is scheduled to be resolved in an upcoming DKP patch release. Please do not interrupt the attachment process.

Use Static Credentials to Provision an Azure Cluster

Only static credentials can be used when provisioning an Azure cluster.

Containerd File Limit Issue

In this version of DKP, we introduced containerd 1.6.17. The systemd unit for containerd 1.6.17 provided upstream removes all file number limits (LimitNOFILE=infinity). In our testing, we found that removing these limits broke some IO-sensitive applications such as Rook Ceph and HAProxy. Because of this, the KIB version included in this release sets the LimitNOFILE value in the containerd systemd unit to the value (1048576), which was used in previous containerd 1.4.13 version releases.

Intermittent Error Status when Creating EKS Clusters in the UI

When provisioning an EKS cluster through the UI, you might receive a brief error state because the EKS cluster can sporadically lose connectivity with the management cluster, which results in the following symptoms:

The UI shows the cluster is in an error state.
The kubeconfig generated and retrieved from Kommander ceases to work.
Applications created on the management cluster may not be immediately federated to managed EKS clusters.

After a few moments, the error self-resolves, without any action on your part. A new kubeconfig generated and retrieved from Kommander then works properly, and the UI shows that it is working again. In the meantime, you can continue to use the UI to work on the cluster such as deploy applications, create projects, and add roles.

Limitations to Disk Resizing in vSphere

The DKP CLI flags --control-plane-disk-size and --worker-disk-size are unable to resize the root file system of VMs, which are created using OS images. The flags work by resizing the primary disk of the VM. When the VM boots, the root file system expands to fill the disk, but that expansion does not work for some file systems. For example, for file systems contained in an LVM Logical Volume. To ensure your root file system has the size you expect, see Create a vSphere Base OS Image | Disk-Size.

Error Status in Grafana Logging Dashboard with EKS Clusters

Currently, it is not possible to use FluentBit to collect Admin-level logs on a managed EKS cluster.

If you have these logs enabled, the following message appears when you access the Kubernetes Audit Dashboard in the Grafana Logging Dashboard:

CODE

Cannot read properties of undefined (reading '0')

Rook Ceph Install Error

An issue might emerge when installing rook-ceph on vSphere clusters using RHEL operating systems.

This issue occurs during initial installation of rook-ceph, causing the object store used by Velero and Grafana Loki, to be unavailable. If the installation of Kommander component of DKP is unsuccessful due to rook-ceph failing, you might need to apply the following workaround:

Run this command to check if the cluster is affected by this issue.
CODE
```
kubectl describe CephObjectStores dkp-object-store -n kommander
```

If this output appears, the workaround needs to be applied so continue with the next step. If you do not see this output, you can stop at this step.

CODE

Name:         dkp-object-store
Namespace:    kommander
...
  Warning  ReconcileFailed     7m55s (x19 over 52m)
  rook-ceph-object-controller  failed to reconcile CephObjectStore
  "kommander/dkp-object-store". failed to create object store deployments: failed
  to configure multisite for object store: failed create ceph multisite for
  object-store ["dkp-object-store"]: failed to commit config changes after
  creating multisite config for CephObjectStore "kommander/dkp-object-store":
  failed to commit RGW configuration period changes%!(EXTRA []string=[]): signal: interrupt

Kubectl exec into the rook-ceph-tools pod.

CODE

export WORKSPACE_NAMESPACE=<workspace namespace>
CEPH_TOOLS_POD=$(kubectl get pods -l app=rook-ceph-tools -n ${WORKSPACE_NAMESPACE} -o name)
kubectl exec -it -n ${WORKSPACE_NAMESPACE} $CEPH_TOOLS_POD bash

Run these commands to set dkp-object-store as the default zonegroup.
NOTE: The period update command may take a few minutes to complete
CODE
```
radosgw-admin zonegroup default --rgw-zonegroup=dkp-object-store
radosgw-admin period update --commit
```
Next, restart the rook-ceph-operator deployment for the CephobjectStore to be reconciled.
CODE
```
kubectl rollout restart deploy -n${WORKSPACE_NAMESPACE} rook-ceph-operator
```
After running the commands above, the CephObjectStore should be Connected once the rook-ceph operator reconciles the object (this may take some time).
CODE
```
kubectl wait CephObjectStore --for=jsonpath='{.status.phase}'=Connected dkp-object-store -n ${WORKSPACE_NAMESPACE} --timeout 10m
```

Kommander Installation Configuration File Changes

In this release, the kommander-ui, which provides the DKP Dashboard, is now deployed in the same manner as the other Platform Applications. Therefore, the contents of the default Kommander installation configuration file, which is the file produced by the dkp install kommander --init command, have changed. If you installed previous versions of DKP using a customized Kommander installation configuration file, we recommend:

Generate a new template for this release.
Reapply your customizations rather than reusing a file created by older DKP versions.

FluentD Logging Operator Error

An error may occur with the FluentD Operator not appearing when installing Kommander on a GCP Environment.

This is due to a limitation of the GCP API. As a result, you need to rename the Fluentd buffer name because the default name exceeds 63 characters and disable buffer volume metrics for FluentD in order for it to operate properly.

Follow these procedures to resolve this issue for the Management Cluster, and Managed or Attached Clusters.

Management Cluster on GCP with the Logging Stack enabled

Create the kommander namespace:
CODE
```
kubectl create namespace kommander
```

Create the logging-operator-logging-overrides ConfigMap

CODE

cat <<EOF | kubectl apply -n kommander -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: logging-operator-logging-overrides
data:
  values.yaml: |
    fluentd:
      # disable buffer metrics
      bufferVolumeMetrics: null
      bufferStorageVolume:
        pvc:
          source:
            # update this name to make the PV <= 63 chars long
            # logging-operator-logging-buf-logging-operator-logging-fluentd-0
            claimName: buf
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            volumeMode: Filesystem
EOF

Continue installing Kommander as normal.

Managed/Attached Cluster in GCP with Logging Stack enabled

Set the WORKSPACE_NAMESPACE environment variable to the name of your workspace’s namespace:
CODE
```
export WORKSPACE_NAMESPACE=<workspace namespace>
```

Create the logging-operator-logging-overrides ConfigMap on the managed or attached cluster prior to enabling the logging-operator application:

CODE

cat <<EOF | kubectl apply -n ${WORKSPACE_NAMESPACE} -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: logging-operator-logging-overrides
data:
  values.yaml: |
    fluentd:
      # disable buffer metrics
      bufferVolumeMetrics: null
      bufferStorageVolume:
        pvc:
          source:
            # update this name to make the PV <= 63 chars long
            # logging-operator-logging-buf-logging-operator-logging-fluentd-0
            claimName: buf
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            volumeMode: Filesystem
EOF

Proceed with installing the logging-operator as normal.

Generate a Token or Enable the Konvoy Credentials Plugin for Proxied Clusters

If you have attached a network-restricted cluster and enabled a proxied access to make its resources available through the Management cluster, the “Generate Token” and “Konvoy credentials plugin instructions” options required for user authentication in the DKP UI require further steps.

Expand the set of instructions depending on the authentication configuration you want to configure:

A. Generate a Token

Generate Token

I. Execute the commands established in the “Generate Token” page.

You can find the “Generate Token” page in the DKP UI by selecting your username in the top right corner of any page.

II. Prepare your Environment

Write down the name of your credentials. You can obtain this name by accessing the DKP UI > username in the top right corner > Generate token page. The credentials are displayed in the kubectl config set-credentials command, next to set-credentials.
In the following example, jane_doe-az189abcd7ab34a2abc0ab845a67a7ea-0123456789.us-west-2.elb.amazonaws.com would be the name of your credentials:
CODE
```
kubectl config set-credentials jane_doe-az189abcd7ab34a2abc0ab845a67a7ea-0123456789.us-west-2.elb.amazonaws.com \
    --token=[...] 
```
Set the CONFIG_CREDENTIALS_NAME environment variable to the credentials you obtained in the previous step:
CODE
```
CONFIG_CREDENTIALS_NAME=<CREDENTIALS_NAME>
```
Set WORKSPACE_NAMESPACE to the namespace of your network-restricted cluster:
CODE
```
export WORKSPACE_NAMESPACE=<workspace namespace>
```
Set TUNNEL_PROXY_NAME to the name of the TunnelProxy:
CODE
```
TUNNEL_PROXY_NAME=<...>
```
Set DEFAULT_CONFIG_PATH to the path file of the kubeconfig where you executed the “Generate Token” steps:
CODE
```
DEFAULT_CONFIG_PATH=<path_of_cluster's_kubeconfig>
```
With your kubeconfig set to the Management cluster, set the TUNNEL_PROXY_DOMAIN environment variable to the proxied access URL:
See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.
CODE
```
TUNNEL_PROXY_DOMAIN=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.clusterProxyDomain }}')
```

III. Reconfigure the Token

Select one of the following options according to the certificate you configured for the proxied access:

A. Clusters with ACME-supported Certificates for the `TunnelProxy`

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

B. Clusters with Self-Signed Certificates for the `TunnelProxy`

With your kubeconfig set to the Management cluster, retrieve the ca.crt (not needed in case of ACME certificates):
See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

CODE

LABEL_KEY=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.reverseProxyReleaseName }}')
SECRET_NAME=$(kubectl get ingress -n $WORKSPACE_NAMESPACE -l app.kubernetes.io/instance=$LABEL_KEY -o jsonpath='{.items[0].spec.tls[0].secretName}')
mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
kubectl get secrets $SECRET_NAME -n $WORKSPACE_NAMESPACE -o template='{{ index .data "ca.crt" | base64decode }}' > ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt

Configure the kubeconfig access via the proxied access:

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

C. Clusters with Custom Certificates for the `TunnelProxy`

Copy the ca.crt used to sign your certificates to the kubeconfig directory:

CODE

mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
cp ca.crt ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt

Configure the kubeconfig access via the proxied access:

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

After the commands complete, you can perform kubectl actions on the network-restricted cluster.

B. Konvoy credentials plugin instructions

Konvoy credentials plugin instructions

I. Execute the commands provided in the “Konvoy credentials plugin instructions” page > “Method 1” section.

You can find a link to the “Konvoy credentials plugin instructions” page in the DKP UI when logging in to your environment, or by adding /token/plugin to the DKP UI URL of your environment.

II. Prepare your Environment

Set WORKSPACE_NAMESPACE to the namespace of your network-restricted cluster:
CODE
```
export WORKSPACE_NAMESPACE=<workspace namespace>
```
Set TUNNEL_PROXY_NAME to the name of the TunnelProxy:
CODE
```
TUNNEL_PROXY_NAME=<...>
```
Set DEFAULT_CONFIG_PATH to the path of the kubeconfig file you downloaded from the dashboard:
CODE
```
DEFAULT_CONFIG_PATH=<path_of_cluster's_kubeconfig>
```
With your kubeconfig set to the Management cluster, set the TUNNEL_PROXY_DOMAIN environment variable to the proxied access URL:
See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.
CODE
```
TUNNEL_PROXY_DOMAIN=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.clusterProxyDomain }}')
```

III. Reconfigure the Token

Select one of the following options according to the certificate you configured for the proxied access:

A. Clusters with ACME-supported Certificates for the `TunnelProxy`

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

B. Clusters with Self-Signed Certificates for the `TunnelProxy`

With your kubeconfig set to the Management cluster, retrieve the ca.crt (not needed in case of ACME certificates):

See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

CODE

LABEL_KEY=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.reverseProxyReleaseName }}')
SECRET_NAME=$(kubectl get ingress -n $WORKSPACE_NAMESPACE -l app.kubernetes.io/instance=$LABEL_KEY -o jsonpath='{.items[0].spec.tls[0].secretName}')
mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
kubectl get secrets $SECRET_NAME -n $WORKSPACE_NAMESPACE -o template='{{ index .data "ca.crt" | base64decode }}' > ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt

Configure the kubeconfig access via the proxied access:

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

C. Clusters with Custom Certificates for the `TunnelProxy`

Copy the ca.crt used to sign your certificates to the kubeconfig directory:

CODE

mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
cp ca.crt ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt

Configure the kubeconfig access via the proxied access:

CODE

kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

“Method 2” is not available for network-restricted clusters.

After the commands complete, you can perform kubectl actions on the network-restricted cluster.

Known Common Vulnerabilities and Exposures (CVE)

Starting with DKP 2.6, Catalog apps are scanned for CVEs. However, only the CVEs for app versions that are compatible with the default Kubernetes version, currently 1.26.6, are mitigated. For more information about the known CVEs for compatible catalog apps, see D2iQ Security Updates.

Potential Logging Interruption during Upgrade

The configuration setting enableRecreateWorkloadOnImmutableFieldChange is enabled by default on the logging-operator. This means the operator automatically triggers a recreation of the resource if any immutable fields are changed on the underlying fluent-bit/fluentd resources. Previously, any changes to immutable fields would fail silently and the failures would only be observable in the logging-operator logs. This was done to handle a breaking change introduced in the logging-operator, which added an additional selector label on the fluent-bit DaemonSet.

Because of this breaking change, during an upgrade to 2.6, fluent-bit is restarted, and some log data in the fluent-bit buffer might be lost if fluentd is unavailable while fluent-bit attempts to flush its buffer prior to the pods being terminated. To prevent issues related to data loss, we recommend you run >1 fluentd pod. Alternatively, Fluent Bit can also be configured to use a hostPath volume to store the buffer information, so it can be picked up again when Fluent Bit restarts. For more information about how to change the host path, see Fluent Bit log collector.

Gatekeeper Uninstallation Error

If you choose to disable Gatekeeper, you can run into an error where the app is still present on your clusters.

For pre-existing attached or managed clusters with Gatekeeper installed when they don’t want it to be, you need to manually cleanup Gatekeeper on those clusters, post upgrade.

Follow these steps to manually remove Gatekeeper:

For every Attached/Managed cluster that has Gatekeeper installed that you have disabled it on otherwise (via appdeployment), run the following commands to remove it:
NOTE: This needs to be run on the attached/managed cluster using the correct kubeconfig.
Set your namespace of your attached cluster's workspace on your attached cluster:
CODE
```
export WORKSPACE_NAMESPACE=<workspace-name>
```

Delete the kustomizations:

CODE

kubectl --kubeconfig <attached-cluster-kubeconfig> delete kustomizations -n ${WORKSPACE_NAMESPACE} gatekeeper-constraint-templates gatekeeper-constraints gatekeeper-release

AWS additionalTags cannot contain spaces

Increased Time to Attach Several Clusters

Use Static Credentials to Provision an Azure Cluster

Containerd File Limit Issue

Intermittent Error Status when Creating EKS Clusters in the UI

Limitations to Disk Resizing in vSphere

Error Status in Grafana Logging Dashboard with EKS Clusters

Rook Ceph Install Error

Kommander Installation Configuration File Changes

FluentD Logging Operator Error

Management Cluster on GCP with the Logging Stack enabled

Managed/Attached Cluster in GCP with Logging Stack enabled

Generate a Token or Enable the Konvoy Credentials Plugin for Proxied Clusters

A. Generate a Token

I. Execute the commands established in the “Generate Token” page.

II. Prepare your Environment

III. Reconfigure the Token

A. Clusters with ACME-supported Certificates for the TunnelProxy

B. Clusters with Self-Signed Certificates for the TunnelProxy

C. Clusters with Custom Certificates for the TunnelProxy

B. Konvoy credentials plugin instructions

I. Execute the commands provided in the “Konvoy credentials plugin instructions” page > “Method 1” section.

II. Prepare your Environment

III. Reconfigure the Token

A. Clusters with ACME-supported Certificates for the TunnelProxy

B. Clusters with Self-Signed Certificates for the TunnelProxy

C. Clusters with Custom Certificates for the TunnelProxy

“Method 2” is not available for network-restricted clusters.

Known Common Vulnerabilities and Exposures (CVE)

Potential Logging Interruption during Upgrade

Gatekeeper Uninstallation Error

AWS `additionalTags` cannot contain spaces

A. Clusters with ACME-supported Certificates for the `TunnelProxy`

B. Clusters with Self-Signed Certificates for the `TunnelProxy`

C. Clusters with Custom Certificates for the `TunnelProxy`

A. Clusters with ACME-supported Certificates for the `TunnelProxy`

B. Clusters with Self-Signed Certificates for the `TunnelProxy`

C. Clusters with Custom Certificates for the `TunnelProxy`