Skip to main content
Skip table of contents

DKP 2.6.2 Known Issues and Limitations

The following items are known issues with this release.

Known Issues

AWS additionalTags cannot contain spaces

Due to an upstream bug in the cluster-api-provider-aws component, it is not possible to specify tags with spaces in their name in the additionalTagssection of an AWSCluster. If you have any tags like this during an upgrade of the capi-components, you may receive a validation error, and will need to remove any such tags. This issue will be corrected in a future DKP release.

AWS Custom AMI Required for Kubernetes Version

Previous versions of DKP would default to using upstream AMIs published by the CAPA (Cluster API AWS) project when building AWS clusters if you did not specify your own AMI. However, those images are not currently available for the Kubernetes version used in the 2.6.2 patch release.

As a result, starting with this release of DKP, the behavior of the DKP create cluster aws command has been changed. It no longer defaults to using the upstream AMIs and instead requires that you specify an AMI built using Konvoy Image Builder (KIB), or by explicitly requesting that it use the upstream images.

For more information on using a custom AMI in cluster creation or during the upgrade process, refer to these topics:

Use Static Credentials to Provision an Azure Cluster

Only static credentials can be used when provisioning an Azure cluster.

Containerd File Limit Issue

This version of DKP uses containerd 1.6.17. The systemd unit for containerd 1.6.17 provided upstream removes all file number limits (LimitNOFILE=infinity). In our testing, we found that removing these limits broke some IO-sensitive applications such as Rook Ceph and HAProxy. Because of this, the KIB version included in this release sets the LimitNOFILE value in the containerd systemd unit to the value (1048576), which was used in previous containerd 1.4.13 version releases.

Intermittent Error Status when Creating EKS Clusters in the UI

When provisioning an EKS cluster through the UI, you might receive a brief error state because the EKS cluster can sporadically lose connectivity with the management cluster, which results in the following symptoms:

  • The UI shows the cluster is in an error state.

  • The kubeconfig generated and retrieved from Kommander ceases to work.

  • Applications created on the management cluster may not be immediately federated to managed EKS clusters.

After a few moments, the error self-resolves, without any action on your part. A new kubeconfig generated and retrieved from Kommander then works properly, and the UI shows that it is working again. In the meantime, you can continue to use the UI to work on the cluster such as deploy applications, create projects, and add roles.

Limitations to Disk Resizing in vSphere

The DKP CLI flags --control-plane-disk-size and --worker-disk-size are unable to resize the root file system of VMs, which are created using OS images. The flags work by resizing the primary disk of the VM. When the VM boots, the root file system expands to fill the disk, but that expansion does not work for some file systems. For example, for file systems contained in an LVM Logical Volume. To ensure your root file system has the size you expect, see Create a vSphere Base OS Image | Disk-Size.

Error Status in Grafana Logging Dashboard with EKS Clusters

Currently, it is not possible to use FluentBit to collect Admin-level logs on a managed EKS cluster.

If you have these logs enabled, the following message appears when you access the Kubernetes Audit Dashboard in the Grafana Logging Dashboard:

CODE
Cannot read properties of undefined (reading '0')

Spark Image Upstream Removal

Google has removed the images from their Spark registry upstream gcr.io/spark-operator/spark-py:v3.1.1. Therefore, you are not able to pull these images for non-air-gapped environments. Refer also to our Deprecations notice for more information.

Rook Ceph Install Error

An issue might emerge when installing rook-ceph on vSphere clusters using RHEL operating systems.

This issue occurs during initial installation of rook-ceph, causing the object store used by Velero and Grafana Loki, to be unavailable. If the installation of Kommander component of DKP is unsuccessful due to rook-ceph failing, you might need to apply the following workaround:

  1. Run this command to check if the cluster is affected by this issue.

    CODE
    kubectl describe CephObjectStores dkp-object-store -n kommander
  2. If this output appears, the workaround needs to be applied so continue with the next step. If you do not see this output, you can stop at this step.

    CODE
    Name:         dkp-object-store
    Namespace:    kommander
    ...
      Warning  ReconcileFailed     7m55s (x19 over 52m)
      rook-ceph-object-controller  failed to reconcile CephObjectStore
      "kommander/dkp-object-store". failed to create object store deployments: failed
      to configure multisite for object store: failed create ceph multisite for
      object-store ["dkp-object-store"]: failed to commit config changes after
      creating multisite config for CephObjectStore "kommander/dkp-object-store":
      failed to commit RGW configuration period changes%!(EXTRA []string=[]): signal: interrupt
  3. Kubectl exec into the rook-ceph-tools pod.

    CODE
    export WORKSPACE_NAMESPACE=<workspace namespace>
    CEPH_TOOLS_POD=$(kubectl get pods -l app=rook-ceph-tools -n ${WORKSPACE_NAMESPACE} -o name)
    kubectl exec -it -n ${WORKSPACE_NAMESPACE} $CEPH_TOOLS_POD bash
  4. Run these commands to set dkp-object-store as the default zonegroup.
    (info) NOTE: The period update command may take a few minutes to complete

    CODE
    radosgw-admin zonegroup default --rgw-zonegroup=dkp-object-store
    radosgw-admin period update --commit
  5. Next, restart the rook-ceph-operator deployment for the CephobjectStore to be reconciled.

    CODE
    kubectl rollout restart deploy -n${WORKSPACE_NAMESPACE} rook-ceph-operator
  6. After running the commands above, the CephObjectStore should be Connected once the rook-ceph operator reconciles the object (this may take some time).

    CODE
    kubectl wait CephObjectStore --for=jsonpath='{.status.phase}'=Connected dkp-object-store -n ${WORKSPACE_NAMESPACE} --timeout 10m

Kommander Installation Configuration File Changes

In this release, the kommander-ui, which provides the DKP Dashboard, is now deployed in the same manner as the other Platform Applications. Also, the ai-navigator-app, which provides the AI Navigator, is deployed in the same manner. Therefore, the contents of the default Kommander installation configuration file, which is the file produced by the dkp install kommander --init command, have changed. If you installed previous versions of DKP using a customized Kommander installation configuration file, we recommend:

  1. Generate a new template for this release.

  2. Reapply your customizations rather than reusing a file created by older DKP versions.

  3. The ai-navigator-app will be enabled by default in the new configuration file

FluentD Logging Operator Error

An error may occur with the FluentD Operator not appearing when installing Kommander on a GCP Environment.

This is due to a limitation of the GCP API. As a result, you need to rename the Fluentd buffer name because the default name exceeds 63 characters and disable buffer volume metrics for FluentD in order for it to operate properly.

Follow these procedures to resolve this issue for the Management Cluster, and Managed or Attached Clusters.

Management Cluster on GCP with the Logging Stack enabled

  1. Create the kommander namespace:

    CODE
    kubectl create namespace kommander
  2. Create the logging-operator-logging-overrides ConfigMap

    CODE
    cat <<EOF | kubectl apply -n kommander -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: logging-operator-logging-overrides
    data:
      values.yaml: |
        fluentd:
          # disable buffer metrics
          bufferVolumeMetrics: null
          bufferStorageVolume:
            pvc:
              source:
                # update this name to make the PV <= 63 chars long
                # logging-operator-logging-buf-logging-operator-logging-fluentd-0
                claimName: buf
              spec:
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 10Gi
                volumeMode: Filesystem
    EOF
  3. Continue installing Kommander as normal.

Managed/Attached Cluster in GCP with Logging Stack enabled

  1. Set the WORKSPACE_NAMESPACE environment variable to the name of your workspace’s namespace:

    CODE
    export WORKSPACE_NAMESPACE=<workspace namespace>
  2. Create the logging-operator-logging-overrides ConfigMap on the managed or attached cluster prior to enabling the logging-operator application:

    CODE
    cat <<EOF | kubectl apply -n ${WORKSPACE_NAMESPACE} -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: logging-operator-logging-overrides
    data:
      values.yaml: |
        fluentd:
          # disable buffer metrics
          bufferVolumeMetrics: null
          bufferStorageVolume:
            pvc:
              source:
                # update this name to make the PV <= 63 chars long
                # logging-operator-logging-buf-logging-operator-logging-fluentd-0
                claimName: buf
              spec:
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 10Gi
                volumeMode: Filesystem
    EOF
  3. Proceed with installing the logging-operator as normal.

Generate a Token or Enable the Konvoy Credentials Plugin for Proxied Clusters

If you have attached a network-restricted cluster and enabled a proxied access to make its resources available through the Management cluster, the “Generate Token” and “Konvoy credentials plugin instructions” options required for user authentication in the DKP UI require further steps.

Expand the set of instructions depending on the authentication configuration you want to configure:

A. Generate a Token

Generate Token
I. Execute the commands established in the “Generate Token” page.

You can find the “Generate Token” page in the DKP UI by selecting your username in the top right corner of any page.

II. Prepare your Environment
  1. Write down the name of your credentials. You can obtain this name by accessing the DKP UI > username in the top right corner > Generate token page. The credentials are displayed in the kubectl config set-credentials command, next to set-credentials.
    In the following example, jane_doe-az189abcd7ab34a2abc0ab845a67a7ea-0123456789.us-west-2.elb.amazonaws.com would be the name of your credentials:

    CODE
    kubectl config set-credentials jane_doe-az189abcd7ab34a2abc0ab845a67a7ea-0123456789.us-west-2.elb.amazonaws.com \
        --token=[...] 
  2. Set the CONFIG_CREDENTIALS_NAME environment variable to the credentials you obtained in the previous step:

    CODE
    CONFIG_CREDENTIALS_NAME=<CREDENTIALS_NAME>
  3. Set WORKSPACE_NAMESPACE to the namespace of your network-restricted cluster:

    CODE
    export WORKSPACE_NAMESPACE=<workspace namespace>
  4. Set TUNNEL_PROXY_NAME to the name of the TunnelProxy:

    CODE
    TUNNEL_PROXY_NAME=<...>
  5. Set DEFAULT_CONFIG_PATH to the path file of the kubeconfig where you executed the “Generate Token” steps:

    CODE
    DEFAULT_CONFIG_PATH=<path_of_cluster's_kubeconfig>
  6. With your kubeconfig set to the Management cluster, set the TUNNEL_PROXY_DOMAIN environment variable to the proxied access URL:

    (info) See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

    CODE
    TUNNEL_PROXY_DOMAIN=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.clusterProxyDomain }}')
III. Reconfigure the Token

Select one of the following options according to the certificate you configured for the proxied access:

A. Clusters with ACME-supported Certificates for the TunnelProxy
CODE
kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
B. Clusters with Self-Signed Certificates for the TunnelProxy
  1. With your kubeconfig set to the Management cluster, retrieve the ca.crt (not needed in case of ACME certificates):
    (info) See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

    CODE
    LABEL_KEY=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.reverseProxyReleaseName }}')
    SECRET_NAME=$(kubectl get ingress -n $WORKSPACE_NAMESPACE -l app.kubernetes.io/instance=$LABEL_KEY -o jsonpath='{.items[0].spec.tls[0].secretName}')
    mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
    kubectl get secrets $SECRET_NAME -n $WORKSPACE_NAMESPACE -o template='{{ index .data "ca.crt" | base64decode }}' > ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt
  2. Configure the kubeconfig access via the proxied access:

    CODE
    kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
C. Clusters with Custom Certificates for the TunnelProxy
  1. Copy the ca.crt used to sign your certificates to the kubeconfig directory:

    CODE
    mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
    cp ca.crt ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt
  2. Configure the kubeconfig access via the proxied access:

    CODE
    kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config set-context ${TUNNEL_PROXY_NAME} --cluster=${TUNNEL_PROXY_NAME}  --user=${CONFIG_CREDENTIALS_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}

After the commands complete, you can perform kubectl actions on the network-restricted cluster.

B. Konvoy credentials plugin instructions

Konvoy credentials plugin instructions
I. Execute the commands provided in the “Konvoy credentials plugin instructions” page > “Method 1” section.

You can find a link to the “Konvoy credentials plugin instructions” page in the DKP UI when logging in to your environment, or by adding /token/plugin to the DKP UI URL of your environment.

II. Prepare your Environment
  1. Set WORKSPACE_NAMESPACE to the namespace of your network-restricted cluster:

    CODE
    export WORKSPACE_NAMESPACE=<workspace namespace>
  2. Set TUNNEL_PROXY_NAME to the name of the TunnelProxy:

    CODE
    TUNNEL_PROXY_NAME=<...>
  3. Set DEFAULT_CONFIG_PATH to the path of the kubeconfig file you downloaded from the dashboard:

    CODE
    DEFAULT_CONFIG_PATH=<path_of_cluster's_kubeconfig>
  4. With your kubeconfig set to the Management cluster, set the TUNNEL_PROXY_DOMAIN environment variable to the proxied access URL:

    (info) See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

    CODE
    TUNNEL_PROXY_DOMAIN=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.clusterProxyDomain }}')
III. Reconfigure the Token

Select one of the following options according to the certificate you configured for the proxied access:

A. Clusters with ACME-supported Certificates for the TunnelProxy
CODE
kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
B. Clusters with Self-Signed Certificates for the TunnelProxy
  1. With your kubeconfig set to the Management cluster, retrieve the ca.crt (not needed in case of ACME certificates):

    (info) See Provide Context for Commands with a kubeconfig File for information on setting the kubeconfig to a cluster.

    CODE
    LABEL_KEY=$(kubectl get tunnelproxy $TUNNEL_PROXY_NAME -n $WORKSPACE_NAMESPACE -o template='{{ .status.reverseProxyReleaseName }}')
    SECRET_NAME=$(kubectl get ingress -n $WORKSPACE_NAMESPACE -l app.kubernetes.io/instance=$LABEL_KEY -o jsonpath='{.items[0].spec.tls[0].secretName}')
    mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
    kubectl get secrets $SECRET_NAME -n $WORKSPACE_NAMESPACE -o template='{{ index .data "ca.crt" | base64decode }}' > ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt
  2. Configure the kubeconfig access via the proxied access:

    CODE
    kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
C. Clusters with Custom Certificates for the TunnelProxy
  1. Copy the ca.crt used to sign your certificates to the kubeconfig directory:

    CODE
    mkdir -p ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}
    cp ca.crt ${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt
  2. Configure the kubeconfig access via the proxied access:

    CODE
    kubectl config set-cluster ${TUNNEL_PROXY_NAME} --server=https://${TUNNEL_PROXY_DOMAIN}/dkp/api-server --certificate-authority=${HOME}/.kube/certs/${TUNNEL_PROXY_NAME}/ca.crt --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config set-context ${TUNNEL_PROXY_NAME} --user default-profile --kubeconfig ${DEFAULT_CONFIG_PATH}
    kubectl config use-context ${TUNNEL_PROXY_NAME} --kubeconfig ${DEFAULT_CONFIG_PATH}
“Method 2” is not available for network-restricted clusters.

After the commands complete, you can perform kubectl actions on the network-restricted cluster.

Known Common Vulnerabilities and Exposures (CVE)

Starting with DKP 2.6, Catalog apps are scanned for CVEs. However, only the CVEs for app versions that are compatible with the default Kubernetes version, currently 1.26.6, are mitigated. For more information about the known CVEs for compatible catalog apps, see D2iQ Security Updates.

Potential Logging Interruption during Upgrade

The configuration setting enableRecreateWorkloadOnImmutableFieldChange is enabled by default on the logging-operator. This means the operator automatically triggers a recreation of the resource if any immutable fields are changed on the underlying fluent-bit/fluentd resources. Previously, any changes to immutable fields would fail silently and the failures would only be observable in the logging-operator logs. This was done to handle a breaking change introduced in the logging-operator, which added an additional selector label on the fluent-bit DaemonSet.

Because of this breaking change, during an upgrade to 2.6, fluent-bit is restarted, and some log data in the fluent-bit buffer might be lost if fluentd is unavailable while fluent-bit attempts to flush its buffer prior to the pods being terminated. To prevent issues related to data loss, we recommend you run >1 fluentd pod. Alternatively, Fluent Bit can also be configured to use a hostPath volume to store the buffer information, so it can be picked up again when Fluent Bit restarts. For more information about how to change the host path, see Fluent Bit log collector.

Gatekeeper Uninstallation Error

If you choose to disable Gatekeeper, you can run into an error where the app is still present on your clusters.

For pre-existing attached or managed clusters with Gatekeeper installed when they don’t want it to be, you need to manually cleanup Gatekeeper on those clusters, post upgrade.

Follow these steps to manually remove Gatekeeper:

  1. For every Attached/Managed cluster that has Gatekeeper installed that you have disabled it on otherwise (via appdeployment), run the following commands to remove it:
    NOTE: This needs to be run on the attached/managed cluster using the correct kubeconfig.
    Set your namespace of your attached cluster's workspace on your attached cluster:

    CODE
    export WORKSPACE_NAMESPACE=<workspace-name>
  2. Delete the kustomizations:

    CODE
    kubectl --kubeconfig <attached-cluster-kubeconfig> delete kustomizations -n ${WORKSPACE_NAMESPACE} gatekeeper-constraint-templates gatekeeper-constraints gatekeeper-release

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.