D2iQ® Kaptain® version 2.0.0 was released on April 26, 2022.
D2iQ is announcing Kaptain 2.0 which includes several user experience features, additional platforms supported, and resolves reported issues. This continues our tradition of customer-led development making Kaptain even easier to deploy, supporting more platforms, and incorporating the latest innovations from the open source community in the hot AI/ML marketplace. Kaptain 2.0 supports Kubeflow 1.5 and a new version of the Kaptain SDK, 1.0.1. For more information on SDK 1.0.1, see SDK release notes.
New features and capabilities
Added Kaptain as DKP Catalog application
Starting with Kaptain 2.0, it can be easily installed via the DKP workspace catalog in both single and multi-cluster experiences. Kaptain 2.0 is supported on DKP 2.1.2 onwards.
Support for Amazon EKS, Azure AKS and DKP multi-cluster
Kaptain 2.0 can be deployed on attached clusters running on Amazon AWS EKS and Microsoft Azure AKS. This extends D2iQ’s openness to support Kubernetes platforms beyond DKP with Kaptain.
Helm based deployment
Kaptain 2.0 now uses helm based deployment which makes it easier to integrate into DKP and other platforms.
Support for Kubernetes 1.22 and Kubeflow 1.5
Kaptain 2.0 comes with Kubeflow 1.5, which includes expanded user control over mounted volumes, increased visibility into idle notebooks, UI improvements, and improvements to hyperparameter tuning. Kaptain 2.0 also supports Kubernetes 1.22 which contains upstream improvements such as security enhancements.
Bundle size reduction
Kaptain 2.0 drastically reduces the bundle sizes improving download times, especially for the CPU and GPU bundle:
- Base installation: 31.8 GB (down from 33.2 GB)
- CPU bundle: 12.4 GB (down from 34.3 GB)
- GPU bundle: 17.2 GB (down from 84.7 GB)
Fixes and Improvements
- Moved from KFServing to KServe
- Fixed the cudatoolkit version regression (COPS-7219)
- Bumped Pytorch to 1.11.0 and CUDA to 11.4 to support modern multi-instance GPUs, such as Nvidia A100 (COPS-7211)
- Kubeflow 1.5.0
- Argo Workflows 3.2.3
- Katib 0.13.0
- KServe 0.7.0
- Percona Kubernetes Operator 1.10.0
- Kubeflow Pipelines 1.8.1
- Training Operator 1.4.0
- Tensorflow 2.8.0
- Pytorch 1.11.0
- CUDA 11.4
- MXNet 1.9
- Horovod 0.24.2