Tutorials

End-to-end tutorials for model development, distributed training, pipelines and metadata management

Kaptain offers several ways to train models (incl. distributed), tune hyperparameters, and deploy optimized models that autoscale.

The Kaptain SDK is the best choice for a data science-friendly user experience. It is designed to be a great first experience with Kaptain.

If you prefer to have full control and are familiar and comfortable with Kubeflow SDKs, or YAML specifications in Kubernetes, then we suggest you consult the other tutorials.

Note that everything can be done from within notebooks, thanks to Kaptain’s notebooks-first approach to machine learning.

How to Navigate the User Interface

The Kubeflow central dashboard is the main entry point to Kaptain after logging in:

Central dashboard

The central area shows recent pipelines, pipeline runs, and notebooks as well as links to documentation.

The namespace is shown at the top: demo in the image above.

The menu on the left has the following entries

  • Home, which is shown in the image
  • Pipelines
  • Notebook Servers
  • Katib

These are discussed in more detail below.

Pipelines

Pipelines and runs with their logged artifacts are available from the Pipelines menu. Details on how to create pipelines are in the pipelines tutorial.

Pipeline Runs

A list of pipeline runs is available in the Experiments menu. It shows a list of runs along with their status, duration, and model performance metrics. As an example, the accuracy and loss are shown in the image below.

Pipeline runs

Pipeline Run Logs

After selecting a single run, logs for individual steps of a pipeline can be displayed:

Pipeline run logs

This is particularly helpful when debugging pipeline steps.

Each step logs its inputs and outputs, which can be accessed via the Input/Output tab.

Pipeline Artifacts

Input and outputs of steps, also known as artifacts, are stored in the Artifacts Store. These are available from the Artifacts menu. The lineage of pipeline artifacts is displayed in the Lineage Explorer tab:

Pipeline artifacts

Notebook Servers

Notebook servers can be set up from the Notebook Servers menu on the central dashboard. From there, users can choose a quick-start image for any of the supported deep learning frameworks: TensorFlow, PyTorch, and MXNet. Each quick-start image comes in two flavors: CPU and GPU. The latter has all the drivers needed for training on GPUs included. Custom images can also be provided.

Each notebook server allows secrets and volumes to be mounted.

Once a notebook server has been set up, a familiar Jupyter notebook environment is available:

Notebooks

The numbered sections are as follows:

  1. Directory tree on the notebook server
  2. Visual git module
  3. Table of contents for the currently visible notebook
  4. Notebook diff viewer
  5. Notebook cells with embedded output

Additional details on the JupyterLab environment can be found in the JupyterLab documentation.

Katib

Katib is the hyperparameter tuner and neural architecture search module in Kaptain. To learn how to create hyperparameter tuning experiments, read the tutorial.

These experiments can be accessed through the HP → Monitor submenu:

Katib monitor

For each experiment a chart of the main objective and different hyperparameter values is shown:

Katib experiment

The View Experiment button shows the details of the experiment itself. The View Suggestion button yields the hyperparameters of the best trial in the experiment.

At the bottom of the chart is a list of all trials, their statuses, objective values, and hyperparameters.