Skip to main content
Skip table of contents

Configure Alerts Using AlertManager

To keep your clusters and applications healthy and drive productivity forward, you need to stay informed of all events occurring in your cluster. DKP helps you to stay informed of these events by using the alertmanager of the kube-prometheus-stack.

Kommander is configured with pre-defined alerts to monitor four specific events. You receive alerts related to:

  • State of your nodes

  • System services managing the Kubernetes cluster

  • Resource events from specific system services

  • Prometheus expressions exceeding some pre-defined thresholds

Some examples of the alerts currently available are:

  • CPUThrottlingHigh

  • TargetDown

  • KubeletNotReady

  • KubeAPIDown

  • CoreDNSDown

  • KubeVersionMismatch

A complete list with all the pre-defined alerts can be found on GitHub.

Prerequisites

  • Determine the name of the workspace where you wish to perform the actions. You can use the dkp get workspaces command to see the list of workspace names and their corresponding namespaces.

  • Set the WORKSPACE_NAMESPACE environment variable to the name of the workspace’s namespace where the cluster is attached:

    CODE
    export WORKSPACE_NAMESPACE=<workspace_namespace>

Configure Alert Rules

Use override ConfigMaps to configure alert rules.

You can enable or disable the default alert rules by providing the desired configuration in an overrides ConfigMap. For example, if you want to disable the default node alert rules, follow these steps to define an overrides ConfigMap:

  1. Create a file named kube-prometheus-stack-overrides.yaml and paste the following YAML code into it to create the overrides ConfigMap:

    CODE
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: kube-prometheus-stack-overrides
      namespace: ${WORKSPACE_NAMESPACE}
    data:
     values.yaml: |
       ---
       defaultRules:
         rules:
           node: false
  2. Use the following command to apply the YAML file:

    CODE
    kubectl apply -f kube-prometheus-stack-overrides.yaml
  3. Edit the kube-prometheus-stack AppDeployment to replace the spec.configOverrides.name value with kube-prometheus-stack-overrides. (You can use the steps in the procedure, Deploy an application with a custom configuration as a guide.)

    CODE
    dkp edit appdeployment -n ${WORKSPACE_NAMESPACE} kube-prometheus-stack

    After your editing is complete, the AppDeployment resembles this example:

    CODE
    apiVersion: apps.kommander.d2iq.io/v1alpha2
    kind: AppDeployment
    metadata:
      name: kube-prometheus-stack
      namespace: ${WORKSPACE_NAMESPACE}
    spec:
      appRef:
        name: kube-prometheus-stack-34.9.3
        kind: ClusterApp
      configOverrides:
        name: kube-prometheus-stack-overrides
  4. To disable all rules, create an overrides ConfigMap with this YAML code:

    CODE
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: kube-prometheus-stack-overrides
      namespace: ${WORKSPACE_NAMESPACE}
    data:
     values.yaml: |
       ---
       defaultRules:
         create: false
  5. Alert rules for the Velero platform service are turned off by default. You can enable them with the following overrides ConfigMap. They should be enabled only if the velero platform service is enabled. If platform services are disabled disable the alert rules to avoid alert misfires.

    CODE
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: kube-prometheus-stack-overrides
      namespace: ${WORKSPACE_NAMESPACE}
    data:
      values.yaml: |
        ---
        mesosphereResources:
          rules:
            velero: true
  6. To create a custom alert rule named my-rule-name, create the overrides ConfigMap with this YAML code:

    CODE
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: kube-prometheus-stack-overrides
      namespace: ${WORKSPACE_NAMESPACE}
    data:
      values.yaml: |
        ---
        additionalPrometheusRulesMap:
          my-rule-name:
            groups:
            - name: my_group
              rules:
              - record: my_record
                expr: 100 * my_record

After you set up your alerts, you can manage each alert using the Prometheus web console to mute or unmute firing alerts, and perform other operations. For more information about configuring alertmanager, see the Prometheus website.

To access the Prometheus Alertmanager UI, browse to the landing page and then search for the Prometheus Alertmanager dashboard, for example https://<CLUSTER_URL>/dkp/alertmanager.

Notify Prometheus Alerts in Slack

To hook up the Prometheus alertmanager notification system, you need to overwrite the existing configuration.

  1. The following file, named alertmanager.yaml, configures alertmanager to use the Incoming Webhooks feature of Slack (slack_api_url: https://hooks.slack.com/services/<HOOK_ID>) to fire all the alerts to a specific channel #MY-SLACK-CHANNEL-NAME.

    CODE
    global:
      resolve_timeout: 5m
      slack_api_url: https://hooks.slack.com/services/<HOOK_ID>
    
    route:
      group_by: ['alertname']
      group_wait: 2m
      group_interval: 5m
      repeat_interval: 1h
    
      # If an alert isn't caught by a route, send it to slack.
      receiver: slack_general
      routes:
        - match:
            alertname: Watchdog
          receiver: "null"
    
    receivers:
      - name: "null"
      - name: slack_general
        slack_configs:
          - channel: '#MY-SLACK-CHANNEL-NAME'
            icon_url: https://avatars3.githubusercontent.com/u/3380462
            send_resolved: true
            color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
            title: '{{ template "slack.default.title" . }}'
            title_link: '{{ template "slack.default.titlelink" . }}'
            pretext: '{{ template "slack.default.pretext" . }}'
            text: '{{ template "slack.default.text" . }}'
            fallback: '{{ template "slack.default.fallback" . }}'
            icon_emoji: '{{ template "slack.default.iconemoji" . }}'
    
    templates:
      - '*.tmpl'
  2. The following file, named notification.tmpl, is a template that defines a pretty format for the fired notifications:

    CODE
    {{ define "__titlelink" }}
    {{ .ExternalURL }}/#/alerts?receiver={{ .Receiver }}
    {{ end }}
    
    {{ define "__title" }}
    [{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }}
    {{ end }}
    
    {{ define "__text" }}
    {{ range .Alerts }}
    {{ range .Labels.SortedPairs }}*{{ .Name }}*: `{{ .Value }}`
    {{ end }} {{ range .Annotations.SortedPairs }}*{{ .Name }}*: {{ .Value }}
    {{ end }} *source*: {{ .GeneratorURL }}
    {{ end }}
    {{ end }}
    
    {{ define "slack.default.title" }}{{ template "__title" . }}{{ end }}
    {{ define "slack.default.username" }}{{ template "__alertmanager" . }}{{ end }}
    {{ define "slack.default.fallback" }}{{ template "slack.default.title" . }} | {{ template "slack.default.titlelink" . }}{{ end }}
    {{ define "slack.default.pretext" }}{{ end }}
    {{ define "slack.default.titlelink" }}{{ template "__titlelink" . }}{{ end }}
    {{ define "slack.default.iconemoji" }}{{ end }}
    {{ define "slack.default.iconurl" }}{{ end }}
    {{ define "slack.default.text" }}{{ template "__text" . }}{{ end }}
  3. Finally, apply these changes to alertmanager as follows. Set ${WORKSPACE_NAMESPACE} to the workspace namespace that kube-prometheus-stack is deployed in:

    CODE
    kubectl create secret generic -n ${WORKSPACE_NAMESPACE} \
      alertmanager-kube-prometheus-stack-alertmanager \
      --from-file=alertmanager.yaml \
      --from-file=notification.tmpl \
      --dry-run=client --save-config -o yaml | kubectl apply -f -

Notify Prometheus Alerts in E-Mail

To configure the Prometheus alertmanager notification system to send an email for alerts, you need to overwrite the existing configuration. The steps below configure Alertmanager to send all configured alerts to a gmail account named test@gmail.com.

  1. Create a file named alertmanager.yaml with the following contents:

    CODE
    global:
      resolve_timeout: 5m
    inhibit_rules: []
    receivers:
    - name: "null"
    - name: test_gmail
      email_configs:
      - to: test@gmail.com
        from: test@gmail.com
        auth_username: test@gmail.com
        auth_password: password
        send_resolved: true
        require_tls: true
        smarthost: smtp.gmail.com:587
    route:
      receiver: test_gmail
      group_by:
      - namespace
      group_interval: 5m
      group_wait: 30s
      repeat_interval: 12h
      routes:
      - matchers:
        - alertname =~ "InfoInhibitor|Watchdog"
        receiver: "null"
    templates:
    - /etc/alertmanager/config/*.tmpl
  2. Apply these changes to alertmanager as follows. Set ${WORKSPACE_NAMESPACE} to the workspace namespace that kube-prometheus-stack is deployed in (typically the kommander namespace):

    CODE
    kubectl create secret generic -n ${WORKSPACE_NAMESPACE} \                                                                                                                                                                   
      alertmanager-kube-prometheus-stack-alertmanager \
      --from-file=alertmanager.yaml \
      --dry-run=client --save-config -o yaml | kubectl apply -f -
  3. Allow some time for the configuration to take affect. You can then use the following command to verify that the configuration took effect:

    CODE
    kubectl exec -it alertmanager-kube-prometheus-stack-alertmanager-0 -n kommander -- cat /etc/alertmanager/config_out/alertmanager.env.yaml

For more information on configuring email alerting, refer to the Alertmanager documentation.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.