Networking

Configure networking for Konvoy cluster

This section describes different networking components that come together to form a Konvoy networking stack. It assumes familiarity with Kubernetes networking.

IPtables

Konvoy can be configured to automatically add iptables with the rules outlined below.

Control Plane nodes:

iptables -A INPUT -p tcp -m tcp --dport 6443 -m comment --comment "Konvoy: kube-apiserver --secure-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10250 -m comment --comment "Konvoy: kubelet --port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10248 -m comment --comment "Konvoy: kubelet --healthz-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10249 -m comment --comment "Konvoy: kube-proxy --metrics-bind-address" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10256 -m comment --comment "Konvoy: kube-proxy --healthz-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10257 -m comment --comment "Konvoy: kube-controller-manager --secure-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10252 -m comment --comment "Konvoy: kube-controller-manager --port (used for liveness)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10259 -m comment --comment "Konvoy: kube-scheduler --secure-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10251 -m comment --comment "Konvoy: kube-scheduler --port (used for liveliness)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 2379 -m comment --comment "Konvoy: etcd client" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 2380 -m comment --comment "Konvoy: etcd peer" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9091 -m comment --comment "Konvoy: calico-node felix (used for metrics)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9092 -m comment --comment "Konvoy: calico-node bird (used for metrics)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9099 -m comment --comment "Konvoy: calico-node felix (used for liveness)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 179 -m comment --comment "Konvoy: calico-node BGP" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 30000:32767 -m comment --comment "Konvoy: NodePorts" -j ACCEPT
iptables -A INPUT -p icmp -m comment --comment "Konvoy: ICMP" -m icmp --icmp-type 8 -j ACCEPT

Worker nodes:

iptables -A INPUT -p tcp -m tcp --dport 10250 -m comment --comment "Konvoy: kubelet --port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10248 -m comment --comment "Konvoy: kubelet --healthz-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10249 -m comment --comment "Konvoy: kube-proxy --metrics-bind-address" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 10256 -m comment --comment "Konvoy: kube-proxy --healthz-port" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9091 -m comment --comment "Konvoy: calico-node felix (used for metrics)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9092 -m comment --comment "Konvoy: calico-node bird (used for metrics)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9099 -m comment --comment "Konvoy: calico-node felix (used for liveness)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 5473 -m comment --comment "Konvoy: calico-typha (used for syncserver)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 9093 -m comment --comment "Konvoy: calico-typha (used for metrics)" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 179 -m comment --comment "Konvoy: calico-node BGP" -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 30000:32767 -m comment --comment "Konvoy: NodePorts" -j ACCEPT
iptables -A INPUT -p icmp -m comment --comment "Konvoy: ICMP" -m icmp --icmp-type 8 -j ACCEPT

The default value is false, however, you can enable this behavior by setting the value of spec.kubernetes.iptables.addDefaultRules to true.

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    networking:
      iptables:
        addDefaultRules: true

Highly Available Control Plane

Konvoy ships with a highly available control plane, in case of multi-master Kubernetes deployment.

AWS

High availability is provided through the cloud provider’s load balancer.

On-premises

In on-premises deployments, Konvoy ships with Keepalived. Keepalived provides two main functionalities - high availability and load balancing. It uses the VRRP (Virtual Router Redundancy Protocol) to provide high availability. VRRP allows you assign a virtual IP (VIP) to participating machines, where it is active only on one of the machines.

VRRP provides high availability by ensuring that virtual IP is active as long as at least one of the participating machines is active. Konvoy uses Keepalived to maintain high availability of the control plane.

To use Keepalived:

  1. Identify and reserve a virtual IP (VIP) address from the networking infrastructure.

  2. Configure the networking infrastructure so that the reserved virtual IP address is reachable:

  • from all hosts specified in the inventory file.
  • from the computer that is used to deploy Kubernetes.

If the reserved virtual IP address is in the same subnet as the rest of the cluster nodes then nothing more needs to be configured. However, if it is in a different subnet you may need to configure appropriate routes to ensure connectivity with the virtual IP address. Further, the virtual IP address may share an interface with the primary IP address of the interface. In such cases, you must be able to disable any IP or MAC spoofing from the infrastructure firewall.

The following example illustrates the configuration if the reserved virtual IP address is 10.0.50.20:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    controlPlane:
      controlPlaneEndpointOverride: "10.0.50.20:6443"
      keepalived:
        interface: ens20f0 # optional
        vrid: 51           # optional

The IP address specified in spec.kubernetes.controlPlane.controlPlaneEndpointOverride is used for the Keepalived VIP. This value is optional if it is already specified in inventory.yaml as part of all.vars.control_plane_endpoint. You can set spec.kubernetes.controlPlane.keepalived.interface to specify the network interface for the Keepalived VIP. This field is optional; if not set, Konvoy automatically detects the network interface to use based on the route to the VIP.

Further, you could set spec.kubernetes.controlPlane.keepalived.vrid to specify the Virtual Router ID used by Keepalived. This field is optional; if not set, Konvoy will randomly pick a Virtual Router ID for you.

Keepalived is enabled by default for on-premises deployment. You can disable it by removing spec.kubernetes.controlPlane.keepalived from the cluster.yaml. This is usually done where there is an on-premises load balancer which could be used to maintain high availability of the control plane.

If you are not setting any of the optional values, set spec.kubernetes.controlPlane.keepalived: {} to enable the default values.

Pod-to-Pod connectivity

Konvoy ships with Calico as the default CNI plugin to provide pod-to-pod connectivity. The .yaml file for the default installation can be viewed here. Konvoy exposes two configurations in cluster.yaml, for Calico - Calico version and PodSubnet. Both these configurations are optional.

By default, Konvoy ships with the latest version of Calico available at the time of Konvoy release. However, you can configure it to a specific version as shown below:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    containerNetworking:
      calico:
        version: v3.13.4

Further, the Calico IPV4 pool CIDR can be set via spec.kubernetes.networking.podSubnet in cluster.yaml, as shown below:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    networking:
      podSubnet: 192.168.0.0/16
      serviceSubnet: 10.0.51.0/24

Konvoy ships with the default CIDR as 192.168.0.0/16. Make sure that podSubnet does not overlap with serviceSubnet.

Encapsulation

Two ways of encapsulating networking traffic are supported: IP-to-IP and VXLAN. By default, IP-to-IP is enabled:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    containerNetworking:
      calico:
        encapsulation: ipip

The following configuration switches it to VXLAN:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  kubernetes:
    containerNetworking:
      calico:
        encapsulation: vxlan

Network Policy

A network policy specifies how groups of pods are allowed to communicate with each other and with other network endpoints.

In Konvoy, network policies are implemented by Calico.

Calico supports a wide range of network policies. It is tightly integrated with Kubernetes network policy. You can use kubectl to configure Kubernetes network policy which would be enforced by Calico. Further, Calico extends Kubernetes network policy through custom CRDs which can be configured using calicoctl. More details about Calico network policy can be found here.

In-cluster BGP Route Reflectors

Calico advertises routes using BGP Protocol with a full node-to-node mesh configured by default. That means all nodes connect to each other which becomes a problem on big clusters. Konvoy has support for in-cluster BGP Route Reflectors. Route Reflectors are essential on clusters with more than 200 nodes. However, we recommend to have Route Reflectors on clusters with more than 100 nodes. When Route Reflector nodes are configured, the full mesh mode is disabled and each node connects only to in-cluster Route Reflectors. That reduces CPU and memory utilization on worker and control-plane nodes. More information about Calico BGP Peers and Route Reflectors can be found here.

Route Reflector node requires at least 2 CPU Core and 4Gb Memory. To enable in-cluster BGP Route Reflectors, add at least two nodes (three nodes are recommended) to the route-reflector node pool and add the following cluster configuration:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  nodePools:
  - name: route-reflector
    labels:
      - key: dedicated
        value: route-reflector
    taints:
      - key: dedicated
        value: route-reflector
        effect: NoExecute

Service Discovery

Konvoy ships with CoreDNS to provide a DNS based service discovery. The default CoreDNS configuration is as shown below:

.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       upstream
       fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf
    loop
    reload
    loadbalance
}

As shown in the above configuration, by default, CoreDNS is shipped with error, health, prometheus, forward, loop, reload, loadbalance plugins enabled. A detailed explanations for these plugins can be found here.

You can modify the CoreDNS configuration by updating the configmap named coredns in kube-system namespace.

Load Balancing

Load Balancing can be addressed in two ways:

  • Load balancing for the traffic within a Kubernetes cluster
  • Load balancing for the traffic coming from outside the cluster

Load balancing for internal traffic

Load balancing within a Kubernetes cluster is exposed through a service of type ClusterIP. ClusterIP is similar to a virtual IP (VIP) which presents a single IP address to the client and load balances the traffic to the backend servers. The actual load balancing happens via iptables rules or IPVS configuration, which are programmed by a Kubernetes component called kube-proxy. By default, kube-proxy runs in iptables mode. It configures iptables to intercept any traffic destined towards ClusterIP and send traffic to the real servers based on the probabilistic iptables rules. Kube-proxy configuration can be altered by updating the configmap named kube-proxy in the kube-system namespace.

Load balancing for external traffic

A Kubernetes service of type LoadBalancer requires a load balancer to connect an external client to your internal service.

AWS

In cloud deployments, the load balancer is provided by the cloud provider.

On-premises

For an on-premises deployment, Konvoy ships with MetalLB.

To use MetalLB for addon load balancing:

  1. Identify and reserve a virtual IP (VIP) address range from the networking infrastructure.

  2. Configure the networking infrastructure so that the reserved IP addresses is reachable:

  • from all hosts specified in the inventory file.
  • from the computer that is used to deploy Kubernetes.

If the reserved virtual IP addresses are in the same subnet as the rest of the cluster nodes, then nothing more needs to be configured. However, if it is in a different subnet then you may need to configure appropriate routes to ensure connectivity with the virtual IP address. Further, the virtual IP addresses may share an interface with the primary IP address of the interface. In such cases, you must disable any IP or MAC spoofing from the infrastructure firewall.

MetalLB can be configured in two modes - Layer2 and BGP.

The following example illustrates the Layer2 configuration in the cluster.yaml configuration file:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  addons:
    addonsList:
    - name: metallb
      enabled: true
      values: |-
        configInline:
          address-pools:
          - name: default
            protocol: layer2
            addresses:
            - 10.0.50.25-10.0.50.50

The number of virtual IP addresses in the reserved range determines the maximum number of services with a type of LoadBalancer that you can create in the cluster.

MetalLB in bgp mode implements only a minimal functionality of BGP. It only advertises the virtual IP to peer BGP agent.

The following example illustrates the BGP configuration in the cluster.yaml configuration file:

kind: ClusterConfiguration
apiVersion: konvoy.mesosphere.io/v1beta2
spec:
  addons:
    addonsList:
    - name: metallb
      enabled: true
      values: |-
        configInline:
          peers:
          - my-asn: 64500
            peer-asn: 64500
            peer-address: 172.17.0.4
          address-pools:
          - name: my-ip-space
            protocol: bgp
            addresses:
            - 172.40.100.0/24

In the above configuration, peers defines the configuration of the BGP peer such as peer ip address and autonomous system number (asn). The address-pools section is similar to layer2, except for the protocol.

Further, MetalLB supports advanced BGP configuration which can be found here.

NOTE: Making a configuration change in cluster.yaml for the metallb addon running konvoy deploy addons may not result in the config change applying. This is intentional behavior. MetalLB will refuse to adopt changes to the ConfigMap that will break existing Services¹. You may force MetalLB to load those changes by deleting the metallb controller pod:

kubectl -n kubeaddons delete pod -l app=metallb,component=controller

Make sure the MetalLB subnet does not overlap with podSubnet and serviceSubnet.

Ingress

Konvoy ships with Traefik as the default ingress controller. The default Traefik helm chart can be viewed here. Traefik creates a service of type Load Balancer. In the cloud, the cloud provider creates the appropriate load balancer. In on-premises deployment, by default, it uses MetalLB. MetalLB can be configured as discussed earlier.

Further, Traefik supports a lot of functionalities such as Name-based routing, Path-based routing, Traffic splitting etc. Details of these functionalities can be viewed here.