Connecting non-Kubernetes nodes to Calico overlay network

Kubernetes networking has some basic rules. In short, every pod has to communicate with every other. Selecting the right network plugin for the cluster is a critical key component when planning and architecting a new cluster. Luckily there are great presentations and blog posts around the topic of Kubernetes cluster networking on the internet, but the available sources are very limited about how to connect external resources that aren’t part of the cluster into the mesh. It all depends on what we would like to achieve, so finally, we have to glue the solutions together.

In this post, I would like to tell our story @IBM about converting an existing node to become a full member of our Kubernetes + Calico network.
First of all, we had to specify the main goals:

Make node full member of the overlay network
The external node needs a pod IP to be able to reach it like any regular pod in the system
The pod IP must be listenable for services on the external node
Service discovery is mandatory for both directions, so pods have to resolve the external node's hostname to pod IP and the node has to reach Kubernetes services as well
On node restart, the same pod IP must reuse for the node

As I mentioned we are using Calico overlay network for many reasons which I don't want to cover now. If you are interested you can find more about available network plugins here and here. Let's jump into the implementation details.

Make node full member of the overlay network

This part was the easiest one. In the case of Calico network in Kubernetes there is a Deployment (calico-kube-controllers) and a DaemonSet (calico-node). The only thing to do here is to run a well-configured calico-node service (both container and native services are supported) on the external node and all the magic happens behind the scenes.

The external node needs a pod IP to be able to reach it like any regular pod in the system

Network layer solution doesn't included into Kubernetes. I can agree with this engineering decision, networking is hard to do and also hard to generalize. Every company/service/whatever has its own special requirements, one is latency the other is throughput sensitive. Others have strict company policies or extra security regulations. And lastly but not least some team wants to connect non Kubernetes workloads to the mesh.

Kubernetes uses network plugins and the plugin's responsibility to manage the network itself. There is two kinds of them, namely kubenet and CNI. In our case, Calico is configured as CNI plugin.
After the node was registered in Calico data store and became a full member of the network we can use the Calico CNI plugin to create a Workload endpoint to allocate the pod IP address from the IP pool assigned to the node by calico-node service.

The pod IP must be listenable for services on the external node

The tricky part. By default CNI plugin does support only container execution (what a surprise in Kubernetes world). It means the previously created network interface and IP address live in a separated namespace and it is hidden from other processes. One option to solve this is to run services in this new network namespace. I suggest this only for network experts and for new installations where you want to make non containerized service available only for Kubernetes pods. The other option is to copy the network interface to the default namespace. This solution is a bit tricky but covers more common use cases and makes the interface available for regular services.

Service discovery is mandatory for both direction

Service discovery is one of the key components of the solution because this is the piece witch is used by application developers. There are two ways of service discovery;

Pods need to reach node by hostname:

During the provision, the configuration tool creates a headless service in Kubernetes which points to the node's pod IP

External node has to reach services in Kubernetes:

A node can reach ClusterIPs of the Kubernetes cluster via network interface created by CNI, so it can communicate with CoreDNS. During the provision, the configuration toolsets CoreDNS as the name resolver

On node restart, the same pod IP must reuse for the node

In the case of node restart, our target was to restore the same state of the Calico network. But the node was already registered, the IP pool was associated and the pod IP was allocated at the start time. We choose the simplest solution possible: changed the services to clean up the previously created Calico node and Workload endpoint before starting.

This is the end of part one. I hope it helped to get a better understanding of how Kubernetes and Calico network works and how to extend Calico network with non Kubernetes workers. If you have any comments please feel free to discuss them.

Next time we will try this in the practice. To be continued...

mhmxs tech.log

Search This Blog