Skip to main content

Connecting non-Kubernetes nodes to Calico overlay network

Kubernetes networking has some basic rules.  In short, every pod has to communicate with every other. Selecting the right network plugin for the cluster is a critical key component when planning and architecting a new cluster. Luckily there are great presentations and blog posts around the topic of Kubernetes cluster networking on the internet, but the available sources are very limited about how to connect external resources that aren’t part of the cluster into the mesh. It all depends on what we would like to achieve, so finally, we have to glue the solutions together.

In this post, I would like to tell our story @IBM about converting an existing node to become a full member of our Kubernetes + Calico network.
First of all, we had to specify the main goals:
  • Make node full member of the overlay network
  • The external node needs a pod IP to be able to reach it like any regular pod in the system
  • The pod IP must be listenable for services on the external node
  • Service discovery is mandatory for both directions, so pods have to resolve the external node's hostname to pod IP and the node has to reach Kubernetes services as well
  • On node restart, the same pod IP must reuse for the node
As I mentioned we are using Calico overlay network for many reasons which I don't want to cover now. If you are interested you can find more about available network plugins here and here. Let's jump into the implementation details.

Make node full member of the overlay network 

This part was the easiest one. In the case of Calico network in Kubernetes there is a Deployment (calico-kube-controllers) and a DaemonSet (calico-node). The only thing to do here is to run a well-configured calico-node service (both container and native services are supported) on the external node and all the magic happens behind the scenes.

The external node needs a pod IP to be able to reach it like any regular pod in the system

Network layer solution doesn't included into Kubernetes. I can agree with this engineering decision, networking is hard to do and also hard to generalize. Every company/service/whatever has its own special requirements, one is latency the other is throughput sensitive. Others have strict company policies or extra security regulations. And lastly but not least some team wants to connect non Kubernetes workloads to the mesh.

Kubernetes uses network plugins and the plugin's responsibility to manage the network itself. There is two kinds of them, namely kubenet and CNI. In our case, Calico is configured as CNI plugin.
After the node was registered in Calico data store and became a full member of the network we can use the Calico CNI plugin to create a Workload endpoint to allocate the pod IP address from the IP pool assigned to the node by calico-node service.

The pod IP must be listenable for services on the external node

The tricky part. By default CNI plugin does support only container execution (what a surprise in Kubernetes world). It means the previously created network interface and IP address live in a separated namespace and it is hidden from other processes. One option to solve this is to run services in this new network namespace. I suggest this only for network experts and for new installations where you want to make non containerized service available only for Kubernetes pods. The other option is to copy the network interface to the default namespace. This solution is a bit tricky but covers more common use cases and makes the interface available for regular services.

Service discovery is mandatory for both direction

Service discovery is one of the key components of the solution because this is the piece witch is used by application developers. There are two ways of service discovery;
  • Pods need to reach node by hostname:
    • During the provision, the configuration tool creates a headless service in Kubernetes which points to the node's pod IP
  • External node has to reach services in Kubernetes:
    • A node can reach ClusterIPs of the Kubernetes cluster via network interface created by CNI, so it can communicate with CoreDNS. During the provision, the configuration toolsets CoreDNS as the name resolver

On node restart, the same pod IP must reuse for the node

In the case of node restart, our target was to restore the same state of the Calico network. But the node was already registered, the IP pool was associated and the pod IP was allocated at the start time. We choose the simplest solution possible: changed the services to clean up the previously created Calico node and Workload endpoint before starting.

This is the end of part one. I hope it helped to get a better understanding of how Kubernetes and Calico network works and how to extend Calico network with non Kubernetes workers. If you have any comments please feel free to discuss them.

Next time we will try this in the practice. To be continued...

Popular posts from this blog

Advanced testing of Golang applications

Golang has a nice built-in framework for testing production code and you can find many articles on how to use it. In this blog post, I don't want to talk too much about the basics , table-driven testing ,  how to generate code coverage  or detect race conditions . I would like to share my personal experiences with a real-world scenario. Go is a relatively young and modern programming language on one side, and it is an old fashion procedural language on the other. You have to keep in mind that fact when you are writing production code from the beginning, otherwise, your program should become an untestable mess so easily. In a procedural way, your program is executed line by line and functions call other functions without any control of the dependencies. Hard to unit test, because you are testing underlying functions too, which are side effects from the perspective of testing.  It looks like everything is static if you are coming from object-oriented world. There are...

Kubernetes and Calico development environment as easy as a flick

I became an active member of the Calico community so I had to build my own development environment from zero. It wasn't trivial for many reasons but mainly because I have MacOS on my machine and not all of the features of Calico are available on my main operating system. The setup also makes some sense on Linux hosts, because if the node controller runs locally it might make changes to the system, which always has some risk in the playing cards. The other big challenge was that I wanted to start any version of Kubernetes with the ability to do changes in it next to Calico. Exactly I had to prepare two tightly coupled environments. My idea was to create a virtual machine with Linux on it, configure development environments for both projects in the VM and use VSCode 's nice remote development feature for code editing. In this way projects are hosted on the target operating system, I don't risk my system, I don't have to deal with poor file system sync between host a...

Autoscaling Calico Route Reflector topology in Kubernetes

Kubernetes is a great tool to organize your workloads on a low or high scale. It has many nice features in different areas, but it is totally out-sourcing the complexity of the network. Network is one of the key layers of a success story and happily there are many available solutions on the market. Calico is one of them, and it is I think the most used network provider, including big players in public cloud space and has a great community who works day by day to make Calico better. Installing Kubernetes and Calico nowadays is easy as a flick if you are happy with the default configurations. Otherwise, life became tricky very easily, there are so many options, configurations, topologies, automation, etc. Surprise or not, networking is one of the hard parts in high scale, and requires thorough design from the beginning. By default Calico uses IPIP encapsulation and full mesh BGP to share routing information within the cluster. This means every single node in the cluster is connected w...