Kubernetes is a great tool to organize your workloads on a low or high scale. It has many nice features in different areas, but it is totally out-sourcing the complexity of the network. Network is one of the key layers of a success story and happily there are many available solutions on the market. Calico is one of them, and it is I think the most used network provider, including big players in public cloud space and has a great community who works day by day to make Calico better.
Installing Kubernetes and Calico nowadays is easy as a flick if you are happy with the default configurations. Otherwise, life became tricky very easily, there are so many options, configurations, topologies, automation, etc. Surprise or not, networking is one of the hard parts in high scale, and requires thorough design from the beginning. By default Calico uses IPIP encapsulation and full mesh BGP to share routing information within the cluster. This means every single node in the cluster is connected with every other node, which becomes a bottleneck shortly in large clusters.
If you are interested in the details please watch my talk on the topic:
Long story short, Calico uses the same technic as regular network solutions, namely introducing Route Reflector concept. There are many types of Route Reflector topologies, but one thing in common: Route Reflectors are dedicated nodes to collect routing information and advertise them to others.
So if you want to operate on high scale you have to design your own Route Reflector topology. I collected some of them here, please follow the link for more info.
Well done! Really?
If you have some basic experience with distributed systems you may know that change is the only constant. There are many moving parts for example nodes are coming and going or the entire network can die in a million ways. In this formation, each time the cluster changes an engineer has to check the current topology, re-calculate a new one, and apply various labels to define the best topology for the current cluster state. Easy to admit this way is less than optimal :D.
I and my team at IBM started to work on a solution to save our customers from wasting time on re-designing topology scale by scale. Our plan was to open source this feature and merge it as a core Calico feature. So first we and some members of the Calico community wrote proposal documentation, which you can find here. Then we implemented a POC and now I opened the official pull request into kube-controllers and libcalico-go projects.
The autoscaling feature is currently on review, but you can have a ride if you want.
So first bring your own cluster, or use my Kind template:
kind create cluster --config cluster.yaml
Then apply Calico manifests which I prepared (diff):
kubectl apply -f https://raw.githubusercontent.com/mhmxs/calico-manifests-dev/main/routereflector/calico-3.17-kdd.yaml
This feature is a technical preview, use it at your own risk! The calico-kube-controllers image has been built on my computer, so you have to trust me (or build your own).
If get an error "
no matches for kind "BGPConfiguration" in version "
crd.projectcalico.org/v1
" please re-apply the manifest, your computer was not fast enough to create CRDs.You can follow the logs of the operator:
kubectl logs -n kube-system -f -l k8s-app=calico-kube-controllers
And check BGP configurations:
kubectl get bgppeers
Optionally you can change the configuration:
# kubectl edit kubecontrollersconfigurations default spec: controllers: routereflector: min: 2 ratio: 0.5 zoneLabel: kubernetes.io/arch
More options are available here.
Once topology becomes stable (give some time to it), you can test autoscaling by auto-scaling your cluster. Or it is less expensive to change its kubecontrollersconfigurations.
The auto scaler is far from perfect and supports only multi-cluster topology at the moment, but ready for wider tests and opens opportunities for further communications.
So please, please, please ...
- feel free to share your experiences and ideas!
- keep your eyes on the source code, reviews are welcome!
- give more use-cases to us!
- join our community Slack channel!