• Apr 26, 2019
Kubernetes Operators are all the rage. However, getting the concept right is difficult. Here we want to provide our view on the past, present, and future of Kubernetes Operators.
Before operators became a thing we did not have a real commonly established approach for implementing idempotency or reconciliation. At some point, CoreOS came out with their blog post about the idea of operators. We were all hooked. The idea led us in the right direction and got us into the topic.
In the early days, there was nothing to really guide us through it. No real example projects or libraries we could use as a blueprint. The times were transformative. We had to come up with something that wasn’t there. This is when our own operatorkit was born. TPRs were the means to reconcile upon but were quite quickly replaced by CRDs, which we are all used to nowadays.
An operator is a codified agent that takes care of a specific task. As with microservices, there is likely no single 100% correct way to cut it. The answer here is always “it depends”. The key aspect of turning a microservice into an operator is the continuous reconciliation and in the scope of Kubernetes its relation to the Kubernetes API. As we see it, an operator is thus an asynchronously reconciling microservice managing one or more controllers. Each controller executes resource implementations. These resource implementations can do whatever is necessary. A resource implementation might manage Ingress Controllers running within Kubernetes, or at least their registered backends. One resource implementation might manage Cloud Provider specific resources, like VPCs in AWS.
Reconciliation is a tricky beast and anything but straight forward. The point of reconciliation is to drive the current system state continuously towards the defined desired state. The differentiation between current and desired state is key here. You need to understand what both of these things are and what they mean.
Let’s assume we want to reconcile the version of an Ingress Controller. The Ingress Controller as it runs already, if at all, runs in its current version. The current state of the version can be read from the system. Then there is the request coming from somewhere to change the current version. The version defining how it should change is in the desired state and is written to the system.
So you should remember this, the current state is always only read from the system, whereas the desired state is always only written to the system. Your implementation is then reconciling periodically, like once every 5 minutes. Then creating, updating or deleting generates events by updating the resource in the Kubernetes API. Down the road, there are more sophisticated techniques to implement something like level based API operations and whatnot, but that goes too far for now.
We came up with our own operatorkit library because nothing else was available. It started out as an opinionated set of primitives and got tailored to reconcile the hundreds of Kubernetes clusters we manage in production right now.
Each framework has different features, or does the same things differently. What we focused on was a composable design backed by a reliable informer implementation, plus some sugar here and there. The informer implementations provided upstream were cumbersome and faulty as we experienced over time. That is why we came up with our own deterministic implementation.
We have different resource interfaces suited for different use cases. The CRUD resource interface guides you through the separate steps of reconciliation itself and makes it quite easy to get the implementation right on a conceptual level. For instance, you implement separate methods for reading the current state and computing the desired state as it should be applied. Further, there are separate methods for creating, updating or deleting the system resources your reconciliation aims to manage. All the abstract magic to decide when to call which method of the interface implementation and how to do so is made under the hood.
The resource interface makes implementations composable so we wrap resources in resources just like the middleware patterns you know from API Gateways or your favorite microservice framework. What we use for all of our resources are the retry and metrics resources. They wrap the actual resources and backoff as configured on error, and emit metrics so we know how often we reconcile or how long reconciliation loops take.
One of our favorite primitives is the automatic finalizers handling. You do not need to worry about it. Any observed runtime object that is reconciled gets a specific finalizer applied for its observing controller. All automatically under the hood. When an error is returned during deleting a system resource, operatorkit does not remove the finalizer registered for the initiating controller. Once the deletion succeeds, which means finishing without returning an error, the finalizer is removed from the reconciled runtime object. At this point, the system should be clean again as it was before. If not, the resource implementations are simply broken in that regard.
In combination with finalizers and other features, operatorkit supports control flow primitives to keep finalizers on demand or cancel resources on purpose. Our implementation here relies on the concept of the Golang Context and is a bit ugly, to be honest. There are ideas on how to improve that in the future where we can also align more with the community.
And now, kubebuilder came around as well, with yet another fundamentally different approach. A good one though.
Since all of this is pretty much new and was never invented before, we have to make some tough decisions. We would like to get closer to the community while keeping our pace. We think we have a couple of useful things we would like to share, or at least not give up. The question will be how we can get your features while keeping ours. One possible solution could be to team up with the kubebuilder maintainers and get in what is important to us. That would mean a fair bit of work, and the success of our investment there would not be guaranteed. Another way would be to try to take what the community provides and make it work for us somehow. As we recently investigated this would be rather weird and ugly with kubebuilder as it is right now because of the code generation it provides. This is a double-edged sword and simply not flexible enough to hook into or provide upgradeable code generation to maintain boilerplate across versions while keeping customizations.
What we want to keep are our control flow primitives and the automatic finalizer management. Also, the resource interfaces and resource wrapping is important to us as a feature. What we want from kubebuilder is the code generation and boilerplate management, but upgradeable and customizable. Also, the level based API implementation and the reconciliation result primitive to ask for rescheduling at a specific future point in time are interesting for us.
Giant Swarm’s managed microservices infrastructure enables enterprises to run agile, resilient, distributed systems at scale, while removing the tasks related to managing the complex underlying infrastructure.
GET IN TOUCH
CERTIFIED SERVICE PROVIDER