• Jun 24, 2022
Over the recent years, since the advent of Docker and the rising popularity of Containers, the concept of Infrastructure as Code (IaC) has constantly been expanding. What started as APIs to concrete infrastructure like (virtual) machines, networking, and storage, slowly expanded to include the OS and Kubernetes as well as their configuration and hardening. It all depends on your point of view. When you look at modern IaC tools like Terraform, they even go as far as to support the deployment of workloads.
What has not changed so much is why people got excited about the “as Code” movement. It all boils down to using the tools (editors, CI/CD,...) and processes (code reviews, versioning,...) that we were used to from software development and applying them to the lower layer. While at the same time making them descriptive, repeatable, shareable, and, last but not least, automatable.
Now, the next step for most is expanding this concept and its benefits to the whole Developer Platform that we want to offer our developers. The goal being, to build Platform-as-a-Service-like systems that abstract away the infrastructure and enable engineers to focus on their code. Just like a PaaS, we’d ideally get benefits like self-service, standardization, and shared common best practices as well as some type of security and compliance enforcement without having to bother developers.
However, there are some pitfalls of typical PaaS systems that we should avoid.
First, the abstractions of a PaaS often result in artificial limitations, and as software and developers grow and mature, they hit more of those limits. Now, with traditional or closed PaaS systems, this leads to exceptions being modeled as (ugly) workarounds. Second, traditional PaaS often had the downsides of high vendor lock-in. Third, we should ask the unpopular question: is a single platform actually enough? Do your Data Science Engineers need the same platform that your eCommerce team needs?
"Kubernetes is a platform for building platforms" — you've probably heard some version of this if you follow Kubernetes thought leaders like Kelsey Hightower or Joe Beda.
In line with this, I would propose Kubernetes can actually be the platform of choice for more than just containers. In fact, it can be the one thing we need to finally get to the Platform as Code world we envision.
The benefits of Kubernetes — both as an orchestrator and as a unified interface — form the basis of my argument. Kubernetes as an orchestrator brings us its famous reconciliation approach, which you could see as a stronger type of the declarative paradigm. It allows for codifying operational knowledge (in custom controllers aka operators), which is more resilient and future-proof than building this knowledge into scripts of any form. Furthermore, its state is a storage of desire and not a storage of status — the latter being a typical downside to storage and state in typical IaC tools.
Kubernetes, as a unified interface, brings us a common API with built-in features like authentication, rate limiting, and audit. Even better, this API has become the standard for cloud native workload management, and with its native extensibility, the familiarity with the Kubernetes API translates to API extensions. Piggy-backing further on Kubernetes’ success over recent years, there’s extensive support in tooling from traditional IaC over CI/CD to modern GitOps approaches.
Last but not least, a lot of companies have already been extending the API for many use cases, including getting the first consensus on common abstractions for defining clusters, apps, and infrastructure services from within Kubernetes.
First and foremost, we have the upstream project Cluster API, which not long ago announced reaching production readiness with 1.0. For the uninitiated, Cluster API is an upstream effort towards a consensus API to declaratively manage the lifecycle of Kubernetes clusters on any infrastructure. And if that sounds like just an API to you, be assured that it includes working implementations of said API to get clusters spawned on many infrastructure providers out there, including the big hyperscalers as well as common on-premises solutions.
Now that we have clusters checked off, next are applications and workloads in said clusters. For a full-featured Cloud Native Platform, you’ll want a base set of observability tooling, connectivity tools, tools that compose your developer pipelines, and maybe even some additional security tooling or service meshes. As for now, as a community, we can at least agree on Helm as a common packaging format. However, how to actually deploy those Helm charts into clusters, especially in multi-cluster environments, which are becoming more and more common as it gets easier to manage clusters, is still a field where consensus is low. If you’ve already jumped on the GitOps bandwagon, tools like Flux CD have some abstractions like the HelmRelease that could help. At Giant Swarm, we developed an open source Kubernetes extension called app-operator that extends Helm, adding multi-cluster functionality as well as multiple levels of overrides for configuration that ease the pain of configuration management in use cases where you deploy fleets of apps into fleets of clusters. It also prepares the way for including more metadata like test results and compatibility information into the deployment process.
One other type of resource that we can hardly ignore is Cloud Provider Services. Here, we see most of the hyperscalers developing their own native Kubernetes extensions so that you can spawn what they call 1st party resources, like, for example, managed databases directly from within Kubernetes and connect to them from your Cloud Native workloads. Another very interesting approach is that of Crossplane, an open source Kubernetes extension enabling users to assemble services from multiple vendors through the same extension, offering a layer of abstraction that reduces lock-in to the actual provider of which it already supports quite a lot.
The above is just the base set of extensions; there’s quite a lot of growth in the sector, and more and more projects either use Kubernetes under the hood or extend it openly towards their use cases. In the context of building Platforms as Code, it is especially relevant to mention some of the more specific frameworks and extensions that cover specialized but common use cases like MLOps/AI with the Kubeflow project and Edge Computing with KubeEdge.
We are still in the early days of Kubernetes extensions as well as Platform as Code in general. Most standardization efforts are still early, but are moving rapidly towards consensus and consecutive production readiness.
The area that we need to address the most is the User Experience of such extensions. This is not limited to just improving the validation and defaulting of our APIs, but we also need to improve the discovery of extensions as well as their level of documentation. Furthermore, once we move closer into production with some of these standards, we as a community need to be careful that we keep the APIs composable and foster interaction without closely coupling systems. Last but not least, debuggability and traceability in complex systems with many Kubernetes extensions is still something that can be improved upon.
One sure thing, however, is that Kubernetes is here to stay. It will further establish itself as the interface of choice for infrastructure and cloud native technology in general. In addition, more standards will be established, and more tools will support and integrate with these standards.
In short, my vision going forward is Kubernetes as the Cloud Native Management Interface. It isn’t one tool to rule them all but a consensus API that unifies communities. Of course, you can still have and give the freedom to use the tools of your choice, but the unified open source interface guarantees that you won’t get locked in.
With Docker and Containers, we created the mindset of treating Workloads as ephemeral. Using the same technologies, we can expand this notion not just to Kubernetes clusters, but to our whole developer platform. Or, if you like, the multitude of platforms we’ll offer our users.
Giant Swarm’s managed microservices infrastructure enables enterprises to run agile, resilient, distributed systems at scale, while removing the tasks related to managing the complex underlying infrastructure.
GET IN TOUCH
CERTIFIED SERVICE PROVIDER