When we started doing platform work more than ten years ago, nobody called it "platform engineering." We were just trying to make it easier for people to run software in production without losing their weekends.
A lot has changed. Kubernetes matured, the CNCF ecosystem grew to roughly 200 projects, and suddenly every enterprise has a slide about internal developer platforms. But one thing hasn't changed: the sheer amount of work it takes to keep a production platform running once you've assembled it.
We've started calling this the platform assembly tax. Not the cost of building your platform, but the ongoing cost of keeping it alive, healthy, and compliant. In our experience, it's the thing most teams massively underestimate. At KubeCon EU this year, we surveyed 143 platform and infrastructure professionals to see how widely that experience was shared. The results were not surprising. That's the problem.
The assembly tax is ultimately a time tax.
It's not really about money, at least not directly. It's about where your best people spend their hours. Every week a senior engineer spends chasing an upgrade conflict or debugging a Helm chart interaction is a week they didn't spend helping a product team design a better system, or thinking about how to make the next hundred deployments safer and faster.
I've had some version of the same conversation dozens of times with platform teams. They started with a reasonable stack, made sensible choices at each step, and somewhere around the two-year mark realized that half their engineering capacity had quietly migrated to keeping the platform running rather than improving it. Nobody made that decision. It just happened.
The survey put numbers to the pattern. When we asked what platform teams plan to focus on this year, security and reliability came out on top -- which makes sense. But cluster management and observability ranked third and fourth. These aren't innovation priorities. These are the things you have to keep doing just to stay in place. That's the assembly tax showing up directly in people's roadmaps.
You'd think this would be getting easier by now. In some ways it's getting harder.
Most of the teams we surveyed are running across multiple clouds simultaneously -- AWS, Azure, bare metal, Google, VMware all featured heavily -- with platform teams of between 6 and 15 people. That's a lot of integration surface for a small team to hold together.
And the surface keeps growing. Regulations like NIS2 and the EU AI Act are adding new compliance requirements that platform teams are expected to absorb. AI workloads are landing on infrastructure that was never designed for them -- GPU scheduling, model serving across clusters, cost tracking at a granularity most observability stacks weren't built for. Often without any additional headcount on the platform side.
And every layer you add grows that maintenance surface — which ultimately eats more of the one resource you can't buy more of: the time and attention of the people who understand the system.
Here's the data point that stayed with me longest.
When we asked how respondents plan to make improvements this year, 99 out of 143 said internal staff only. Just 26 said they'd use some external support. That's 69% planning to solve a compounding, multi-cloud, security-and-reliability-first challenge with internal effort alone.
I'm not sure that's always a deliberate choice. It often looks more like a default — the assumption that managing your own platform is the right way to demonstrate technical maturity, that leaning on external support means giving something up.
The most capable platform organizations we work with aren't the ones who built everything themselves. They're the ones who were honest about where the value of their engineering time actually was, and deliberate about what they chose to own.
There's a related misconception worth addressing: the idea that open source means low total cost.
It's free to download. It is not free to operate. The time your team spends tracking upstream releases, monitoring vulnerability disclosures, testing upgrades, and managing dependency chains -- that time has a real cost. Many organizations don't account for it, because they don't fully value their own team's time against the alternatives.
This isn't an argument against open source. We've built our entire platform on CNCF projects and we believe in the communities behind them. But believing in open source means being honest about what it actually takes to run it well in production, at scale, over years. The EU Cyber Resilience Act is making that honesty mandatory for a lot of companies -- and I think the effect on the ecosystem may be more interesting than most people expect.
Platform engineering is a way of working, not a thing you buy. There's no box you plug in that makes this go away.
But there are real, honest choices about how much undifferentiated heavy lifting you want to do yourself versus relying on curated, well-integrated components -- from the community, from partners, or from a combination of both. Most teams have settled on "do it yourself" as the default, without necessarily asking whether it's still the right answer for where they are now.
The goal isn't to eliminate platform teams. It's to free them from re-solving the same problems everyone else is also re-solving, so they can focus on the work that's actually unique to their organization.
The real measure of a platform isn't how many tools it includes. It's how much time it gives back to the people who use it.
If this resonates, or if your answers would have looked a lot like the survey data above, I'd genuinely like to hear how you're thinking about it. Feel free to reach out.