Infrastructure for AI is finally getting a standard » Giant Swarm

Written by Puja Abbassi | Nov 11, 2025

Over the past year, AI has exploded into production, from experimental models to customer-facing apps and internal copilots. But while the models evolved fast, infrastructure lagged behind, cobbled together with custom configs, opaque tooling, and vendor-specific lock-in.

Now, that's changing. At KubeCon North America in Atlanta, Chris Aniszczyk, CTO of the CNCF, announced the launch of the Kubernetes AI Conformance Program, along with the first set of certified Kubernetes-based platforms.

We’re proud to share that Giant Swarm is among the first platforms to be certified, officially recognized as ready to run AI/ML workloads in a standardized, cloud native way.

Why this standard matters

The CNCF AI Conformance Program is the result of months of work by the AI Conformance Working Group. The goal:

To define a consistent set of capabilities, APIs, and configurations that a Kubernetes cluster must offer to reliably and efficiently run AI/ML workloads.

That matters because AI/ML workloads, particularly those involving generative AI, have rapidly become central to many organizations’ strategies. But until now, there’s been no shared baseline to assess whether a Kubernetes platform can support these workloads consistently and at scale. This initiative brings clarity. It helps remove uncertainty for teams building out AI infrastructure, and lays the groundwork for interoperability, portability, and ecosystem-wide growth.

According to research cited by the CNCF, 82% of organizations are already building custom AI solutions, and 58% use Kubernetes to support those workloads. As AI becomes more central to business strategy, teams need shared, open standards to reduce fragmentation and ensure consistent performance. This certification program addresses that need, offering a common foundation across vendors, clouds, and frameworks.

What we’ve seen in the field

At Giant Swarm, we’ve seen firsthand how AI/ML workloads have evolved from occasional use cases to critical paths in modern product development. This shift has only accelerated with the rise of generative models, which bring new infrastructure requirements, especially around GPU sharing, distributed scheduling, and governance.

Over the years, we’ve worked closely with our customers to integrate these capabilities into the Giant Swarm Kubernetes platform. That includes everything from GPU-aware scheduling and storage tuning to tailored security policies and observability for model pipelines. In many cases, we’ve extended existing platform engineering patterns to support these needs without reinventing the wheel.

Betting on standards and helping build them

One of our core beliefs at Giant Swarm is to adopt community standards when they exist and help shape them when they don’t. That’s why we’ve contributed to efforts like the CIS Benchmarks for Kubernetes, Cluster API, and now this AI conformance standard. We believe this kind of community collaboration is essential to building infrastructure that lasts.

“We believe in open standards, this certification validates our platform's readiness for AI, and it gives our customers the confidence to build the future.” — Timo Derstappen, Giant Swarm Co-Founder and CTO

A strategic choice with broad industry support

For teams investing in AI, one of the hardest decisions is where to place your infrastructure bets. What if you choose the wrong stack? What if it doesn’t scale? What if it can’t adapt?

This conformance program helps answer those questions. It shows that Kubernetes-based platforms aren’t just viable, they’re already being used by companies like Bloomberg, Zalando, OpenAI, NVIDIA, and Apple. With backing from major cloud providers and the CNCF, Kubernetes is increasingly the de facto foundation for modern AI infrastructure. That ecosystem support helps reduce risk and makes it easier for teams to move forward with confidence.

Where we go from here

AI/ML platforms are, at their core, an extension of developer platforms. They require the same principles: self-service, governance, scalability, and reliability. Many of the capabilities we’ve built: multi-cluster fleet management and central observability, GitOps workflows, hardened security baselines, are directly applicable to this new class of workloads.

We see AI as the next level in platform engineering. And we’re excited to work with both customers and the broader community to define what that looks like. For us, the certification is just one milestone. The real work continues: supporting teams building AI platforms today and making sure they’re ready for what comes next.

View full post