• Oct 6, 2023
While the general trend of the last decade has been to move workloads to the cloud, along with major companies jumping on the bandwagon to offer their own managed services, the industry isn’t ready to relegate on-premises infrastructure to the closet just yet.
Many organizations keep running their own Software-Defined Datacentre (SDDC) and will do so for the foreseeable future for reasons such as sovereignty, low latency prerequisites, data privacy, or simply cloud providers selling IaaS capacity on VMware Cloud Director. On top of that, companies that already have solid investments in their on-premise SDDC with the right professionals to run it aren’t going to start over for the sake of moving to the cloud.
With that said, running an on-premises infrastructure isn’t a walk in the park, and Kubernetes is a whole different beast introduced to the zoo. It shouldn’t mean these customers can’t adopt the cloud native ecosystem without losing their sanity and this is where Giant Swarm jumps in by offering Kubernetes on VMware Cloud Director!
Let’s first take a look at today’s star of the show and how we got to writing this piece.
VMware Cloud Director (VCD) is a cloud provider solution (much like OpenStack) built on VMware’s SDDC stack, featuring components such as VSAN for storage, NSX-T for networking, NSX ALB for load balancing, and of course, vSphere ESXi for compute. VCD is essentially a management layer added on top of these bricks, to allow organizations to sell resources to internal and external customers in various fashions such as Pay-As-You-Go or Reserved…
On top of these, VMware offers Kubernetes on VMware Cloud Director with Tanzu and the Container Storage Extension (CSE) plugin for VCD. However, we found that customers exploring this route face the same challenges as those looking at managed Kubernetes offerings from major hyperscalers:
Giant Swarm plugs these holes by offering the option for a full-featured end-to-end managed cloud native platform for Kubernetes on VMware Cloud Director.
Cluster API Provider VMware Cloud Director (CAPVCD) is a provider for Cluster API (CAPI) to support deploying Kubernetes on VMware Cloud Director. This upstream Open Source project was initiated by VMware in mid-2021 in an effort to upgrade their CSE plugin in the future. Back then, the GitHub repo had only a dozen commits but the core features were there.
Fast forward a few months to the early hours of 2022, and Giant Swarm started investigating this new provider. The objective was twofold: to identify how it could solve pain points for our valued VCD customers and to understand the specifics of their unique environment.
What makes working with on-premises so fun (depending on who you ask) is the variety of environments you come across. Not two customers have the same infra and there are always a few exotic quirks that need to be accounted for.
During the exploration phase of our customers’ requirements, it became apparent that the project would require a few tweaks. Enter the shining attribute of CAPVCD’s Open Source nature, which allows us to implement new capabilities to the product that wouldn’t have been possible otherwise. We contributed quite a bit to the project so that we could run Kubernetes on VMware Cloud Director with our customer’s environment’s requirements in mind.
A few months after starting the exploration phase and in collaboration with the maintainers, we made our first few additions to the product:
Along with the ability to connect the nodes to several networks, we use postKubeadmCommands to easily configure static routes in our cluster chart.
In this excerpt of values.yaml, we connect three additional networks to each node and specify static routes to go along with them. Note that we call them additional networks because there’s already an existing field that sets the first network card where the default gateway will be located (.providerSpecific.ovdcNetwork).
connectivity: network: extraOvdcNetworks: - MY_network_x - MY_network_y - MY_network_z loadBalancers: vipSubnet: 10.205.9.254/24 staticRoutes: # Routes out MY_network_x - destination: 10.30.0.0/16 via: 10.80.80.1 - destination: 10.90.0.0/16 via: 10.80.80.1 # Routes out MY_network_y - destination: 172.31.192.0/19 via: 184.108.40.206 - destination: 10.200.200.0/18 via: 220.127.116.11 # Routes out MY_network_z - destination: 10.0.0.0/8 via via: 18.104.22.168
Many large organizations enforce naming conventions because of some obscure automation happening somewhere in the background that is based on those. If you worked with Cluster API before, you might know that the names of the provisioned infrastructure objects look something like this: mycluster-worker-64b89864f9xg6h5x-rkh8b. However, this doesn’t cut it with existing naming conventions (unless you are veeeery lucky).
In the example below, the virtual machines backing the Kubernetes nodes will be named giantswarm-xxxxx as you can see in the screenshot.
providerSpecific: vmNamingTemplate: giantswarm-
Giant Swarm supports multiple Cluster API providers such as AWS, Azure, GCP, vSphere, and OpenStack. For us, deploying a Kubernetes cluster to VMware Cloud Director remains the same across any of these providers, with the exception of provider-specific parameters of course.
A cluster is made up of two different Giant Swarm apps, which we will explore below. If you want to know more about Giant Swarm apps, check out our App Platform.
Based on our cluster-cloud-director helm chart, this app defines what the cluster should look like. The configuration is stored in a configMap and lets you configure many parameters. While many sensible parameters are set by Giant Swarm, everything remains fully configurable.
The cluster app will automatically install Cilium, the Container Network Interface (CNI), the VCD Cloud Provider Interface (CPI), and CoreDNS using HelmReleases.
Here’s an example of a minimalist cluster app and its configuration values file. In these values, you can see the various VCD-specific values such as the sizing policy to use, template and catalog to deploy from, OS disk size (must be greater than template’s size), Load balancer’s gateway’s CIDR, VCD endpoint, OVDC (Organizational Virtual Datacenter), ovdcNetwork for the NIC where the default gateway will live and so on…
--- apiVersion: application.giantswarm.io/v1alpha1 kind: App metadata: name: gs-test namespace: org-giantswarm labels: app-operator.giantswarm.io/version: 0.0.0 app.kubernetes.io/name: cluster-cloud-director spec: catalog: cluster extraConfigs: kubeConfig: inCluster: true name: cluster-cloud-director namespace: org-giantswarm userConfig: configMap: name: gs-test-user-values namespace: org-giantswarm version: 0.13.0 --- apiVersion: v1 kind: ConfigMap metadata: labels: app-operator.giantswarm.io/watching: "true" cluster.x-k8s.io/cluster-name: gs-test name: gs-test-user-values namespace: org-giantswarm data: values: |- baseDomain: "test.gigantic.io" controlPlane: catalog: giantswarm replicas: 1 sizingPolicy: m1.xlarge template: ubuntu-2004-kube-v1.24.10 diskSizeGB: 30 oidc: issuerUrl: https://dex.gs-test.test.gigantic.io clientId: "dex-k8s-authenticator" usernameClaim: "email" groupsClaim: "groups" connectivity: network: loadBalancers: vipSubnet: 10.205.9.254/24 proxy: enabled: true nodePools: worker: class: default replicas: 1 providerSpecific: site: "https://cd.neoedge.cloud" org: "GIANT_SWARM" ovdc: "Org-GIANT-SWARM" ovdcNetwork: "LS-GIANT-SWARM" nodeClasses: default: catalog: giantswarm sizingPolicy: m1.2xlarge-new template: ubuntu-2004-kube-v1.24.10 diskSizeGB: 60 userContext: secretRef: secretName: vcd-credentials vmNamingTemplate: giantswarm- metadata: description: "Testing Cluster" organization: giantswarm internal: kubernetesVersion: v1.24.10+vmware.1
In order to enforce the installation of a set of apps in all the clusters of a specific provider, we leverage the concept of App-of-Apps to simplify lifecycle management. Essentially, we install an app in the management cluster, which in turn installs all the required apps in the workload cluster. We call it Default-Apps, and it is based on our default-apps-cloud-director Helm chart.
You can find the list of all the apps that will be installed in the cluster as part of the default apps in the values.yaml file, it is also where you would change the configuration of the apps themselves.
In the example below, we configure the default apps to configure HTTP/HTTPS proxy variables and add a secret used by cert-manager which contains AWS Route53 credentials to solve DNS01 challenges. Typically, these two are usually used in combination in private clusters.
apiVersion: application.giantswarm.io/v1alpha1 kind: App metadata: labels: app-operator.giantswarm.io/version: 0.0.0 app.kubernetes.io/name: default-apps-cloud-director name: gs-test-default-apps namespace: org-giantswarm spec: catalog: cluster kubeConfig: inCluster: true name: default-apps-cloud-director namespace: org-giantswarm userConfig: configMap: name: gs-test-default-apps-user-values namespace: org-giantswarm version: 0.6.0 --- apiVersion: v1 data: values: | clusterName: gs-test organization: giantswarm managementCluster: glasgow userConfig: certManager: configMap: values: | controller: proxy: noProxy: 10.80.0.0/13,10.90.0.0/11 ,test.gigantic.io, cd.neoedge.cloud,svc,127.0.0.1,localhost http: http://10.100.100.254:3128 https: http://10.100.100.254:3128 apps: certManager: extraConfigs: - kind: secret name: gs-test-cert-manager-user-secrets kind: ConfigMap metadata: labels: app-operator.giantswarm.io/watching: "true" cluster.x-k8s.io/cluster-name: gs-test name: gs-test-default-apps-user-values namespace: org-giantswarm
Since we have a multitude of clusters to manage for our customers, Giant Swarm is committed to using and improving the Gitops framework and it fits just right with Cluster API. We rely on the Open Source project Flux-CD to control the desired state of clusters across our customers' fleets, all orchestrated from the simplicity of a few Git repositories.
Gone are the days of clicking through the VCD UI or issuing imperative commands to deploy or manage clusters. Pull Requests (PRs) are the new cool kids on the block as the benefits are countless:
Our Gitops framework allows our customers to create Kubernetes clusters on VMware Cloud Director by committing a few files that can be copied from existing clusters. The structure offers a base layer, to which a number of overlays can be added to customize clusters and their respective apps independently from each other.
You can find out more about this by looking at our Gitops-template repository, which contains the structure we use for both our customers and our own clusters.
We advise all our customers, regardless of the infrastructure provider they use, to embrace GitOps for all these reasons, and so far, the feedback has been nothing but great. Through PR reviews, the customer gains peace of mind while simultaneously acquiring valuable insights when collaboration is needed for fixing or commenting.
We’ve been iterating our CAPVCD integration for a while now and it’s been a success marked by fine-tuning the deployment processes, upgrade procedures, and feature additions. Our customers are happy with the result and keep helping us improve the product through technical feedback and feature requests, which we enthusiastically welcome.
While we already have a solid production-ready way to manage Kubernetes on VMware Cloud Director, our roadmap is packed with cool enhancements that will keep our customers and ourselves on that upward development trend we’ve been riding for the past year and a half.
Giant Swarm’s managed microservices infrastructure enables enterprises to run agile, resilient, distributed systems at scale, while removing the tasks related to managing the complex underlying infrastructure.
GET IN TOUCH
CERTIFIED SERVICE PROVIDER