Fleet Management

Managing Infrastructure at Scale

Multiple clusters, apps, and clouds

A big challenge for companies that are already running multiple Kubernetes clusters is managing these clusters, their associated namespaces, and the applications on top of them — all while ensuring robust governance and security. Establishing a cloud native platform successfully often presents a dilemma. On one hand, you want to foster further developer adoption, but on the other, managing and maintaining the infrastructure needed to support this growth can quickly become overwhelming.

Giant Swarm provides Enterprise Fleet Management Capabilities developed through years of experience. While managing hundreds of clusters 24x7 for organizations, on-prem, and across cloud accounts, we’ve come across many different use cases, ranging from complex networking and team setups to full governance and automation. Our fleet management capabilities go way beyond grouping clusters, providing visibility, and applying rules. We make sure you can efficiently manage at scale, which includes the lifecycle management of all elements involved — from clusters to apps to access rights to teams.

Automation for scalable management

Managing the lifecycle of clusters, namespaces, and apps includes the creation, configuration, update, replacement, and complete decommissioning of these objects. Ideally, everything happens automatically once a change has been initiated and checks are performed to ensure they have been properly put into effect and remain in the specified state. Any source of error or deviation needs to be handled through automation, ensuring governance is achieved without blocking developers, and adding to the workload of the infrastructure team.

The more experience you have in owning lifecycle management across different clouds, use cases, and organizations, the more you can improve your automation, coverage of edge cases, and support for every detail of the life cycle. This is what Giant Swarm excels at. We have managed clusters for customers from the start and built a lot of checks in before we upgrade or decommission a cluster to make sure there is no unexpected behavior or left-over residues.

Take full control

While involving developers and having them take ownership as part of the “shift left” approach is critical, it’s important to provide a secure, compliant, and reliable environment for teams to operate in.

Whether it’s rolling out policies and permissions, managing exceptions, keeping an overview, or standardizing configuration and installation of apps that help secure, troubleshoot, connect, or optimize applications, it is essential to enable your platform team to centrally manage and leverage automation for governance across all clouds and clusters. The state can be changed through GitOps and kept in sync with the definition proactively.

While collaborating with development teams can amplify cost reduction and other initiatives, our solution empowers platform teams to govern centrally and drive value independently, minimizing reliance on internal developers. At Giant Swarm, we understand the importance of achieving efficiency and optimization without creating dependencies. For instance, adidas achieved a remarkable 40% reduction in cloud expenditure by actively optimizing usage and scaling workloads to zero during non-business hours. While communication with development teams is essential, all these activities are orchestrated centrally and automated by the platform team, allowing organizations to swiftly achieve cost-saving objectives without hurdles.

Every technology requires services to provide the most value. Providing these packages to developers through a self-service portal makes capabilities easy to consume and manage, even when scaling. Even if capabilities are only required by certain use cases and are voluntary, you will find that you can apply governance elements more easily once they are widely adopted.

Addressing different team maturity levels

Platform teams tend to heavily focus on technology but one of the biggest challenges for them arises when all of their users are treated the same.

Advanced platform users quickly feel restricted if they are limited in their choices. Newbies are inundated with too many options and can create havoc if not guided sufficiently. A fresh perspective is needed: the development of platforms designed to accommodate “fleets of developer teams”.

Not all teams start off at the same time on their cloud native journey, experienced and inexperienced people join the teams, and some developers believe they can always come up with a better solution if they are just given the freedom. Instead of forcing all developers into the same process and defending a strict setup, platform teams need to support the developers where they are and help them grow and mature. This is where advanced organizations have come to realize that they need to segment development teams and provide various levels of freedom to more mature teams.

With great power comes great responsibility. Shared responsibility models need to be fluid to ensure you provide the right level of support and restrictions to every team – and they need to be managed efficiently across all clusters, namespaces, and workloads. By utilizing the dev team fleet methodology that Giant Swarm provides, you can give developers, teams, and areas the right level of freedom while feeling comfortable with the responsibility that comes with it.

gs23-cloud-native-capability-assessment-bg-v2

Need to See Smarter Platform Engineering in Action?

Schedule your Cloud Native Capability Assessment now and find out how you can boost the performance of your development teams instantly.

Learn more