Treat Kubernetes Clusters as Cattle, Not Pets

June 30, 2021

This post is more than a year old. The contents and recommendations in this blog could be outdated.

Dev Ops Engineer

engineering

Tl;dr

Kubernetes clusters along with all the tooling needed for operating a production grade service should be disposable just like its nodes and pods are disposable. This principle enables you to re-architect your whole infrastructure, solve a lot of your present problems and have a good sleep at night for your services’ full lifecycles.

Treating clusters as cattle enables you to run many small clusters, e.g. one for each domain and environment. As a result, you are comfortable with all day-two operations like keeping the Kubernetes, tooling, operating systems and package versions up-to-date, because you have perfect isolation and similar environments.

In order to achieve this similarity, you choose to follow the GitOps pattern, not only for your service deployments to be reproducible. Also the desired states for the tooling, the Kubernetes installation itself, CPU, memory, networking and storage you just declare in Git and let operator binaries do their work. With comprehensive GitOps, you have your day-one and day-two operations automated. And you get disaster recovery for free.

By relying on Vanilla Kubernetes instead of a complex platform orchestration machinery like OpenShift or Rancher and by having your infrastructure providers abstracted, you are free to choose and instantly change your vendors. If there is something wrong with the pasture grass, you just move your cattle somewhere else.

Having the reproducibility from GitOps combined with the flexibility of abstracted infrastructure providers and Vanilla Kubernetes makes treating clusters as cattle really easy. This way of automation enables you to run many small, isolated, secure and service oriented platforms whenever and wherever you need them. Luckily, we invented this wheel called ORBOS already. It’s Open Source Software and free to use for everyone.

The Shaping of Our Philosophy

We at CAOS AG founded our startup roughly two years ago. Back then, most of us worked on a self-developed Identity and Access Management System (IAM) at an E-Government company. As this IAM has grown in users and traffic, we found that we could massively improve the quality of our highly critical service, if we operated it ourselves on top of Kubernetes. As our company didn’t have any Kubernetes cluster available at that time, nor any experience with Kubernetes, we created and maintained one ourselves using Tectonic, which back then belonged to CoreOS (now RedHat). There is not that much to bootstrapping a new small cluster and we had a quick performance gain with all the convenience Kubernetes has to offer.

Soon, developers from other domains of the company started deploying their applications on our cluster too. It was inevitable that at some point in time we ran into pod limits and other problems that came with the cluster’s scale. Experiencing these issues from a service provider's perspective made us wonder if there isn’t a better solution. We concluded that dedicating a Kubernetes cluster to a service, including the infrastructure and tooling, improves a provider's ability to guarantee its quality. We saw our chance to go to market with a brand new Cloud Native IAM ZITADEL on our own, distinguishing it from other IAM vendors' products by including the automation for small and dedicated Kubernetes clusters.

If you need arguments for running many small clusters instead of a big one and solutions for doing so without increasing your operational costs, read on.

From left to right: Florian Forster, Christian Jakob, Silvan Reusser, Fabienne Gerschwiler, Max Peintner, Stefan Benz, Elio Bischof, Michael Wäger, Jürg Rinaldi, Livio Amstutz, Maximilian Panne

Isolate Your Domains

Doing DevOps means that dev and ops engineers have a common goal, which is providing good service quality. The ops side traditionally needs to run many services from many domains with different quality requirements on the same infrastructure and platform. This makes it hard to serve the developers the exact infrastructure they need. Imagine each domain has its own production cluster and a set of test clusters. The clusters are guaranteed to never interfere with each other on the operating system level through NFS volumes, resource consumption and the like.

If the tech stack is siloed along its services all the way down to the infrastructure, not only can each stack be precisely tweaked to the domains and services needs. The tweaker is also more confident in doing so, because he is absolutely sure his tweaks have no unforeseen negative impacts on other domains. If something goes wrong, then not the whole company is at risk that the lights go out, but the blast radius is limited to that domain only. The improved confidence leads to more security updates, feature updates and configuration optimization of the infrastructure, tooling and the API Gateway.

Fireball_comparison_pic.jpg found on wikipedia.org

Manage Your Desired States Only

You might be thinking that the operational costs would grow almost linearly with the number of clusters. And that’s true for all tasks that you manually execute. Manual tasks are not only expensive and error prone, they also lead to hardly reproducible states which decreases your confidence and performance when doing day-two operations. So you are forced to automate each of them. Currently, the GitOps pattern is gaining adoption for Kubernetes workload deployment. But it’s very suitable for the platform and infrastructure as well. With GitOps, you focus on automating day-two operations and let a system reconcile its current state based on a declared desired state. If anybody changes something in the system, the reconciling operator reverts the changes immediately and brings the system back to its desired state. So with GitOps you are guaranteed to have a self healing and reproducible system over time.

In this blog post, David Dooling from Atomist is advocating the GitOps pattern for solving the reproducibility problem too. But he focuses on storing the Kubernetes resources declarations in Git and relies on managed services and third party CLI tools for bootstrapping new clusters. I think this doesn’t scale well if you want to run many small clusters instead of few big ones. If you need to reproduce a cluster along with its infrastructure, you need the desired infrastructure state to be declared somewhere, preferably in a provider independent way. The other point is, that when storing a mess of hundreds of Kubernetes resources in Git for many clusters in a “copy-and-own-way”, your clusters will start to behave differently from each other over time until their differences are not manageable anymore.

ORBOS fills this gap. Our ORBOS operators interpret their self defined desired states and translate them into cloud and Kubernetes resources. This abstraction layer not only leads to having just a fraction of YAML stored in Git, also the differences in cloud provider infrastructure is as minimal as possible and transparent in their YAML. With ORBOS, you can scale to thousands of clusters without having to employ tons of engineers. Here you find an example YAML for the infrastructure and here is a dead-simple one for the platform tooling that omits all default values.

Easily Recover from Disasters

When a cluster goes down, you need to be able to bring back your service’s last state from scratch. To prepare for this, you have to make your workload’s state declarable. This means for example, that all database schema changes should be tracked, tied to your services version and re-executable at any time in an idempotent way. For example, we built a ZITADEL operator that uses Flyway. Check it out on GitHub. When your whole workload is declared, you can easily replicate it to another cluster. But in a disaster recovery case where the whole cluster goes down, you might be in trouble getting the declarations out of the broken cluster. Here is where GitOps is helping out again, because with GitOps, you have your desired state declarations stored out-of-band. For learning more about our understanding of using GitOps for workload deployment, checkout Christian Jakob's blog post in German.

Now, combine that with a GitOpsed tooling, platform and infrastructure described in the previous chapter. If done right, a comprehensively “gitopsed” environment is absolutely bulletproof in almost all failure cases. As GitOps comes with perfect reproducibility, you can test if you’ve done it right as much as you want just by replicating your systems. Congratulations, from here on, you can just kill your cattle instead of taking them to a doctor like they were pets. For being bulletproof in not almost all but really all failure cases, read on.

via GIPHY

Avoiding Lock-Ins

Cattle care less about its owner or territory than pets do. A Cluster on Google Compute Engine should be seamlessly reproducible with other cloud providers and also on-premises if your requirements for prices, data policies, locality etc. change. If you use managed Kubernetes clusters like GKE or AKS, you will have many fancy and integrated features like monitoring or networking. This is comfortable but comes at the cost of a vendor lock-in causing much headache when you try to move to another provider. Especially in cases of critical networking or hardware failures you are glad if you have multiple infrastructure providers abstracted so they are at your disposal immediately.

Also, your workload and Kubernetes API should not rely on the cluster and infrastructure management systems. The operator you use should utilize a generic tool like Kubeadm to maintain the clusters lifecycle and avoid depending on critical API server intermediating components that also need energy and time investments. Without these additional moving targets, you are way more safe from getting locked in to the operator vendor. For example, if you would like to stop using ORBOS and move to another infrastructure and platform management, the only thing you have to do is to delete its deployments. If the operator is stopped or the operator throws errors, neither your service quality nor the Kubernetes API is affected.

kubeadm-stacked-color.png found on kubernetes.io

Real World Experience

Implementing our GitOps operators definitely took some time and energy. But in return, it enabled us to scale massively and maintain many clusters now. Each cluster is perfectly monitored in a centralized Grafana Cloud instance. If something goes wrong, we fix the operators' code and roll it out to all clusters. Instead of accidentally fixing the same issues again and again, we improved the maturity of our operators incrementally. Now they are in a solid state where they just work and let our customers and us have a good sleep each night.

Treating clusters as cattle enabled us to expand our business model in selling our IAM service ZITADEL too. We are not only a Software-as-a-Service provider but also deliver the whole automation needed to run instances dedicated to customers in isolated environments of their choice too. It is crucial for many companies to have full control over all user data stored in our IAM solution, so they often need dedicated instances deployed in their own data centers. An IAM is a mission critical central system that has high quality and scalability requirements to the infrastructure. With our automation, customers yet do not have to assign engineers to the task of operating our product. So the costs that our product causes over time keep being limited and calculable, thanks to our GitOps operators. This gives us a significant advantage over our competitors.

Apart from selling ZITADEL, ORBOS also gave us the opportunity to offer maintenance and support for dedicated Kubernetes clusters. Also, there is a tech company that is running about twenty Kubernetes clusters using ORBOS, split along its domains. Another software agency is running their services on multiple ORBOS managed Kubernetes clusters dedicated to end customers.