When NOT to use Kubernetes

TL;DR: Default to Kubernetes only when there’s not a better option for your workloads.

To Kubernetes, or not to Kubernetes, that is the question.

You may have heard that Kubernetes is not a silver bullet or that batteries are not included, but that doesn’t mean you shouldn’t use it. The real key to getting the most out of Kubernetes is to filter out the hype and understand what it actually does and doesn’t do, so that you can make a value judgement based on your use cases, instead of just following the crowd.

But instead of going over the pros and cons of Kubernetes itself, lets look at what the alternatives are. Often it’s easier to rule things out, rather than try to make a judgement call based on a spec sheet.

Kubernetes Alternatives

  • FaaS (Serverless) — Lambda, Google Functions, OpenFaaS, etc.
  • PaaS — Heroku, Cloud Foundry, Google App Engine, etc.
  • CaaS — Azure Container Instances, Google Cloud Run, Elastic Container Service, Fargate, DC/OS, etc.
  • IaaS (VMs) — EC2, GCE, vSphere, etc.
  • Bare Metal — MaaS, Packet, DIY, etc.

Obviously this isn’t a complete list, but the categories make it relatively easy to reason about the competition more generally. For more about the differences see IaaS vs CaaS vs PaaS vs FaaS.

When is FaaS a better choice?

Building your system in the serverless style (on a Functions as a Service platform) requires a bit of a paradigm shift, so it’s best used when you’re already thinking about a rewrite, are building a new system from scratch, or are looking to glue systems together with some integration connectors.

Functions (another name for serverless that sounds less pretentious) are most useful when your workload is event driven and has short-lived processes. While you can technically use functions to run a website, that’s not gonna be the most efficient use of a FaaS and you’d probably be better served with something that optimizes for long running processes (literally any other alternative other than a batch platform), for better caching and consistently lower latency.

FaaS is great, but it’s probably one of the most limiting options on the list. It’s great for things that need to auto-scale horizontally, but can get really expensive quick if that’s your goal. To take the most advantage of FaaS, your workload needs to be non-continuous, so you can scale to zero and pay nothing when it’s not in use.

Probably the best reason to use a FaaS is that it if it’s hosted by a cloud provider it’s gonna be the most vertically integrated and the easiest to manage. You pay the cloud provider to do all the low level operations and you just manage your functions. This means you may not even need an internal platform team, which is money in your pocket.

The downside, of course, is that FaaS is also the least portable option; most hosted FaaS are provider-specific. And if you host your own (OpenFaaS, OpenWhisk, Knative, etc.), you’re definitely going to need a platform team to manage it. Plus, these self-hosted options are all newer than Kubernetes and less mature. So you’ll spend more finding talent and building integrations to mature them yourself.

When is a PaaS a better choice?

The hallmark of modern PaaS is optimization for 12-factor apps. The key limiting factor here is that PaaS generally only support “stateless” apps. Of course, you’re not gonna run stateful apps in a FaaS either, but you could on a CaaS or IaaS.

PaaS probably makes the most sense for small companies that have few requirements and just want to run a bog standard 3-tier app. It may also make sense for large enterprises that have enough scale to run a PaaS on the side in addition to other alternatives, but we’ll get to that later.

For 3-tier apps, it’s easiest to just use a SaaS database and a SaaS CDN and a hosted PaaS where you run the app backends. This is why things like Heroku are valuable, for when you dont need all the flexibility that a CaaS like Kubernetes would offer.

When is a CaaS a better choice?

Kubernetes usually falls in this bucket, or sometimes it’s considered more of a proto-platform for building other platforms, but it’s not the only option here.

Elastic Container Service (ECS) is probably the most popular alternative CaaS, tho it’s a bit long in the tooth these days and effectively being replaced by EKS on AWS. I wouldn’t recommend starting to use ECS now, but it’s a decent reference point that people are familiar with, because it’s very similar to legacy Docker Swarm/Enterprise in usage, before Kubernetes was baked in.

If you’re looking at CaaS today, this is probably the most diverse market, where vendors are still trying to figure out what interfaces customers want when they don’t want to use Kubernetes, which is effectively the default option due to its popularity. So when you’re trying to decide whether to use Kubernetes vs another CaaS, you have to look at the individual options, not just the category as a whole.

Cloud Run, Azure Container Instances, and ECS are all pretty similar in that they’re hosted platforms that schedule single containers (not pods of containers) and hide a lot of the complexity of operating the nodes and control plane. The primary reason to use these options is generally that they provide a simpler interface than Kubernetes does, and they’re hosted services, so there’s less infrastructure for you to operate. The gotcha is that the use cases for these are pretty limited, so it’s easy to invest and then feel trapped later as your workload requirements change and grow.

Knative has a bit of a categorization problem, because it’s sort of half way between a FaaS and a CaaS, but I’ll include it here because there aren’t really any hosted options, so you have to operate it yourself on top of Kubernetes. You can technically check a box and get it on top of GKE, but it’s still early days and you’re still effectively operating the cluster, even if the provisioning is automated. The reason to use Knative is really that you want something FaaS-like that still allows you to take advantage of your Kubernetes investments, and so you can collocate your FaaS and CaaS workloads.

When is an IaaS a better choice?

Infrastructure as a Service is often talked about as the legacy architecture these days, but in reality it underpins much of the world’s compute power and even most Kubernetes deployments. So it’s less like legacy and more like a standard layer everyone builds on top of. It’s boring! It just works. It gets the job done. And sometimes that’s all you need.

Probably the best reason for using just an IaaS by itself is that it’s definitely the most mature category of platform, with an endless variety of integrations and flexibility. If you just want to get shit done, and not spend your hard earned money on a platform team, but still need the flexibility to run multiple types of workloads, the IaaS is for you!

The trade off with an IaaS is that if you’re already bought into microservices and have hundreds of instances of dozens of workloads, you’ll probably be leaving money on the table or building or buying a platform layer to enable collocation and make workload management easier. But at smaller scales, an IaaS alone can give you a decent amount of bang for your buck without having to spend years investing in making a Kubernetes-based platform production ready.

In Kubernetes-land, there’s lots of things that people want that just aren’t fully baked yet. But in IaaS-land, pretty much everything you can think of is already available and integrated by your provider of choice.

Another good reason to just use an IaaS is if your workload uses heavy network or disk IOPS. Kubernetes doesn’t isolate these resources very well (if at all), which means if you need more isolation you should really consider VMs or bare metal.

And last, but not least, and probably most contentious… If your workloads are heavily dependent the JVM, you might just want to avoid containers all together for now. The latest versions of Java have a “UseContainerSupport” flag which (finally) fixes most of the issues, but you’re still running a virtual machine inside a container inside another (different kind of) virtual machine, and they all have distinct ideas about resource management that can become a headache if the automatic sizing isn’t good enough for your workload. YMMV.

When is bare metal a better choice?

In my experience, there’s a few good reasons to use bare metal:

  1. You absolutely must have your compute local (for latency or security or legal reasons)
  2. Your workloads need to be on the edge (for latency or security)
  3. You need truly HUGE amounts of compute and/or data storage and managing your own datacenters is more cost effective
  4. You have some fancy hardware needs that you can’t get (cheaper) from a cloud provider

The gotcha with many of these is that they don’t actually preclude using an IaaS, CaaS, or PaaS on top or in addition. Almost any situation where bare metal is cost effective is probably also going to require workload collocation in order to actually be cost effective.

¿Por qué no los dos?

If you have enough diversity of workloads, like say web apps AND machine learning AND big data AND databases AND monitoring AND logging (basically any enterprise), you’re probably going to want to have multiple of these platform alternatives available to you so you can use what makes sense for each workload.

And if you’re thinking about running multiple alternatives, it starts sounding like a good idea to run them on top of each other: FaaS & PaaS on CaaS on IaaS on Metal (or some variation). This onion approach is awesome because it allows you to make a decision for each workload which layer to use.

The down side is that you’re now running 5 different platforms (probably multiple versions of each) and they all have different usage patterns and workflow requirements and integrations and security concerns and docs and support contracts and specialist operators and training requirements and optimization strategies… and you just bought yourself a decade of backlog to make it production ready, at which point you’re basically a cloud provider yourself. Good luck!

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two (dozen) roads diverged in a wood, and I —
I took all of them.

Cloud Guy. Anthos Solutions Architect at Google (opinions my own). X-Cruise, X-Mesosphere, & X-Pivotal.