Building the largest known Kubernetes cluster, with 130k nodes

cloud.google.com

26 points by TangerineDream 3 days ago

hazz99 2 hours ago

I’m sure this work is very impressive, but these QPS numbers don’t seem particularly high to me, at least compared to existing horizontally scalable service patterns. Why is it hard for the kube control plane to hit these numbers?

For instance, postgres can hit this sort of QPS easily, afaik. It’s not distributed, but I’m sure Vitess could do something similar. The query patterns don’t seem particularly complex either.

Not trying to be reductive - I’m sure there’s some complexity here I’m missing!

phrotoma an hour ago

I am extremely Not A Database Person but I understand that the rationale for Kubernetes adopting etcd as its preferred data store was more about its distributed consistency features and less about query throughput. etcd is slower cause it's doing RAFT things and flushing stuff to disk.
Projects like kine allow K8s users to swap sqlite or postgres in place of etcd which (I assume, please correct me otherwise) would deliver better throughput since those backends don't need to perform consenus operations.
https://github.com/k3s-io/kine
- dijit 18 minutes ago
  
  You might not be a database person, but you’re spot on.
  A well managed HA postgresql (active/passive) is going to run circles around etcd for kube controlplane operations.
  The caveat here is increased risk of downtime, and a much higher management overhead, which is why its not the default.

blurrybird 2 hours ago

AWS and Anthropic did this back in July: https://aws.amazon.com/blogs/containers/amazon-eks-enables-u...

yanhangyhy an hour ago

there is a doc about how to do with 1M nodes: https://bchess.github.io/k8s-1m/#_why

so i guess the title is not true?

xyse53 2 hours ago

They mention GCS fuse. We've had nothing but performance and stability problems with this.

We treat it as a best effort alternative when native GCS access isn't possible.

dijit 15 minutes ago

fuse based filesystems in general shouldn’t be treated as production ready in my experience.
They’re wonderful for low volume, low performance and low reliability operations. (browsing, copying, integrating with legacy systems that do not permit native access), but beyond that they consume huge resources and do odd things when the backend is not in its most ideal state.

jakupovic 36 minutes ago

Doing this at anything > 1k nodes is a pain in the butt. We decided to run many <100 nodes clusters rather than a few big ones.

kvrty 5 minutes ago

Same here. Non Kubernetes project originated control plane components start failing beyond a certain limit - your ingress controllers, service meshes etc. So I don't usually take node numbers from these benchmarks seriously for our kind of workloads. We run a bunch of sub-1k node clusters.

zoobab 2 hours ago

The new mainframe.

rvz 2 hours ago

> While we don’t yet officially support 130K nodes, we're very encouraged by these findings. If your workloads require this level of scale, reach out to us to discuss your specific needs

Obviously this is a typical experiment at Google on running a K8s cluster at 130K nodes but if there is a company out their that "requires" this scale, I must question their architecture and their infrastructure costs.

But of course someone will always request that they somehow need this sort of scale to run their enterprise app. But once again, let's remind the pre-revenue startups talking about scale before they hit PMF:

Unless you are ready to donate tens of billions of dollars yearly, you do not need this.

You are not Google.

mlnj 37 minutes ago

>You are not Google.
It's literally Google coming out with this capability and how is the criticism still "You are not Google"
- Rastonbury 16 minutes ago
  
  The criticism is at pre-PMF startups who believe they need something similar

belter an hour ago

130k nodes...cute...but can Google conquer the ultimate software engineering challenge they warn you about in CS school? A functional online signup flow?

jasonvorhe an hour ago

For what? Access to the control plane API?
- belter 42 minutes ago
  
  In general... Try to sign up for their AI services...