Back to Blog
Kubernetes
DevOps
Infrastructure
Kubernetes Isn’t Your Load Balancer — It’s the Puppet Master Pulling the Strings
February 18, 2026
7 min read read
**Kubernetes Isn’t Your Load Balancer — It’s the Puppet Master Pulling the Strings**
There’s a moment every network engineer hits when diving into Kubernetes.
You’ve spent years tuning hardware load balancers. You know the rhythm: client → load balancer → servers. TLS termination? Easy. Sticky sessions? Done. 200,000 concurrent TCP sessions? Bring it on.
Then Kubernetes shows up and suddenly the vocabulary explodes.
Services. Ingress. API Gateways. Cloud load balancers. Service meshes. NodePort. LoadBalancer type. MetalLB. Cilium.
It feels like someone took a perfectly clean diagram and spilled abstractions all over it.
So what’s actually happening in production?
Let’s break the illusion.
---
## Kubernetes Doesn’t Replace Your Load Balancer
One of the most upvoted responses in the discussion cuts straight to it:
> “No, it just controls load balancers… Kubernetes does not come with a load balancer.”
That’s the punchline.
Kubernetes is not your F5. It’s not magically terminating TLS out of thin air. It’s not secretly managing 200k TCP sessions inside a black box.
It orchestrates.
It configures.
It tells something else to do the heavy lifting.
That “something else” depends entirely on your environment.
In cloud setups, a `Service` of type `LoadBalancer` triggers the cloud controller to provision an external load balancer automatically. In on-prem or bare metal environments, you might use **MetalLB** to advertise service IPs over BGP or ARP. Or maybe you let **Cilium** handle BGP and L2/L3 distribution.
But Kubernetes itself?
It’s the control plane. The traffic workhorse lives elsewhere.
That’s the key mental shift.
---
## Ingress Is Not a Load Balancer (Technically)
This tripped up a few people in the thread, and it’s worth slowing down for.
An Ingress resource isn’t a load balancer. It’s a rule set. A configuration object.
The real muscle sits in the Ingress controller — maybe **NGINX**, maybe **Traefik**, maybe **HAProxy** — running inside the cluster.
One engineer explained it plainly: an Ingress resource for ingress-nginx gets translated into an `nginx.conf` file. That config then handles routing.
Which means this:
Kubernetes defines the intent.
The controller translates it into a real proxy configuration.
The proxy does the actual balancing.
Abstraction layered on abstraction.
It sounds messy at first. But once you realize it’s just config automation, it clicks.
---
## So Where Does TLS Actually Terminate?
This is where production setups start to diverge.
There isn’t one “correct” answer. There are patterns.
### Pattern 1: Terminate at the Edge (Most Common)
External hardware or cloud load balancer terminates TLS.
Why?
- You can run WAF checks at the edge
- You reduce internal SSL handshakes
- Certificate management is centralized
Several engineers in the thread prefer this. Let the big iron (or cloud LB) handle TLS. Keep the cluster focused on application routing.
Some setups even re-encrypt traffic into the cluster — edge TLS termination, then a second TLS session to the Ingress, followed by mTLS inside the cluster.
Layered security. Clean separation of responsibility.
---
### Pattern 2: Terminate at the Ingress Controller
Instead of offloading to the cloud LB, you configure cert management directly in Kubernetes.
Annotations on an Ingress resource can trigger certificate provisioning via tools like cert-manager. The Ingress controller handles TLS termination itself.
This works well in:
- Cloud-native setups
- Smaller clusters
- Teams that want full control inside Kubernetes
It simplifies external infrastructure. But now your ingress layer needs to scale accordingly.
---
### Pattern 3: Offload to a Service Mesh
This is where things get spicy.
With **Istio**, TLS can be handled at the gateway or even via sidecar injection into workloads.
The mesh can automatically enforce mTLS between services.
That’s powerful. It’s also operationally heavy.
You don’t adopt this casually. You adopt it when you need service-to-service encryption, traffic shaping, observability, and policy control at scale.
---
## What About 200k TCP Sessions?
Here’s the part that reassures the network engineers in the room:
If you’re using a hardware load balancer or cloud LB in front, the performance story is exactly the same as it’s always been.
Kubernetes doesn’t suddenly make TCP weaker.
It uses the same Linux networking stack under the hood that many “traditional” load balancers rely on anyway.
One commenter who’s been building HPC clusters long before Kubernetes existed made it blunt:
It’s the same networking stack. Just automated differently.
If you’re running software-based ingress inside the cluster, then scaling becomes horizontal.
More ingress pods.
More nodes.
More endpoints.
Autoscalers kick in. L2/L3 balancing distributes traffic. At that point, it’s capacity planning, not magic.
Kubernetes isn’t counting sessions itself. It’s delegating that responsibility to the infrastructure layer doing the balancing.
---
## Bare Metal Gets Interesting
Cloud makes things easy.
Bare metal? That’s where creativity shows up.
Some engineers run **MetalLB** with BGP, advertising service IPs directly to their routers. One mentioned pairing it with a MikroTik RB5009 handling BGP routes.
Others run a traditional reverse proxy outside the cluster — maybe **Traefik** or **HAProxy** — and forward traffic to NodePort services inside Kubernetes.
And yes, some still integrate actual hardware load balancers, letting Kubernetes dynamically manage configuration while infra teams keep control of the physical devices.
That hybrid model is more common than people admit.
Infra manages hardware.
App teams manage routing rules.
Everyone stays in their lane.
---
## The Real Production Pattern
Strip away the terminology and most real-world architectures look like this:
**Client → External Load Balancer → Ingress Controller → Service → Pods**
That’s it.
Everything else is implementation detail.
Sometimes the external LB is cloud-managed.
Sometimes it’s F5 or Citrix.
Sometimes it’s MetalLB advertising VIPs via BGP.
But Kubernetes sits in the middle, coordinating how traffic should flow once it enters the cluster boundary.
It doesn’t eliminate load balancers.
It industrializes their configuration.
---
## Why This Feels So Different
In traditional environments, networking teams built and managed the entire traffic flow.
In Kubernetes environments, application teams can define routing rules declaratively.
That’s a cultural shift as much as a technical one.
You’re no longer hand-configuring virtual servers on a hardware device.
You’re writing YAML.
And that YAML triggers automation that configures something else.
That indirection can feel uncomfortable if you’re used to seeing every knob directly.
But it’s powerful.
Because now:
- Developers can ship services without ticketing the network team
- Infrastructure can enforce guardrails
- Scaling becomes API-driven
And most importantly, misconfigurations are reproducible and version-controlled.
---
## Does Kubernetes Simplify or Complicate Load Balancing?
Honestly?
Both.
It simplifies provisioning.
It complicates the mental model.
Instead of one box labeled “Load Balancer,” you now have:
- Service objects
- Ingress resources
- Ingress controllers
- Cloud controller managers
- Optional service meshes
- Optional in-cluster L2/L3 load balancers
But under all of it, the same principles apply:
Layer 4 vs Layer 7
TLS termination points
Session handling
Health checks
Routing logic
The difference is who manages what — and how automated it is.
---
## The Brutal Truth About Production Kubernetes Networking
Kubernetes doesn’t magically replace traditional load balancers.
It standardizes how you talk to them.
If you need massive throughput and high TCP concurrency, you still rely on proven infrastructure — whether that’s cloud-native LBs, hardware appliances, or high-performance software proxies.
Kubernetes just makes sure:
- They’re configured consistently
- They’re provisioned automatically
- They follow declarative rules
- Developers can’t accidentally break global traffic patterns
It’s not reinventing networking.
It’s orchestrating it.
And once you stop expecting it to be the load balancer — and start seeing it as the automation brain behind the scenes — everything makes sense.
The old flow still exists.
It just has a control plane now.
Keep Exploring
It Works... But It Feels Wrong - The Real Way to Run a Java Monolith on Kubernetes Without Breaking Your Brain
A practical production guide to running a Java monolith on Kubernetes without fragile NodePort duct tape.
Should You Use CPU Limits in Kubernetes Production?
A grounded take on when CPU limits help, when they hurt, and how to choose based on workload behavior.
We Thought Kubernetes Would Save Us - The Production Failures No One Puts on the Conference Slides
A field report on real Kubernetes production failures and the human factors that trigger them.
Zero-Downtime Deployments Without Kubernetes: Proven Approaches
Kubernetes is not the only way to achieve zero-downtime deployments. This article covers proven alternatives such as load balancers, blue-green rollout patterns, and graceful shutdown strategies.