skip to content

Kubernetes Architecture Explained How It Actually Works

Published: • Updated:
Kubernetes Architecture Explained How It Actually Works

So you’ve heard about Kubernetes, maybe even used it a bit, but when you look at those architecture diagrams, your brain goes “wait, what’s happening here?” I’ve been there. Let me break down how Kubernetes actually works under the hood.

Kubernetes manages your containerized apps across multiple machines. To really get it, you need to understand how the pieces fit together.

The Big Picture

Kubernetes has two main parts:

  1. Control Plane — The brain that makes decisions
  2. Worker Nodes — The machines that run your apps

Everything talks through the API Server — it’s the central switchboard. Nothing happens without going through it first.

Control Plane Components

The Control Plane is mission control. It watches everything, makes decisions, and keeps your cluster running. Five key components:

API Server

The front door to your cluster. Every request goes through it:

  • Validates and processes requests
  • Updates cluster state
  • Handles authentication

Think of it like a reception desk — everyone goes through it first. This makes security and debugging way easier.

etcd

The cluster’s memory bank. Stores everything:

  • Cluster configuration
  • Desired state (“I want 3 copies of my app”)
  • Current state
  • Secrets and configs

Usually runs on 3-5 nodes for reliability. Only the API Server can access it directly.

Scheduler

The matchmaker. Decides which node should run each Pod by looking at:

  • Resource availability (CPU, memory)
  • Your rules and constraints
  • Data locality

It picks the node but doesn’t start the Pod — that’s the worker node’s job.

Controller Manager

The supervisor that keeps things running correctly. If you want 3 replicas, it makes sure you have exactly 3. If one crashes, it starts a new one. This is Kubernetes’ “self-healing” capability.

Different controllers handle different jobs: nodes, replicas, endpoints, permissions, and more.

Cloud Controller Manager

Only for cloud deployments (AWS, GCP, Azure). Translates between Kubernetes and your cloud provider’s API for things like load balancers, storage, and node management.

Worker Node Components

Where your apps actually run. Each node has three main components:

Kubelet

The foreman on each node. It:

  • Gets instructions from the API Server
  • Pulls container images
  • Starts and stops containers
  • Reports status back

The Control Plane says “run this Pod here,” and Kubelet makes it happen.

Kube Proxy

The traffic director. Handles networking:

  • Sets up Pod-to-Pod communication
  • Implements Services (stable addresses for Pods)
  • Load balances across Pod replicas
  • Routes traffic around the cluster

CRI (Container Runtime Interface)

Actually runs your containers. Works with Docker, containerd, CRI-O, or any CRI-compatible runtime. It’s an abstraction layer, so you can swap runtimes without changing Kubernetes.

Pods

Where your containers live. A Pod wraps one or more containers that share:

  • Network (same IP)
  • Storage volumes
  • Resource limits

Important: Pods are temporary. They can be destroyed and recreated anytime. Don’t rely on Pod IPs staying the same — use Services instead.

External Components

kubectl

Your command-line tool for everything. Deploy apps, check status, debug, scale — it all goes through kubectl to the API Server.

Cloud Provider API

Your cloud provider’s services (AWS, GCP, Azure). The Cloud Controller Manager uses this to provision load balancers, storage, and nodes.

How Everything Works Together

Here’s what happens when you run kubectl create deployment nginx --image=nginx:

  1. kubectl sends the request to the API Server
  2. API Server validates and saves “I want nginx” to etcd
  3. Scheduler picks the best node
  4. API Server tells Kubelet on that node to start the Pod
  5. Kubelet pulls the image and starts the container via CRI
  6. Kubelet reports back: “Pod is running”
  7. API Server updates etcd with the new state

All of this happens in seconds. The cluster constantly monitors itself — controllers watch for issues, Kubelet reports health, and Kube Proxy updates network rules as things change.

Key point: Almost everything goes through the API Server. This might seem inefficient, but it makes everything easier to manage, secure, and debug.

Why This Architecture Works

1. Centralized communication — One place for security, debugging, and versioning

2. Declarative model — You say “I want 3 replicas,” Kubernetes makes it happen and keeps it that way

3. Separation of concerns — Control Plane decides, Workers execute. Scale them independently.

4. Self-healing — Controllers automatically fix problems: restart crashed Pods, create missing replicas, reschedule workloads when nodes fail

Quick Reference

ComponentLocationWhat It Does
API ServerControl PlaneCentral hub for all requests
etcdControl PlaneStores all cluster data
SchedulerControl PlaneAssigns Pods to nodes
Controller ManagerControl PlaneMaintains desired state
Cloud Controller ManagerControl PlaneBridges with cloud services
KubeletWorker NodeRuns and monitors Pods
Kube ProxyWorker NodeHandles networking
CRIWorker NodeExecutes containers
PodsWorker NodeContains your containers

Practical Tips

For Developers:

  • Design apps to be stateless (Pods can disappear)
  • Use Services, not Pod IPs (they change)
  • Use Deployments, not bare Pods

For Operators:

  • Monitor API Server performance (it’s the bottleneck)
  • Back up etcd regularly (it has all your cluster state)
  • Make Control Plane highly available
  • Watch node resources (CPU, memory, disk)

Troubleshooting:

  • Pods won’t start? Check Kubelet logs and API Server connectivity
  • Networking broken? Verify Kube Proxy is running
  • Scheduling issues? Check Scheduler logs and node resources
  • State inconsistent? Check etcd health and Controller Manager logs

Wrapping Up

Kubernetes architecture: Control Plane decides, Worker Nodes execute. The API Server coordinates everything.

Once you understand this, things make sense:

  • Why Pods move between nodes
  • How Kubernetes knows when to restart something
  • Why everything goes through the API Server
  • How the cluster keeps itself running

Whether you’re just starting or managing production clusters, this mental model helps you deploy better, debug faster, and make smarter decisions.

The next time you see a Kubernetes architecture diagram, you’ll see the Control Plane making decisions, Worker Nodes doing the work, and the API Server coordinating everything in between.

Happy deploying!