KubeForge — Hands-on Kubernetes & EKS Learning

Scenario

The `compute` Deployment needs 3 replicas but the pods are stuck in `Pending`. The cluster has two nodes, each with 2 CPU allocatable (4 CPU total). Each pod currently requests 1500m CPU — 3 replicas would need 4500m, which exceeds what's available. Reduce the CPU request so all three pods can schedule.

The scheduler needs numbers

When a pod is created, the Kubernetes scheduler must decide which node to place it on. It does not measure actual CPU usage — it compares the pod's declared requests against each node's available (allocatable) capacity. A node with 2 CPU allocatable and 1800m already requested has only 200m left to offer new pods.

Requests vs. limits

resources:
  requests:
    cpu: "500m"      # scheduler guarantee — "reserve this much"
    memory: "256Mi"  # also used for OOM kill priority
  limits:
    cpu: "1000m"     # hard ceiling — container is throttled above this
    memory: "512Mi"  # container is OOM-killed if it exceeds this

Field	What it controls
`requests.cpu`	How much CPU the scheduler reserves on the node
`limits.cpu`	CPU throttle ceiling (container slows, not killed)
`requests.memory`	Used to rank pods for eviction under memory pressure
`limits.memory`	Container is terminated (OOMKilled) if exceeded

CPU units

1 = 1 full core. 500m = 500 millicores = half a core. You can also write 0.5. CPU is compressible — exceeding the limit causes throttling, not termination.

Why pods go Pending

When no node has enough unreserved CPU (or memory) to satisfy a pod's requests, the scheduler emits an event:

0/2 nodes are available: 2 Insufficient cpu.

This is a scheduling failure, not a node failure. Diagnose it with kubectl describe pod <pending-pod>.

Capacity math example

Node	Allocatable CPU	Already requested	Available
sim-node-1	2000m	1400m	600m
sim-node-2	2000m	1600m	400m

A pod requesting 800m cannot fit on either node. A pod requesting 500m fits on sim-node-1.

Right-sizing requests

Setting requests too high causes scheduling failures and wastes money. Setting them too low means your pod may be evicted or throttled under load. A good starting point:

Run the app under realistic load
Observe actual usage with kubectl top pods
Set requests ≈ average usage, limits ≈ peak usage × 1.5

LimitRange and ResourceQuota

For multi-team clusters, use a LimitRange to enforce default requests/limits per namespace, and a ResourceQuota to cap total consumption:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - type: Container
    default:
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:
      cpu: "100m"
      memory: "64Mi"

Pods stuck Pending — over-requested CPU

intermediate~25 min

manifest.yamlYAML

Cluster loading…