How the Scheduler Works
The Kubernetes scheduler watches for unbound Pods (pods with no .spec.nodeName) and runs them through a scheduling cycle to pick the best node, then a binding cycle to commit the choice.
Scheduling cycle phases:
- PreFilter — fast checks (pod can fit anywhere?).
- Filter — eliminate nodes that violate hard constraints (taints, nodeSelector, resource requests, PVC availability).
- Score — rank remaining nodes (least-allocated, image locality, etc.).
- Reserve — tentatively claim resources on the winning node.
- Bind — write
.spec.nodeName to the API server.
If no node survives the Filter phase, the pod stays Pending and an event is emitted:
0/2 nodes are available: 2 node(s) didn't match Pod's node affinity/selector.
nodeSelector
The simplest scheduling constraint — a map of required node labels:
spec:
nodeSelector:
accelerator: nvidia-tesla-v100
If no ready node carries all of these labels, the pod will never be scheduled. Use kubectl get nodes --show-labels to confirm what labels exist before writing a nodeSelector.
Node Affinity (preferred over nodeSelector)
Node affinity gives you required (hard) and preferred (soft) rules, plus operators (In, NotIn, Exists):
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: accelerator
operator: In
values: ["nvidia-tesla-v100", "nvidia-a100"]
Debugging Pending Pods
kubectl describe pod <name> # look at Events section
kubectl get events --field-selector involvedObject.name=<pod>
kubectl get nodes --show-labels # verify label availability
Common Filter failures:
| Event Reason |
Root Cause |
FailedScheduling |
No node matches nodeSelector / affinity |
FailedScheduling |
Insufficient CPU/memory |
FailedScheduling |
Taint not tolerated |
FailedScheduling |
PVC not yet bound |
Further Reading
Kubernetes Scheduler