CrashLoopBackOff is one of the most common statuses you will encounter in Kubernetes. It isn't an error itself, but a signal that a container is crashing repeatedly, and Kubernetes is waiting (backing off) before trying to start it again to avoid overloading the system.
The "BackOff" period is exponential: it starts at 10 seconds and doubles each time () until it hits a maximum of 5 minutes.
1. Common Root Causes
Most crashes fall into these four categories:
2. The Troubleshooting Workflow (The "Detective" Toolkit)
When a pod is stuck, run these commands in order:
Step A: Check the Exit Code
Run kubectl describe pod <pod-name> and look at the Containers -> Last State section. The exit code tells you how it died:
Exit Code 1: General error (Application crash).
Exit Code 137: OOMKilled. Your container needs more memory.
Exit Code 127: Command not found (Typo in your YAML
commandsection).Exit Code 139: Segmentation fault (Memory corruption or library issues).
Step B: View the "Dead" Logs
If the container is currently waiting to restart, kubectl logs might show nothing. You need to see the logs from the previous failed instance:
kubectl logs <pod-name> --previous
Step C: Check Events
Look at the bottom of the kubectl describe output. Look for "Liveness probe failed." If your Liveness Probe is too aggressive or the app takes too long to start, Kubernetes will kill the pod before it ever gets "Ready."
3. How to Fix It
For OOMKilled: Increase the
resources.limits.memoryin your Deployment YAML.For Application Crashes: Fix the code or add missing environment variables.
For Probe Issues: Use a Startup Probe if your app is slow to boot, or increase the
initialDelaySecondson your Liveness Probe.For Permission Issues: Ensure the
securityContextallows the container to read/write to the mounted volumes.
Quick Debugging Checklist
kubectl get pods(Confirm status)kubectl describe pod <pod-name>(Check Exit Code & Events)kubectl logs <pod-name> --previous(See why it crashed)kubectl edit deployment <name>(Apply the fix)
