Probes & Lifecycle

Configure health checks and graceful shutdown for production-ready pods.

Time: ~15 minutes Difficulty: Intermediate

What You Will Learn

Startup probes: protect slow-starting containers from being killed
Liveness probes: detect deadlocked or crashed containers
Readiness probes: control when a pod receives traffic
Pre-stop hooks: run cleanup before pod termination
Graceful shutdown with SIGTERM and terminationGracePeriodSeconds
The #1 production incident cause: misconfigured probes

Deploy

Navigate to the demo directory:

cd demos/probes-lifecycle

kubectl apply -f manifests/namespace.yaml
kubectl apply -f manifests/healthy-app.yaml
kubectl apply -f manifests/slow-start-app.yaml
kubectl apply -f manifests/failing-liveness.yaml
kubectl apply -f manifests/graceful-shutdown.yaml

Scenario 1: Healthy App (all three probes)

kubectl get pods -l app=healthy-app -n probes-demo
kubectl describe pod -l app=healthy-app -n probes-demo | grep -A 3 "Liveness\|Readiness\|Startup"

All probes pass. The startup probe runs first, then liveness and readiness kick in.

Scenario 2: Slow-Starting App

The slow-start app takes 15 seconds to initialize:

kubectl get pods -l app=slow-start -n probes-demo -w

Watch the pod go through these phases:

Running but 0/1 Ready (startup probe still checking)
After ~15s, startup probe passes
Readiness probe passes, pod becomes 1/1 Ready

Without the startup probe, the liveness probe would kill the container before it finished starting.

Scenario 3: Failing Liveness Probe

The liveness-fail pod removes its health file after 30 seconds:

kubectl get pods liveness-fail -n probes-demo -w

Watch it:

Starts healthy (RESTARTS: 0)
After ~45s, liveness probe fails 3 times
Kubernetes restarts the container (RESTARTS: 1)
Cycle repeats

Check the events:

kubectl describe pod liveness-fail -n probes-demo | grep -A 5 "Events:"

Scenario 4: Graceful Shutdown

Delete a pod and watch the graceful shutdown sequence:

# Watch logs in one terminal
kubectl logs -f deploy/graceful-app -n probes-demo &

# Delete a pod
kubectl delete pod -l app=graceful-app -n probes-demo --wait=false

# Watch the shutdown sequence in the logs:
# 1. preStop hook runs (removes from LB)
# 2. SIGTERM delivered to the process
# 3. App drains connections and saves state
# 4. Clean exit

What is Happening

manifests/
  namespace.yaml          # probes-demo namespace
  healthy-app.yaml        # All 3 probes configured correctly
  slow-start-app.yaml     # Startup probe protects 15s init time
  failing-liveness.yaml   # Liveness failure triggers restart
  graceful-shutdown.yaml  # preStop hook + SIGTERM handler

Probe execution order:

Pod starts ──> Startup probe (repeats until success)
                    |
                    v (startup passes)
              Liveness probe ──> fails 3x ──> container restart
              Readiness probe ──> fails ──> removed from Service endpoints

Shutdown sequence:

kubectl delete pod
    |
    v
1. preStop hook runs
2. SIGTERM sent to PID 1
3. App handles SIGTERM (drain, save state)
4. terminationGracePeriodSeconds countdown
5. SIGKILL if still running

Probe types:

Probe	Method	Best For
`httpGet`	HTTP GET to a path/port	Web apps with health endpoints
`exec`	Run a command, check exit code	File checks, CLI tools
`tcpSocket`	TCP connection to a port	Databases, non-HTTP services

Experiment

Remove the startup probe from slow-start-app.yaml and redeploy. Watch the liveness probe kill the container before it finishes starting.

Make the readiness probe fail by changing the exec command:

kubectl exec -it deploy/healthy-app -n probes-demo -- rm /usr/share/nginx/html/index.html

The pod becomes 0/1 Ready and is removed from the Service. Restore it:

kubectl exec -it deploy/healthy-app -n probes-demo -- sh -c 'echo ok > /usr/share/nginx/html/index.html'

Check endpoints during readiness failure:
Terminal window
```
kubectl get endpoints healthy-app -n probes-demo
```

Cleanup

kubectl delete namespace probes-demo

Next Step

Move on to Network Policies to learn how to control pod-to-pod traffic.