Probes & Lifecycle: Deep Dive
This document explains how the kubelet executes probes, how the three probe types interact, the full pod shutdown sequence, and common misconfigurations that cause production incidents.
The Kubelet Probe Execution Model
Section titled “The Kubelet Probe Execution Model”The kubelet runs probes, not the API server. Each node’s kubelet is responsible for probing every container on that node. The kubelet runs probes in a dedicated goroutine per container per probe type.
A pod with all three probes configured (startup, liveness, readiness) has three independent goroutines checking that container. These goroutines run on their own timers. They do not coordinate with each other.
The kubelet performs probes from inside the node. For HTTP probes, the request comes from the node’s IP, not from another pod. For exec probes, the kubelet invokes the command inside the container’s namespace using the container runtime API.
Probe Types in Detail
Section titled “Probe Types in Detail”HTTP Probes
Section titled “HTTP Probes”The kubelet sends an HTTP GET request to the specified path and port. Any status code between 200 and 399 counts as success. Anything else is failure.
livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 0 periodSeconds: 10 failureThreshold: 3The kubelet does not follow redirects. A 301 or 302 response is a success (it falls within 200-399), but the kubelet will not follow the redirect to the new location. If your health endpoint redirects, that redirect counts as a passing probe.
You can set custom HTTP headers:
livenessProbe: httpGet: path: /health port: 8080 httpHeaders: - name: Authorization value: Bearer token123The host defaults to the pod IP. You can override it with host, but this is rarely needed and can break things if set incorrectly.
Exec Probes
Section titled “Exec Probes”The kubelet runs a command inside the container. Exit code 0 means success. Any other exit code means failure.
livenessProbe: exec: command: ["test", "-f", "/tmp/healthy"] initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3The demo’s liveness-fail pod uses an exec probe to check for a file:
containers: - name: app image: busybox:1.36 command: - /bin/sh - -c - | touch /tmp/healthy sleep 30 echo "Simulating crash - removing health file" rm /tmp/healthy sleep infinity livenessProbe: exec: command: ["test", "-f", "/tmp/healthy"] initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3After 30 seconds, the health file disappears. The liveness probe fails three consecutive times (at 5-second intervals), and the kubelet restarts the container.
Exec probes create a new process inside the container on every check. For high-frequency probes, this adds CPU overhead. The spawned process inherits the container’s resource limits.
One subtle issue: if the exec command hangs (for example, connecting to a database that is stalled), it will block until the timeoutSeconds expires. The default timeout is 1 second.
TCP Socket Probes
Section titled “TCP Socket Probes”The kubelet attempts a TCP connection to the specified port. If the connection succeeds (TCP handshake completes), the probe passes. If the connection is refused or times out, the probe fails.
livenessProbe: tcpSocket: port: 5432 periodSeconds: 10TCP probes are useful for non-HTTP services like databases, message brokers, or custom TCP servers. They confirm the port is open but say nothing about whether the service is actually processing requests.
gRPC Probes
Section titled “gRPC Probes”Since Kubernetes v1.27 (stable), the kubelet can perform native gRPC health checks. The target service must implement the gRPC Health Checking Protocol.
livenessProbe: grpc: port: 50051 service: "my.package.MyService" # Optional, defaults to "" periodSeconds: 10If service is empty, the probe checks overall server health. If specified, it checks the health of that particular service. The gRPC response status SERVING means success. Everything else means failure.
Probe Timing Parameters
Section titled “Probe Timing Parameters”Every probe type accepts the same timing parameters:
| Parameter | Default | What It Controls |
|---|---|---|
initialDelaySeconds | 0 | Wait this long after container start before first probe |
periodSeconds | 10 | Time between consecutive probes |
timeoutSeconds | 1 | Max time to wait for a probe response |
successThreshold | 1 | Consecutive successes to consider healthy |
failureThreshold | 3 | Consecutive failures to consider unhealthy |
The effective time from first failure to action is:
failureThreshold * periodSecondsFor the demo’s healthy-app liveness probe:
livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 0 periodSeconds: 10 failureThreshold: 3Time from first failure to container restart: 3 * 10 = 30 seconds.
For readiness probes, successThreshold matters. A pod removed from Service endpoints must pass successThreshold consecutive checks before it is added back. The default of 1 means a single success re-adds the pod.
Startup Probe Interaction with Liveness
Section titled “Startup Probe Interaction with Liveness”The startup probe exists to solve one problem: slow-starting containers.
Before startup probes existed, developers set large initialDelaySeconds on liveness probes to give containers time to start. This created a blind spot where a truly stuck container would not be detected for a long time.
The startup probe works differently from liveness and readiness:
- While the startup probe is active, the kubelet disables both liveness and readiness probes.
- The startup probe runs until it succeeds once or exceeds its failure threshold.
- On success, the startup probe is permanently disabled and liveness/readiness take over.
- On failure (exceeding the threshold), the container is killed and restarted.
The demo’s slow-start app shows this clearly:
containers: - name: app startupProbe: httpGet: path: / port: 8080 failureThreshold: 20 periodSeconds: 2 livenessProbe: httpGet: path: / port: 8080 periodSeconds: 10 readinessProbe: exec: command: ["test", "-f", "/tmp/healthy"] periodSeconds: 3The startup probe allows up to 20 * 2 = 40 seconds for the app to start. During this window, liveness and readiness are completely paused. Once the startup probe passes, the liveness probe kicks in with a tight 10-second interval.
This gives you the best of both worlds: tolerance for slow starts and fast detection of runtime failures.
Pod Lifecycle Phases
Section titled “Pod Lifecycle Phases”A pod moves through defined phases:
| Phase | Meaning |
|---|---|
Pending | Accepted by API server, waiting to be scheduled or pull images |
Running | At least one container is running or starting |
Succeeded | All containers terminated with exit code 0 |
Failed | All containers terminated, at least one with non-zero exit |
Unknown | Node communication lost |
Within the Running phase, individual containers can be in states: Waiting, Running, or Terminated. A pod can be Running but have a container in Waiting (for example, during a restart after a liveness failure).
The Full Shutdown Sequence
Section titled “The Full Shutdown Sequence”When a pod is deleted, the following happens in order:
Step 1: Pod Marked for Deletion
Section titled “Step 1: Pod Marked for Deletion”The API server sets deletionTimestamp on the pod. The pod enters Terminating state. The endpoints controller immediately removes the pod’s IP from all Service endpoints.
Step 2: preStop Hook Executes
Section titled “Step 2: preStop Hook Executes”If defined, the preStop hook runs. The demo uses this to simulate removing from a load balancer:
lifecycle: preStop: exec: command: - /bin/sh - -c - | echo "[$(date)] preStop hook: removing from load balancer" sleep 3The preStop hook runs before SIGTERM is sent. This is the place to drain connections, deregister from service discovery, or close resources.
The hook can be:
- An
execcommand (runs inside the container) - An
httpGetrequest (calls an endpoint on the container)
Step 3: SIGTERM Sent to PID 1
Section titled “Step 3: SIGTERM Sent to PID 1”After the preStop hook completes, the kubelet sends SIGTERM to the container’s PID 1. The application should handle this signal to perform graceful shutdown:
command: - /bin/sh - -c - | cleanup() { echo "[$(date)] Received SIGTERM, starting graceful shutdown..." echo "[$(date)] Draining connections..." sleep 5 echo "[$(date)] Saving state..." sleep 2 echo "[$(date)] Shutdown complete." exit 0 } trap cleanup TERM
echo "[$(date)] Application started (PID $$)" while true; do sleep 1 doneStep 4: Grace Period Countdown
Section titled “Step 4: Grace Period Countdown”The terminationGracePeriodSeconds timer starts when SIGTERM is sent (not when the preStop hook starts, as many assume). Wait: that is actually incorrect in practice.
The truth is nuanced. The grace period starts when the kubelet begins the termination sequence. Both the preStop hook and the SIGTERM handling share the same grace period window. If your preStop hook takes 20 seconds and your grace period is 30, the application only has 10 seconds to handle SIGTERM before SIGKILL.
spec: terminationGracePeriodSeconds: 30The default is 30 seconds. Set this based on how long your application needs to drain.
Step 5: SIGKILL
Section titled “Step 5: SIGKILL”If the process is still running after the grace period, the kubelet sends SIGKILL. This cannot be caught or handled. The process is immediately terminated.
The Race Condition
Section titled “The Race Condition”There is a well-known race condition in Kubernetes shutdown. The endpoints controller and the kubelet operate independently. When a pod enters Terminating:
- The endpoints controller removes the pod from Service endpoints.
- The kubelet starts the preStop hook.
- kube-proxy on each node updates its iptables rules.
Steps 1 and 2 happen concurrently. Step 3 takes additional time. During this window, traffic may still be routed to the pod even though it is shutting down.
The fix: use a preStop hook with a short sleep (3-5 seconds) before beginning shutdown. This gives kube-proxy time to update its routing rules.
lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 5"]Container Hooks
Section titled “Container Hooks”Kubernetes supports two container lifecycle hooks:
postStart
Section titled “postStart”Runs immediately after the container is created. It runs concurrently with the container’s entrypoint. There is no guarantee that postStart finishes before the container starts accepting traffic.
lifecycle: postStart: exec: command: ["/bin/sh", "-c", "echo started > /tmp/startup"]If postStart fails, the container is killed. But because it runs asynchronously with the container entrypoint, the container might have already started doing work.
preStop
Section titled “preStop”Runs before SIGTERM. Blocks SIGTERM until it completes (or the grace period expires). This is the correct place for graceful shutdown preparation.
preStop and SIGTERM are complementary. Use preStop for infrastructure work (deregister, drain). Use SIGTERM handlers for application work (close database connections, flush caches).
Common Misconfigurations
Section titled “Common Misconfigurations”1. Liveness Probe Checks External Dependencies
Section titled “1. Liveness Probe Checks External Dependencies”The single most common probe misconfiguration. If your liveness probe checks a database or external API, and that dependency goes down, the kubelet restarts your container. This does not fix the external dependency. It creates a cascade of restarts across all pods.
Rule: Liveness probes should check only the container’s own internal state. Can the process respond? Is it deadlocked? That is all.
Use readiness probes for dependency checks. A failing readiness probe removes the pod from traffic without killing it.
2. Missing Startup Probe for Slow Starters
Section titled “2. Missing Startup Probe for Slow Starters”Without a startup probe, the liveness probe runs during startup. If the app takes 60 seconds to start and the liveness probe allows 30 seconds before restarting, the container never starts. It enters a restart loop.
Setting a high initialDelaySeconds on the liveness probe is the old workaround. The startup probe is better because it does not create a blind spot for runtime failures.
3. Identical Liveness and Readiness Probes
Section titled “3. Identical Liveness and Readiness Probes”If your liveness and readiness probes check the same endpoint with the same thresholds, a failing probe both removes the pod from traffic AND restarts it simultaneously. The readiness removal is pointless because the container is about to be killed anyway.
Make them different. Readiness can be more sensitive (lower threshold). Liveness should have a higher tolerance for transient failures.
4. Aggressive Probe Timing
Section titled “4. Aggressive Probe Timing”Setting periodSeconds: 1 and failureThreshold: 1 means a single 1-second hiccup restarts the container. This causes unnecessary restarts during brief load spikes or GC pauses.
Reasonable defaults for production:
livenessProbe: periodSeconds: 10 failureThreshold: 3 timeoutSeconds: 3
readinessProbe: periodSeconds: 5 failureThreshold: 1 timeoutSeconds: 35. Not Handling SIGTERM
Section titled “5. Not Handling SIGTERM”Many applications do not handle SIGTERM by default. The process receives the signal and does nothing. After 30 seconds (grace period), SIGKILL forces an abrupt termination. In-flight requests are dropped. Database transactions are left open.
Most web frameworks require explicit SIGTERM handling. FastAPI, Flask, Express, Spring Boot all have their own shutdown hooks. Configure them.
6. preStop Hook Exceeds Grace Period
Section titled “6. preStop Hook Exceeds Grace Period”If the preStop hook takes longer than terminationGracePeriodSeconds, SIGTERM is never sent. The process goes straight from preStop hook timeout to SIGKILL. The application gets no chance to clean up.
Keep preStop hooks short. Increase the grace period if needed.
Production Probe Tuning Guide
Section titled “Production Probe Tuning Guide”Web Applications (HTTP APIs)
Section titled “Web Applications (HTTP APIs)”startupProbe: httpGet: path: /health/startup port: 8080 failureThreshold: 30 periodSeconds: 2 # Allows up to 60s for startup
livenessProbe: httpGet: path: /health/live port: 8080 periodSeconds: 15 failureThreshold: 3 timeoutSeconds: 5
readinessProbe: httpGet: path: /health/ready port: 8080 periodSeconds: 5 failureThreshold: 2 successThreshold: 1 timeoutSeconds: 3Use three different health endpoints:
/health/startup: Checks that initialization is complete (migrations run, caches warmed)./health/live: Checks the process is alive (not deadlocked, not OOM)./health/ready: Checks dependencies (database connected, downstream services reachable).
Databases and Stateful Services
Section titled “Databases and Stateful Services”startupProbe: exec: command: ["pg_isready", "-U", "postgres"] failureThreshold: 60 periodSeconds: 5 # Allows up to 5 minutes for recovery
livenessProbe: exec: command: ["pg_isready", "-U", "postgres"] periodSeconds: 30 failureThreshold: 5 timeoutSeconds: 10Databases need more tolerance. A PostgreSQL vacuum or WAL replay can make the server temporarily unresponsive. Aggressive liveness probes would kill it during recovery.
Background Workers (No HTTP)
Section titled “Background Workers (No HTTP)”livenessProbe: exec: command: - /bin/sh - -c - "kill -0 $(cat /var/run/worker.pid)" periodSeconds: 30 failureThreshold: 3For processes without HTTP endpoints, use exec probes that check the process is alive or that a heartbeat file was recently updated.
Readiness Gates
Section titled “Readiness Gates”Beyond probes, Kubernetes supports readiness gates. These are custom conditions that must be true for a pod to be considered ready. External controllers set these conditions via the pod status API.
spec: readinessGates: - conditionType: "my-custom-condition"Until an external controller sets my-custom-condition to True, the pod remains not-ready even if all readiness probes pass. This is used by service meshes and load balancer controllers to add their own readiness logic.
Summary
Section titled “Summary”| Probe | When It Runs | Failure Action | Use For |
|---|---|---|---|
| Startup | Before liveness/readiness | Kill + restart | Slow-starting containers |
| Liveness | After startup passes | Kill + restart | Deadlock detection |
| Readiness | After startup passes | Remove from Service | Dependency health |
The probes and lifecycle hooks together form the contract between Kubernetes and your application. Get them right and your services handle failures gracefully. Get them wrong and a single dependency blip cascades into a cluster-wide outage.