Probes & Lifecycle: Deep Dive

This document explains how the kubelet executes probes, how the three probe types interact, the full pod shutdown sequence, and common misconfigurations that cause production incidents.

The Kubelet Probe Execution Model

The kubelet runs probes, not the API server. Each node’s kubelet is responsible for probing every container on that node. The kubelet runs probes in a dedicated goroutine per container per probe type.

A pod with all three probes configured (startup, liveness, readiness) has three independent goroutines checking that container. These goroutines run on their own timers. They do not coordinate with each other.

The kubelet performs probes from inside the node. For HTTP probes, the request comes from the node’s IP, not from another pod. For exec probes, the kubelet invokes the command inside the container’s namespace using the container runtime API.

Probe Types in Detail

HTTP Probes

The kubelet sends an HTTP GET request to the specified path and port. Any status code between 200 and 399 counts as success. Anything else is failure.

livenessProbe:
  httpGet:
    path: /
    port: 80
  initialDelaySeconds: 0
  periodSeconds: 10
  failureThreshold: 3

The kubelet does not follow redirects. A 301 or 302 response is a success (it falls within 200-399), but the kubelet will not follow the redirect to the new location. If your health endpoint redirects, that redirect counts as a passing probe.

You can set custom HTTP headers:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
    httpHeaders:
      - name: Authorization
        value: Bearer token123

The host defaults to the pod IP. You can override it with host, but this is rarely needed and can break things if set incorrectly.

Exec Probes

The kubelet runs a command inside the container. Exit code 0 means success. Any other exit code means failure.

livenessProbe:
  exec:
    command: ["test", "-f", "/tmp/healthy"]
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3

The demo’s liveness-fail pod uses an exec probe to check for a file:

containers:
  - name: app
    image: busybox:1.36
    command:
      - /bin/sh
      - -c
      - |
        touch /tmp/healthy
        sleep 30
        echo "Simulating crash - removing health file"
        rm /tmp/healthy
        sleep infinity
    livenessProbe:
      exec:
        command: ["test", "-f", "/tmp/healthy"]
        initialDelaySeconds: 5
        periodSeconds: 5
        failureThreshold: 3

After 30 seconds, the health file disappears. The liveness probe fails three consecutive times (at 5-second intervals), and the kubelet restarts the container.

Exec probes create a new process inside the container on every check. For high-frequency probes, this adds CPU overhead. The spawned process inherits the container’s resource limits.

One subtle issue: if the exec command hangs (for example, connecting to a database that is stalled), it will block until the timeoutSeconds expires. The default timeout is 1 second.

TCP Socket Probes

The kubelet attempts a TCP connection to the specified port. If the connection succeeds (TCP handshake completes), the probe passes. If the connection is refused or times out, the probe fails.

livenessProbe:
  tcpSocket:
    port: 5432
  periodSeconds: 10

TCP probes are useful for non-HTTP services like databases, message brokers, or custom TCP servers. They confirm the port is open but say nothing about whether the service is actually processing requests.

gRPC Probes

Since Kubernetes v1.27 (stable), the kubelet can perform native gRPC health checks. The target service must implement the gRPC Health Checking Protocol.

livenessProbe:
  grpc:
    port: 50051
    service: "my.package.MyService"    # Optional, defaults to ""
  periodSeconds: 10

If service is empty, the probe checks overall server health. If specified, it checks the health of that particular service. The gRPC response status SERVING means success. Everything else means failure.

Probe Timing Parameters

Every probe type accepts the same timing parameters:

Parameter	Default	What It Controls
`initialDelaySeconds`	0	Wait this long after container start before first probe
`periodSeconds`	10	Time between consecutive probes
`timeoutSeconds`	1	Max time to wait for a probe response
`successThreshold`	1	Consecutive successes to consider healthy
`failureThreshold`	3	Consecutive failures to consider unhealthy

The effective time from first failure to action is:

failureThreshold * periodSeconds

For the demo’s healthy-app liveness probe:

livenessProbe:
  httpGet:
    path: /
    port: 80
  initialDelaySeconds: 0
  periodSeconds: 10
  failureThreshold: 3

Time from first failure to container restart: 3 * 10 = 30 seconds.

For readiness probes, successThreshold matters. A pod removed from Service endpoints must pass successThreshold consecutive checks before it is added back. The default of 1 means a single success re-adds the pod.

Startup Probe Interaction with Liveness

The startup probe exists to solve one problem: slow-starting containers.

Before startup probes existed, developers set large initialDelaySeconds on liveness probes to give containers time to start. This created a blind spot where a truly stuck container would not be detected for a long time.

The startup probe works differently from liveness and readiness:

While the startup probe is active, the kubelet disables both liveness and readiness probes.
The startup probe runs until it succeeds once or exceeds its failure threshold.
On success, the startup probe is permanently disabled and liveness/readiness take over.
On failure (exceeding the threshold), the container is killed and restarted.

The demo’s slow-start app shows this clearly:

containers:
  - name: app
    startupProbe:
      httpGet:
        path: /
        port: 8080
      failureThreshold: 20
      periodSeconds: 2
    livenessProbe:
      httpGet:
        path: /
        port: 8080
      periodSeconds: 10
    readinessProbe:
      exec:
        command: ["test", "-f", "/tmp/healthy"]
      periodSeconds: 3

The startup probe allows up to 20 * 2 = 40 seconds for the app to start. During this window, liveness and readiness are completely paused. Once the startup probe passes, the liveness probe kicks in with a tight 10-second interval.

This gives you the best of both worlds: tolerance for slow starts and fast detection of runtime failures.

Pod Lifecycle Phases

A pod moves through defined phases:

Phase	Meaning
`Pending`	Accepted by API server, waiting to be scheduled or pull images
`Running`	At least one container is running or starting
`Succeeded`	All containers terminated with exit code 0
`Failed`	All containers terminated, at least one with non-zero exit
`Unknown`	Node communication lost

Within the Running phase, individual containers can be in states: Waiting, Running, or Terminated. A pod can be Running but have a container in Waiting (for example, during a restart after a liveness failure).

The Full Shutdown Sequence

When a pod is deleted, the following happens in order:

Step 1: Pod Marked for Deletion

The API server sets deletionTimestamp on the pod. The pod enters Terminating state. The endpoints controller immediately removes the pod’s IP from all Service endpoints.

Step 2: preStop Hook Executes

If defined, the preStop hook runs. The demo uses this to simulate removing from a load balancer:

lifecycle:
  preStop:
    exec:
      command:
        - /bin/sh
        - -c
        - |
          echo "[$(date)] preStop hook: removing from load balancer"
          sleep 3

The preStop hook runs before SIGTERM is sent. This is the place to drain connections, deregister from service discovery, or close resources.

The hook can be:

An exec command (runs inside the container)
An httpGet request (calls an endpoint on the container)

Step 3: SIGTERM Sent to PID 1

After the preStop hook completes, the kubelet sends SIGTERM to the container’s PID 1. The application should handle this signal to perform graceful shutdown:

command:
  - /bin/sh
  - -c
  - |
    cleanup() {
      echo "[$(date)] Received SIGTERM, starting graceful shutdown..."
      echo "[$(date)] Draining connections..."
      sleep 5
      echo "[$(date)] Saving state..."
      sleep 2
      echo "[$(date)] Shutdown complete."
      exit 0
    }
    trap cleanup TERM

    echo "[$(date)] Application started (PID $$)"
    while true; do
      sleep 1
    done

Step 4: Grace Period Countdown

The terminationGracePeriodSeconds timer starts when SIGTERM is sent (not when the preStop hook starts, as many assume). Wait: that is actually incorrect in practice.

The truth is nuanced. The grace period starts when the kubelet begins the termination sequence. Both the preStop hook and the SIGTERM handling share the same grace period window. If your preStop hook takes 20 seconds and your grace period is 30, the application only has 10 seconds to handle SIGTERM before SIGKILL.

spec:
  terminationGracePeriodSeconds: 30

The default is 30 seconds. Set this based on how long your application needs to drain.

Step 5: SIGKILL

If the process is still running after the grace period, the kubelet sends SIGKILL. This cannot be caught or handled. The process is immediately terminated.

The Race Condition

There is a well-known race condition in Kubernetes shutdown. The endpoints controller and the kubelet operate independently. When a pod enters Terminating:

The endpoints controller removes the pod from Service endpoints.
The kubelet starts the preStop hook.
kube-proxy on each node updates its iptables rules.

Steps 1 and 2 happen concurrently. Step 3 takes additional time. During this window, traffic may still be routed to the pod even though it is shutting down.

The fix: use a preStop hook with a short sleep (3-5 seconds) before beginning shutdown. This gives kube-proxy time to update its routing rules.

lifecycle:
  preStop:
    exec:
      command: ["/bin/sh", "-c", "sleep 5"]

Container Hooks

Kubernetes supports two container lifecycle hooks:

postStart

Runs immediately after the container is created. It runs concurrently with the container’s entrypoint. There is no guarantee that postStart finishes before the container starts accepting traffic.

lifecycle:
  postStart:
    exec:
      command: ["/bin/sh", "-c", "echo started > /tmp/startup"]

If postStart fails, the container is killed. But because it runs asynchronously with the container entrypoint, the container might have already started doing work.

preStop

Runs before SIGTERM. Blocks SIGTERM until it completes (or the grace period expires). This is the correct place for graceful shutdown preparation.

preStop and SIGTERM are complementary. Use preStop for infrastructure work (deregister, drain). Use SIGTERM handlers for application work (close database connections, flush caches).

Common Misconfigurations

1. Liveness Probe Checks External Dependencies

The single most common probe misconfiguration. If your liveness probe checks a database or external API, and that dependency goes down, the kubelet restarts your container. This does not fix the external dependency. It creates a cascade of restarts across all pods.

Rule: Liveness probes should check only the container’s own internal state. Can the process respond? Is it deadlocked? That is all.

Use readiness probes for dependency checks. A failing readiness probe removes the pod from traffic without killing it.

2. Missing Startup Probe for Slow Starters

Without a startup probe, the liveness probe runs during startup. If the app takes 60 seconds to start and the liveness probe allows 30 seconds before restarting, the container never starts. It enters a restart loop.

Setting a high initialDelaySeconds on the liveness probe is the old workaround. The startup probe is better because it does not create a blind spot for runtime failures.

3. Identical Liveness and Readiness Probes

If your liveness and readiness probes check the same endpoint with the same thresholds, a failing probe both removes the pod from traffic AND restarts it simultaneously. The readiness removal is pointless because the container is about to be killed anyway.

Make them different. Readiness can be more sensitive (lower threshold). Liveness should have a higher tolerance for transient failures.

4. Aggressive Probe Timing

Setting periodSeconds: 1 and failureThreshold: 1 means a single 1-second hiccup restarts the container. This causes unnecessary restarts during brief load spikes or GC pauses.

Reasonable defaults for production:

livenessProbe:
  periodSeconds: 10
  failureThreshold: 3
  timeoutSeconds: 3

readinessProbe:
  periodSeconds: 5
  failureThreshold: 1
  timeoutSeconds: 3

5. Not Handling SIGTERM

Many applications do not handle SIGTERM by default. The process receives the signal and does nothing. After 30 seconds (grace period), SIGKILL forces an abrupt termination. In-flight requests are dropped. Database transactions are left open.

Most web frameworks require explicit SIGTERM handling. FastAPI, Flask, Express, Spring Boot all have their own shutdown hooks. Configure them.

6. preStop Hook Exceeds Grace Period

If the preStop hook takes longer than terminationGracePeriodSeconds, SIGTERM is never sent. The process goes straight from preStop hook timeout to SIGKILL. The application gets no chance to clean up.

Keep preStop hooks short. Increase the grace period if needed.

Production Probe Tuning Guide

Web Applications (HTTP APIs)

startupProbe:
  httpGet:
    path: /health/startup
    port: 8080
  failureThreshold: 30
  periodSeconds: 2           # Allows up to 60s for startup

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  periodSeconds: 15
  failureThreshold: 3
  timeoutSeconds: 5

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  periodSeconds: 5
  failureThreshold: 2
  successThreshold: 1
  timeoutSeconds: 3

Use three different health endpoints:

/health/startup: Checks that initialization is complete (migrations run, caches warmed).
/health/live: Checks the process is alive (not deadlocked, not OOM).
/health/ready: Checks dependencies (database connected, downstream services reachable).

Databases and Stateful Services

startupProbe:
  exec:
    command: ["pg_isready", "-U", "postgres"]
  failureThreshold: 60
  periodSeconds: 5            # Allows up to 5 minutes for recovery

livenessProbe:
  exec:
    command: ["pg_isready", "-U", "postgres"]
  periodSeconds: 30
  failureThreshold: 5
  timeoutSeconds: 10

Databases need more tolerance. A PostgreSQL vacuum or WAL replay can make the server temporarily unresponsive. Aggressive liveness probes would kill it during recovery.

Background Workers (No HTTP)

livenessProbe:
  exec:
    command:
      - /bin/sh
      - -c
      - "kill -0 $(cat /var/run/worker.pid)"
  periodSeconds: 30
  failureThreshold: 3

For processes without HTTP endpoints, use exec probes that check the process is alive or that a heartbeat file was recently updated.

Readiness Gates

Beyond probes, Kubernetes supports readiness gates. These are custom conditions that must be true for a pod to be considered ready. External controllers set these conditions via the pod status API.

spec:
  readinessGates:
    - conditionType: "my-custom-condition"

Until an external controller sets my-custom-condition to True, the pod remains not-ready even if all readiness probes pass. This is used by service meshes and load balancer controllers to add their own readiness logic.

Summary

Probe	When It Runs	Failure Action	Use For
Startup	Before liveness/readiness	Kill + restart	Slow-starting containers
Liveness	After startup passes	Kill + restart	Deadlock detection
Readiness	After startup passes	Remove from Service	Dependency health

The probes and lifecycle hooks together form the contract between Kubernetes and your application. Get them right and your services handle failures gracefully. Get them wrong and a single dependency blip cascades into a cluster-wide outage.