Skip to content

Event-Driven Architecture (Kafka)

Build an asynchronous messaging pipeline with Apache Kafka where a producer and consumer are fully decoupled through a message broker.

Time: ~20 minutes Difficulty: Intermediate

Resources: This demo needs ~2GB RAM. Clean up other demos first: task clean:all

  • Running Apache Kafka and Zookeeper as StatefulSets in Kubernetes
  • The producer/consumer messaging pattern
  • How services communicate asynchronously without knowing about each other
  • Using persistent volumes to survive pod restarts without losing messages
  • Kafka topics, partitions, and consumer groups
+------------------+ +------------------+
| Producer Service | | Consumer Service |
| (sends events | | (reads events |
| every 5 sec) | | from topic) |
+--------+---------+ +--------+---------+
| ^
| produce | consume
v |
+--------+-----------------------------+---------+
| Kafka Broker |
| (topic: events) |
+------------------------+------------------------+
|
+--------+--------+
| Zookeeper |
| (coordination) |
+-----------------+

The Producer sends timestamped event messages to a Kafka topic every 5 seconds. The Consumer reads those messages in real time. Neither service knows about the other. Kafka handles the routing, buffering, and ordering. Zookeeper manages Kafka’s cluster metadata.

Terminal window
kubectl apply -f demos/event-driven-kafka/manifests/namespace.yaml
Terminal window
kubectl apply -f demos/event-driven-kafka/manifests/zookeeper.yaml

Wait for Zookeeper to be ready:

Terminal window
kubectl get pods -n kafka-demo -w

Wait until zookeeper-0 shows Running and 1/1 ready.

Terminal window
kubectl apply -f demos/event-driven-kafka/manifests/kafka.yaml

Wait for Kafka to be ready:

Terminal window
kubectl get pods -n kafka-demo -w

Wait until kafka-0 shows Running and 1/1 ready. This may take 1-2 minutes as Kafka connects to Zookeeper.

Terminal window
kubectl apply -f demos/event-driven-kafka/manifests/producer.yaml
Terminal window
kubectl apply -f demos/event-driven-kafka/manifests/consumer.yaml
Terminal window
kubectl get pods -n kafka-demo -w

Wait until all four pods (zookeeper-0, kafka-0, producer, consumer) are Running.

Terminal window
# Check all pods are running
kubectl get pods -n kafka-demo
# Check persistent volume claims
kubectl get pvc -n kafka-demo
# Watch the producer sending messages
kubectl logs -f deploy/producer -n kafka-demo
# Watch the consumer receiving messages
kubectl logs -f deploy/consumer -n kafka-demo

You should see the producer logging messages like:

[producer] Sent: event-1: order-placed at 2026-04-11 14:30:05
[producer] Sent: event-2: order-placed at 2026-04-11 14:30:10

And the consumer printing those same messages as they arrive:

event-1: order-placed at 2026-04-11 14:30:05
event-2: order-placed at 2026-04-11 14:30:10
Terminal window
# List Kafka topics
kubectl exec -it kafka-0 -n kafka-demo -- \
kafka-topics.sh --list --bootstrap-server localhost:9092
# Describe the events topic
kubectl exec -it kafka-0 -n kafka-demo -- \
kafka-topics.sh --describe --topic events --bootstrap-server localhost:9092
manifests/
namespace.yaml # kafka-demo namespace
zookeeper.yaml # Zookeeper StatefulSet (1 replica) with 1Gi PVC
kafka.yaml # Kafka StatefulSet (1 replica) with 2Gi PVC
producer.yaml # Deployment sending events to topic every 5 seconds
consumer.yaml # Deployment reading events from topic

Zookeeper runs as a StatefulSet with persistent storage so cluster metadata survives restarts. Kafka also runs as a StatefulSet with its own PVC, connecting to Zookeeper for coordination. The KAFKA_CFG_ADVERTISED_LISTENERS is set to the StatefulSet’s stable DNS name (kafka-0.kafka.kafka-demo.svc.cluster.local) so clients can always reach the broker. The Producer uses kafka-console-producer.sh to write messages to the events topic. The Consumer uses kafka-console-consumer.sh with --from-beginning to read all messages from the topic.

The key insight is decoupling: the producer does not know about the consumer, and the consumer does not know about the producer. Kafka acts as a buffer between them. You can stop the consumer, let messages pile up, restart it, and it will catch up from where it left off.

  1. Stop the consumer and watch messages accumulate, then restart it:

    Terminal window
    kubectl scale deployment consumer --replicas=0 -n kafka-demo
    # Wait 30 seconds while the producer keeps sending
    kubectl scale deployment consumer --replicas=1 -n kafka-demo
    kubectl logs -f deploy/consumer -n kafka-demo
    # The consumer catches up on all missed messages
  2. Scale the producer to send from multiple instances:

    Terminal window
    kubectl scale deployment producer --replicas=3 -n kafka-demo
    kubectl logs -f deploy/consumer -n kafka-demo
    # Messages from all three producers appear interleaved
  3. Interact with Kafka directly from the broker pod:

    Terminal window
    kubectl exec -it kafka-0 -n kafka-demo -- \
    kafka-console-producer.sh --broker-list localhost:9092 --topic events
    # Type a message and press Enter, then Ctrl+C
  4. Create a new topic with multiple partitions:

    Terminal window
    kubectl exec -it kafka-0 -n kafka-demo -- \
    kafka-topics.sh --create --topic orders \
    --partitions 3 --replication-factor 1 \
    --bootstrap-server localhost:9092
  5. Check consumer group offsets:

    Terminal window
    kubectl exec -it kafka-0 -n kafka-demo -- \
    kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
Terminal window
kubectl delete namespace kafka-demo

Note: Deleting the namespace also deletes the PVCs. If you want to keep the data, delete deployments and statefulsets individually instead.

See docs/deep-dive.md for a detailed explanation of event-driven architecture patterns, Kafka’s internal log structure, consumer groups and offset management, how partitions enable parallelism, and when to choose Kafka over simpler message queues like Redis or RabbitMQ.

Move on to EFK Logging to learn how to collect and search logs from across your cluster.