Skip to content

Knative Serving

Run serverless workloads that scale to zero and back, with revision-based traffic splitting.

Time: ~20 minutes Difficulty: Intermediate

Resources: Knative Serving installs a control plane (~500MB RAM). Clean up other demos first: task clean:all

  • How Knative Serving provides serverless capabilities on Kubernetes
  • Scale-to-zero and automatic scale-up based on traffic
  • Revision-based deployments (immutable snapshots of your service)
  • Traffic splitting for canary and blue-green deployments
  • Concurrency-based autoscaling with target metrics

Install Knative Serving with Kourier networking:

Terminal window
# Install Knative Serving CRDs and core
kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-core.yaml
# Install Kourier networking layer
kubectl apply -f https://github.com/knative/net-kourier/releases/latest/download/kourier.yaml
# Configure Knative to use Kourier
kubectl patch configmap/config-network -n knative-serving --type merge -p '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'
# Wait for components
kubectl wait --for=condition=ready pod --all -n knative-serving --timeout=120s
kubectl wait --for=condition=ready pod --all -n kourier-system --timeout=120s

Configure DNS (use sslip.io for minikube):

Terminal window
kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-default-domain.yaml

Navigate to the demo directory:

Terminal window
cd demos/knative-serving

Apply the namespace:

Terminal window
kubectl apply -f manifests/namespace.yaml

Deploy the initial hello service:

Terminal window
kubectl apply -f manifests/service-hello.yaml

Wait for the Knative Service to become ready:

Terminal window
kubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120s

Get the service URL:

Terminal window
kubectl get ksvc hello -n knative-demo -o jsonpath='{.status.url}'

Check that the service is ready:

Terminal window
kubectl get ksvc hello -n knative-demo

You should see output like:

NAME URL LATESTCREATED LATESTREADY READY REASON
hello http://hello.knative-demo.10.0.0.1.sslip.io hello-00001 hello-00001 True

Curl the service URL:

Terminal window
SERVICE_URL=$(kubectl get ksvc hello -n knative-demo -o jsonpath='{.status.url}')
curl $SERVICE_URL

You should see:

Hello World v1!

Wait 60-90 seconds without sending any requests. Then check pods:

Terminal window
kubectl get pods -n knative-demo

You should see no pods running (or pods terminating). The service scaled to zero because there was no traffic.

Now curl the service again:

Terminal window
curl $SERVICE_URL

Check pods immediately:

Terminal window
kubectl get pods -n knative-demo

You will see a pod spinning up (cold start). The Knative Activator queued your request while the pod started.

Apply the v2 service configuration (80% latest, 20% v1):

Terminal window
kubectl apply -f manifests/service-hello-v2.yaml

Wait for the new revision to be ready:

Terminal window
kubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120s

Check revisions:

Terminal window
kubectl get revisions -n knative-demo

You should see two revisions:

NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASON
hello-00001 hello 1 True
hello-00002 hello 2 True

Curl the service multiple times to see traffic splitting:

Terminal window
for i in {1..10}; do curl $SERVICE_URL; done

You should see mostly “Hello World v2!” with occasional “Hello World v1!” responses (80/20 split).

manifests/
namespace.yaml # knative-demo namespace
service-hello.yaml # Initial Knative Service (creates hello-00001 revision)
service-hello-v2.yaml # Updated service with traffic split (creates hello-00002, 80% v2, 20% v1)
service-hello-v3.yaml # Another update with 50/50 split between v2 and v3
service-autoscale.yaml # Demo service with concurrency-based autoscaling

Knative Serving brings serverless capabilities to Kubernetes. Unlike a standard Deployment, a Knative Service automatically creates a Configuration and Route. Each time you update the service, Knative creates a new immutable Revision (a snapshot of the container image, environment variables, and resource limits).

Key concepts:

Scale-to-zero: When there is no traffic, Knative terminates all pods after a grace period (default 60 seconds). The Activator component sits in front of your service and queues incoming requests while pods spin up. This saves cluster resources for services that are idle most of the time.

Revisions: Every change to the service spec creates a new revision. Revisions are immutable and numbered sequentially (hello-00001, hello-00002). You can pin traffic to specific revisions for canary or blue-green deployments.

Traffic splitting: The traffic block defines how requests are distributed across revisions. You can route by percentage (80/20 canary) or by named tags. Traffic splitting happens at the Kourier ingress layer, before requests reach your pods.

Autoscaling: Knative watches concurrency (concurrent requests per pod) and RPS (requests per second). When concurrency exceeds the target, Knative spins up more pods. The autoscaler supports multiple modes: concurrency-based (default), RPS-based, and custom metrics. You can set min-scale (minimum replicas) and max-scale (cap on replicas).

  1. Deploy the autoscale demo and generate load to watch pods scale up:

    Terminal window
    kubectl apply -f manifests/service-autoscale.yaml
    kubectl wait --for=condition=ready ksvc autoscale-demo -n knative-demo --timeout=120s
    # Get the URL
    AUTOSCALE_URL=$(kubectl get ksvc autoscale-demo -n knative-demo -o jsonpath='{.status.url}')
    # Generate load (requires hey: go install github.com/rakyll/hey@latest)
    hey -z 30s -c 50 $AUTOSCALE_URL
    # Watch pods scale up
    kubectl get pods -n knative-demo -w

    You should see pods scale from 0 to 5 as concurrency increases, then back to 0 after traffic stops.

  2. Apply the v3 service configuration for a 50/50 blue-green split:

    Terminal window
    kubectl apply -f manifests/service-hello-v3.yaml
    kubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120s
    # Curl multiple times to see 50/50 split
    for i in {1..10}; do curl $SERVICE_URL; done
  3. Pin traffic to a specific revision using tags:

    Terminal window
    kubectl patch ksvc hello -n knative-demo --type merge -p '
    {
    "spec": {
    "traffic": [
    {
    "revisionName": "hello-00001",
    "percent": 100,
    "tag": "stable"
    }
    ]
    }
    }'
    # Access the stable tag URL
    kubectl get ksvc hello -n knative-demo -o jsonpath='{.status.traffic[?(@.tag=="stable")].url}'
  4. Change the scale-to-zero grace period to 30 seconds:

    Terminal window
    kubectl patch ksvc hello -n knative-demo --type merge -p '
    {
    "spec": {
    "template": {
    "metadata": {
    "annotations": {
    "autoscaling.knative.dev/scale-to-zero-grace-period": "30s"
    }
    }
    }
    }
    }'
  5. Check Knative metrics and status:

    Terminal window
    kubectl get ksvc,revision,route -n knative-demo
    kubectl describe ksvc hello -n knative-demo
    kubectl describe revision hello-00002 -n knative-demo
  6. Check Activator logs to see request queuing during cold start:

    Terminal window
    kubectl logs -f deployment/activator -n knative-serving
  7. Try changing the autoscaling mode to RPS-based instead of concurrency:

    Terminal window
    kubectl patch ksvc autoscale-demo -n knative-demo --type merge -p '
    {
    "spec": {
    "template": {
    "metadata": {
    "annotations": {
    "autoscaling.knative.dev/metric": "rps",
    "autoscaling.knative.dev/target": "50"
    }
    }
    }
    }
    }'

Delete the demo namespace:

Terminal window
kubectl delete namespace knative-demo

Optionally, remove Knative Serving components:

Terminal window
kubectl delete -f https://github.com/knative/net-kourier/releases/latest/download/kourier.yaml
kubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-core.yaml
kubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-crds.yaml
kubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-default-domain.yaml

See docs/deep-dive.md for a detailed explanation of Knative architecture, the relationship between Configuration, Route, and Revision, autoscaling algorithms, cold start optimization, and comparison with other serverless platforms.

Move on to Trivy Operator to scan your running containers for vulnerabilities and misconfigurations.