Knative Serving
Run serverless workloads that scale to zero and back, with revision-based traffic splitting.
Time: ~20 minutes Difficulty: Intermediate
Resources: Knative Serving installs a control plane (~500MB RAM). Clean up other demos first:
task clean:all
What You Will Learn
Section titled “What You Will Learn”- How Knative Serving provides serverless capabilities on Kubernetes
- Scale-to-zero and automatic scale-up based on traffic
- Revision-based deployments (immutable snapshots of your service)
- Traffic splitting for canary and blue-green deployments
- Concurrency-based autoscaling with target metrics
Prerequisites
Section titled “Prerequisites”Install Knative Serving with Kourier networking:
# Install Knative Serving CRDs and corekubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-crds.yamlkubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-core.yaml
# Install Kourier networking layerkubectl apply -f https://github.com/knative/net-kourier/releases/latest/download/kourier.yaml
# Configure Knative to use Kourierkubectl patch configmap/config-network -n knative-serving --type merge -p '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'
# Wait for componentskubectl wait --for=condition=ready pod --all -n knative-serving --timeout=120skubectl wait --for=condition=ready pod --all -n kourier-system --timeout=120sConfigure DNS (use sslip.io for minikube):
kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-default-domain.yamlDeploy
Section titled “Deploy”Navigate to the demo directory:
cd demos/knative-servingApply the namespace:
kubectl apply -f manifests/namespace.yamlDeploy the initial hello service:
kubectl apply -f manifests/service-hello.yamlWait for the Knative Service to become ready:
kubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120sGet the service URL:
kubectl get ksvc hello -n knative-demo -o jsonpath='{.status.url}'Verify
Section titled “Verify”Check that the service is ready:
kubectl get ksvc hello -n knative-demoYou should see output like:
NAME URL LATESTCREATED LATESTREADY READY REASONhello http://hello.knative-demo.10.0.0.1.sslip.io hello-00001 hello-00001 TrueCurl the service URL:
SERVICE_URL=$(kubectl get ksvc hello -n knative-demo -o jsonpath='{.status.url}')curl $SERVICE_URLYou should see:
Hello World v1!Test scale-to-zero
Section titled “Test scale-to-zero”Wait 60-90 seconds without sending any requests. Then check pods:
kubectl get pods -n knative-demoYou should see no pods running (or pods terminating). The service scaled to zero because there was no traffic.
Now curl the service again:
curl $SERVICE_URLCheck pods immediately:
kubectl get pods -n knative-demoYou will see a pod spinning up (cold start). The Knative Activator queued your request while the pod started.
Test traffic splitting
Section titled “Test traffic splitting”Apply the v2 service configuration (80% latest, 20% v1):
kubectl apply -f manifests/service-hello-v2.yamlWait for the new revision to be ready:
kubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120sCheck revisions:
kubectl get revisions -n knative-demoYou should see two revisions:
NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASONhello-00001 hello 1 Truehello-00002 hello 2 TrueCurl the service multiple times to see traffic splitting:
for i in {1..10}; do curl $SERVICE_URL; doneYou should see mostly “Hello World v2!” with occasional “Hello World v1!” responses (80/20 split).
What is Happening
Section titled “What is Happening”manifests/ namespace.yaml # knative-demo namespace service-hello.yaml # Initial Knative Service (creates hello-00001 revision) service-hello-v2.yaml # Updated service with traffic split (creates hello-00002, 80% v2, 20% v1) service-hello-v3.yaml # Another update with 50/50 split between v2 and v3 service-autoscale.yaml # Demo service with concurrency-based autoscalingKnative Serving brings serverless capabilities to Kubernetes. Unlike a standard Deployment, a Knative Service automatically creates a Configuration and Route. Each time you update the service, Knative creates a new immutable Revision (a snapshot of the container image, environment variables, and resource limits).
Key concepts:
Scale-to-zero: When there is no traffic, Knative terminates all pods after a grace period (default 60 seconds). The Activator component sits in front of your service and queues incoming requests while pods spin up. This saves cluster resources for services that are idle most of the time.
Revisions: Every change to the service spec creates a new revision. Revisions are immutable and numbered sequentially (hello-00001, hello-00002). You can pin traffic to specific revisions for canary or blue-green deployments.
Traffic splitting: The traffic block defines how requests are distributed across revisions. You can route by percentage (80/20 canary) or by named tags. Traffic splitting happens at the Kourier ingress layer, before requests reach your pods.
Autoscaling: Knative watches concurrency (concurrent requests per pod) and RPS (requests per second). When concurrency exceeds the target, Knative spins up more pods. The autoscaler supports multiple modes: concurrency-based (default), RPS-based, and custom metrics. You can set min-scale (minimum replicas) and max-scale (cap on replicas).
Experiment
Section titled “Experiment”-
Deploy the autoscale demo and generate load to watch pods scale up:
Terminal window kubectl apply -f manifests/service-autoscale.yamlkubectl wait --for=condition=ready ksvc autoscale-demo -n knative-demo --timeout=120s# Get the URLAUTOSCALE_URL=$(kubectl get ksvc autoscale-demo -n knative-demo -o jsonpath='{.status.url}')# Generate load (requires hey: go install github.com/rakyll/hey@latest)hey -z 30s -c 50 $AUTOSCALE_URL# Watch pods scale upkubectl get pods -n knative-demo -wYou should see pods scale from 0 to 5 as concurrency increases, then back to 0 after traffic stops.
-
Apply the v3 service configuration for a 50/50 blue-green split:
Terminal window kubectl apply -f manifests/service-hello-v3.yamlkubectl wait --for=condition=ready ksvc hello -n knative-demo --timeout=120s# Curl multiple times to see 50/50 splitfor i in {1..10}; do curl $SERVICE_URL; done -
Pin traffic to a specific revision using tags:
Terminal window kubectl patch ksvc hello -n knative-demo --type merge -p '{"spec": {"traffic": [{"revisionName": "hello-00001","percent": 100,"tag": "stable"}]}}'# Access the stable tag URLkubectl get ksvc hello -n knative-demo -o jsonpath='{.status.traffic[?(@.tag=="stable")].url}' -
Change the scale-to-zero grace period to 30 seconds:
Terminal window kubectl patch ksvc hello -n knative-demo --type merge -p '{"spec": {"template": {"metadata": {"annotations": {"autoscaling.knative.dev/scale-to-zero-grace-period": "30s"}}}}}' -
Check Knative metrics and status:
Terminal window kubectl get ksvc,revision,route -n knative-demokubectl describe ksvc hello -n knative-demokubectl describe revision hello-00002 -n knative-demo -
Check Activator logs to see request queuing during cold start:
Terminal window kubectl logs -f deployment/activator -n knative-serving -
Try changing the autoscaling mode to RPS-based instead of concurrency:
Terminal window kubectl patch ksvc autoscale-demo -n knative-demo --type merge -p '{"spec": {"template": {"metadata": {"annotations": {"autoscaling.knative.dev/metric": "rps","autoscaling.knative.dev/target": "50"}}}}}'
Cleanup
Section titled “Cleanup”Delete the demo namespace:
kubectl delete namespace knative-demoOptionally, remove Knative Serving components:
kubectl delete -f https://github.com/knative/net-kourier/releases/latest/download/kourier.yamlkubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-core.yamlkubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-crds.yamlkubectl delete -f https://github.com/knative/serving/releases/latest/download/serving-default-domain.yamlFurther Reading
Section titled “Further Reading”See docs/deep-dive.md for a detailed explanation of Knative architecture, the relationship between Configuration, Route, and Revision, autoscaling algorithms, cold start optimization, and comparison with other serverless platforms.
Next Step
Section titled “Next Step”Move on to Trivy Operator to scan your running containers for vulnerabilities and misconfigurations.