Kyverno Policy Enforcement: Deep Dive
Why Policy Engines Exist
Section titled “Why Policy Engines Exist”Kubernetes gives you the building blocks to deploy applications, but it does not enforce organizational standards. Without a policy engine, every deployment is a manual code review:
- Does every pod have resource limits?
- Are all images pulled from approved registries?
- Is anyone running containers as root?
- Do all services have owner labels?
You can write documentation telling teams to follow best practices. You can review every YAML file before it merges. But humans miss things. Teams forget. New engineers do not read the 40-page deployment guide.
A policy engine moves these checks from human review to automated enforcement. The API server blocks non-compliant resources before they enter the cluster. Teams get immediate feedback. Standards are enforced uniformly across all namespaces.
The Admission Control Flow
Section titled “The Admission Control Flow”Kubernetes admission control is a series of webhooks that run after authentication and authorization but before the resource is stored in etcd.
graph LR A[kubectl apply] --> B[API Server] B --> C[Authentication] C --> D[Authorization] D --> E[Mutating Admission] E --> F[Validating Admission] F --> G[Persisted to etcd]
E --> H[Kyverno Mutate] F --> I[Kyverno Validate]
H --> E I --> FMutating webhooks run first. They modify the resource (add labels, inject sidecars, set defaults). Validating webhooks run second. They accept or reject the resource based on policy. If any validating webhook rejects the request, the resource is not created.
Kyverno registers both mutating and validating webhooks. When you create a ClusterPolicy, Kyverno updates its webhook configurations to intercept resources that match the policy’s rules.
How Kyverno Works Internally
Section titled “How Kyverno Works Internally”Kyverno runs as a deployment in the cluster. It has three main components:
Admission Controller
Section titled “Admission Controller”This is the webhook server that receives admission requests from the API server. When you create a pod, the API server sends the pod spec to Kyverno. Kyverno evaluates all policies that match the resource and returns either:
- Allowed (with optional mutations applied)
- Denied (with a message explaining the violation)
Background Controller
Section titled “Background Controller”Admission webhooks only run when resources are created or updated. What about resources that existed before you installed Kyverno? Or policy changes that apply retroactively?
The background controller continuously scans existing resources and evaluates them against policies with background: true. It does not block or mutate existing resources. Instead, it creates PolicyReport resources that document violations.
The demo policies all have background: true:
# From manifests/policy-require-labels.yamlspec: validationFailureAction: Enforce background: trueThis means Kyverno checks both new pods (at admission time) and existing pods (in the background).
Generate Controller
Section titled “Generate Controller”The generate controller watches for resources and creates additional resources when rules match. For example, a policy can automatically create a NetworkPolicy whenever a namespace is created. The generate controller ensures these generated resources stay synchronized with the source resource.
Policy Structure
Section titled “Policy Structure”A Kyverno policy is either a ClusterPolicy (cluster-wide) or a Policy (namespaced). The demo uses ClusterPolicies because they apply across all matched namespaces.
Here is the label validation policy from the demo:
# From manifests/policy-require-labels.yamlapiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: require-team-label annotations: policies.kyverno.io/title: Require Team Label policies.kyverno.io/category: Best Practices policies.kyverno.io/severity: medium policies.kyverno.io/description: >- All pods must have a 'team' label for ownership tracking.spec: validationFailureAction: Enforce background: true rules: - name: check-team-label match: any: - resources: kinds: - Pod namespaces: - kyverno-demo validate: message: "Pod is missing required label 'team'. Add metadata.labels.team to your pod spec." pattern: metadata: labels: team: "?*"Match Conditions
Section titled “Match Conditions”The match block determines when the rule applies. This policy matches:
- Resource kind: Pod
- Namespace: kyverno-demo
You can also match on resource names, label selectors, namespace selectors, and subjects (the user or ServiceAccount creating the resource). The any field means “apply if any condition matches.” The alternative is all, which requires all conditions to match.
To apply a policy cluster-wide except for kube-system, use exclude:
match: any: - resources: kinds: - Podexclude: any: - resources: namespaces: - kube-system - kyvernoValidation Patterns
Section titled “Validation Patterns”The validate.pattern block specifies what the resource must look like. The pattern uses special wildcards:
| Pattern | Meaning |
|---|---|
"?*" | Any value, but the field must be present |
"!value" | Any value except “value” |
"*" | Any value (including the field being absent) |
"exact-string" | Must match exactly |
The demo’s label policy uses "?*" to require the team label to exist with any non-empty value:
pattern: metadata: labels: team: "?*"If you wanted to restrict to specific team names, use an enum pattern:
validate: message: "Team must be platform, backend, or frontend" anyPattern: - metadata: labels: team: "platform" - metadata: labels: team: "backend" - metadata: labels: team: "frontend"Validating vs Mutating vs Generating Policies
Section titled “Validating vs Mutating vs Generating Policies”Kyverno supports three policy types. Each serves a different purpose.
Validating Policies
Section titled “Validating Policies”These check if a resource meets criteria. The demo has two validating policies.
The first requires all pods to have a team label (shown above). The second blocks the :latest tag:
# From manifests/policy-disallow-latest.yamlspec: validationFailureAction: Enforce background: true rules: - name: require-image-tag match: any: - resources: kinds: - Pod namespaces: - kyverno-demo validate: message: "Container images must have a specific tag. Do not use :latest or omit the tag." pattern: spec: containers: - name: "*" image: "!*:latest" - name: require-tag-present match: any: - resources: kinds: - Pod namespaces: - kyverno-demo validate: message: "Container images must specify a tag explicitly." pattern: spec: containers: - name: "*" image: "*:*"This policy has two rules in one ClusterPolicy. The first rule rejects any image with :latest. The second rule requires the image to contain a colon (enforcing that a tag is present, since nginx with no tag is treated as nginx:latest by the runtime).
Mutating Policies
Section titled “Mutating Policies”These modify resources before they are created. The demo has a mutating policy that adds default resource limits:
# From manifests/policy-add-default-resources.yamlapiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: add-default-resources annotations: policies.kyverno.io/title: Add Default Resource Limits policies.kyverno.io/category: Resource Management policies.kyverno.io/severity: low policies.kyverno.io/description: >- Automatically adds default resource requests and limits to containers that do not specify them.spec: background: true rules: - name: add-default-resources match: any: - resources: kinds: - Pod namespaces: - kyverno-demo mutate: patchStrategicMerge: spec: containers: - (name): "*" resources: requests: +(memory): "64Mi" +(cpu): "100m" limits: +(memory): "128Mi" +(cpu): "200m"The +(field) syntax means “add this field only if it does not exist.” If a container already has a memory request, this policy does not overwrite it. If the container has no memory request, Kyverno adds 64Mi.
The (name): "*" syntax means “match all containers, regardless of name.”
Mutations are powerful. Common use cases:
- Add default resource limits (as shown)
- Inject sidecar containers (logging agents, service mesh proxies)
- Add annotations (pod identity, monitoring labels)
- Set security contexts (runAsNonRoot, read-only root filesystem)
- Modify image pull policies
Generating Policies
Section titled “Generating Policies”These create additional resources when a trigger resource is created. A common pattern is to generate a default-deny NetworkPolicy whenever a namespace is created:
apiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: add-networkpolicyspec: rules: - name: default-deny-ingress match: any: - resources: kinds: - Namespace generate: apiVersion: networking.k8s.io/v1 kind: NetworkPolicy name: default-deny-ingress namespace: "{{request.object.metadata.name}}" synchronize: true data: spec: podSelector: {} policyTypes: - IngressThe synchronize: true field means Kyverno will recreate the generated resource if it is deleted. The {{request.object.metadata.name}} syntax is a variable substitution. When a namespace called backend is created, Kyverno generates a NetworkPolicy in the backend namespace.
Other generate use cases:
- Create default ResourceQuotas for new namespaces
- Generate ConfigMaps with standard settings
- Create RoleBindings to grant access to new ServiceAccounts
- Populate secrets with default certificates
Policy Modes: Enforce vs Audit
Section titled “Policy Modes: Enforce vs Audit”Every validating policy has a validationFailureAction field with two possible values:
Enforce
Section titled “Enforce”The default. Violations are rejected. The resource is not created.
spec: validationFailureAction: EnforceWhen you try to create a pod without the required team label, the API server returns an error:
Error from server: error when creating "manifests/test-bad-pod-no-label.yaml":admission webhook "validate.kyverno.svc-fail" denied the request:
policy Pod/kyverno-demo/bad-pod-no-label for resource violation:
require-team-label: check-team-label: Pod is missing required label 'team'. Add metadata.labels.team to your pod spec.Violations are logged but the resource is still created.
spec: validationFailureAction: AuditIn audit mode, the same pod creation succeeds. But Kyverno creates a PolicyReport that documents the violation:
kubectl get policyreport -n kyverno-demoPolicyReports are Kubernetes resources. They contain:
- Which resource violated which policy
- The rule that failed
- The severity level
- The timestamp
This is useful for phased rollouts. Start with audit mode to understand what would break. Review the violations. Fix the resources. Then switch to enforce mode.
Policy Reports and Background Scanning
Section titled “Policy Reports and Background Scanning”PolicyReports are automatically created when background: true is set. Kyverno scans existing resources every 15 minutes (configurable) and updates the reports.
Each namespace gets a PolicyReport that aggregates violations in that namespace. Cluster-scoped resources (nodes, namespaces, etc.) get a ClusterPolicyReport.
To see all violations across the cluster:
kubectl get policyreport -Akubectl get clusterpolicyreportTo see details of a specific report:
kubectl describe policyreport -n kyverno-demoReports show:
- Pass: How many resources comply
- Fail: How many resources violate policies
- Warn: How many resources trigger warnings
- Error: How many resources failed policy evaluation
- Skip: How many resources were excluded
Reports are read-only. Kyverno owns them. When the resource is fixed or deleted, the report entry is automatically updated.
Trade-offs: Kyverno vs OPA Gatekeeper
Section titled “Trade-offs: Kyverno vs OPA Gatekeeper”Kubernetes has two major policy engines. The choice depends on team skills and use cases.
Kyverno
Section titled “Kyverno”Pros:
- Policies are written in YAML. No new language to learn.
- Integrated with kubectl. You can validate policies locally before applying them.
- Generate policies can create related resources automatically.
- Built-in policy library with 200+ ready-to-use policies.
- Variable substitution and context lookups are simpler.
Cons:
- YAML patterns can be verbose for complex logic.
- No built-in testing framework (you test by applying manifests).
- Custom functions are limited compared to Rego.
OPA Gatekeeper
Section titled “OPA Gatekeeper”Pros:
- Policies are written in Rego, a powerful declarative language.
- Better for complex logic (conditional rules, nested checks, arithmetic).
- Strong testing story (you can unit test Rego policies).
- OPA is a general-purpose policy engine (works beyond Kubernetes).
Cons:
- Rego has a learning curve. Developers need training.
- Policy structure is more abstract (ConstraintTemplates + Constraints).
- No automatic resource generation.
When to Choose Kyverno
Section titled “When to Choose Kyverno”- Teams prefer YAML over learning a new language.
- Most policies are simple pattern matching (require labels, block host paths).
- You want to use generate policies to auto-create related resources.
- You value the built-in policy library.
When to Choose OPA Gatekeeper
Section titled “When to Choose OPA Gatekeeper”- You have complex policy logic (e.g., “memory limit must be <= 2x memory request”).
- You already use OPA for other systems (API gateways, microservices).
- You want strong unit testing for policies.
- You need fine-grained control over data queries and transformations.
You can also run both. Some organizations use Kyverno for common patterns and Gatekeeper for complex custom rules.
Production Considerations
Section titled “Production Considerations”High Availability
Section titled “High Availability”Kyverno runs as a deployment. In production, run at least three replicas with pod anti-affinity to spread them across nodes:
replicas: 3affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app.kubernetes.io/name operator: In values: - kyverno topologyKey: kubernetes.io/hostnameIf all Kyverno pods are down, the webhook fails closed (blocks all requests) by default. You can configure it to fail open, but this defeats the purpose of policy enforcement.
Webhook Timeout
Section titled “Webhook Timeout”Admission webhooks have a timeout. The default is 10 seconds. If Kyverno does not respond within 10 seconds, the webhook fails and the request is denied (or allowed, depending on failurePolicy).
Complex policies with external data lookups or API calls can exceed this timeout. Keep policies simple. If you need external data, cache it in ConfigMaps and reference them with context variables.
Policy Testing
Section titled “Policy Testing”Test policies before deploying them to production.
Local validation:
Kyverno CLI can validate resources against policies locally:
kyverno apply policy.yaml --resource pod.yamlThis runs the policy engine without applying to the cluster.
Dry-run mode:
Use kubectl apply --dry-run=server to test if a resource would be blocked:
kubectl apply -f test-pod.yaml --dry-run=serverThe server evaluates the resource through all admission webhooks, including Kyverno, but does not persist it.
CI integration:
Run kyverno apply in CI pipelines. Fail the build if manifests violate policies. This catches violations before they reach the cluster.
Namespace Exclusions
Section titled “Namespace Exclusions”System namespaces should usually be excluded from policies. Kyverno excludes kube-system, kube-public, kube-node-lease, and its own namespace by default.
Add more exclusions if needed:
spec: validationFailureAction: Enforce background: true rules: - name: check-team-label match: any: - resources: kinds: - Pod exclude: any: - resources: namespaces: - kube-system - monitoring - loggingBe careful not to over-exclude. If you exclude too many namespaces, policies lose effectiveness.
Policy Versioning
Section titled “Policy Versioning”Store policies in Git. Version them alongside application manifests. Use GitOps (ArgoCD, Flux) to deploy policies. This gives you:
- Audit trail of policy changes
- Rollback capability
- Code review for policy changes
- Consistent deployment process
Treat policies as code. Do not kubectl apply policies manually in production.
Resource Consumption
Section titled “Resource Consumption”Each policy evaluation consumes CPU and memory. The impact depends on:
- Number of policies
- Complexity of patterns
- Number of resources created per second
Monitor Kyverno pod metrics. Set resource limits:
resources: requests: cpu: 100m memory: 128Mi limits: cpu: 1000m memory: 512MiFor clusters with high churn (thousands of pods created per hour), you may need to tune background: false on some policies to reduce background scan overhead.
Common Pitfalls
Section titled “Common Pitfalls”Wildcard Matches Can Be Too Broad
Section titled “Wildcard Matches Can Be Too Broad”Using match.any.resources.kinds: ["*"] makes the policy apply to every resource type. This can cause unexpected behavior:
- Policy evaluates on CRDs you did not intend to cover
- Background scans become slow as every resource is checked
- Webhook latency increases
Be specific. List the exact kinds the policy should cover.
Policy Ordering Matters for Mutations
Section titled “Policy Ordering Matters for Mutations”If multiple mutating policies apply to the same resource, they run in order of creation. Policy A runs first. Its output becomes the input to policy B.
This can cause conflicts. Policy A adds a label. Policy B expects that label to not exist (using a preconditions check). Policy B fails.
Solve this with explicit preconditions:
mutate: preconditions: any: - key: "{{request.object.metadata.labels.team}}" operator: NotEquals value: "" patchStrategicMerge: metadata: labels: reviewed: "true"This mutation only runs if the team label already exists.
Background Scanning Can Create Report Noise
Section titled “Background Scanning Can Create Report Noise”With background: true, Kyverno scans all existing resources. If you have 1,000 pods and install a new policy, you immediately get 1,000 PolicyReport entries.
This is by design. But it can overwhelm teams. Strategies to manage this:
- Start with
validationFailureAction: Auditandbackground: false. Only new resources are checked. - Fix existing violations gradually.
- Switch
background: trueafter violations are reduced. - Finally switch to
validationFailureAction: Enforce.
Forgetting to Exclude System Namespaces
Section titled “Forgetting to Exclude System Namespaces”A policy that requires all pods to have a team label will break kube-dns, metrics-server, and other system components.
Always use exclude for system namespaces or use an opt-in approach with namespace labels:
match: any: - resources: kinds: - Pod namespaceSelector: matchLabels: policy-enforcement: "true"Only namespaces with the policy-enforcement: "true" label are checked. This avoids breaking system namespaces.
Policy Exceptions Are Not PolicyException Resources
Section titled “Policy Exceptions Are Not PolicyException Resources”Kyverno v1.10+ has a PolicyException resource for excluding specific resources from policies:
apiVersion: kyverno.io/v2kind: PolicyExceptionmetadata: name: allow-latest-for-dev namespace: kyverno-demospec: exceptions: - policyName: disallow-latest-tag ruleNames: - require-image-tag match: any: - resources: kinds: - Pod namespaces: - kyverno-demo names: - dev-*This allows pods with names starting with dev- to bypass the :latest tag check. Use this sparingly. Broad exceptions defeat the purpose of policies.
Variables Must Be Properly Escaped
Section titled “Variables Must Be Properly Escaped”Kyverno supports variable substitution with {{...}} syntax. Common variables:
| Variable | Description |
|---|---|
{{request.object}} | The incoming resource |
{{request.operation}} | CREATE, UPDATE, DELETE |
{{request.userInfo}} | The user or ServiceAccount |
{{request.namespace}} | The namespace of the resource |
Variables can reference nested fields:
message: "Image {{request.object.spec.containers[0].image}} is not allowed"But if the field does not exist, the policy evaluation fails. Use the safe navigation operator:
message: "Image {{request.object.spec.containers[?0].image || 'unknown'}} is not allowed"This prevents errors when the containers array is empty.
Advanced Patterns
Section titled “Advanced Patterns”Context Variables and External Data
Section titled “Context Variables and External Data”Kyverno can fetch data from ConfigMaps, API server, or external services:
context: - name: allowed-registries configMap: name: registry-whitelist namespace: kyvernovalidate: message: "Image registry is not in the approved list" deny: conditions: all: - key: "{{request.object.spec.containers[].image | split(@, '/') | [0]}}" operator: NotIn value: "{{allowed-registries.data.registries}}"This loads a ConfigMap, parses the list of allowed registries, and blocks images from unapproved registries.
Image Verification with Cosign
Section titled “Image Verification with Cosign”Kyverno can verify image signatures using Cosign:
spec: rules: - name: check-image-signature match: any: - resources: kinds: - Pod verifyImages: - imageReferences: - "ghcr.io/myorg/*" attestors: - entries: - keys: publicKeys: |- -----BEGIN PUBLIC KEY----- ... -----END PUBLIC KEY-----This ensures only images signed with the specified key are allowed. Unsigned images are rejected.
Auto-Generation of Network Policies
Section titled “Auto-Generation of Network Policies”A common pattern for generate policies:
apiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: add-networkpolicyspec: rules: - name: default-deny match: any: - resources: kinds: - Namespace exclude: any: - resources: namespaces: - kube-system - kyverno generate: apiVersion: networking.k8s.io/v1 kind: NetworkPolicy name: default-deny-ingress namespace: "{{request.object.metadata.name}}" synchronize: true data: spec: podSelector: {} policyTypes: - IngressEvery new namespace gets a default-deny NetworkPolicy. Teams must explicitly create allow rules for their services.
Connection to the Demo
Section titled “Connection to the Demo”The demo shows all three policy types in action:
- Validation (require-team-label): Blocks pods without a
teamlabel. - Validation (disallow-latest-tag): Blocks pods using
:latestor untagged images. - Mutation (add-default-resources): Automatically adds resource limits to pods that omit them.
Test pods demonstrate the policy enforcement:
test-good-pod.yaml: Passes all policies. Has a team label, a specific image tag, and resource limits.test-bad-pod-no-label.yaml: Blocked by the label policy.test-bad-pod-latest.yaml: Blocked by the image tag policy.
The mutation is invisible at creation time but visible when you inspect the pod:
kubectl get pod mutated-pod -o yaml | grep -A 5 resourcesThe output shows Kyverno added default requests and limits.
Further Reading
Section titled “Further Reading”- Kyverno Documentation
- Kyverno Policy Library
- Kubernetes Admission Controllers
- OPA Gatekeeper
- Policy Reports
- Cosign Image Signing