Resource Quotas & LimitRanges: Deep Dive
This document explains how Kubernetes enforces resource governance at the namespace level. It covers the ResourceQuota admission controller, LimitRange mechanics, quota scopes, count quotas for custom resources, and strategies for multi-team namespace management.
The ResourceQuota Admission Controller
Section titled “The ResourceQuota Admission Controller”ResourceQuota enforcement happens through an admission controller. When a pod creation or update request arrives at the API server, the ResourceQuota admission controller intercepts it.
The admission controller:
- Looks up all ResourceQuota objects in the target namespace.
- Calculates the total resource consumption after the request would be applied.
- If the new total exceeds any quota, the request is rejected with a 403 Forbidden error.
- If the request fits, the quota’s
status.usedfield is updated.
This is a synchronous, blocking check. The pod is never created if it would exceed the quota. This differs from resource limits on containers, which allow the container to start and then enforce via the kernel (OOM kill, CPU throttling).
The demo defines a quota in compute-quota:
apiVersion: v1kind: ResourceQuotametadata: name: compute-quota namespace: quota-demospec: hard: requests.cpu: "1" requests.memory: 1Gi limits.cpu: "2" limits.memory: 2Gi pods: "5" services: "3" persistentvolumeclaims: "2"Quota Enforcement Requires Resource Specs
Section titled “Quota Enforcement Requires Resource Specs”When a ResourceQuota exists in a namespace that tracks compute resources (requests.cpu, limits.memory, etc.), every pod in that namespace must specify resource requests and limits for those tracked resources. If a pod does not have resource specs, it is rejected.
This is why LimitRange exists: it provides default resource specs for pods that do not define their own.
Without a LimitRange, a bare kubectl run nginx --image=nginx in a quota-enabled namespace fails with:
Error: pods "nginx" is forbidden: failed quota: compute-quota:must specify requests.cpu, requests.memory, limits.cpu, limits.memoryWhat Gets Counted
Section titled “What Gets Counted”The quota counts resources from all pods in the namespace, regardless of their status. A pod in Pending state still counts against the quota because its resource requests are reserved. Only Succeeded and Failed pods (terminal states) do not count.
This means: if you have 5 pods stuck in Pending (image pull error, scheduling failure), they still consume quota. New pods cannot be created until the stuck pods are cleaned up.
ResourceQuota Scopes
Section titled “ResourceQuota Scopes”Quotas can be scoped to apply only to certain types of pods. Scopes narrow which pods count against the quota.
BestEffort Scope
Section titled “BestEffort Scope”Applies only to pods with no resource requests or limits on any container:
apiVersion: v1kind: ResourceQuotametadata: name: best-effort-quotaspec: hard: pods: "10" scopes: - BestEffortThis limits the number of BestEffort pods (pods without resource specs). Pods with resource specs are not affected.
NotBestEffort Scope
Section titled “NotBestEffort Scope”The inverse. Applies only to pods that have at least one container with resource requests or limits:
apiVersion: v1kind: ResourceQuotametadata: name: not-best-effort-quotaspec: hard: requests.cpu: "4" requests.memory: 8Gi pods: "20" scopes: - NotBestEffortTerminating and NotTerminating Scopes
Section titled “Terminating and NotTerminating Scopes”These scope quotas based on whether pods have an activeDeadlineSeconds set:
Terminating: Pods withactiveDeadlineSecondsset (Jobs, batch workloads)NotTerminating: Pods withoutactiveDeadlineSeconds(long-running services)
apiVersion: v1kind: ResourceQuotametadata: name: batch-quotaspec: hard: pods: "50" requests.cpu: "8" scopes: - TerminatingThis allows up to 50 batch job pods consuming 8 CPU total, without affecting the quota for long-running services.
PriorityClass Scope
Section titled “PriorityClass Scope”Quota can target pods of a specific priority class using scopeSelector:
apiVersion: v1kind: ResourceQuotametadata: name: high-priority-quotaspec: hard: pods: "10" requests.cpu: "4" scopeSelector: matchExpressions: - operator: In scopeName: PriorityClass values: - high-priorityThis is powerful for multi-team clusters. You can give each team a quota for high-priority workloads and a separate, larger quota for low-priority workloads. High-priority pods are scarce and controlled. Low-priority pods can use more resources but get preempted first.
CrossNamespaceAffinity Scope
Section titled “CrossNamespaceAffinity Scope”Added in v1.24. Limits the number of pods that use cross-namespace affinity terms:
apiVersion: v1kind: ResourceQuotametadata: name: cross-ns-quotaspec: hard: pods: "5" scopes: - CrossNamespacePodAffinityCross-namespace affinity is a potential security concern because a pod in namespace A can influence scheduling near pods in namespace B. This scope limits that behavior.
Count Quotas
Section titled “Count Quotas”Beyond compute resources, quotas can count any API object type:
apiVersion: v1kind: ResourceQuotametadata: name: object-countsspec: hard: pods: "5" services: "3" persistentvolumeclaims: "2" configmaps: "10" secrets: "10" services.loadbalancers: "1" services.nodeports: "2"Count Quotas for CRDs
Section titled “Count Quotas for CRDs”You can quota custom resources using the count/ prefix:
apiVersion: v1kind: ResourceQuotametadata: name: crd-quotaspec: hard: count/certificates.cert-manager.io: "20" count/virtualservices.networking.istio.io: "10"The format is count/<resource>.<api-group>. This prevents teams from creating unbounded numbers of custom resources.
LimitRange In Detail
Section titled “LimitRange In Detail”LimitRange operates at the individual container or pod level, while ResourceQuota operates at the namespace level. They serve different purposes:
- ResourceQuota: Total budget for the namespace
- LimitRange: Per-container constraints and defaults
The demo’s LimitRange:
apiVersion: v1kind: LimitRangemetadata: name: default-limits namespace: quota-demospec: limits: - type: Container default: cpu: 200m memory: 128Mi defaultRequest: cpu: 50m memory: 64Mi min: cpu: 25m memory: 32Mi max: cpu: 500m memory: 512MiLimitRange Fields Explained
Section titled “LimitRange Fields Explained”| Field | What It Does |
|---|---|
default | Applied as the container’s limit if none specified |
defaultRequest | Applied as the container’s request if none specified |
min | Minimum allowed request/limit. Pod is rejected if below this |
max | Maximum allowed request/limit. Pod is rejected if above this |
maxLimitRequestRatio | Maximum allowed ratio of limit to request |
The maxLimitRequestRatio is interesting. If set to 2, a container requesting 100m CPU can have at most 200m CPU limit. This prevents overcommit where a container requests 50m but has a limit of 4000m, potentially starving other containers.
How Defaults Are Applied
Section titled “How Defaults Are Applied”LimitRange defaults are injected by a mutating admission controller. When a pod spec arrives without resource specs:
- The controller checks if any LimitRange exists in the namespace.
- For each container missing
resources.limits, it injects thedefaultvalues. - For each container missing
resources.requests, it injects thedefaultRequestvalues. - If
defaultRequestis not set butdefaultis, the request is set equal to the limit.
The injection happens before the ResourceQuota check. So even if a user submits a pod without resources, the LimitRange injects defaults, and then the ResourceQuota validates the totals.
LimitRange Types
Section titled “LimitRange Types”LimitRange supports three types:
Container (most common):
limits: - type: Container default: cpu: 200m max: cpu: 500mApplies to individual containers within a pod.
Pod:
limits: - type: Pod max: cpu: "2" memory: 4GiLimits the total resources of all containers in a pod combined. This prevents a pod with 10 containers from consuming 10x the container max.
PersistentVolumeClaim:
limits: - type: PersistentVolumeClaim min: storage: 1Gi max: storage: 100GiControls PVC sizes. Prevents users from requesting 1 TiB PVCs when the storage class has limited capacity.
Interaction Between Quota and LimitRange
Section titled “Interaction Between Quota and LimitRange”These two resources work together in a specific order:
- Pod spec arrives at the API server.
- LimitRange mutating admission injects defaults (if needed).
- LimitRange validating admission checks min/max/ratio constraints.
- ResourceQuota admission checks namespace totals.
If the LimitRange defaults cause the pod to exceed the ResourceQuota, the pod is rejected. The error message comes from ResourceQuota, not LimitRange, which can be confusing.
The Math in the Demo
Section titled “The Math in the Demo”The demo shows this interaction:
small-app (fits within quota):
spec: replicas: 2 template: spec: containers: - name: nginx resources: requests: cpu: 100m memory: 64Mi limits: cpu: 200m memory: 128MiTotal for 2 replicas: 200m CPU request, 400m CPU limit, 128Mi memory request, 256Mi memory limit.
greedy-app (exceeds quota):
spec: replicas: 3 template: spec: containers: - name: nginx resources: requests: cpu: 400m memory: 256Mi limits: cpu: 800m memory: 512Mismall-app already uses 200m CPU request. greedy-app wants 400m * 3 = 1200m CPU request. Total would be 1400m. Quota allows 1000m. Only 2 of the 3 greedy-app pods can be created (200m existing + 400m * 2 = 1000m exactly). The third pod is rejected.
The Deployment controller keeps retrying. If small-app is scaled down or deleted, the quota frees up and the third greedy-app pod can be created.
Multiple ResourceQuotas
Section titled “Multiple ResourceQuotas”A namespace can have multiple ResourceQuota objects. Each one enforces independently. The pod must satisfy all of them.
# Compute quotaapiVersion: v1kind: ResourceQuotametadata: name: compute-quotaspec: hard: requests.cpu: "4" limits.memory: 8Gi
---# Object count quotaapiVersion: v1kind: ResourceQuotametadata: name: count-quotaspec: hard: pods: "20" services: "5"Use separate quotas for different concerns: compute resources, object counts, storage. This makes it easier to adjust one without touching the other.
Multi-Team Namespace Strategies
Section titled “Multi-Team Namespace Strategies”In multi-team clusters, quotas are the primary mechanism for resource governance.
One Namespace Per Team
Section titled “One Namespace Per Team”The simplest model. Each team gets a namespace with a ResourceQuota:
apiVersion: v1kind: ResourceQuotametadata: name: team-alpha-quota namespace: team-alphaspec: hard: requests.cpu: "8" requests.memory: 16Gi limits.cpu: "16" limits.memory: 32Gi pods: "50"Environment-Based Namespaces
Section titled “Environment-Based Namespaces”Teams get separate namespaces for dev, staging, and production with different quotas. Dev gets a small quota (2 CPU, 10 pods). Production gets a larger one (16 CPU, 100 pods). The Hierarchical Namespace Controller (HNC) can propagate quotas from parent to child namespaces for more automated governance.
QoS Classes and Quotas
Section titled “QoS Classes and Quotas”Kubernetes assigns QoS classes based on resource specs: Guaranteed (requests equal limits), Burstable (requests set but not equal to limits), and BestEffort (no requests or limits). The BestEffort and NotBestEffort quota scopes correspond to these classes. BestEffort pods are evicted first under memory pressure.
Storage and Monitoring Quotas
Section titled “Storage and Monitoring Quotas”Quotas can limit storage with requests.storage and per-StorageClass quotas using the <storageclass>.storageclass.storage.k8s.io/ prefix. Monitor quota usage with kubectl describe resourcequota, the kube_resourcequota Prometheus metric, or kubectl get events --field-selector reason=FailedCreate.
Common Pitfalls
Section titled “Common Pitfalls”1. Forgetting LimitRange with Quota
Section titled “1. Forgetting LimitRange with Quota”If ResourceQuota tracks compute resources but no LimitRange provides defaults, pods without resource specs are rejected. Always pair ResourceQuota with LimitRange.
2. Quota Covers All Pod States
Section titled “2. Quota Covers All Pod States”Pending and CrashLoopBackOff pods still count. A namespace full of broken pods prevents new pods from being created.
3. ReplicaSet Controller Retries Silently
Section titled “3. ReplicaSet Controller Retries Silently”When a Deployment cannot create pods due to quota, the ReplicaSet controller retries with exponential backoff. The Deployment appears to accept the request, but replicas stay at a lower count. Check events, not just kubectl get deploy.
4. LimitRange Min/Max vs Quota
Section titled “4. LimitRange Min/Max vs Quota”A LimitRange max of 500m CPU with 3 pods could mean 1500m total. But if the quota limits requests to 1000m, only 2 pods at 500m can exist. The LimitRange max and the ResourceQuota hard limit interact in non-obvious ways.
5. Ephemeral Storage Quotas
Section titled “5. Ephemeral Storage Quotas”If you need to limit ephemeral storage (container logs, emptyDir volumes), add:
spec: hard: requests.ephemeral-storage: 10Gi limits.ephemeral-storage: 20GiWithout this, a single pod writing to an emptyDir can fill the node’s disk.