Skip to content

Multi-Tenant Platform: Deep Dive

Most organizations run multiple teams on shared Kubernetes clusters. Without governance, a single team can exhaust cluster resources, access other teams’ secrets, or send traffic to services they should not reach. Multi-tenancy combines several Kubernetes features to create isolated, governed workspaces within a single cluster.

The simplest multi-tenancy model assigns one namespace per team. Each namespace acts as a boundary for:

  • Resource quotas - total CPU, memory, and object counts
  • RBAC - who can do what inside the namespace
  • Network policies - which namespaces can communicate
  • LimitRanges - per-container defaults and maximums

This model works well for 5-20 teams. Beyond that, consider hierarchical namespaces (HNC) or virtual clusters (vcluster).

A ResourceQuota tracks the aggregate resource consumption across all pods in a namespace. When a pod creation would exceed the quota, the admission controller rejects it.

Important behaviors:

  • Once a ResourceQuota exists in a namespace, every pod must specify resource requests/limits. Pods without them are rejected. LimitRange solves this by injecting defaults.
  • Quotas are checked at admission time, not runtime. If a running pod’s actual usage exceeds its limits, the kubelet handles it (OOMKill for memory, throttling for CPU). Quotas only control what gets scheduled.
  • Quota usage updates are eventually consistent. There is a brief window where a burst of pod creations might slightly exceed the quota.

Quota scopes:

  • BestEffort - applies only to pods without resource requests
  • NotBestEffort - applies only to pods with resource requests
  • Terminating - pods with an active deadline
  • NotTerminating - pods without an active deadline

Count quotas for CRDs:

spec:
hard:
count/certificates.cert-manager.io: "10"
count/ingresses.networking.k8s.io: "5"

LimitRange operates at the container level (not pod level). It provides:

  • default - applied when a container omits limits
  • defaultRequest - applied when a container omits requests
  • min - minimum allowed request/limit
  • max - maximum allowed request/limit
  • maxLimitRequestRatio - enforces a ratio between limit and request

LimitRange also supports type: Pod (aggregate of all containers) and type: PersistentVolumeClaim (storage size limits).

Team RoleTypical Permissions
Frontend developersDeployments, Services, ConfigMaps (read-only)
Backend developersDeployments, Services, ConfigMaps, Secrets
Data engineersAll above + PVCs, StatefulSets, Jobs, CronJobs
Platform adminsClusterRole with namespace creation, quota management

Kubernetes supports label-based role aggregation. Create small roles and aggregate them:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: team-base
labels:
rbac.myorg.io/aggregate-to-team: "true"
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: team-aggregate
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.myorg.io/aggregate-to-team: "true"
rules: [] # auto-populated

Instead of using long-lived ServiceAccount tokens, use projected service account tokens with an audience and expiration:

volumes:
- name: token
projected:
sources:
- serviceAccountToken:
path: token
expirationSeconds: 3600
audience: my-api

Always start with deny-all in every tenant namespace, then whitelist specific flows. This follows the zero-trust principle: assume no traffic is legitimate until proven otherwise.

NetworkPolicy uses namespaceSelector to control cross-namespace traffic. This requires labels on namespaces (set at creation time, not on pods).

Common mistake: forgetting that NetworkPolicy is additive. If you have a deny-all and an allow rule, the allow rule creates an exception. You cannot use NetworkPolicy to deny traffic that another policy allows.

Every deny-all policy must be paired with a DNS allow rule. Without it, pods cannot resolve service names. DNS runs in kube-system on port 53 (UDP and TCP).

  • namespaceSelector alone: matches all pods in matching namespaces
  • podSelector alone: matches pods in the same namespace
  • Both together (in the same from entry): matches pods with the label in namespaces with the label (AND logic)
  • Both separately (in different from entries): matches either (OR logic)

Use OPA/Gatekeeper or Kyverno to enforce policies that RBAC and quotas cannot:

  • Require specific labels on all resources
  • Block privileged containers
  • Enforce image registries (only pull from approved registries)
  • Prevent hostPath mounts

Map ResourceQuota usage to cost centers:

  1. Label namespaces with cost-center identifiers
  2. Use tools like kubecost or OpenCost to track actual vs allocated resources
  3. Set quotas based on budget, not just technical capacity

For organizations with sub-teams, Kubernetes Hierarchical Namespace Controller (HNC) allows:

  • Parent namespaces that propagate policies to children
  • Delegated namespace creation within a subtree
  • Inherited RBAC, quotas, and network policies

For stronger isolation, vcluster creates virtual Kubernetes clusters inside namespaces. Each tenant gets their own control plane (API server, controller manager) while sharing the underlying nodes. This provides near-cluster-level isolation without the cost of separate clusters.

Key metrics to track per namespace:

  • kube_resourcequota - quota limits and usage
  • container_memory_working_set_bytes - actual memory usage
  • container_cpu_usage_seconds_total - actual CPU usage
  • kube_pod_status_phase - pod lifecycle states
  • kube_networkpolicy_* - policy counts and affected pods

Set alerts when any namespace exceeds 80% of its quota to give teams time to optimize before hitting hard limits.