10: Resources and Limits

10: Resources and Limits

Objective

Learn how to define CPU and memory requests and limits for containers, understand the three Quality of Service (QoS) classes in Kubernetes, and observe what happens when a container exceeds its memory limit.


Theory

Requests vs Limits

Every container in a Pod can specify two resource boundaries:

Field Description
resources.requests The minimum amount of CPU/memory the container needs. The scheduler uses this to find a node with enough capacity.
resources.limits The maximum amount of CPU/memory the container is allowed to use. The kubelet enforces this at runtime.
  • If a container tries to use more CPU than its limit, it gets throttled (not killed).
  • If a container tries to use more memory than its limit, it gets OOM killed (Out Of Memory).

CPU Units

CPU is measured in millicores (also called millicpu):

Value Meaning
1 1 full CPU core
500m 0.5 CPU core (half a core)
100m 0.1 CPU core (10% of a core)
250m 0.25 CPU core (quarter of a core)

Memory Units

Memory is measured in bytes, commonly expressed as:

Unit Meaning
Ki / Mi / Gi Kibibytes / Mebibytes / Gibibytes (powers of 1024)
64Mi 64 mebibytes (67,108,864 bytes)
128Mi 128 mebibytes
1Gi 1 gibibyte (1,073,741,824 bytes)

QoS Classes

Kubernetes assigns a Quality of Service class to every Pod based on its resource configuration. The QoS class determines eviction priority when a node runs low on resources.

QoS Class Condition Eviction Priority
Guaranteed Every container has requests equal to limits for both CPU and memory Last to be evicted
Burstable At least one container has requests less than limits, or only requests are set Evicted after BestEffort
BestEffort No container has any requests or limits set First to be evicted

QoS Priority Pyramid

graph TB
    G["Guaranteed<br/>requests = limits<br/>Last evicted"]
    B["Burstable<br/>requests < limits<br/>Evicted second"]
    BE["BestEffort<br/>No resources set<br/>Evicted first"]

    G --- B --- BE

    style G fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px
    style B fill:#fff9c4,stroke:#f9a825,stroke-width:2px
    style BE fill:#ffcdd2,stroke:#c62828,stroke-width:2px

Requests vs Limits Axis

graph LR
    subgraph ResourceAxis["CPU / Memory Allocation"]
        direction LR
        ZERO["0"] -->|"← requests (scheduler guarantee)"| REQ["requests"]
        REQ -->|"← container can burst into this range"| LIM["limits"]
        LIM -->|"← throttled (CPU) or OOMKilled (memory)"| OVER["exceeded"]
    end

    style ZERO fill:#e3f2fd,stroke:#1565c0
    style REQ fill:#c8e6c9,stroke:#2e7d32
    style LIM fill:#fff9c4,stroke:#f9a825
    style OVER fill:#ffcdd2,stroke:#c62828

Practical Tasks

Task 1: Pod with Guaranteed QoS

When requests equal limits for both CPU and memory, the Pod gets the Guaranteed QoS class.

Create a file called pod-guaranteed.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
  namespace: student-XX
  labels:
    app: guaranteed-pod
spec:
  containers:
    - name: kuard
      image: <ACR_NAME>.azurecr.io/kuard:1
      ports:
        - containerPort: 8080
      resources:
        requests:
          cpu: "100m"
          memory: "64Mi"
        limits:
          cpu: "100m"
          memory: "64Mi"

Deploy and verify the QoS class:

kubectl apply -f pod-guaranteed.yaml
kubectl get pod guaranteed-pod -n student-XX -o yaml | grep qosClass

Expected output:

  qosClass: Guaranteed

Task 2: Pod with Burstable QoS

When requests are lower than limits, the Pod gets the Burstable QoS class.

Create a file called pod-burstable.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
  namespace: student-XX
  labels:
    app: burstable-pod
spec:
  containers:
    - name: kuard
      image: <ACR_NAME>.azurecr.io/kuard:1
      ports:
        - containerPort: 8080
      resources:
        requests:
          cpu: "50m"
          memory: "32Mi"
        limits:
          cpu: "200m"
          memory: "128Mi"

Deploy and verify:

kubectl apply -f pod-burstable.yaml
kubectl get pod burstable-pod -n student-XX -o yaml | grep qosClass

Expected output:

  qosClass: Burstable

Task 3: Pod with BestEffort QoS

When no resource requests or limits are specified, the Pod gets the BestEffort QoS class.

Create a file called pod-besteffort.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: besteffort-pod
  namespace: student-XX
  labels:
    app: besteffort-pod
spec:
  containers:
    - name: kuard
      image: <ACR_NAME>.azurecr.io/kuard:1
      ports:
        - containerPort: 8080

Deploy and verify:

kubectl apply -f pod-besteffort.yaml
kubectl get pod besteffort-pod -n student-XX -o yaml | grep qosClass

Expected output:

  qosClass: BestEffort

Task 4: OOM Kill Test

Deploy a Pod with a tight memory limit of 100Mi, then intentionally exceed it using the kuard Memory tab.

Create a file called pod-oomkill.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: oomkill-pod
  namespace: student-XX
  labels:
    app: oomkill-pod
spec:
  containers:
    - name: kuard
      image: <ACR_NAME>.azurecr.io/kuard:1
      ports:
        - containerPort: 8080
      resources:
        requests:
          cpu: "100m"
          memory: "100Mi"
        limits:
          cpu: "100m"
          memory: "100Mi"

Deploy and port-forward:

kubectl apply -f pod-oomkill.yaml
kubectl port-forward pod/oomkill-pod 8080:8080 -n student-XX

Open http://localhost:8080 in your browser and navigate to the Memory tab. Gradually increase memory allocation until the container exceeds its 100Mi limit.

After the container is killed, stop port-forward and check the Pod status:

kubectl describe pod oomkill-pod -n student-XX

Look for OOMKilled in the Last State section:

    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137

Also observe the restart count:

kubectl get pod oomkill-pod -n student-XX
NAME          READY   STATUS    RESTARTS   AGE
oomkill-pod   1/1     Running   1          2m

Clean Up

kubectl delete pod guaranteed-pod burstable-pod besteffort-pod oomkill-pod -n student-XX

Common Problems

Problem Possible Cause Solution
Pod stuck in Pending Requested resources exceed available capacity on any node Reduce requests or check node capacity with kubectl describe node
OOMKilled restarts Container memory usage exceeds the limit Increase memory limit or optimize application memory usage
CPU throttling (slow app) CPU limit is too low Increase CPU limit or remove it (keep only requests)
Pod evicted Node under memory pressure, BestEffort Pods evicted first Set resource requests to avoid BestEffort QoS

Best Practices

  1. Always set resource requests — Without requests, the scheduler cannot make informed placement decisions, and Pods get BestEffort QoS.
  2. Set memory limits equal to requests — This ensures Guaranteed QoS and prevents OOM kills from affecting other workloads on the node.
  3. Be cautious with CPU limits — CPU throttling can cause latency spikes. Many teams set CPU requests but omit CPU limits.
  4. Start small and tune — Begin with conservative requests and use monitoring data to right-size resources over time.
  5. Use LimitRange and ResourceQuota in shared clusters — These cluster-level policies enforce defaults and maximums, preventing any single team from consuming all resources.

AKS Note

In shared AKS clusters, administrators typically configure ResourceQuota and LimitRange objects per namespace to enforce fair usage. A LimitRange can set default requests and limits so that Pods without explicit resource definitions still get reasonable defaults instead of BestEffort QoS.


Summary

In this exercise you learned:

  • The difference between resource requests (scheduling guarantee) and limits (runtime enforcement)
  • How CPU is measured in millicores and memory in Mi/Gi
  • The three QoS classes: Guaranteed, Burstable, and BestEffort
  • How to check the QoS class of a Pod using kubectl get pod -o yaml | grep qosClass
  • What happens when a container exceeds its memory limit (OOMKilled)
  • Why setting appropriate resource requests and limits is critical in shared clusters

results matching ""

    No results matching ""