· 3 min read

Kubernetes Resource Management: A Practical Guide

kubernetes devops platform-engineering

One of the most common issues I see in Kubernetes deployments is improper resource management. Teams either set resources too conservatively (wasting cluster capacity) or too aggressively (causing OOM kills and CPU throttling). Let’s dive into how to do this properly.

Understanding Requests vs Limits

First, let’s clarify the difference:

  • Requests: The guaranteed amount of resources your pod will receive. The scheduler uses this to decide where to place your pod.
  • Limits: The maximum amount of resources your pod can use. Exceeding memory limits causes OOM kills; exceeding CPU limits causes throttling.
resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

Common Mistakes

1. Setting Limits Equal to Requests

While this seems safe, it can cause problems:

  • No room for burst traffic handling
  • Increased CPU throttling under load
  • Higher chance of OOM kills

A better approach is to set limits at 1.5-2x your requests, allowing for reasonable bursts while preventing runaway processes.

2. Not Setting Any Resources

If you don’t set requests, your pods are treated as “best effort” and will be the first to be evicted under memory pressure. Always set at least requests.

3. Copying Resources from Stack Overflow

Every application is different. What works for a sample nginx deployment won’t work for your Java application with specific heap requirements.

Measuring Actual Usage

Before setting resources, measure what your application actually needs:

# Check current resource usage
kubectl top pods -n your-namespace

# Get historical metrics (if you have Prometheus)
sum(container_memory_working_set_bytes{namespace="your-namespace"}) by (pod)

Tools like Vertical Pod Autoscaler (VPA) in recommendation mode can help you understand what your applications actually need.

My Recommendations

  1. Start with measurement: Run your application under realistic load and measure actual usage
  2. Set requests to P95 usage: Your requests should cover 95% of normal operation
  3. Set memory limits carefully: Memory limits should be higher than requests, but not unlimited
  4. Consider not setting CPU limits: CPU throttling can cause latency spikes; some teams skip CPU limits entirely and rely on requests for scheduling
  5. Use LimitRanges: Set namespace-level defaults so nothing runs without resources

Conclusion

Resource management is crucial for both cost optimization and application reliability. Take the time to measure, set appropriate values, and monitor. Your cluster (and your wallet) will thank you.

In future posts, I’ll cover more advanced topics like Quality of Service classes, Priority Classes, and implementing Vertical Pod Autoscaler effectively.