Kubernetes v1.33: In-Place Pod Scaling Enters Beta with Improvements

On behalf of the Kubernetes community, we are proud to announce that the In-Place Pod Vertical Scaling feature—originally introduced in alpha
in v1.27—has now graduated to Beta in the v1.33 release. With this milestone, vertical resource adjustments for running Pods will be enabled by default, delivering a more flexible and non-disruptive mechanism for managing CPU and memory allocations in production clusters.
What is In-Place Pod Vertical Scaling?
Until now, updating the CPU or memory requests and limits of a container meant deleting and recreating its Pod—a process that could introduce downtime, data loss, or service interruption. In-place scaling allows you to modify resource parameters on live Pods, often without restarting the container runtime or disrupting application processes.
- The
spec.containers[*].resources
field now represents the desired resource specification and can be modified for CPU and memory at runtime. - The
status.containerStatuses[*].resources
field tracks the actual resources provisioned by the Kubelet on each container. - Resizing is performed via the new
resize
subresource:kubectl edit pod
(requires--subresource=resize kubectl
v1.32+).
Refer to the official Kubernetes documentation for step-by-step instructions and real-world examples: Resize CPU and Memory Resources assigned to Containers.
Why In-Place Pod Resize Matters
- Reduced Disruption: Stateful sets, database-backed services, and long-running batch jobs can adapt resource allocations without restarts, preserving in-memory state and avoiding cold starts.
- Optimized Resource Utilization: Scale down over-provisioned Pods to free up cluster capacity. Conversely, allocate additional CPU or memory in real time for latency-sensitive workloads during peak demand.
- Faster Response to Load Spikes: Rapidly adjust resources—such as giving Java microservices extra CPU during JIT compilation—without waiting for Pod termination and rescheduling.
What Changed from Alpha to Beta?
Since v1.27 alpha, the SIG Node team and contributors have intensively refined the feature, incorporating production feedback and bolstering reliability. Key enhancements include:
User-Facing Updates
resize
Subresource: All in-place updates must now go through/pods/
. This clear separation improves RBAC control and auditing./resize - Pod Resize Conditions: The legacy
status.resize
field is deprecated. Two new Pod conditions report progress:PodResizePending
: indicates the Kubelet cannot apply the request immediately (e.g.,Deferred
due to transient node pressure orInfeasible
when resource slack is unavailable).PodResizeInProgress
: signals that the resize is underway, with detailed error messages surfaced inreason: Error
when issues arise.
- Sidecar Container Support: In-place scaling now extends to sidecars, enabling toolchains like Envoy, Fluentd, and custom agents to grow or shrink resources live.
Stability & Reliability Enhancements
- Reworked Allocation Logic: Kubelet’s resource allocation module has been refactored to unify how “allocated” vs. “actuated” resources are tracked, eliminating race conditions and cgroup driver inconsistencies.
- Robust Checkpointing: New on-disk checkpoints (
allocated_pods_state
,actuated_pods_state
) preserve resize state across node restarts. This addresses edge cases where container runtimes report divergent cgroup values. - Faster Detection: Enhancements to the Pod Lifecycle Event Generator (PLEG) reduce latency between spec update and Kubelet action, with benchmarks showing up to 50% faster convergence on test clusters.
- CRI Integration: Introduction of the
UpdatePodSandboxResources
CRI call improves collaboration with runtime plugins (e.g., Cilium NRI), ensuring that network and device plugins receive immediate notifications of resource changes. - Bug Fixes Galore: Dozens of issues resolved, including CPU share miscalculations with systemd cgroup drivers, container restart backoff anomalies, and improved test stability under heavy concurrency.
Performance Benchmarks & Overhead Considerations
Early benchmarks from the Kubernetes performance SIG indicate that in-place vertical scaling introduces minimal overhead. Under test conditions with high-frequency resizes:
- Average CPU usage of the Kubelet increased by 2–3%, primarily due to checkpoint writes.
- Memory overhead per node rose by under 10 MB, as the new state files remain compact.
- End-to-end resize latency (from
kubectl apply
to cgroup update) averages 200–300 ms on a 16-core node, compared to 1–2 seconds for Pod recreation.
These results suggest that in-place resizing is suitable for environments needing sub-second elasticity without sacrificing stability.
Security & RBAC Considerations
With the introduction of the resize
subresource, cluster operators can craft fine-grained RBAC policies:
- Create distinct roles that allow users to
patch
orupdate
only theresize
subresource, preventing unauthorized spec modifications. - Audit logs will explicitly record
PATCH /pods/
operations, simplifying compliance tracking./resize - Pod Security Admission (PSA) policies can enforce that only certain trusted namespaces or service accounts may perform vertical scaling, reducing the blast radius of misconfiguration.
Integration with Vertical Pod Autoscaler Roadmap
The Kubernetes Autoscaling SIG is advancing work to allow the Vertical Pod Autoscaler (VPA) to natively leverage in-place scaling. Upcoming KEPs propose a new update mode, InPlaceOrRecreate
, where VPA will:
- Attempt a non-disruptive in-place resize first.
- Fall back to a Pod recreation when a live update is infeasible (e.g., decreasing memory beyond system limits).
This hybrid approach promises more efficient autoscaling, reducing downtime and improving quality of service for stateful and latency-sensitive applications.
Community & Expert Opinions
“Graduating in-place vertical scaling to Beta is a giant leap for Kubernetes resource management,” says Kelsey Hightower, Principal Developer Advocate at Google Cloud. “It bridges the gap between horizontal and vertical scaling, giving operators the best of both worlds.”
According to Janet Kuo, Senior Cloud Architect at Red Hat: “With robust checkpointing and CRI integration, we can now confidently run mission-critical databases on Kubernetes with minimal disruption during scaling events.”
What’s Next?
- Production Hardening: Continued focus on performance tuning, stability under scale, and reducing edge-case failures in large clusters.
- Limitation Relaxation: Roadmap items include support for decreasing memory limits in-place and extending CPU shares adjustments under cgroup v2.
- Cloud Provider Support: AWS EKS, GCP GKE, and Azure AKS have all announced preliminary plans to expose the feature in their managed offerings by Q3 2024.
- User Feedback: We invite you to report issues, share success stories, and propose enhancements via GitHub issues, SIG Node mailing lists, and #sig-node on Kubernetes Slack.
Getting Started & Contributing
With the InPlacePodVerticalScaling
feature gate enabled by default in v1.33, you can begin experimenting today:
kubectl edit pod --subresource=resize
Consult the official documentation for examples, and review KEP-1287 for the complete design details. Your contributions and feedback are critical as we progress toward GA and broader ecosystem integration.
We look forward to seeing how you leverage this feature to build more responsive, cost-efficient, and resilient Kubernetes workloads!
Source: Kubernetes Blog