I extracted a few points to grasp the rough idea of how Kubernetes manages resources QoS:
- Pods that need to stay up reliably can request guaranteed resources, while pods with less stringent requirements can use resources with weaker or no guarantee.
- If a pod is successfully scheduled, the container is guaranteed the amount of resources requested. Scheduling is based on requests and not limits. The pods and its containers will not be allowed to exceed the specified limit.
- CPU is compressible resource. Pods will be throttled if they exceed their CPU limit. If CPU limit is unspecified, then the pods can use excess CPU when available.
- Memory is uncompressible resource. Pods will get the amount of memory they request, if they exceed their memory request, they could be killed (if some other pod needs memory), but if pods consume less memory than requested, they will not be killed (except in cases where system tasks or daemons need more memory). When Pods use more memory than their limit, a process that is using the most amount of memory, inside one of the pod’s containers, will be killed by the kernel.
- 3 QoS classes: Guaranteed, Burstable, and Best-Effort, in decreasing order of priority. Processes with higher OOM scores are killed.
- If limits and optionally requests (not equal to 0) are set for all resources across all containers and they are equal (request == limit), then the pod is classified as Guaranteed.
- If requests and optionally limits are set (not equal to 0) for one or more resources across one or more containers, and they are not equal (request != limit), then the pod is classified as Burstable. When limits are not specified, they default to the node capacity.
- If requests and limits are not set for all of the resources, across all containers, then the pod is classified as Best-Effort
- The current QoS policy assumes that swap is disabled. If swap is enabled, then resource guarantees (for pods that specify resource requirements) will not hold.