Linux Cgroups
Why Cgroups Exist
Linux Namespaces provide environmental isolation for containers and processes—similar to how chroot confines a user to a specific directory tree. However, as discussed in namespace documentation, certain resources remain globally accessible and largely unrestricted: memory, CPU cycles, disk I/O, and more. A process running within an isolated namespace can still consume system resources that affect other processes outside its boundary.
Cgroups solve this problem by implementing a resource isolation mechanism directly in the Linux kernel, exposed through the filesystem at /sys/fs/cgroup. This subsystem limits, controls, and accounts for resource usage by groups of processes.
The mechanism works by first defining resource constraints on the system—for example, limiting a group to 20% CPU utilization. Then, specific processes are assigned to the control group, and the constraints immediately apply to all members.
Cgroups enable fine-grained control over system resources at the group level, making them essential for container runtimes and system isolation.
Cgroup Filesystem Structure
The cgroup filesystem exposes multiple resource controllers:
$ ls -la /sys/fs/cgroup/
drwxr-xr-x 15 root root 380 Oct 24 14:35 ./
drwxr-xr-x 8 root root 0 Oct 24 19:33 ..
drwxr-xr-x 4 root root 0 Oct 24 19:33 blkio/
lrwxrwxrwx 1 root root 11 Oct 24 14:35 cpu -> cpu,cpuacct/
lrwxr-xr-x 5 root root 11 Oct 24 14:35 cpuacct -> cpu,cpuacct/
drwxr-xr-x 0 root root 0 Oct 24 19:39 cpu,cpuacct/
drwxr-xr-x 2 root root 0 Oct 24 19:33 cpuset/
drwxr-xr-x root root 0 Oct 24 19:33 devices/
drwxr-xr-x 2 root root 0 Oct 24 19:33 freezer/
drwxr-xr-x 2 root root 0 Oct 24 19:33 hugetlb/
drwxr-xr-x root root 0 Oct 24 19:33 memory/
lrwxrwxrwx 1 root root 16 Oct 24 14:35 net_cls -> net_cls,net_prio/
drwxr-xr-x 2 root root 0 Oct 24 19:33 net_cls,net_prio/
lrwxrwxrwx 1 root root 16 Oct 24 14:35 net_prio -> net_cls,net_prio/
drwxr-xr-x 2 root root 0 Oct 24 19:33 perf_event/
drwxr-xr-x 0 root root 0 Oct 24 19:33 pids/
drwxr-xr-x 2 root root 0 Oct 24 19:33 rdma/
drwxr-x 5 root root 0 Oct 24 19:33 systemd/
drwxr-x 5 root root 0 Oct 24 19:33 unified/
Key controllers include:
- cpu: CPU time allocation
- cpuacct: CPU accounting reports
- memory: Memory limits and reporting
- blkio: Block I/O throttling
- cpuset: CPU and memory node assignment
- pids: Process count limits
Practical Example: CPU Throttling
Consider a CPU-bound loop program:
#include <stdio.h>
#include <unistd.h>
int main(void) {
volatile long counter = 0;
while (1) {
counter++;
}
return 0;
}
Running this program causes 100% CPU utilization as shown in top. To constrain this behavior:
Step 1: Create a control group with CPU limits
Create a new directory under /sys/fs/cgroup/cpu to define a custom group:
mkdir -p /sys/fs/cgroup/cpu/limited_group
Set the CPU quota to 50% (each CPU has a period of 100ms; 50000 microseconds equals 50%):
echo 50000 > /sys/fs/cgroup/cpu/limited_group/cpu.cfs_quota_us
echo 100000 > /sys/fs/cgroup/cpu/limited_group/cpu.cfs_period_us
Step 2: Assign processes to the control group
Add a shell session to the group:
echo $$ > /sys/fs/cgroup/cpu/limited_group/tasks
Any subsequent command in that shell session respects the 50% CPU limit.
Memory Limitation
The memory controller provides similar functionality for memory constraints. Create a group and set limits:
mkdir -p /sys/fs/cgroup/memory/mem_limited_group
echo 256M > /sys/fs/cgroup/memory/mem_limited_group/memory.limit_in_bytes
echo $$ > /sys/fs/cgroup/memory/mem_limited_group/tasks
Processes in this group cannot allocate beyond 256MB of memory.
Hierarchical Design
Cgroups maintain a hierarchical structure—parent groups can contain child groups, and constraints propagate downward. This enables nested isolation patterns commonly seen in container orchestration systems.
Control Group Lifecycle
When a control group directory is created, it inherits settings from its parent. To remove a group, ensure all processes have exited the group, then remove the directory:
rmdir /sys/fs/cgroup/cpu/limited_group
Key Files in Each Controller
Each controller exposes specific files for configuration and monitoring:
tasks: List of process IDs in the groupcgroup.procs: Thread group IDsnotify_on_release: Trigger on group emptyrelease_agent: Executable path for cleanup scripts
These interfaces provide programmatic access to resource constraints without requiring direct system calls.