Paging Mechanism in Protected Mode

Paging Mechanism

The compiler treats addresses as contiguous sequences, known as linear addresses. In a segmented-only model, the CPU treats linear addresses as physical addresses directly. However, this traditional approach has significant limitations:

  • Segmentation requires each segment's memory to be contiguous. When allocating large memory blocks, fragmentation issues may prevent finding sufficient continuous space. As programs allocate and free memory repeatedly during execution, memory becomes increasingly fragmented.
  • Segmentation cannot effectively achieve complete isolation between processes. Different processes share the same memory model, and misconfigured segment descriptors may lead to memory conflicts or data leaks between processes.

Paging manages memory by dividing it into fixed-size pages, where each page can occupy a non-contiguous position in physical memory. This mechanism is natively supported by the CPU at the hardware level. Once paging is enabled, the CPU automatically translates all linear addresses from assembly code into physical addresses using page tables.

Single-Level Page Table

The CPU defines a page size of 4KB, dividing the 4GB addressable space into 1 million pages. A 32-bit linear address splits into two parts: the upper 20 bits serve as the page table index, while the lower 12 bits represent the offset within a page.

Two-Level Page Table

Each page table entry occupies 4 bytes. Mapping the complete physical memory requires at least 4MB of space for a single-level page table. Since each process needs its own independent address space, the memory consumption of single-level page tables becomes impractical.

The x86 architecture defaults to a two-level paging scheme. The 1 million pages distribute evenly across 1024 page tables, with each page table containing 1024 entries of 4 bytes each. This means the total size of a second-level page table equals 4KB—a single page. Unlike single-level page tables that must be pre-allocated, two-level page tables only require the page directory to exist upfront. Second-level page tables are created dynamically, significantly reducing memory overhead.

The traditional x86 two-level paging divides the linear address space as follows:

  • Upper 10 bits: Index into the page directory to locate a page directory entry (PDE). The PDE contains the physical address of a page table.
  • Middle 10 bits: Index into a specific page table to locate a page table entry (PTE).
  • Lower 12 bits: Offset within the target page.

Since both PDEs and PTEs are 4 bytes in length, accessing a linear address involves these steps:

  1. Multiply the high 10 bits of the virtual address by 4 and add the page directory's physical address to obtain the PDE's physical address. Read the content at that address to retrieve the page table's physical address.
  2. Multiply the middle 10 bits of the virtual address by 4 and add the page table's physical address to obtain the PTE's physical address. Read the PTE's content to extract the required physical address from its data structure.
  3. Add the PTE's physical base address to the lower 12 bits of the virtual address to get the final physical address for the memory access.

PTE and PDE share an identical structure:

Bit Position Attribute Name Description Common Values
0 P (Present) Whether the page exists in physical memory 0: Not present, 1: Present
1 R/W (Read/Write) Read/write permissions for the page 0: Read-only, 1: Read/Write
2 U/S (User/Supervisor) Access privileges for user and kernel modes 0: Kernel mode only, 1: User mode allowed
3 PWT (Page Write-Through) Write policy for caching 0: Write-back, 1: Write-through
4 PCD (Page Cache Disable) Cache enable/disable policy 0: Cache enabled, 1: Cache disabled
5 A (Accessed) Whether the page has been accessed 0: Not accessed, 1: Accessed
6 D (Dirty) Whether the page has been modified (PTE only) 0: Clean, 1: Dirty
7 PS (Page Size) Page size selector 0: 4KB page, 1: 4MB page
8 G (Global) Global page flag (not flushed from TLB on context switch) 0: Non-global, 1: Global
9-11 AVL (Available) Reserved bits for OS use Undefined, OS-defined
12-31 Base Address Physical address of page table or page (aligned to 4KB, lower 12 bits are zero) Base address of page table or physical page

The page directory's physical base address must be stored in the CR3 register, known as the Page Directory Base Regsiter. The PCD and PWT bits are typically set to 0, making all lower 12 bits zero.

Multi-Process and Paging

  • Page tables form the foundation for implementing virtual memory in multi-process operating systems. Each process maintains its own page table, ensuring memory isolation between processes.
  • The operating system dynamically manages each process's virtual address space through on-demand memory allocation, shared memory regions, and page swapping.
  • Virtual address space typically divides into user space and kernel space. Processes map virtual addresses to physical addresses through their page tables.

User processes typically rely on kernel system calls. When allocating address space, the kernel reserves high address ranges for its own code. All processes share the same kernel space, which maps to the same physical memory region.

Example: Linux-Like Address Space Mapping

init_paging:
    mov ecx, 4096
    mov edi, 0
.clear_dir_entries:
    mov byte [PAGE_DIR_BASE + edi], 0
    inc edi
    loop .clear_dir_entries

.setup_directory:
    mov eax, PAGE_DIR_BASE
    add eax, 0x1000
    mov ebx, eax
    or eax, PAGE_USER | PAGE_WRITE | PAGE_PRESENT

    ; PDE[0] and PDE[768] both point to the same page table (PTE[0])
    ; PDE[0] maps virtual 0x00000000-0x003FFFFF to physical 0x00000000-0x003FFFFF
    ; PDE[768] maps virtual 0xC0000000-0xC03FFFFF to physical 0x00000000-0x003FFFFF
    ; Since kernel and loader reside in the low 4MB, we map kernel to virtual high 3GB (0xC0000000-0xFFFFFFFF)
    ; PDE[0] ensures linear addresses equal physical addresses for loader code (0-0xFFFFF)
    mov dword [PAGE_DIR_BASE + 0x0], eax
    mov dword [PAGE_DIR_BASE + 0xc00], eax

    ; Last PDE points to the page directory physical address for dynamic page table manipulation
    sub eax, 0x1000
    mov dword [PAGE_DIR_BASE + 4092], eax

    ; Initialize Page Table 0 (maps first 4MB)
    mov ecx, 1024
    mov edi, 0
    mov edx, PAGE_USER | PAGE_WRITE | PAGE_PRESENT
.populate_entries:
    mov dword [ebx + edi * 4], edx
    add edx, 4096
    inc edi
    loop .populate_entries

    ; Setup remaining kernel PDEs (769-1022)
    mov eax, PAGE_DIR_BASE
    add eax, 0x2000
    or eax, PAGE_WRITE | PAGE_USER | PAGE_PRESENT
    mov ebx, PAGE_DIR_BASE
    mov ecx, 254
    mov edi, 769
.kernel_pde_loop:
    mov [ebx + edi * 4], eax
    inc edi
    add eax, 0x1000
    loop .kernel_pde_loop
    ret

Mapping results in Bochs simulator:

0x00000000-0x003fffff -> 0x000000000000-0x0000003fffff
0xc0000000-0xc03fffff -> 0x000000000000-0x0000003fffff
# The following entries result from the last PDE pointing to the page directory itself,
# causing the simulator to interpret PDE entries as 256 PTE entries.
0xffc00000-0xffc00fff -> 0x000000101000-0x000000101fff
0xfff00000-0xffffefff -> 0x000000101000-0x0000001fffff
0xfffff000-0xffffffff -> 0x000000100000-0x000000100fff

Page directory entries are accessible at address 0xfffffxxx, where xxx equals the directory entry index multiplied by 4.

Tags: operating-systems x86 paging protected-mode page-table

Posted on Tue, 19 May 2026 22:39:43 +0000 by NTM