Compilation Pipeline Overview
Transforming source code into an executable involves several stages:
- Preprocessing: Expands macros and includes, producing a
.ifile still in C syntax. - Compilation: Translates the preprocessed file into assembly language (
.s). - Assembly: Converts assembly into relocatable binary objects (
.o). - Linking: Combines multiple object files into a single executable program.
A library is essentially an archive of object files (excluding main) that encapsulate reusable functions.
Static vs. Dynamic Libraries
Identification Commands and Naming
- Use
lddto list dynamic dependencies of an executable. It does not show static libraries because their contents are embedded during linking. - On Linux:
- Shared object:
.soextension → dynamic library - Archive:
.aextension → static library
- Shared object:
- On Windows:
- Dynamic link library:
.dll - Static library:
.lib
- Dynamic link library:
Static Library (.a): Code is copied into the executable at link time; no external dependency at runtime.
Dynamic Library (.so): Code is linked at runtime; multiple executables can share a single loaded instance, reducing disk and memory footprint via virtual memory mapping.
Executables linked with a dynamic library store only function entry references; actual machine code is loaded from the library file into memory when execution starts—a process called dynamic linking.
Creating and Using Static Libraries
Build Process
Object files (add.obj, sub.obj, etc.) are archived using:
ar -rc libcalc.a add.obj sub.obj
# 'ar' = archive tool; 'rc' = replace/create
Linking with Compiler
Specify include path, library path, and library name explicitly:
gcc main.c -o app -I./calc_pkg/headers -L./calc_pkg/bin -lcalc
# -I : additional header search directory
# -L : directory containing the library
# -l : library base name (libcalc.a → calc)
Packaging for Distribution
Organize headers and the archive:
mkdir -p calc_pkg/headers calc_pkg/bin
cp *.h calc_pkg/headers
cp *.a calc_pkg/bin
Sample Makefile
STATIC_LIB = libcalc.a
$(STATIC_LIB): add.obj sub.obj mul.obj div.obj
ar -rc $@ $^
%.obj: %.c
gcc -c $<
.PHONY: package
package:
mkdir -p calc_pkg/headers calc_pkg/bin
cp *.h calc_pkg/headers
cp *.a calc_pkg/bin
.PHONY: clean
clean:
rm -f *.obj *.a calc_pkg app
Creating and Using Dynamic Libraries
Build Process
Generate position-independent object files:
gcc -fPIC -c add.c sub.c mul.c div.c
Combine them into a shared object:
gcc -shared add.obj sub.obj mul.obj div.obj -o libcalc.so
# Equivalent forms:
gcc -shared -o libcalc.so add.obj sub.obj mul.obj div.obj
-fPIC ensures code can execute correctly regardless of load address, critical for shared usage across processes.
Linking Executable
Same compiler flags as static case:
gcc main.c -o app -I./calc_pkg/headers -L./calc_pkg/bin -lcalc
At runtime, ensure the loader finds libcalc.so:
- Install system-wide: Copy library and headers into standard locations (
/usr/lib,/usr/include). - Local symlink: In executable's directory, create a symlink named exactly as the library (
libcalc.so). - Environment variable: Temporarily extend
LD_LIBRARY_PATHto include the library directory. - System config: Add a new file under
/etc/ld.so.conf.d/containing the custom path, then runldconfigto refresh the linker cache permanently.
Packaging for Distribution
mkdir -p calc_pkg/headers calc_pkg/bin
cp *.h calc_pkg/headers
cp *.so calc_pkg/bin
Sample Makefile
SHARED_LIB = libcalc.so
$(SHARED_LIB): add.obj sub.obj mul.obj div.obj
gcc -shared -o $@ $^
%.obj: %.c
gcc -fPIC -c $<
.PHONY: package
package:
mkdir -p calc_pkg/headers calc_pkg/bin
cp *.h calc_pkg/headers
cp *.so calc_pkg/bin
.PHONY: clean
clean:
rm -f *.obj *.so calc_pkg app
Key Differences
- Static: Entire library code is duplicated inside the final executable; no runtime library needed.
- Dynamic: Only references are embedded; actual code is mapped at launch. Distribution must include the
.so/.dllfiles.
Loading Strategies for Dynamic Libraries
- Static load: Entire library image is copied into the process address space during startup—always available but increases memory use.
- Dynamic load: Segments are mapped on demand—memory-efficient but incurs latency on first use.
Address Mapping in Dynamic Libraries
Executable formats like ELF record entry points and offsets for function calls. Virtual addresses are assigned at compile time.
Shared libraries use relative addressing: each exported symbol’s location is stored as an offset from the library’s base load address in memory. This enables identical code to work regardless of where the library is loaded.
When execution begins, the program’s virtual address space is populated via paging. Calls into a dynamic routine use the recorded offset plus the library’s base virtual address, resolved through the page table to physical RAM.
Runtime Call Sequence for Dynamic Functions
Execution starts in main, mapped from disk via page tables. A call to a function at virtual address 0x2222 places that address into the instruction register. The CPU translates it through the active page table to locate the real code in physical memory, then proceeds with execution.