A process represents an independent executable program unit that contains one or more threads. A thread serves as the fundamental execusion unit within a process and represents the smallest schedulable unit by the operating system.
Multithreading
Multithreading enables concurrent execution of multiple threads within a single process.
Advantages of Multithreading
- Background execution: Time-consuming operations like database access and I/O can run in parallel with other tasks
- Performance improvement: Better utilization of system resources
- Responsive interfaces: User input and file operations can proceed without blocking main execution
Disadvantages of Multithreading
- Resource consumption: Each thread requires memory allocation
- Performance overhead: Context switching between threads consumes CPU cycles
- Complexity: Thread management introduces synchronization challenges and potential deadlocks
- Unexpected behavior: Improper thread handling can cause program instability
Note: While multithreading creates the illusion of simultaneous execution, the operating system actually rapidly switches between threads using time-slicing algorithms.
Multiprocessing
Multiprocessing involves running multiple independent processes simultaneously, typically corresponding to separate program executions.
# Process creation example
import multiprocessing
def worker_task():
print("Process executing task")
if __name__ == "__main__":
process = multiprocessing.Process(target=worker_task)
process.start()
process.join()
Choosing Between Approaches
Analogy
- Single process, single thread: One person eating at a table
- Single process, multiple threads: Multiple people sharing dishes at one table
- Multiple processes, single thread: Multiple people each eating at their own table
Platform Considerations
- Windows: Process creation carries significant overhead, favoring multithreading approaches
- Linux: Process creation is lightweight, making multiprocessing more practical
# Performance comparison concept
import time
def measure_overhead():
start_time = time.time()
# Repeated process creation and destruction
elapsed = time.time() - start_time
return elapsed
Modern Considerations
With multi-core processors, context switching overhead becomes significant. Threads sharing resources require data transfer between cores, while processes maintain independent resources. Server applications often prefer process-based architectures for better multi-core utilization.
Concurrency Terminology
- Parallelism: True simultaneous execution using multiple CPU cores or machines
- Concurrency: The appearance of simultaneous execution through rapid task switching
- Thread safety: Ensuring correct behavior when multiple threads access shared resources
- High concurrency: Systems handling numerous simultaneous requests, requiring optimization across hardware, architecture, and software layers
Multithreading represents one approach to address high concurrency scenarios by maximizing resource utilization and preventing computational idle time.