0. TCP/IP Network Model Layers
The TCP/IP model consists of four layers: Application, Transport, Internet, and Network Access.
1. Linux Commands for Process Status, Memory Usage, and tar Extraction
Process Status: Use ps command. For example, ps -aux | grep PID shows the status of a specific process.
Memory Usage: Use free command. For example, free -m displays memory usage in megabytes.
tar Extraction Parameters:
-x: Extract files from an archive.-z: Filter the archive through gzip.-v: Verbosely list files processed.
2. Modifying File Permissions
Linux file permissions consist of three sets: owner, group, and others, each with read (r), write (w), and execute (x) rights. The chmod command modifies permissions. Permissions can be represented numerically: r=4, w=2, x=1. For example, rwxrwx--- equals 770.
Usage: chmod [-R] xyz file_or_directory
xyz: Numeric permission value.-R: Apply changes recursively.- Example:
chmod 770 test.c
3. Common Linux Commands
cd: Change directory.ls: List files and directories.grep: Search for patterns in output, often used with pipes.cp: Copy files or directories.mv: Move or rename files.rm: Delete files or directories.ps: Display running processes.kill: Send a signal to terminate a process.tar: Archive files, optionally compress with gzip/bzip.cat: View file contents (similar toless,more).top: Monitor system processes, CPU, memory usage.pwd: Print working directory.
4. Running a Program as Root
Use sudo to run a program with root privileges: sudo ./program_name
5. Soft Links vs Hard Links
Soft Link (Symbolic Link): A file that contains a path to another file or directory. Can link across filesystems, can link to non-existent files.
Hard Link: Multiple filenames pointing to the same inode on the same filesystem. Cannot link directories or cross filesystems.
Key Differences:
- Deleting a hard link does not affect other links. Deleting the original file makes soft links "dangling" (broken).
- Soft links can be created for directories; hard links cannot.
6. Static and Dynamic Libraries
Static Library Creation:
gcc -c hello.c # produces hello.o
ar rcs libhello.a hello.o
Usage:
gcc main.c -lhello -L. -o staticBinary
Dynamic Library Creation:
gcc -shared -fpic hello.c -o libhello.so
Usage:
gcc main.c -lhello -L. -o dynamicBinary
Differences:
- Static libraries are linked at compile time; dynamic at runtime.
- Static binaries are larger; dynamic saves memory and disk space.
- Static libraries load faster; dynamic libraries allow updates without recompilation.
7. GDB Debugging, Conditional Breakpoints, Multi-process Debugging
Compile with -g flag: gcc -g test.c -o test
Common GDB Commands:
quit: Exit debugger.list: Show source code.break 7: Set breakpoint at line 7.break func_name: Set at function.break line if condition: Conditional breakpoint.run: Start execution.next: Step over function calls.step: Step into functions.watch expr: Break when expression changes.
Multi-process Debugging:
Use set follow-fork-mode child to debug child process, or parent for parent.
8. Process Scheduling Algorithms
- First-Come, First-Served (FCFS): Non-preemptive, simple but poor for short jobs.
- Shortest Job First (SJF): Minimizes average waiting time, but can starve long jobs.
- Highest Response Ratio Next (HRRN): Balances short and long jobs, but impractical due to unknown service time.
- Round Robin (RR): Preemptive, each process gets a time slice. Good responsiveness but context-switch overhead.
- Multilevel Feedback Queue: Multiple queues with different priorities and time slices. New processes enter highest queue; if not completed, move to lower queue. Balances all job types.
9. Memory Management: Virtual and Physical Memory
Physical Memory: Managed by the memory manager, it allocates and deallocates memory for processes. Includes registers, cache, main memory, disk.
Virtual Memory: Provides address isolation, each process has its own virtual address space mapped to physical memory via MMU. Benefits: larger address space, memory protection, sharing. Drawbacks: overhead of mapping tables, page swaps.
10. Virtual-to-Physical Address Mapping
Segmentation: Divides memory into variable-sized segments, leading to external fragmentation.
Paging: Divides memory into fixed-size pages (e.g., 4KB). Virtual address split into page number and offset. Page tables map virtual pages to physical frames. Disadvantages: internal fragmentation, page table overhead. Mitigated by multi-level page tables and TLBs.
11. Kernel Mode vs User Mode and Virtual Memory Layout
Kernel Mode: Has full access to hardware and all instructions. Used by OS kernel. User Mode: Restricted access, applications run here. Transition via system calls or interrupts.
Virtual Memory Layout (32-bit):
- Kernel space: 1 GB (top)
- User space: 3 GB (bottom) containing:
- Code segment
- Data segment (initialized globals/statics)
- BSS segment (uninitialized globals/statics)
- Heap (grows upward)
- Memory mapping segment (shared libs, etc.)
- Stack (grows downward, typically 8 MB)
- Reserved area (low addresses)
12. Page Fault Interrupt
When a process accesses a page not in physical memory, a page fault occurs. The OS loads the required page from disk. Steps:
- CPU tries to access page, finds invalid page table entry.
- Page fault triggers interrupt.
- OS checks if page is valid; if so, finds disk location.
- If physical memory is full, use page replacement algorithm (e.g., LRU) to evict a page.
- Load required page into freed frame, update page table.
- Resume process from the faulting instruction.
Page Replacement Algorithms: LRU (Least Recently Used) selects the page not accessed for the longest time.
13. LRU Page Replacement Implementation
LRU can be implemented using a hash map and a doubly linked list. On access: if present, move node to head; if not, add new node at head. On eviction, remove tail node. This ensures the tail is the least recently used. Disadvantage: updating list on every memory access is expensive.
14. Stack and Heap Overflow
Stack Overflow: Occurs when recursion is too deep or a function call stack exceeds its limit, leading to StackOverflowError.
Heap Overflow: Happens when memory allocation (e.g., new in C++) is not freed, exhausting memory, leading to OutOfMemoryError.
15. malloc Memory Allocation
- For allocations < 128 KB,
brk()is used to adjust the heap boundary. - For allocations >= 128 KB,
mmap()maps memory in file mapping region.
brk() does not return memory to OS immediately (cached in malloc pool), reducing system calls. mmap() returns memory to OS on free(), but incurs more overhead.
16. Can a 32-bit System Access >4GB RAM?
Normally no, due to 32-bit address space limit (2^32 = 4GB). However, PAE (Physical Address Extension) allows access to up to 64GB by using a 36-bit physical address.
17. Concurrency vs Parallelism
Concurrency: Multiple tasks make progress within the same time period (interleaved execution on single core).
Parallelism: Multiple tasks run simultaneously on multiple cores.
18. Proces States
- Created: Process being initialized, PCB allocated.
- Ready: Process is ready to run, waiting for CPU.
- Running: Process is executing.
- Blocked: Process waiting for I/O or event.
- Terminated: Process finished execution.
Additional: Suspended states (ready-suspend, blocked-suspend) when swapped out to disk.
19. Process Control (Overview)
Process control involves creating, terminating, and switching processes using the Process Control Block (PCB). Context switching saves and restores state.
20. Process, Thread, and Coroutine Differences
- Process: Resource allocation unit, has own address space.
- Thread: Execution unit within a process, shares address space, faster creation/termination/context switch.
- Coroutine: Lightweight thread, user-space managed, extremely low overhead, no kernel involvement.
Context Switch Overhead: Thread < Process (since threads share virtual memory). Coroutines have even lower overhead.
21. Why Threads?
Threads reduce overhead of process creation and context switching. They share memory, simplifying communication.
22. Inter-Process Communication (IPC) Mechanisms
- Pipes: Unnamed (parent-child) or named (FIFO), half-duplex, limited size.
- Message Queues: Kernel-managed linked list of messages, supports random access, but incurs copy overhead.
- Semaphores: Counter for access control, used for synchronization.
- Signals: Asynchronous notifications (e.g., SIGINT, SIGKILL, SIGTERM).
- Shared Memory: Fastest IPC, multiple processes map same physical memory.
- Sockets: For network communication between different hosts.
23. Process Synchronization
- Semaphore: P (wait) and V (signal) operations to control access.
- Monitor: High-level synchronization construct ensuring mutual exclusion.
- Message Queues: Can also be used for synchronization.
24. Thread Synchronization Methods
- Mutex: Ensures exclusive access to a resource.
- Semaphore: Controls access to a shared resource with a count.
- Condition Variable: Allows threads to wait for a condition.
- Read-Write Lock: Multiple readers allowed, exclusive writer.
- Spinlock: Busy-waits until lock is available, used in multi-core systems.
25. Deadlock: Conditions and Solutions
Conditions: Mutex, Hold and Wait, No Preemption, Circular Wait.
Solutions:
- Break hold and wait (allocate all resources at once).
- Allow preemption.
- Impose resource ordering (circular wait prevention).
26. fork() System Call
fork() creates a new process by duplicating the parent. Returns child's PID to parent, 0 to child, -1 on error. Child inherits parent's address space.
27. Orphen and Zombie Processes
Orphan: Parent terminates before child; child is adopted by init (PID 1).
Zombie: Child terminates but parent does not call wait(); process table entry remains.
Prevention: Parent should call wait() or handle SIGCHLD signal to clean up child processes.
Kill zombie: Send SIGCHLD to parent: kill -s SIGCHLD <parent_pid>
28. Daemon Processes
A daemon is a long-running background process, detached from the terminal.
Implementation Steps:
- Fork child, exit parent.
- Call
setsid()to create new session. - Change working directory to
/. - Reset file mode mask with
umask(0). - Close all open file descriptors.
29. Multi-threading on Single-core Machine
Yes, locks are still needed because threads can be preempted at any time, leading to race conditions on shared data.
30. Context Switch Process
Process Context Switch: Save CPU registers and PCB of current process, load PCB of next, flush TLB, switch page tables.
Thread Context Switch: Save thread-specific registers and stack, load next thread's state; no need to switch address space if same process.
31. Multi-thread vs Multi-process
- Robustness: One thread crash can crash the whole process; processes are isolated.
- Communication: Threads share memory easily; processes need IPC.
- Overhead: Thread creation/context switch is lighter.
- Synchronization: Threads require locks to avoid data races.
32. sleep() vs wait()
sleep() delays execution of the calling thread/process for a specified time.
wait() is a system call used by a parent process to collect the exit status of a child process; it blocks until a child terminates.
33. Thread Pool Design
Design:
- Use a task queue (producer-consumer) protected by mutex and condition variable.
- Initialize a fixed number of worker threads.
- Threads block when queue is empty.
- When a task arrives, signal a waiting thread.
Thread Count Determination:
- CPU-bound:
CPU cores + 1 - I/O-bound:
2 * CPU cores + 1 - General formula:
(wait_time / cpu_time + 1) * CPU cores
Core vs Non-core Threads: Core threads are always alive; non-core threads are created when queue is full.
34. Zero-Copy in Linux
Traditional I/O involves multiple data copies and context switches. Zero-copy (e.g., sendfile()) allows data to be transferred directly from one file descriptor to another without copying to user space, reducing CPU overhead. It leverages DMA and PageCache. Not suitable for large files (may evict hot data).
35. select, poll, epoll
select: Monitors multiple FDs, linear scan, limited to 1024 FDs, inefficient.poll: Similar to select but no upper limit, still linear scan.epoll: Event-driven, uses callback, O(1) performance, scalable for many FDs.
36. select vs epoll Usage and Edge/Level Trigger
epollis better for high-concurrency servers.- Level-triggered (LT): Default, events reported as long as condition holds.
- Edge-triggered (ET): Events reported only when state changes; requires non-blocking I/O.
37. Reactor and Proactor Patterns
- Reactor: Synchronous evant demultiplexer, dispatches events to handlers.
- Proactor: Asynchronous, initiates operations and handles completions via callbacks.
38. Synchronous vs Asynchronous, Blocking vs Non-blocking
- Synchronous: Caller waits for operation to complete.
- Asynchronous: Caller returns immediately, later notified of completion.
- Blocking: Call blocks until operation finishes.
- Non-blocking: Call returns immediately, may not complete operation.
39. BIO vs NIO
- BIO (Blocking I/O): One thread per connection, simple but resource-intensive.
- NIO (Non-blocking I/O): Uses selectors to multiplex I/O, supports non-blocking operations, better scalability.
40. Five I/O Models
- Blocking I/O
- Non-blocking I/O
- I/O Multiplexing (select/poll/epoll)
- Signal-driven I/O
- Asynchronous I/O (POSIX AIO)
41. Socket Programming Functions
Server: socket(), bind(), listen(), accept(), recv()/send(), close().
Client: socket(), connect(), send()/recv(), close().
42. Classic Synchronization Problems
- Dining Philosophers: Avoid deadlock using resource hierarchy or a waiter.
- Readers-Writers: Multiple readers allowed, exclusive writers. Use read-write locks.