Database Recovery and Consistency
RPO (Recovery Point Objective)
Recovery Point Objective (RPO) defines the maximum acceptable amount of data loss measured in time. For example, an RPO of 4 hours means that in the event of a failure, the system can be restored to a state no older than 4 hours before the failure occurred. Databases achieving RPO=0 guarantee no data loss during recovery.
PITR (Point-in-Time Recovery)
Point-in-Time Recovery allows restoring a database to a specific moment in the past. This capability relies on comprehensive transaction logs that record all changes, enabling precise restoration to any desired timestamp before failure.
WAL (Write-Ahead Logging)
Write-Ahead Logging is a technique where database modifications are first written to a log before being applied to the actual data files. This ensures that in case of failure, the database can be restored to a consistent state by replaying the transaction logs.
Database Transaction Models
ACID Properties
ACID represents four key properties of reliable database transactions:
- Atomicity: Transactions are all-or-nothing operations
- Consistency: Transactions bring the database from one valid state to another
- Isolation: Concurrent transactions don't interfere with eachother
- Durability: Once committed, transactions remain permanent
MVCC (Multi-Version Concurrency Control)
MVCC allows multiple versions of data to exist simultaneously, enabling readers to access consistent data snapshots without blocking writers. This concurrency control mechanism improves performance in read-heavy workloads.
Database Architectures
HTAP (Hybrid Transactional/Analytical Processing)
HTAP systems combine transactional processing and analytical capabilities in a single platform. This architecture eliminates the need for separate ETL processes to move data between transactional and analytical systems.
MPP (Massively Parallel Processing)
MPP architectures distribute database operations across multiple servers, with each server handling a portion of the data. This approach enables horizontal scaling and improved performance for large-scale data processing tasks.
Authentication Systems
SSO (Single Sign-On)
Single Sign-On is an authentication scheme that allows users to log in once and gain access to multiple systems without re-authenticating. Key benefits include:
- Improved user experience through reduced credential management
- Enhanced security through centralized authentication
- Simplified administration of access controls
- Reduced password-related help desk requests
Business Models
B2B (Business-to-Business)
B2B models involve transactions between companies, typically characterized by larger order volumes, longer sales cycles, and more complex decision-making processes. Examples include wholesale distribution, enterprise software, and component manufacturing.
B2C (Business-to-Consumer)
B2C models focus on transactions between businesses and end consumers, generally characterized by higher volumes, lower individual transaction values, and marketing-driven sales cycles. Examples include retail e-commerce, streaming services, and consumer applications.
Network Protocols
TCP (Transmission Control Protocol)
TCP provides reliable, connection-oriented communication with these characteristics:
- Establishes a connection before data transfer
- Guarantees ordered delivery of packets
- Implements flow control and congestion control
- Uses significant header overhead (20 bytes)
- Suitable for applications requiring data integrity (web, email, file transfer)
UDP (User Datagram Protocol)
UDP offers lightweight, connectionless communication with these features:
- No connection establishment required
- No guaranteed delivery or ordering
- No flow or congestion control
- Minimal header overhead (8 bytes)
- Ideal for real-time applications (streaming, gaming, VoIP)
Database Performance Metrics
QPS (Queries Per Second)
QPS measures the number of search operations a database can process in one second. This metric is particularly relevant for read-heavy applications and helps evaluate query performance under load.
TPS (Transactions Per Second)
TPS quantifies the number of database transactions completed per second. Transactions typically involve multiple operations (like reads and writes), making TPS a key indicator of write-intensive system performance.
TPC-C Benchmark
TPC-C is an industry standard benchmark for evaluating online transaction processing (OLTP) systems. It simulates a complex order-processing environment with multiple transaction types including new orders, payments, and stock-level checks.