Overview of Data Management
Evolution of Data Management
Manual Handling Era
File Systems Era
Database Systems Era
Core Concepts
Data
- Records describing entities (text, graphics, images, audio)
- Data alone lacks meaning without semantic interpretation; data and semantics are inseparable
Database (DB)
- A large collection of structured, shared data stored permanently in a computer system
Database Management System (DBMS)
- Software layer between users and operating system
- Manages database creation, usage, and maintenance
Database System (DBS)
- Consists of database, DBMS, application systems, database administrators, and users
Data Redundancy
- Degree to which identical data is stored multiple times
Data Security
- Protection against unauthorized access leading to data leakage or damage
- Ensures users can only access specific data according to defined permissions
Data Integrity
- Ensures correctness, validity, and consistency of data
- Enforces constraints to maintain data within aceptable ranges or relationships
Concurrency Control
- Manages simultaneous access by multiple users to prevent interference and ensure data consistency
Database Recovery
- Restores database from errors or failures to a consistent state
- Handles hardware failures, software issues, human errors, and malicious actions
Essential Components of a Database
Data (the actual content stored)
Storage Medium (typically disk drives)
Database Management System (DBMS for management)
Data Models
Model
- Abstract representation of real-world features
Data Model
- Abstract representation of data characteristics, including entities and their relationships
- Conceptual toolset for describing data, relationships, semantics, and constraints
Two-Level Abstraction
Conceptual Model
- Also known as information model, used for modeling in the information world
- Describes data and information from the user's perspective
- Facilitates database design, emphasizing semantic expressiveness
- Should be simple, clear, and easily understood by users
Logical Model
- Used in machine world, models data from computer system viewpoint
- Basis for DBMS implementation
- Requires formal definitions and often includes restrictions for easier implementation
- Includes formally defined syntax and semantics for data manipulation
Three Elements of Data Models
Data Structure
- Collection of object types studied
- Can be categorized into data type objects and relationship objects
Data Operations
- Set of operations allowed on data objects
- Mainly retrieval and update operations (insert, delete, modify)
- Static description vs dynamic behavior
Constraints
- Integrity rules defining valid data states and transitions
- Ensures data correctness, validity, and compatibility
Conceptual Models
Entity
- Any distinguishable object in reality, whether concrete or abstract
- Examples: students, departments, courses, bank accounts, enrollments, orders
Attribute
- Characteristics possessed by an entity
- Example: student entity consists of ID, name, gender, birth date, department, enrollment date
Key
- Minimal set of attributes uniquely identifying an entity
- Example: student ID as key for student entity
Domain
- Range of values an attribute can take
- Example: ID domain is 8-digit integers, gender domain is {male, female}
Entity Type
- General description of entities with common properties
- Example: Student(ID, name, gender, birth date, department, enrollment date)
Entity Set
- Collection of entities of the same type
- Example: all students constitute an entity set
Relationship
- Connections among entities or within entities
- Represents inter-entity relationships and intra-entity connections
Types of Relationships
One-to-One (1:1)
- Each entity in set A relates to at most one entity in set B, and vice versa
One-to-Many (1:n)
- Each entity in set A relates to n entities in set B (n ≥ 0), each entity in set B relates to at most one entity in set A
Many-to-Many (m:n)
- Each entity in set A relates to n entities in set B (n ≥ 0), each entity in set B relates to m entities in set A (m ≥ 0)
Conceptual Model Representation
E-R Diagram
- Rectangles represent entity types
- Ellipses represent attributes
- Diamonds represent relationships
- Lines connect entities to attributes and relationships
- Relationship types indicated near connecting lines
- Attributes of relationships connected to diamonds
Fundamental Data Models
Hierarchical Model
- Early model using a directed tree structure
- Features:
- Single root node with no parent
- All other nodes have exactly one parent
Network Model
- Uses a directed graph structure
- Features:
- Multiple nodes with out parents
- Nodes may have multiple parents
- Represents complex entity relationships
Relational Model
- Uses two-dimensional table structures
- Features:
- Data organized as table collections
- Simple, clear, and user-friendly
- Based on rigorous mathematical foundations
- Most widely used in modern databases
Key Terms
- Relation: corresponds to a table
- Tuple: row in a table
- Attribute: column in a table
- Primary Key: unique identifier for a tuple
- Domain: range of attribute values
- Component: value in a tuple
- Relation Schema: describes relation as (attribute1, attribute2, ..., attributen)
Characteristics
-
Unified Concept
- Entities and relationships represented as relations
- User view treats all data as tables
-
Normalization Requirements
- Each relation must meet normalization criteria
- Atomic data items only; no nested tables
-
Set-Based Operations
- Operations work on sets of tuples
- Transparent access paths improve data independence
Database System Architecture
Three-Level Schema: External, Conceptual, Internal
External Schema
- Also called user schema
- User interface to database system
- Describes partial logical structure visible to users
- Multiple external schemas per database
- One external schema per application
Conceptual Schema
- Also called logical schema
- Common view for all database users
- Describes complete logical structure and features
- One conceptual schema per database
- Defines data relationships, integrity, and security
Internal Schema
- Also called storage schema
- Describes physical structure and storage methods
- Internal representation of data
- One internal schema per database
- Independent of hardware and physical records
Two Mapping Functions: External/Conceptual and Conceptual/Internal
Schema Relationships
- Conceptual schema is core
- External schemas are subsets of conceptual schema
- Data accessed via external schemas, stored via internal schemas
- Conceptual schema provides isolation
- Internal schema depends on logical structure but not physical devices
Mapping
- Correspondence rule for conversion between representations
External/Conceptual Mapping
- Links external schema to conceptual schema
- Enables logical independence
- When conceptual schema changes, mapping updates maintain application compatibility
Conceptual/Internal Mapping
- Links conceptual schema to internal schema
- Enables physical independence
- When internal schema changes, mapping updates maintain application compatibility