Implementing Full-Text Search in Databases: Architecture and Techniques

Full-text search implementation revolves around initializing the search environment, creating indexes on table fields, processing search terms through tokenization, matching terms against indexed records, and returning relevant results.

Implementation Workflow

  1. Initialize Full-Text Search: Execute FullText.init() to set up necessary database schemas and tables if not already present.
  2. Create Full-Text Index: Use FT_CREATE_INDEX function to build indexes for specific tables and columns, triggering index construction for existing data.
  3. Configure Data Triggers: Implement triggers on indexed tables for insert/update/delete operations to maintain index-data synchronization.
  4. Execute Search Queries: Users submit searches via FT_SEARCH function with parameters including search text, result limit, and offset.
  5. Process Search Requests: The system performs:
    • Tokenization: Breaks search text into individual terms
    • Term Lookup: Finds term IDs in the WORDS table
    • Record Identification: Locates associated record IDs in the MAP table
  6. Construct Results: Using collected record IDs to:
    • Retrieve primary key conditions from ROWS table
    • Fetch actual data rows from source tables
  7. Return Results: The database returns formatted results, which may include raw data or SQL for further processing.
  8. Display Results: Applications present the search results to end users.

Index Schema Design

  1. Index Metadata Table (INDEXES): Stores index definitions with schema, table, and column information.
CREATE TABLE FT.INDEXES(
    ID INT AUTO_INCREMENT PRIMARY KEY,
    SCHEMA_NAME VARCHAR(255),
    TABLE_NAME VARCHAR(255),
    COLUMN_NAMES VARCHAR(255),
    UNIQUE(SCHEMA_NAME, TABLE_NAME)
);
  1. Term Dictionary (TERMS): Maintains unique terms with identifiers.
CREATE TABLE FT.TERMS(
    TERM_ID INT AUTO_INCREMENT PRIMARY KEY,
    TERM_VALUE VARCHAR(255) UNIQUE
);
  1. Record Mapping (RECORDS): Tracks indexed items with hash values and primary keys.
CREATE TABLE FT.RECORDS(
    RECORD_ID INT AUTO_INCREMENT,
    HASH_VALUE INT,
    INDEX_ID INT,
    PRIMARY_KEY VARCHAR(255),
    UNIQUE(HASH_VALUE, INDEX_ID, PRIMARY_KEY)
);
  1. Term-Record Association (TERM_MAPPING): Links terms to containing records.
CREATE TABLE FT.TERM_MAPPING(
    RECORD_ID INT,
    TERM_ID INT,
    PRIMARY KEY(TERM_ID, RECORD_ID)
);
  1. Exclusion List (STOP_WORDS): Contains terms to ignore during indexing.
CREATE TABLE FT.STOP_WORDS(
    TERM_VALUE VARCHAR(255)
);

E-Commerce Example

For a Products table with name and description columns:

  1. Create the index:
FT_CREATE_INDEX('PUBLIC', 'Products', 'Name, Description');
  1. Sample INDEXES entry:
ID | SCHEMA | TABLE    | COLUMNS
1  | PUBLIC | Products | Name, Description
  1. Sample TERMS entries:
TERM_ID | TERM_VALUE
1       | smartphone
2       | protective
3       | cover
  1. Sample RECORDS entries:
RECORD_ID | HASH_VALUE | INDEX_ID | PRIMARY_KEY
1         | 12345      | 1        | ProductID = 101
  1. Sample TERM_MAPPING:
RECORD_ID | TERM_ID
1         | 1
1         | 3
  1. Execute search:
FT_SEARCH('protective cover', 10, 0);

This query would return products containing both "protective" and "cover" terms.

Tags: database full-text-search sql indexing Architecture

Posted on Fri, 22 May 2026 16:58:03 +0000 by kaveman50