Add and Search Word Data Structure

Trie (Prefix Tree) Fundamentals

Binary trees consist of nodes where each node holds a value and pointers to left and right children:

struct Node {
    int value;
    Node* left;
    Node* right;
};

A binary tree node has at most two children. When a tree node can have multiple children, it becomes a multi-way tree. Since the number of children varies, a dynamic container typically stores all child pointers:

struct Node {
    int value;
    std::vector<Node*> descendants;
};

In a standard multi-way tree, child nodes have no specific organization. However, a Trie imposes a particular structure on the children array.

Trie Structure

A Trie is a specialized multi-way tree designed for string operations. For lowercase English letters, each node maintains a fixed-size array of 26 pointers, one for each letter 'a' through 'z', essentially forming a 26-ary tree.

class TrieNode {
public:
    std::vector<TrieNode*> children;
    bool terminatesWord;
    
    TrieNode() : terminatesWord(false), children(26, nullptr) {}
    
    ~TrieNode() {
        for (auto& node : children)
            delete node;
    }
};

Each TrieNode contains two key fields:

  • children: An array storing references to all child nodes
  • terminatesWord: A flag indicating whether the path from root to this node forms a valid word

Construction

When inserting words into a Trie:

  1. The root node remains empty
  2. For each character, place it in the corresponding index (character - 'a') as a child node
  3. The next character becomes the child of the current character
  4. Upon completing insertion, mark the final node's terminatesWord as true

Consider a Trie containing words: {"am", "an", "as", "b", "c", "cv"}. Key observations:

  • Words sharing the same prefix cluster on the same subtree
  • A word doesn't need to reach a leaf node to be valid—terminatesWord determines validity
  • Characters aren't physically stored in nodes; their position in the childern array determines the path

Search Operations

Searching traverses the Trie according to the input pattern. Three scenarios occur:

  1. Path breaks: The required node doesn't exist (e.g., searching "d" or "ar" in the example)
  2. Path exists but word incomplete: Final node's terminatesWord is false (e.g., searching "a")
  3. Path exists and complete: Final node has terminatesWord as true (e.g., "am", "cv")

Wildcard Matching

This problem introduces '.' as a wildcrad matching any single letter. During search, when encountering '.', all 26 branches must be explored. If any branch successfully matches the remaining pattern, the search succeeds.

Implementation

class TrieNode {
public:
    std::array<TrieNode*, 26> next;
    bool isEnd;
    
    TrieNode() : isEnd(false) {
        next.fill(nullptr);
    }
    
    ~TrieNode() {
        for (int i = 0; i < 26; ++i) {
            delete next[i];
        }
    }
};

class WordDictionary {
private:
    TrieNode* head;
    
    bool findPattern(const std::string& pattern, TrieNode* current, int idx) {
        if (!current) return false;
        if (idx == pattern.length()) return current->isEnd;
        
        char ch = pattern[idx];
        
        if (ch == '.') {
            for (int i = 0; i < 26; ++i) {
                if (findPattern(pattern, current->next[i], idx + 1))
                    return true;
            }
            return false;
        }
        
        int pos = ch - 'a';
        return findPattern(pattern, current->next[pos], idx + 1);
    }

public:
    WordDictionary() {
        head = new TrieNode();
    }
    
    ~WordDictionary() {
        delete head;
    }
    
    void addWord(const std::string& word) {
        TrieNode* node = head;
        for (char c : word) {
            int index = c - 'a';
            if (!node->next[index]) {
                node->next[index] = new TrieNode();
            }
            node = node->next[index];
        }
        node->isEnd = true;
    }
    
    bool search(const std::string& word) {
        return findPattern(word, head, 0);
    }
};

Complexity Analysis

Time Complexity: Adding a word costs O(L) where L is the word length. Searching costs O(26^L) in the worst case when all characters are wildcards, as the algorithm must explore all branches.

Space Complexity: O(26 Ă— total characters added). Each node allocates space for 26 pointers regardless of how many children it actually has.

Tags: Trie data-structure string-search wildcard-matching algorithm

Posted on Sun, 05 Jul 2026 16:36:08 +0000 by Joeddox