Trie (Prefix Tree) Fundamentals
Binary trees consist of nodes where each node holds a value and pointers to left and right children:
struct Node {
int value;
Node* left;
Node* right;
};
A binary tree node has at most two children. When a tree node can have multiple children, it becomes a multi-way tree. Since the number of children varies, a dynamic container typically stores all child pointers:
struct Node {
int value;
std::vector<Node*> descendants;
};
In a standard multi-way tree, child nodes have no specific organization. However, a Trie imposes a particular structure on the children array.
Trie Structure
A Trie is a specialized multi-way tree designed for string operations. For lowercase English letters, each node maintains a fixed-size array of 26 pointers, one for each letter 'a' through 'z', essentially forming a 26-ary tree.
class TrieNode {
public:
std::vector<TrieNode*> children;
bool terminatesWord;
TrieNode() : terminatesWord(false), children(26, nullptr) {}
~TrieNode() {
for (auto& node : children)
delete node;
}
};
Each TrieNode contains two key fields:
children: An array storing references to all child nodesterminatesWord: A flag indicating whether the path from root to this node forms a valid word
Construction
When inserting words into a Trie:
- The root node remains empty
- For each character, place it in the corresponding index (character - 'a') as a child node
- The next character becomes the child of the current character
- Upon completing insertion, mark the final node's
terminatesWordas true
Consider a Trie containing words: {"am", "an", "as", "b", "c", "cv"}. Key observations:
- Words sharing the same prefix cluster on the same subtree
- A word doesn't need to reach a leaf node to be valid—
terminatesWorddetermines validity - Characters aren't physically stored in nodes; their position in the childern array determines the path
Search Operations
Searching traverses the Trie according to the input pattern. Three scenarios occur:
- Path breaks: The required node doesn't exist (e.g., searching "d" or "ar" in the example)
- Path exists but word incomplete: Final node's
terminatesWordis false (e.g., searching "a") - Path exists and complete: Final node has
terminatesWordas true (e.g., "am", "cv")
Wildcard Matching
This problem introduces '.' as a wildcrad matching any single letter. During search, when encountering '.', all 26 branches must be explored. If any branch successfully matches the remaining pattern, the search succeeds.
Implementation
class TrieNode {
public:
std::array<TrieNode*, 26> next;
bool isEnd;
TrieNode() : isEnd(false) {
next.fill(nullptr);
}
~TrieNode() {
for (int i = 0; i < 26; ++i) {
delete next[i];
}
}
};
class WordDictionary {
private:
TrieNode* head;
bool findPattern(const std::string& pattern, TrieNode* current, int idx) {
if (!current) return false;
if (idx == pattern.length()) return current->isEnd;
char ch = pattern[idx];
if (ch == '.') {
for (int i = 0; i < 26; ++i) {
if (findPattern(pattern, current->next[i], idx + 1))
return true;
}
return false;
}
int pos = ch - 'a';
return findPattern(pattern, current->next[pos], idx + 1);
}
public:
WordDictionary() {
head = new TrieNode();
}
~WordDictionary() {
delete head;
}
void addWord(const std::string& word) {
TrieNode* node = head;
for (char c : word) {
int index = c - 'a';
if (!node->next[index]) {
node->next[index] = new TrieNode();
}
node = node->next[index];
}
node->isEnd = true;
}
bool search(const std::string& word) {
return findPattern(word, head, 0);
}
};
Complexity Analysis
Time Complexity: Adding a word costs O(L) where L is the word length. Searching costs O(26^L) in the worst case when all characters are wildcards, as the algorithm must explore all branches.
Space Complexity: O(26 Ă— total characters added). Each node allocates space for 26 pointers regardless of how many children it actually has.