Sets and Common Operations
A set is an unordered, duplicate-free collection of elements, mathematically denoted as (S = {x, y, z, \dots}). For example, two sets tracking event participants could be:
- (Photography = {\text{Liam}, \text{Mia}, \text{Noah}, \text{Emma}, \text{Ethan}} | \text{Event photography team volunteers})
- (Videography = {\text{Olivia}, \text{Mia}, \text{Ethan}, \text{Ava}, \text{Logan}} | \text{Event videography team volunteers})
Key set operations include:
- Intersection ((A \cap B)): Returns elements present in both sets. For volunteers, (Photography \cap Videography = {\text{Mia}, \text{Ethan}}).
- Union ((A \cup B)): Returns all unique elements from both sets. For volunteers, (Photography \cup Videography = {\text{Liam}, \text{Mia}, \text{Noah}, \text{Emma}, \text{Ethan}, \text{Olivia}, \text{Ava}, \text{Logan}}).
- Difference ((A \setminus B)): Returns elements in (A) but not in (B). For volunteers, (Photography \setminus Videography = {\text{Liam}, \text{Noah}, \text{Emma}}).
DSU focuses on efficient union operations for disjoint sets (sets with empty intersection), along with fast membership queries.
Disjoint Set Union (DSU) Algorithm
DSU is a forest-based data structure that manages disjoint sets and supports two core optimized operations:
- Find (getRoot): Locate the root (unique identifier) of an element’s set via path compression, flattening the tree to reduce future lookup steps.
- Union (mergeSets): Combine two sets by linking one root to the other, using union by rank to maintain a shallow tree height.
DSU excels at dynamic connectivity problems like graph connected components, network connectivity, and Kruskal’s minimum spanning tree algorithm.
DSU Implementation
Each set is represented as a tree, with all nodes pointing to a single root. A forest contains all independent sets.
Step 1: Variable Definition and Setup
We use two arrays/vectors: ancestor to track each node’s parent, and rank to store tree depth for union by rank optimization.
#include <vector>
using namespace std;
vector<int> ancestor;
vector<int> rank;
void setup(int size) {
ancestor.resize(size);
rank.resize(size, 0);
for (int i = 0; i < size; ++i) {
ancestor[i] = i;
}
}
Step 2: Find Operation with Path Compression
int getRoot(int node) {
if (ancestor[node] != node) {
ancestor[node] = getRoot(ancestor[node]); // Path compression
}
return ancestor[node];
}
Step 3: Union Operation with Union by Rank
void mergeSets(int nodeA, int nodeB) {
int rootA = getRoot(nodeA);
int rootB = getRoot(nodeB);
if (rootA == rootB) {
return; // Already in the same set
}
// Attach smaller rank tree to larger rank tree
if (rank[rootA] > rank[rootB]) {
ancestor[rootB] = rootA;
} else {
ancestor[rootA] = rootB;
if (rank[rootA] == rank[rootB]) {
rank[rootB] += 1;
}
}
}
Time Complexiyt
- Find (getRoot): Without path compression, worst-case (O(n)); with compression, amortized (O(\alpha(n))), where (\alpha(n)) is the inverse Ackermann function (effectively constant for practical inputs, (\alpha(10^{100}) \approx 5)).
- Union (mergeSets): With union by rank and path compression, amortized (O(\alpha(n))).
- Overall: DSU handles large-scale dynamic connectivity tasks (millions of operations) in near-constant time.
Visualization
DSU algorithm logic can be visualized interactively at Visualgo.net.
References
- GeeksforGeeks. "Introduction to Disjoint Set Data Structure or Union-Find Algorithm." https://www.geeksforgeeks.org/introduction-to-disjoint-set-data-structure-or-union-find-algorithm/
- CP-Algorithms. "Disjoint Set Union." https://cp-algorithms.com/data_structures/disjoint_set_union.html
- USA Computing Olympiad. "DSU." https://usaco.guide/gold/dsu?lang=cpp
- Halim, Steven, et al. "Union-Find Disjoint Sets (UFDS) - VisuAlgo." VisuAlgo.net