Merging Account Lists via Disjoint Set Union

Problem Definition

Given a list of accounts accounts, where each accounts[i] is a list of strings. The first string accounts[i][0] is the user's name. The remaining strings are email addresses belonging to that account.

The objective is to merge accounts that belong to the same user. Two accounts belong to the same person if they share at least one email address. While accounts with the same name might belong to different people (names are not unique identifiers), all accounts belonging to the same person must share the same name. A person can have multiple disitnct accounts.

Requirements

Return a list of merged accounts. Each account in the result must be a list of strings where:

  1. The first element is the name.
  2. Subsequent elements are the email addresses associated with that account, sorted in lexicographical (ASCII) order.
  3. The accounts themselves can be returned in any order.

Algorithmic Analysis

This problem is efficiently solved using the Disjoint Set Union (DSU) data structure, also known as Union-Find.

  1. Initialization: Create a DSU structure capable of managing N elements, where N is the number of input accounts. Initialize each account as its own parent.
  2. Email Mapping: Maintain a hash map that records the index of the first account encountered for every unique email address.
  3. Union Operations: Iterate through every account and its associated emails.
    • If an email is found in the map, it implies a connection. Use the union operation to merge the current account's index with the index stored in the map.
    • If the email is new, store the current account's index in the map.
  4. Aggregation: After processing all emails, iterate through the map again. For each email, find the ultimate root representative of its account index using the find operation. Group the emails by this root index.
  5. Result Construction: For each group of emails (identified by the root account index), construct the final list. Prepend the name from the representative account to the list of emails, sort the emails, and add to the final result.

Implemantation

The following C++ implemantation utilizes a dedicated struct for the DSU logic to encapsulate state and operations.

#include <vector>
#include <string>
#include <unordered_map>
#include <algorithm>
#include <numeric>

using namespace std;

// Structure to manage Disjoint Set Union operations
struct DisjointSet {
    vector<int> parent;

    // Initialize DSU with n elements, each being its own parent
    DisjointSet(int n) {
        parent.resize(n);
        iota(parent.begin(), parent.end(), 0);
    }

    // Find the representative of the set containing x with path compression
    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]);
        }
        return parent[x];
    }

    // Union the sets containing x and y
    void unite(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);
        if (rootX != rootY) {
            parent[rootY] = rootX;
        }
    }
};

class Solution {
public:
    vector<vector<string>> accountsMerge(vector<vector<string>>& accounts) {
        int n = accounts.size();
        DisjointSet dsu(n);
        unordered_map<string, int> email_map;

        // Iterate through all accounts and emails to build connections
        for (int i = 0; i < n; ++i) {
            for (size_t j = 1; j < accounts[i].size(); ++j) {
                const string& email = accounts[i][j];
                if (email_map.find(email) != email_map.end()) {
                    // If email exists, union current account with the one that has this email
                    dsu.unite(i, email_map[email]);
                } else {
                    // If new email, map it to the current account index
                    email_map[email] = i;
                }
            }
        }

        // Group emails by their root account index
        unordered_map<int, vector<string>> grouped_emails;
        for (const auto& [email, original_index] : email_map) {
            int root_index = dsu.find(original_index);
            grouped_emails[root_index].push_back(email);
        }

        vector<vector<string>> result;
        // Format the output
        for (auto& [root_idx, emails] : grouped_emails) {
            vector<string> merged_account;
            // Add the name associated with the root account
            merged_account.push_back(accounts[root_idx][0]);
            // Sort emails lexicographically
            sort(emails.begin(), emails.end());
            // Append sorted emails
            merged_account.insert(merged_account.end(), emails.begin(), emails.end());
            result.push_back(std::move(merged_account));
        }

        return result;
    }
};

Tags: algorithm graph-theory Union-Find disjoint-set C++

Posted on Mon, 11 May 2026 13:18:36 +0000 by rishiraj