Querying Path Sums with Value Constraints in Trees Using Persistent Segment Trees

Problem Overview

Given a tree with weighted nodes and multiple queries, each query asks for the sum of node weights along the path between two nodes x and y, where the weights fall within a specified range [l, r].

Solution Approach

This problem can be efficiently solved using persistent segment trees (also known as chairman trees) combined with Lowest Common Ancestor (LCA) computation. The core idea involves building a persistent segment tree for each node that stores the cumulative sum of weights along the path from the root to that node.

Key Implementation Details

Weight Discretization

Node weight are first discretized to reduce the value range and optimize segment tree operations:

vector<int> distinct_vals = original_vals;
sort(distinct_vals.begin(), distinct_vals.end());
distinct_vals.erase(unique(distinct_vals.begin(), distinct_vals.end()), distinct_vals.end());
for (int i = 1; i <= n; i++) {
    compressed[i] = lower_bound(distinct_vals.begin(), distinct_vals.end(), original_vals[i]) - distinct_vals.begin() + 1;
}

Persistent Segment Tree Structure

The segment tree maintains versioned states where each version corresponds to a node's path from the root:

struct PersistentSegmentTree {
    struct Node {
        int left_child, right_child;
        long long sum;
    } nodes[MAX_NODES * 20];
    
    int root_versions[MAX_NODES];
    int node_count = 0;
    
    int create_new_version(int prev_version, int start, int end, int position, int value) {
        int current = ++node_count;
        nodes[current] = nodes[prev_version];
        nodes[current].sum += value;
        
        if (start == end) return current;
        
        int mid = (start + end) / 2;
        if (position <= mid) {
            nodes[current].left_child = create_new_version(nodes[prev_version].left_child, start, mid, position, value);
        } else {
            nodes[current].right_child = create_new_version(nodes[prev_version].right_child, mid + 1, end, position, value);
        }
        return current;
    }
    
    long long query_range(int version, int start, int end, int query_l, int query_r) {
        if (query_r < start || query_l > end) return 0;
        if (query_l <= start && end <= query_r) return nodes[version].sum;
        
        int mid = (start + end) / 2;
        long long left_sum = query_range(nodes[version].left_child, start, mid, query_l, query_r);
        long long right_sum = query_range(nodes[version].right_child, mid + 1, end, query_l, query_r);
        return left_sum + right_sum;
    }
};

Tree Traversal and Version Creation

During a DFS traversal, create new segment tree versions for each node:

void build_tree_versions(int current, int parent) {
    segment_tree.root_versions[current] = segment_tree.create_new_version(
        segment_tree.root_versions[parent], 
        1, value_count, 
        compressed_weight[current], 
        original_weight[current]
    );
    
    for (int neighbor : adjacency_list[current]) {
        if (neighbor == parent) continue;
        build_tree_versions(neighbor, current);
    }
}

Query Processing

For each query (x, y, l, r), compute the result using the formula: sum(x) + sum(y) - sum(lca) - sum(parent[lca])

long long process_query(int node_x, int node_y, int lca_node, int low_bound, int high_bound) {
    long long sum_x = segment_tree.query_range(segment_tree.root_versions[node_x], 1, value_count, low_bound, high_bound);
    long long sum_y = segment_tree.query_range(segment_tree.root_versions[node_y], 1, value_count, low_bound, high_bound);
    long long sum_lca = segment_tree.query_range(segment_tree.root_versions[lca_node], 1, value_count, low_bound, high_bound);
    long long sum_parent_lca = segment_tree.query_range(segment_tree.root_versions[parent[lca_node]], 1, value_count, low_bound, high_bound);
    
    return sum_x + sum_y - sum_lca - sum_parent_lca;
}

LCA Computation

Tarjan's offline algorithm is recommended for LCA computation too avoid meemory issues:

void tarjan_lca(int current) {
    visited[current] = true;
    parent_dsu[current] = current;
    
    for (int neighbor : adjacency_list[current]) {
        if (neighbor == parent_tree[current]) continue;
        parent_tree[neighbor] = current;
        tarjan_lca(neighbor);
        parent_dsu[neighbor] = current;
    }
    
    for (auto& query : queries_related_to[current]) {
        int other_node = query.other_node;
        if (visited[other_node]) {
            query.lca_result = find_dsu(other_node);
        }
    }
}

Important Considerations

  • Use 64-bit integers (long long) to prevent integer overflow
  • Properly handle edge cases where the query range [l, r] contains no values
  • For large trees, Tarjan's LCA algorithm is preferred over binary lifting due to memory constraints
  • The discretization step is crucial for efficient segment tree operations

Alternative Approaches

An offline approach can split each query into two parts: [0, r] minus [0, l-1], which can be processed using a simpler segment tree structure, though this requires different query organization.

Tags: persistent-segment-tree lowest-common-ancestor tree-algorithms range-queries algorithm-optimization

Posted on Fri, 15 May 2026 17:00:54 +0000 by bigbob