Understanding Graph Color Algorithm in Houdini Vellum Solver

Vellum 18 employs XPBD (eXtended Position Based Dynamics) as its underlying physics framework, which distinguishes itself from traditional PBD through second-order integration rather than first-order. This architectural choice enables more accurate constraint satisfaction but demands efficient parallel execution during the iterative constraint solving phase.

Parallel Constraint Solving Through Graph Coloring

The Vellum Solver performs repeated constraint calculations across multiple iterations, making parallelization essential for performance. Graph Color technology addresses this challenge by ensuring that constraints sharing geometric connections receive different color values. Constraints with identical colors form a workset, allowing the OpenCL compute kernel to process all members of a single workset simultaneously without race conditions.

Consider a grid geometry processed through Vellum Constraints to generate distance constraints. The Graph Color node assigns integer color values to primitive attributes such that adjacent primitives never share the same color. When visualized, each edge (primitive) displays a distinct color from its neighbors, demonstrating the successful partitioning of constraints into parallel-executable worksets.

The Vellum Solver performs this coloring internally during simulation, though the concept can be demonstrated using standalone nodes for educational purposes.

Constraint Propagation and Color Assignment

When a grid has opposing edge points pinned through Pin Constraints and distance constraints applied along edges with maximum stiffness, an interesting artifact emerges. With Rest Length Scale set to 0.5 and Smooth Iterations disabled, the solver compresses some edges to half their original length while simultaneously stretching others.

Examining the color assignment reveals that stretched edges share identical colors, indicating they belong to the same workset and execute concurrently in OpenCL. The yellow-colored edges process first, accumulating errors from the constraint solving iteration before red and blue edges execute. This sequential error accumulation causes compression artifacts that violate the intended material behavior.

Algorithm Implementation Details

The Graph Color node processes primitives through several distinct stages, with the connectivity mode set to primitives since Vellum constraints operate at the primitive level.

Primitive Point Count Analysis

The algorithm begins by creating an integer attribute to store the neighbor count for each primitive:

int neighbor_count = len(primpoints(0, @primnum));
setprimattrib(geoself(), "neighbor_count", @primnum, neighbor_count);

High Valence Primitive Preprocessing

Primitives with excessive neighbors require special handling to prevent memory overflow and computational overhead during OpenCL execution. The algorithm filters primitives exceeding the maximum valence threshold (default: 20) and processes them sequentially before parallel coloring.

int threshold = chi("max_valence");
string filter = sprintf("neighbor_count>=%d", threshold);
int high_valence_prims[] = expandprimgroup(0, filter);
int prim_count = len(high_valence_prims);

if (prim_count == 0)
    return;

int point_count = npoints(0);
bool color_mapping[];
int assigned_colors[];
resize(assigned_colors, prim_count);
int color_count = 0;

int vertex_counts[];
resize(vertex_counts, prim_count);

foreach(int index; int pid; high_valence_prims) {
    vertex_counts[index] = -len(primpoints(0, pid));
}

high_valence_prims = reorder(high_valence_prims, argsort(vertex_counts));

The primitives undergo reverse sorting by point count, ensuring those with more vertices process first and reserve colors for more constrained geometry.

Color Assignment for High Valence Primitives

For each high-valence primitive, the algorithm attempts to assign an existing color beefore creating new colors:

foreach(int index; int pid; high_valence_prims) {
    bool requires_new = true;
    int vertices[] = primpoints(0, pid);
    
    for(int c = 0; c < color_count; c++) {
        bool available = true;
        foreach(int v; vertices) {
            if (color_mapping[c * point_count + v] != 0) {
                available = false;
                break;
            }
        }
        
        if (available) {
            assigned_colors[index] = c;
            foreach(int v; vertices) {
                color_mapping[c * point_count + v] = 1;
            }
            requires_new = false;
            break;
        }
    }
    
    if (requires_new) {
        assigned_colors[index] = color_count;
        color_count++;
        resize(color_mapping, color_count * point_count);
        foreach(int v; vertices) {
            color_mapping[assigned_colors[index] * point_count + v] = 1;
        }
    }
}

foreach(int index; int pid; high_valence_prims) {
    setprimattrib(geoself(), "result_color", pid, assigned_colors[index]);
}

The color mapping array tracks which colors are unavailable at each point, preventing adjacent primitives from receiving identical colors. When no existing color satisfies the constraints, a new color value increments the palette.

Neighbor Discovery

The algorithm identifies adjacent primitives by examining shared vertices:

int adjacent[];
int shared[] = primpoints(0, @primnum);
foreach(int v; shared) {
    int connected[] = pointprims(0, v);
    foreach(int neighbor; connected) {
        if (neighbor != @primnum) {
            append(adjacent, neighbor);
        }
    }
}
i[]@adjacent_prims = adjacent;

Workset Configuration

OpenCL requires workset boundaries for parallel execution. The algorithm configures a single workset containing all primitives, with the ability to dynamically shrink during iteration if constraints converge:

i[]@workset_offsets = {0};
i[]@workset_sizes = array(nprimitives(0));

The sizing mechanism allows early termination. When constraint iterations converge sufficiently, setting the workset size to zero prevents unnecessary computation cycles.

Gebremedhin-Manne Speculative Coloring Algorithm

The OpenCL kernel employs the Gebremedhin-Manne algorithm for conflict-aware parallel coloring. Each iteration consists of speculative color assignment followed by conflict resolution.

Bitwise Color Assignment

The algorithm uses 64-bit integers to efficiently track forbidden colors:

#define FORBID_BITS 64
#define MAX_FORBID 18446744073709551615ull

kernel void assign_colors(
    int workset_start,
    int workset_count,
    int neighbor_count,
    global int* neighbor_index,
    global int* neighbors,
    int vertex_count,
    global int* vertex_counts,
    int color_buffer_size,
    global int* colors,
    int speculative_size,
    global int* speculative
) {
    int id = get_global_id(0);
    if (id >= workset_count)
        return;
    
    if (id == 0)
        *speculative = 0;
    
    if (colors[id] >= 0)
        return;
    
    int range_start = neighbor_index[id];
    int range_end = neighbor_index[id + 1];
    int candidate = -1;
    int bit_offset = 0;
    
    while (candidate < 0) {
        ulong forbidden = 0;
        for (int i = range_start; i < range_end; i++) {
            int neighbor = neighbors[i];
            int neighbor_color = colors[neighbor] - bit_offset;
            if (neighbor_color >= 0 && neighbor_color < FORBID_BITS) {
                forbidden |= (1ull << neighbor_color);
            }
        }
        
        if ((forbidden ^ MAX_FORBID) != 0) {
            ulong mask = forbidden;
            candidate = bit_offset;
            mask = (~mask) & (mask + 1);
            while (mask > 1) {
                mask >>= 1;
                candidate++;
            }
        } else {
            bit_offset += FORBID_BITS;
        }
    }
    speculative[id] = candidate;
}

The bitwise operations efficiently identify available colors. The forbidden mask tracks colors used by adjacent primitives, and bitwise manipulation locates the first unset bit representing an available color.

Conflict Resolution

Conflict resolution ensures that adjacent primitives receive different colors through a deterministic tie-breaking mechanism:

kernel void resolve_conflicts(
    int workset_start,
    int workset_count,
    int neighbor_count,
    global int* neighbor_index,
    global int* neighbors,
    int vertex_count,
    global int* vertex_counts,
    int color_buffer_size,
    global int* colors,
    int speculative_size,
    global int* speculative,
    int workset_size_buffer,
    global int* size_index,
    global int* workset_sizes
) {
    int id = get_global_id(0);
    if (id >= workset_count)
        return;
    
    if (colors[id] >= 0)
        return;
    
    int range_start = neighbor_index[id];
    int range_end = neighbor_index[id + 1];
    int assigned = speculative[id];
    bool collision = false;
    int my_vertices = vertex_counts[id];
    
    for (int i = range_start; i < range_end; i++) {
        int neighbor = neighbors[i];
        int neighbor_speculative = speculative[neighbor];
        
        if (neighbor_speculative == assigned) {
            int neighbor_vertices = vertex_counts[neighbor];
            if (neighbor_vertices > my_vertices || 
               (neighbor_vertices == my_vertices && neighbor < id)) {
                collision = true;
                break;
            }
        }
    }
    
    if (collision)
        *workset_sizes = workset_count;
    else
        colors[id] = assigned;
}

The conflict resolution prioritizes primitives with more vertices, ensuring that the most constrained geometry retains its speculative color asignment.

Workset Finalization

The final stages organize primitives by color for optimal OpenCL memory access patterns:

int workset_offsets[];
int workset_sizes[];
int current_offset = 0;
string color_attr = chs("color_attribute");
int distinct_colors = nuniqueval(0, "primitive", color_attr);
resize(workset_offsets, distinct_colors);
resize(workset_sizes, distinct_colors);

for (int i = 0; i < distinct_colors; i++) {
    int color_value = uniqueval(0, "primitive", color_attr, i);
    int primitives_with_color = findattribvalcount(0, "primitive", color_attr, color_value);
    workset_offsets[i] = current_offset;
    workset_sizes[i] = primitives_with_color;
    current_offset += primitives_with_color;
}

The generated offset and size arrays define workset boundaries, with each color value corresponding to one independent workset executable in parallel.

Tags: Houdini Vellum XPBD Graph Coloring OpenCL

Posted on Sun, 14 Jun 2026 16:27:04 +0000 by imnsi