Heaps: Core Concepts, Implementations, and Practical Applications

A heap is a specialized complete binary tree that adheres to strict ordering rules, with two primary variants:

  • Max Heap: Every node’s value is greater than or equal to the values of its child nodes.
  • Min Heap: Every node’s value is less than or equal to the values of its child nodes.

As a type of complete binary tree, heaps exhibit key characteristics:

  • Nodes are filled level by level, with the bottom level populated from left to right.
  • The root node is called the "heap top," while the rightmost node in the bottom level is the "heap bottom."
  • For max heaps, the heap top holds the largest value; for min heaps, it holds the smallest.

Common Heap Operations

Many programming languages provide a priority queue—an abstract data structure where elements are dequeued based on priority. Since heaps are the standard implementation for priority queues (max heap for descending priority, min heap for ascending), the two terms are often used interchangeably in practice.

Core heap operations and their time complexities are:

  • push(val): Add an element to the heap (O(log n))
  • pop(): Remove and return the heap top element (O(log n))
  • peek(): Access the heap top element without removal (O(1))
  • get_size(): Return the number of elements in the heap (O(1))
  • is_empty(): Check if the heap is empty (O(1))

Heap Implementation in C

We implemnet a max heap here; modifying it to a min heap only requires reversing all value comparisons.

Storage Representation

Complete binary trees are efficiently stored in arrays. For a node at index i, its left child is at 2*i + 1, right child at 2*i + 2, and parent at (i - 1) // 2. We encapsulate these calculations in helper functions:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>

#define MAX_CAPACITY 1000

typedef struct {
    int elements[MAX_CAPACITY];
    int count;
} MaxHeap_t;

// Helper function to get left child index
int left_child(MaxHeap_t* heap, int idx) {
    return 2 * idx + 1;
}

// Helper function to get right child index
int right_child(MaxHeap_t* heap, int idx) {
    return 2 * idx + 2;
}

// Helper function to get parent node index
int parent_node(MaxHeap_t* heap, int idx) {
    return (idx - 1) / 2;
}

// Helper function to swap two elements in the heap
void swap_heap_elements(MaxHeap_t* heap, int a, int b) {
    int temp = heap->elements[a];
    heap->elements[a] = heap->elements[b];
    heap->elements[b] = temp;
}

Access Heap Top

The heap top is the first element in the array:

// Retrieve the heap top element
int heap_peek(MaxHeap_t* heap) {
    return heap->elements[0];
}

Insert Element into Heap

Add the element to the heap bottom, then sift up (heapify) to restore heap properties:

// Sift up operation to maintain max heap property
void sift_up(MaxHeap_t* heap, int idx) {
    for (int parent = parent_node(heap, idx); 
         parent >= 0 && heap->elements[idx] > heap->elements[parent]; 
         idx = parent, parent = parent_node(heap, idx)) {
        swap_heap_elements(heap, idx, parent);
    }
}

// Insert an element into the heap
void heap_push(MaxHeap_t* heap, int value) {
    if (heap->count >= MAX_CAPACITY) {
        printf("Heap overflow: cannot insert new element\n");
        return;
    }
    heap->elements[heap->count] = value;
    heap->count++;
    sift_up(heap, heap->count - 1);
}

Remove Heap Top Element

Swap heap top with heap bottom, remove the bottom element, then sift down to restore heap properties:

// Check if heap is empty
bool heap_is_empty(MaxHeap_t* heap) {
    return heap->count == 0;
}

// Get current size of the heap
int heap_size(MaxHeap_t* heap) {
    return heap->count;
}

// Sift down operation to maintain max heap property
void sift_down(MaxHeap_t* heap, int idx) {
    while (true) {
        int left = left_child(heap, idx);
        int right = right_child(heap, idx);
        int largest = idx;
        
        if (left < heap->count && heap->elements[left] > heap->elements[largest]) {
            largest = left;
        }
        if (right < heap->count && heap->elements[right] > heap->elements[largest]) {
            largest = right;
        }
        
        if (largest == idx) break;
        swap_heap_elements(heap, idx, largest);
        idx = largest;
    }
}

// Remove and return the heap top element
int heap_pop(MaxHeap_t* heap) {
    if (heap_is_empty(heap)) {
        printf("Heap underflow: cannot pop from empty heap\n");
        return INT_MAX;
    }
    int top_val = heap->elements[0];
    swap_heap_elements(heap, 0, heap->count - 1);
    heap->count--;
    sift_down(heap, 0);
    return top_val;
}

Efficient Heap Construction

There are two primary methods to build a heap from an existing array:

  1. Naive Approach: Insert each element individually using heap_push, resulting in O(n log n) time complexity.
  2. Optimal Bottom-Up Approach: Start from the last non-leaf node and perform sift down operations. This method achieves O(n) time complexity, significantly faster for large datasets.
// Create a max heap from an existing array
MaxHeap_t* create_max_heap(int arr[], int arr_size) {
    MaxHeap_t* heap = (MaxHeap_t*)malloc(sizeof(MaxHeap_t));
    heap->count = arr_size;
    memcpy(heap->elements, arr, arr_size * sizeof(int));
    
    // Start heapifying from the last non-leaf node
    for (int i = parent_node(heap, arr_size - 1); i >= 0; i--) {
        sift_down(heap, i);
    }
    return heap;
}

// Free the heap memory
void destroy_max_heap(MaxHeap_t* heap) {
    free(heap);
}

Complexity Analysis

For a complete binary tree with n nodes, leaf nodes constitute (n+1)/2 of the total, leaving (n-1)/2 nodes to heapify. Each node's sift down cost is proportional to its depth. Summing these costs across all nodes shows the total time complexity is O(n), as the higher number of shalow nodes offsets the lower number of deep nodes.

Top-K Problem: Finding the K Largest Elements

Given an unsorted array, finding the top k largest elements can be solved in three ways:

  • Iterative Selection: Perform k passes to find the largest, second largest, etc., with O(nk) time complexity (inefficient for large k).
  • Sorting: Sort the array and take the last k elements, with O(n log n) time complexity (overkill as it sorts all elements).
  • Heap-Based Approach: Use a min heap of size k to track the largest k elements, achieving O(n log k) time complexity—optimal for most cases.

The heap-based approach works as follows:

  1. Initialize a min heap (simulated using a max heap with negated values).
  2. Insert the first k elements into the heap.
  3. For each remaining element, if it exceeds the heap's smallest element (top), replace the top with this element.
  4. After processing all elements, the heap contains the k largest elements.
// Helper function to push to min heap (using max heap with negation)
void min_heap_push(MaxHeap_t* heap, int value) {
    heap_push(heap, -value);
}

// Helper function to pop from min heap
int min_heap_pop(MaxHeap_t* heap) {
    return -heap_pop(heap);
}

// Helper function to peek min heap top
int min_heap_peek(MaxHeap_t* heap) {
    return -heap_peek(heap);
}

// Extract elements from min heap (convert back from negated values)
int* extract_min_heap_elements(MaxHeap_t* heap, int* result_size) {
    *result_size = heap->count;
    int* res = (int*)malloc(*result_size * sizeof(int));
    for (int i = 0; i < *result_size; i++) {
        res[i] = -heap->elements[i];
    }
    return res;
}

// Find top k largest elements using heap approach
int* find_top_k(int* nums, int nums_size, int k, int* return_size) {
    *return_size = k;
    MaxHeap_t* min_heap = (MaxHeap_t*)malloc(sizeof(MaxHeap_t));
    min_heap->count = 0;
    
    // Insert first k elements
    for (int i = 0; i < k; i++) {
        min_heap_push(min_heap, nums[i]);
    }
    
    // Process remaining elements
    for (int i = k; i < nums_size; i++) {
        if (nums[i] > min_heap_peek(min_heap)) {
            min_heap_pop(min_heap);
            min_heap_push(min_heap, nums[i]);
        }
    }
    
    int* result = extract_min_heap_elements(min_heap, return_size);
    destroy_max_heap(min_heap);
    return result;
}

Tags: Heap Data Structure priority queue Top-K Algorithm c programming algorithm implementation

Posted on Sat, 13 Jun 2026 17:27:42 +0000 by scorphus