Greenplum Automatic Partition Management: A Comprehensive Guide


Greenplum 6 provides robust partitioning capabilities that allow for efficient data management. This guide presents an enhanced automatic partitioning solution that supports daily, monthly, and yearly partitioning strategies with dynamic future partition creation.

Partition Management Function

The following function automates partition creation for Greenplum tables, supporting multiple granularity levels and including comprehensive error handling:

CREATE OR REPLACE FUNCTION manage_table_partitions(
    p_table_name text,                       -- Table name (with schema if needed)
    p_partition_type text DEFAULT 'monthly', -- Partition granularity: daily/monthly/yearly
    p_future_partitions int DEFAULT 3       -- Number of future partitions to create
)
RETURNS text AS $$
DECLARE
    v_schema_name text;                      -- Schema name
    v_table_name text;                      -- Table name without schema
    v_partition_type text;                  -- Normalized partition type
    v_base_date date;                       -- Base date aligned to partition type
    v_start_date date;                      -- Partition start date (inclusive)
    v_end_date date;                        -- Partition end date (exclusive)
    v_partition_name text;                  -- Generated partition name
    v_partition_exists int;                 -- Partition existence flag
    v_sql_command text;                     -- Dynamic ALTER statement
    v_execution_summary text := '';         -- Execution results summary
    v_created_count int := 0;               -- Count of successfully created partitions
    v_loop_counter int;                     -- Loop counter
BEGIN
    -- 1. Parse table name into schema and table components
    IF position('.' in p_table_name) > 0 THEN
        v_schema_name := split_part(p_table_name, '.', 1);
        v_table_name := split_part(p_table_name, '.', 2);
    ELSE
        v_schema_name := 'public';
        v_table_name := p_table_name;
    END IF;

    -- 2. Normalize and validate partition type parameter
    v_partition_type := lower(p_partition_type);
    IF v_partition_type NOT IN ('daily', 'monthly', 'yearly') THEN
        RAISE EXCEPTION 'Invalid partition type "%". Allowed values: daily, monthly, yearly', p_partition_type;
    END IF;

    -- 3. Loop through and create future partitions
    FOR v_loop_counter IN 0..(p_future_partitions - 1) LOOP
        -- Calculate base date aligned to partition type
        CASE v_partition_type
            WHEN 'daily' THEN
                v_base_date := date_trunc('day', current_date + (v_loop_counter || ' days')::interval)::date;
                v_start_date := v_base_date;
                v_end_date := v_base_date + interval '1 day';
                v_partition_name := 'p_' || to_char(v_start_date, 'YYYY_MM_DD');
            WHEN 'monthly' THEN
                v_base_date := date_trunc('month', current_date + (v_loop_counter || ' months')::interval)::date;
                v_start_date := v_base_date;
                v_end_date := v_base_date + interval '1 month';
                v_partition_name := 'p_' || to_char(v_start_date, 'YYYY_MM');
            WHEN 'yearly' THEN
                v_base_date := date_trunc('year', current_date + (v_loop_counter || ' years')::interval)::date;
                v_start_date := v_base_date;
                v_end_date := v_base_date + interval '1 year';
                v_partition_name := 'p_' || to_char(v_start_date, 'YYYY');
        END CASE;

        -- 4. Check if partition already exists
        EXECUTE format('
            SELECT 1 FROM pg_partitions
            WHERE schemaname = %L
              AND tablename = %L
              AND partitiontablename = %L',
            v_schema_name, v_table_name, v_partition_name
        ) INTO v_partition_exists;

        -- 5. Create partition if it doesn't exist
        IF v_partition_exists IS NULL THEN
            BEGIN
                v_sql_command := format('
                    ALTER TABLE %I.%I
                    SPLIT DEFAULT PARTITION
                    START (%L::date) INCLUSIVE
                    END (%L::date) EXCLUSIVE
                    INTO (PARTITION %I, DEFAULT PARTITION)',
                    v_schema_name, v_table_name,
                    v_start_date::text, v_end_date::text,
                    v_partition_name
                );

                RAISE NOTICE 'Executing: %', v_sql_command;

                EXECUTE v_sql_command;
                v_created_count := v_created_count + 1;
                v_execution_summary := v_execution_summary || format('Created %s [%s to %s]; ', v_partition_name, v_start_date, v_end_date);
            EXCEPTION WHEN OTHERS THEN
                RAISE WARNING 'Failed to create %: %', v_partition_name, SQLERRM;
                v_execution_summary := v_execution_summary || format('Failed to create %s: %s; ', v_partition_name, SQLERRM);
            END;
        END IF;
    END LOOP;

    -- 6. Return summary of execution
    RETURN format('Created %s new partition(s). Details: %s', v_created_count, v_execution_summary);
END;
$$ LANGUAGE plpgsql;

Partition Table Structure Examples

Here are examples of partitioned table structures in Greenplum:

CREATE TABLE public.sales_data (
    sale_id integer NOT NULL,
    product_id varchar(50),
    sale_date date NOT NULL,
    amount numeric(10,2),
    region varchar(50)
)
WITH (
    appendonly=true,
    orientation=column,
    compresstype=zstd,
    compresslevel=5
)
DISTRIBUTED BY (sale_date)
PARTITION BY RANGE(sale_date) 
          (
          PARTITION p_2023_01 START ('2023-01-01'::date) END ('2023-02-01'::date), 
          PARTITION p_2023_02 START ('2023-02-01'::date) END ('2023-03-01'::date), 
          DEFAULT PARTITION default_part  
          );

Viewing Table Partitions

To view existing partitions in a table:

SELECT SCHEMANAME, partitionname, partitiontablename, partitiontype, partitionrangestart, partitionrangeend
FROM pg_partitions
WHERE tablename = 'sales_data' AND schemaname = 'public';

Usage Examples

Monthly Partitions (Next 6 Months)

Create monthly partitions for the next 6 months:

SELECT manage_table_partitions('public.sales_data', 'monthly', 6);

Daily Partitions (Next 7 Days)

Create daily partitions for the next week:

SELECT manage_table_partitions('public.sales_data', 'daily', 7);

Yearly Partitions (Next 2 Years)

Create yearly partittions for the next two years:

SELECT manage_table_partitions('public.sales_data', 'yearly', 2);

Linux Automation Script

The following shell script automates partition management for multiple tables:

#!/bin/bash
# =============================================================================
# Script Name: auto_partition_manager.sh
# Description: Automates partition creation for multiple Greenplum tables
# Environment: Requires psql client with configured database connection
# Usage: 
#   1. Execute directly: ./auto_partition_manager.sh
#   2. Schedule via cron: Add to crontab for daily execution
# =============================================================================

# ------------------------------ Configuration ------------------------------
# Database connection parameters (can be overridden by environment variables)
DB_HOST="${DB_HOST:-127.0.0.1}"          # Database host
DB_PORT="${DB_PORT:-5432}"              # Port
DB_NAME="datawarehouse"                # Database name
DB_USER="${DB_USER:-gpadmin}"          # Username
# Password should be set in ~/.pgpass file for security

# Log file path
LOG_FILE="/var/log/greenplum_partition.log"

# Tables to process and their parameters
# Format: "schema.table:partition_type:future_count"
# partition_type: daily / monthly / yearly
# future_count: Number of future partitions to create (default: 3)
TABLE_CONFIGS=(
    "public.sales_data:monthly:6"
    "public.customer_activity:monthly:6"
    "public.product_inventory:daily:7"
    "public.financial_records:yearly:2"
)

# Optional: Load configurations from external file
# CONFIG_FILE="/etc/greenplum/partition_config.conf"
# if [ -f "$CONFIG_FILE" ]; then
#     mapfile -t TABLE_CONFIGS < "$CONFIG_FILE"
# fi
# ----------------------------------------------------------------------------

# Check if psql is available
if ! command -v psql &> /dev/null; then
    echo "$(date '+%Y-%m-%d %H:%M:%S') - ERROR: psql command not found" >> "$LOG_FILE"
    exit 1
fi

# Record start time
echo "$(date '+%Y-%m-%d %H:%M:%S') - ========== Starting automatic partition creation task ==========" >> "$LOG_FILE"

# Process each table configuration
for config in "${TABLE_CONFIGS[@]}"; do
    # Skip empty lines and comments
    [[ -z "$config" || "$config" =~ ^#.*$ ]] && continue

    # Parse configuration
    IFS=':' read -r table partition_type future <<< "$config"
    future=${future:-3}

    echo "$(date '+%Y-%m-%d %H:%M:%S') - Processing table: $table (type=$partition_type, future=$future)" >> "$LOG_FILE"

    # Build SQL to call the partition management function
    SQL="SELECT manage_table_partitions('$table', '$partition_type', $future);"

    # Execute SQL and capture output
    result=$(psql -h "$DB_HOST" -p "$DB_PORT" -d "$DB_NAME" -U "$DB_USER" -t -c "$SQL" 2>&1)
    exit_code=$?

    if [ $exit_code -eq 0 ]; then
        echo "$(date '+%Y-%m-%d %H:%M:%S') - Success: $result" >> "$LOG_FILE"
    else
        echo "$(date '+%Y-%m-%d %H:%M:%S') - Failure: $result" >> "$LOG_FILE"
    fi
done

echo "$(date '+%Y-%m-%d %H:%M:%S') - ========== Automatic partition creation task completed ==========" >> "$LOG_FILE"
exit 0

Cron Job Configuration

Schedule the script to run daily at 2 AM:

# Daily partition management at 2 AM
0 2 * * * /home/gpadmin/scripts/auto_partition_manager.sh

Important Considerations

Prerequisites

  • The target table must already be a partitioned table with a DEFAULT PARTITION
  • The partition column must be of date type (or implicitly convertible to date)

Partition Naming Convention

  • Daily: p_YYYY_MM_DD (e.g., p_2026_03_19)
  • Monthly: p_YYYY_MM (e.g., p_2026_03)
  • Yearly: p_YYYY (e.g., p_2026)

Boundary Handling

Uses left-closed, right-open interval principle (INCLUSIVE start, EXCLUSIVE end) consistent with Greenplum's default behavior.

Idempotency

The function checks for existing partitions before creation, ensuring safe repeated execuiton.

Error Handling

Individual partition creation failures won't interrupt the entire process, with detailed error reporting.

Performance Considerations

For large-scale partitioning (e.g., hundreds of daily partitions), limit the number of future partitions or execute in batches to avoid long-running operations.

Tags: greenplum PostgreSQL partitioning Database Administration PL/pgSQL

Posted on Wed, 10 Jun 2026 16:14:27 +0000 by leetee