Optimizing SQL Performance in OceanBase Database

OceanBase's architectural foundation differs significantly from traditional relational databases, which directly impacts SQL performance tuning strategies.

Architectural Distinctions

LSM-Tree Storage Engine Data is organized into static components (SSTables) and dynamic components (MemTables). Performance for many query improves after a major compaction merges these components. Buffer tables, often used for temporary data storage, can lead to performance degradation. Operations involving rapid writes, modifications, and deletions create "tombstone" markers in memory. Querying a range with many such markers can result in slower execution.

Distributed Share-Nothing Architecture Unlike traditional share-disk systems where execution plans assume local data access, OceanBase stores data across multiple physical nodes. A join operation between tables located on different nodes requires a distributed execution plan. Performance can be impacted if a data partition changes its primary node (leader), causing a previously local plan to become a remote or distributed plan.

Sources of SQL Performance Issues

SQL Statement Construction: Adherence to development best practices is critical.
Cost Model Imperfections: May require plan binding.
Inaccurate Statistics: Statistics are stored locally and updated during major compaction.
Physical Database Design: Fundamentally determines query performance.
System Load: Affects overall throughput and individual query response time.
Cleint-Side Routing: Can lead to remote execution.

SQL Tuning Approaches

Optimizing Single Query Execution

Single-Table Access: Focus on indexing, sort/aggregate operations, partitioning, and distributed parallelism.
Multi-Table Joins: Optimize join order, join algorithms, distributed parallelism, and consider query rewriting.

Optimizing System Throughput

Identify and tune slow SQL statements.
Balance SQL request traffic and resources via settings like ob_read_consistency, primary zone, client routing policies, and ensuring even distribution of hot query partitions.
Balance RPC request traffic for sub-plans by configuring internal OBServer routing and distributing hot query partitions evenly.

Basic Tuning Workflow

Use the global SQL audit table (g)v$sql_audit, SQL Trace, or plan cache views to identify SQL statements with high execution time or resource consumption (memory, disk I/O).
Analyze the execution plan using the EXPLAIN command. The logical plan generated by the optimizer is the primary tool for identifying tuning directions.
Apply optimizations to meet performance expectations:
- Rewrite the SQL statement equivalently to produce a better execution plan.
- For multi-table joins, optimize access paths, join order, and join algorithms.

Partitioned Tables Overview

OceanBase divides table data into distinct blocks based on defined rules, with each block (partition) stored together physically. This partitioning is a core distributed feature that addresses large table capacity and high-concurrency performance. A non-partitioned table is effectively a single-partition table. Each partition resides on one node, but different partitions of a table can be distributed across nodes. Partitioning aims to reduce the volume of data read/written for specific SQL operations, thereby improving response time. Benefits include scalability, manageability, and performance gains.

OceanBase Partitioning Characteristics

Horizontal scaling across machines.
Enhanced manageability and performance.
Automatic load balancing and fault tolerance.
Transparent to applications, potentially replacing manual sharding solutions.
Supports inter-partition parallelism.
Maximum of 8192 partitions per table.
Recommended limit of 30,000 partitions per server (supports up to 80,000).

Partition Types

OceanBase supports one-level and two-level (composite) partitioning. Primary Partition Types: HASH, KEY, LIST, RANGE, RANGE COLUMNS, and partitions using generated columns. Two-Level Partitioning: Further splits data within a primary partition based on a second dimension.

MySQL Mode Support:

RANGE, RANGE COLUMNS, LIST, LIST COLUMNS, HASH, KEY, and composite partitions.

Oracle Mode Support:

RANGE, LIST, HASH, and composite partitions.

Primary Partition Details

HASH Partitioning Specify a partition key and count. Suitable where RANGE or LIST partitioning isn't applicable. Data is distributed via a hash function on the partition key, ideal for point queries (e.g., by user_id) to eliminate hotspots.

CREATE TABLE sample_hash (
    col_a INT,
    col_b INT
) PARTITION BY HASH(col_a + 1) PARTITIONS 5;

Restriction (MySQL mode): The partition expression must evaluate to an INT type. Vector keys (e.g., (col_a, col_b)) are not supported.

KEY Partitioning Similar to HASH but uses an internal hash function on the key before modulo operation, making the target partition less predictable for users.

CREATE TABLE sample_key (
    col_a VARCHAR(16),
    col_b INT
) PARTITION BY KEY(col_a) PARTITIONS 5;

Differences from HASH: The partition key can be any data type, does not support expressions, but does support vector keys. A special syntax PARTITION BY KEY() uses the primary key columns as the partition key.

LIST Partitioning Divides data based on enumerated values.

CREATE TABLE sample_list (
    col_a INT,
    col_b INT
) PARTITION BY LIST(col_a) (
    PARTITION part_0 VALUES IN (1, 2, 3),
    PARTITION part_1 VALUES IN (5, 6),
    PARTITION part_default VALUES IN (DEFAULT)
);

Restriction: The expression must be INT type. Vector keys are not supported. LIST COLUMNS differs by allowing any data type, disallowing expressions, and supporting vector keys.

RANGE Partitioning Assigns rows to partitions based on specified expression ranges. Common for time-series data. Supports ADD/DROP partition operations.

CREATE TABLE time_series (
    id INT,
    event_time TIMESTAMP,
    details VARCHAR(20),
    PRIMARY KEY (event_time)
) PARTITION BY RANGE(UNIX_TIMESTAMP(event_time)) (
    PARTITION p_2014 VALUES LESS THAN (UNIX_TIMESTAMP('2015-01-01 00:00:00')),
    PARTITION p_2015 VALUES LESS THAN (UNIX_TIMESTAMP('2016-01-01 00:00:00')),
    PARTITION p_2016 VALUES LESS THAN (UNIX_TIMESTAMP('2017-01-01 00:00:00'))
);

RANGE COLUMNS differs by allowing any data type for the partition key, disallowing expressions, and supporting vector keys.

CREATE TABLE time_series_col (
    id INT,
    event_time DATETIME,
    details VARCHAR(20),
    PRIMARY KEY (event_time)
) PARTITION BY RANGE COLUMNS(event_time) (
    PARTITION p_2014 VALUES LESS THAN ('2015-01-01 00:00:00'),
    PARTITION p_2015 VALUES LESS THAN ('2016-01-01 00:00:00'),
    PARTITION p_2016 VALUES LESS THAN ('2017-01-01 00:00:00'),
    PARTITION p_future VALUES LESS THAN (MAXVALUE)
);

Generated Column Partitioning Uses a generated (computed) column as the partition key, useful for partitioning on a transformed field.

CREATE TABLE gen_col_part (
    record_key VARCHAR(10) PRIMARY KEY,
    gen_id VARCHAR(4) GENERATED ALWAYS AS (SUBSTRING(record_key, 1, 4)) VIRTUAL,
    value INT
) PARTITION BY KEY(gen_id) PARTITIONS 10;

Two-Level (Composite) Partitioning

Splits data along two dimensions. A common pattern is HASH by user_id and RANGE by creation time.

CREATE TABLE user_history (
    user_id INT,
    created DATETIME,
    info VARCHAR(20),
    PRIMARY KEY(user_id, created)
) PARTITION BY RANGE COLUMNS (created)
SUBPARTITION BY HASH(user_id) SUBPARTITIONS 3 (
    PARTITION p0 VALUES LESS THAN ('2014-11-11'),
    PARTITION p1 VALUES LESS THAN ('2015-11-11'),
    PARTITION p2 VALUES LESS THAN ('2016-11-11'),
    PARTITION p3 VALUES LESS THAN ('2017-11-11')
);

Supported Composite Partition Combinations:

HASH/KEY + RANGE/RANGE_COLUMNS
RANGE/RANGE_COLUMNS + HASH/KEY
LIST/LIST_COLUMNS + RANGE/RANGE_COLUMNS
RANGE/RANGE_COLUMNS + LIST/LIST_COLUMNS
HASH/KEY + LIST/LIST_COLUMNS
LIST/LIST_COLUMNS + HASH/KEY

Note: ADD/DROP partition operations are only supported when RANGE or RANGE_COLUMNS is the first-level partition. Therefore, RANGE + HASH is generally preferred over HASH + RANGE.

Summary of Compatible Combinations The following pairings are equivalent and can be used to create partitioned tables (order is not significant): HASH/KEY == RANGE; LIST == RANGE; LIST == HASH.

Tags: OceanBase SQL Tuning Database Performance partitioning Distributed Databases

Posted on Wed, 20 May 2026 08:13:05 +0000 by bhavesh

Freaks City