Root Cause Analysis of Performance Variations
A colleague recently shared a comparative analysis between MogDB and KingBase conducted by another technical blogger. The benchmark utilized sysbench for testing, and the results indicated that MogDB underperformed compared to KingBase in most scenarios, with only one test case showing significantly higher results for MogDB.
Upon initial review, this comparison raised some questions. The test environment utilized high-end hardware specifications: Kunpeng 920 processor with 128 cores and 500GB SSD storage. One would expect substantially higher performance metrics from both databases under such conditions. The blogger noted that no specific optimization configurations were applied during testing.
Why KingBase Shows Higher Results
The key factor explaining why certain query scenarios demonstrate significantly higher numbers for KingBase is parallel execution. KingBase enables parallel query processing by default, which can dramatically improve performance for certain workload types. This is a critical distinction that often leads to misleading benchmark comparisons when not properly documented.
Understanding Performance Degradation
Another observation from the benchmark was that both MogDB and KingBase exhibited decreased performance after multiple test runs. This behavior is well-documented in PostgreSQL-based databases and stems from table bloat - a common issue where deleted or updated rows accumulate in the database without being physically removed.
Regular VACUUM operations are essential to maintain optimal performance. The upcoming Ustore storage engine addresses this limitation by implementing a design similar to Oracle's UNDO mechanism, effectively eliminating table bloat concerns.
Optimization Considerations
For MogDB deployments, using the automated installation tool (ptk) automatically applies standard parameter optimizations suitable for most production environments. This baseline configuration typically provides adequate performance without manual tuning.
Benchmark Results from Standard Development Environment
Due to unavailability of high-specification hardware, tests were conducted on internal development infrastructure with more modest specifications:
$ cat /proc/cpuinfo | grep 'model name' |uniq
model name : Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz
$ cat /proc/cpuinfo | grep "physical id" | unq | wc -l
16
$ cat /proc/meminfo | head -5
MemTotal: 64651288 kB
MemFree: 11864704 kB
MemAvailable: 56137824 kB
Buffers: 1768 kB
Cached: 50118012 kB
System configuration: 16 cores, 64GB RAM, SATA SSD storage.
Original Benchmark Data (80 concurrent threads)
| Test Scenario | TPS |
|---|---|
| oltp_point_select | 96251 |
| oltp_read_only | 6164 |
| select_random_points | 64906 |
| select_random_ranges | 34030 |
| oltp_insert | 66850 |
| oltp_write_only | 12138 |
| oltp_delete | 53598 |
| oltp_update_non_index | 58508 |
| oltp_update_index | 55031 |
| oltp_read_write | 2689 |
Internal Test Results (Same Configuration)
| Test Scenario | TPS |
|---|---|
| oltp_point_select | 107679.39 |
| oltp_read_only | 5512.35 |
| select_random_points | 77225 |
| select_random_ranges | 34232 |
| oltp_insert | 33988.50 |
| oltp_write_only | 12476.46 |
| oltp_delete | 91002.39 |
| oltp_update_non_index | 44770.62 |
| oltp_update_index | 39473.81 |
| oltp_read_write | 3183.41 |
Analysis
Several test scenarios demonstrate improved performance on the lower-specification environment compared to the original benchmark, particularly:
- Point select operations
- Random point queries
- Random range queries
- Write-only operations
- Delete operations
- Mixed read/write workloads
These results suggest that default configuration differences and parallel execution settings have substantial impact on benchmark outcomes. For typical business applications, proper configuration tuning can yield meaningful performance improvements.