Understanding Greenplum EXPLAIN ANALYZE Output and Detailed Analysis Parameters
Greenplum's EXPLAIN ANALYZE provides valuable insights into query execution performance. This article explores how to interpret EXPLAIN ANALYZE output and introduces two parameters that offer more detailed information about query execution.
Background
When analyzing Greenplum's EXPLAIN ANALYZE output, it's important to understand that without detailed parameters enabled, the statistics following "rows out" actually display data from the segment returning the most rows, not necessarily the slowest segment. This can lead to situations where some offsets appear smaller than others despite potential performance issues.
The start offset represents the time from when the Query Dispatcher (QD) initiates the query until the first valid tuple is returned for that node. This metric helps identify several potential issues:
- The time difference between when the QD dispatches the execution plan to segments and when segment worker processes are initiated, reflected in the start offset.
- Large offsets may indicate slow worker process initialization or delayed processing of the first valid record.
- Data skew might cause delays in retrieving the first valid record if extensive filtering is required.
Understanding Greenplum Query Plan Terminology
Greenplum query plans use specific terminology that's important to understand:
- Tree Structure: Query plans are read and executed from bottom to top.
- Node: Each node represents a database operation such as table scan, join, aggregation, or sort.
- Slice: To maximize parallelism, Greenplum divides query plans into slices. A slice is a portion of the plan that segments can work on independently. Plans are sliced where motion operations occur.
- Motion: This operation moves tuples between segments during query processing.
- Redistribute Motion: Moves tuples between segments to complete joins when data distribution keys differ.
- Gather Motion: When segments send results back to the master for presentation to the client.
- Query Dispatcher (QD): The master's query worker process responsible for creating and dispatching the query plan and presenting final results.
- Query Executor (QE): Segment worker processes that complete their portion of the query and communicate intermediate results.
- Worker Process: At least one process assigned to each slice of the query plan.
- Gangs: Related processes working on the same slice but different segments.
- Interconnect: The communication component between segments.
Detailed Analysis Parameters
Greenplum provides two parameters that add additional information to EXPLAIN ANALYZE output:
gp_enable_explain_allstat
This parameter (true/false) prints execution time details for each segment, helping identify skew, excessive processing time, or memory consumption. When enabled, it adds lines showing individual segment performance:
allstat: seg_firststart_total_ntuples/seg0_19 ms_1.028 ms_1334/seg1_19 ms_2.283 ms_3333//end
The format is: segment number_time to start in ms_time to complete in ms_number of tuples. This helps identify segments with unusual behavior.
explain_memory_verbosity
This parameter (SUPPRESS, SUMMARY, DETAIL) controls memory usage details in the output. It uses Greenplum's memory accounting framework introduced in GPDB 4.3.2.0:
- SUPPRESS: Default, no additional memory information.
- SUMMARY: Adds one row per node showing average and maximum memory usage.
- DETAIL: Provides comprehensive memory usage breakdown per operator.
Parameter Usage Example
When both parameters are enabled, the EXPLAIN ANALYZE output becomes significantly more detailed:
SET explain_memory_verbosity = detail;
SET gp_enable_explain_allstat = on;
EXPLAIN ANALYZE SELECT COUNT(*),c1 FROM a GROUP BY c1 LIMIT 1;
The enhanced output includes:
- Memory usage breakdown for each operator node
- Segment-specific performance metrics
- Peak memory usage statistics
- Detailed timing information for each segment
This detailed output helps identify performance bottlenecks, data skew, and memory-intensive operations that might not be apparent in standard EXPLAIN ANALYZE output.
Code Implementation Details
The underlying implemantation for displaying execution statistics can be found in the cdbexplain_showExecStats function. This function handles different plan node types and formats the output accordingly:
switch (planstate->type)
{
case T_BitmapAndState:
case T_BitmapOrState:
case T_BitmapIndexScanState:
s_row = "";
if (ns->ntuples.vcnt > 1)
appendStringInfo(str,
"Bitmaps out: Avg %.1f x %d workers. Max %.0f%s",
ntuples_avg,
ns->ntuples.vcnt,
ns->ntuples.vmax,
segbuf);
else
appendStringInfo(str,
"Bitmaps out: %s%.0f%s",
noRowRequested,
ns->ntuples.vmax,
segbuf);
break;
case T_HashState:
if (ns->ntuples.vcnt > 1)
appendStringInfo(str,
"Rows in: Avg %.1f rows x %d workers. Max %.0f rows%s",
ntuples_avg,
ns->ntuples.vcnt,
ns->ntuples.vmax,
segbuf);
else
appendStringInfo(str,
"Rows in: %s%.0f rows%s",
noRowRequested,
ns->ntuples.vmax,
segbuf);
break;
case T_MotionState:
if (ns->ntuples.vcnt > 1)
appendStringInfo(str,
"Rows out: Avg %.1f rows x %d workers at destination. Max %.0f rows%s",
ntuples_avg,
ns->ntuples.vcnt,
ns->ntuples.vmax,
segbuf);
else
appendStringInfo(str,
"Rows out: %s%.0f rows at destination%s",
noRowRequested,
ns->ntuples.vmax,
segbuf);
break;
default:
if (ns->ntuples.vcnt > 1)
appendStringInfo(str,
"Rows out: Avg %.1f rows x %d workers. Max %.0f rows%s",
ntuples_avg,
ns->ntuples.vcnt,
ns->ntuples.vmax,
segbuf);
else
appendStringInfo(str,
"Rows out: %s%.0f rows%s",
noRowRequested,
ns->ntuples.vmax,
segbuf);
}
/* Show elapsed time once if they are the same or if we don't have any valid elapsed time for first tuple */
if ((instr->ntuples > 0) && (strcmp(firstbuf, totalbuf) != 0))
appendStringInfo(str,
" with %s to first%s, %s to end",
firstbuf,
s_row,
totalbuf);
else
appendStringInfo(str,
" with %s to end",
totalbuf);
/* Number of rescans */
if (instr->nloops > 1)
appendStringInfo(str, " of %.0f scans", instr->nloops);
/* Time from start of query on qDisp to this worker's first result row */
if (!(INSTR_TIME_IS_ZERO(instr->firststart)))
{
INSTR_TIME_SET_ZERO(timediff);
INSTR_TIME_ACCUM_DIFF(timediff, instr->firststart, ctx->querystarttime);
cdbexplain_formatSeconds(startbuf, sizeof(startbuf), INSTR_TIME_GET_DOUBLE(timediff));
appendStringInfo(str, ", start offset by %s", startbuf);
}
This implementation handles various node types and formats the output to include execution times, row counts, and offset information, providing a comprehensive view of query execution performance.