Deep Dive into MySQL Query Analysis with pt-query-digest

Introduction

pt-query-digest is a powerful tool from the Percona Toolkit designed to analyze MySQL slow queries. It can process various input sources such as the slow query log, the binary log (binlog), the general log, output from SHOW PROCESSLIST, or even TCP traffic captured via tcpdump. The tool works by parameterizing query conditions, grouping them by fingerprint, and generating aggregated statistics—including execution time, frequency, and contribution percentages. These insights help DBAs and developers pinpoint performance bottlenecks and optimize problematic queries.

Installation of pt-query-digest

Prerequisites

Ensure that Perl modules CPAN and Time::HiRes are installed:

yum install -y perl-CPAN perl-Time-HiRes

Method 1: RPM Installation (CentOS/RHEL)

cd /usr/local/src
wget https://www.percona.com/downloads/percona-toolkit/3.3.1/binary/redhat/7/x86_64/percona-toolkit-3.3.1-1.el7.x86_64.rpm -O percona-toolkit.rpm
yum install -y percona-toolkit.rpm
# Tools are placed in /usr/bin

Method 2: Source Installation

cd /usr/local/src
wget https://www.percona.com/downloads/percona-toolkit/3.3.1/tarball/percona-toolkit-3.3.1.tar.gz -O percona-toolkit.tar.gz
tar zxf percona-toolkit.tar.gz
cd percona-toolkit-3.3.1
perl Makefile.PL PREFIX=/usr/local/percona-toolkit
make && make install
# Tools are placed in /usr/local/percona-toolkit/bin

Additional Useful Tools from the Suite

  • pt-summary – System summary
  • pt-diskstats – Disk I/O monitoring
  • pt-mysql-summary – MySQL server status overview

Syntax and Key Options

pt-query-digest [OPTIONS] [FILES] [DSN]

Important options:

  • --create-review-table – Automatically create the review table when using --review.
  • --create-history-table – Automatically create the history table when using --history.
  • --filter – Apply a Perl expression to filter queries before analysis.
  • --limit – Restrict output to a percentage or number of queries (default: 20).
  • --host, --user, --password – MySQL connection credentials.
  • --history – Save detailed analysis to a table; subsequent runs compare against historical data.
  • --review – Save simple parameterized results to a table; avoids duplicates.
  • --output – Output format: report (default, human-readable), slowlog, json, json-anon.
  • --since – Start time for analysis (e.g., 12h, 2017-01-07 09:30:00).
  • --until – End time.

Understanding the Output

1. Overall Statistics

This section provides a high-level summary:

Attribute Description
Overall Total number of queries
Time range Time window of the log
unique Number of unique parameterized queries
total, min, max, avg Aggregate statistics to each metric
95% 95th percentile value
median Median value

Example output:

# 340ms user time, 140ms system time, 23.99M rss, 203.11M vsz
# Current date: Fri Nov 25 02:37:18 2016
# Hostname: myserver.local
# Files: slow.log
# Overall: 2 total, 2 unique, 0.01 QPS, 0.01x concurrency
# Time range: 2016-11-22 06:06:18 to 06:11:40
# Attribute         total     min     max     avg     95%  stddev  median
# ============   ======= ======= ======= ======= ======= ======= =======
# Exec time           3s   640ms      2s      1s      2s   999ms      1s
# Lock time          1ms       0     1ms   723us     1ms     1ms   723us
# Rows sent            5       1       4    2.50       4    2.12    2.50
# Rows examine   186.17k       0 186.17k  93.09k 186.17k 131.64k  93.09k
# Query size         455      15     440  227.50     440  300.52  227.50

2. Query Profile (Grouped Results)

Each distinct query fingerprint is ranked by total response time:

# Profile
# Rank Query ID           Response time Calls R/Call V/M   Item
# ==== ================== ============= ===== ====== ===== ===============
#    1 0xF9A57DD5A41825CA  2.0529 76.2%     1 2.0529  0.00 SELECT
#    2 0x4194D8F83F4F9365  0.6401 23.8%     1 0.6401  0.00 SELECT wx_member_base

Columns: Rank, Query ID (hash), Response time (total and percentage), Calls (execution count), R/Call (average response per call), V/M (variance-to-mean ratio), Item (query pattern).

3. Detailed Per-Query Report

For each query ID, a detailed breakdown includes:

  • Count, execution time distribution, lock time, rows sent/examined, query size
  • Databases, hosts, users involved
  • Query time histogram (bars representing time ranges)
  • EXPLAIN output if available

Example:

# Query 1: 0 QPS, 0x concurrency, ID 0xF9A57DD5A41825CA at byte 802
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2016-11-22 06:11:40
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count         50       1
# Exec time     76      2s      2s      2s      2s      2s       0      2s
# Lock time      0       0       0       0       0       0       0       0
# Rows sent     20       1       1       1       1       1       0       1
# Rows examine   0       0       0       0       0       0       0       0
# Query size     3      15      15      15      15      15       0      15
# String:
# Databases    test
# Hosts        192.168.8.1
# Users        mysql
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms
#    1s  ################################################################
#  10s+
# EXPLAIN /*!50100 PARTITIONS*/
select sleep(2)

Parctical Usage Examples

1. Basic Slow Log Analysis

pt-query-digest /var/log/mysql/slow.log > report.txt

2. Analyze Queries from the Last 12 Hours

pt-query-digest --since=12h /var/log/mysql/slow.log > report_12h.log

3. Filter by Custom Time Window

pt-query-digest slow.log --since '2023-01-01 10:00:00' --until '2023-01-01 12:00:00' > window_report.log

4. Focus Only on SELECT Statements

pt-query-digest --filter '$event->{fingerprint} =~ m/^select/i' slow.log > select_only.log

5. Filter for a Specific User

pt-query-digest --filter '($event->{user} || "") =~ m/^appuser/' slow.log > user_report.log

6. Detect Full Table Scans or Full Joins

pt-query-digest --filter '(($event->{Full_scan} || "") eq "yes") || (($event->{Full_join} || "") eq "yes")' slow.log > full_scan_report.log

7. Save Results to a Review Table

pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_review --create-review-table slow.log

8. Save Results to a History Table (Multiple Files)

pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_history --create-review-table slow.log_0001
pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_history --create-review-table slow.log_0002

9. Capture and Analyze TCP Traffic

tcpdump -s 65535 -x -nn -q -tttt -i eth0 -c 5000 port 3306 > mysql_traffic.txt
pt-query-digest --type tcpdump mysql_traffic.txt > tcp_analysis.log

10. Analyze Binary Logs

mysqlbinlog mysql-bin.000001 > binlog.sql
pt-query-digest --type=binlog binlog.sql > binlog_analysis.log

11. Analyze General Log

pt-query-digest --type=genlog /var/log/mysql/general.log > general_report.log

Tags: pt-query-digest MySQL query analysis Performance Tuning percona-toolkit

Posted on Wed, 20 May 2026 21:03:56 +0000 by Pig