Introduction
pt-query-digest is a powerful tool from the Percona Toolkit designed to analyze MySQL slow queries. It can process various input sources such as the slow query log, the binary log (binlog), the general log, output from SHOW PROCESSLIST, or even TCP traffic captured via tcpdump. The tool works by parameterizing query conditions, grouping them by fingerprint, and generating aggregated statistics—including execution time, frequency, and contribution percentages. These insights help DBAs and developers pinpoint performance bottlenecks and optimize problematic queries.
Installation of pt-query-digest
Prerequisites
Ensure that Perl modules CPAN and Time::HiRes are installed:
yum install -y perl-CPAN perl-Time-HiRes
Method 1: RPM Installation (CentOS/RHEL)
cd /usr/local/src
wget https://www.percona.com/downloads/percona-toolkit/3.3.1/binary/redhat/7/x86_64/percona-toolkit-3.3.1-1.el7.x86_64.rpm -O percona-toolkit.rpm
yum install -y percona-toolkit.rpm
# Tools are placed in /usr/bin
Method 2: Source Installation
cd /usr/local/src
wget https://www.percona.com/downloads/percona-toolkit/3.3.1/tarball/percona-toolkit-3.3.1.tar.gz -O percona-toolkit.tar.gz
tar zxf percona-toolkit.tar.gz
cd percona-toolkit-3.3.1
perl Makefile.PL PREFIX=/usr/local/percona-toolkit
make && make install
# Tools are placed in /usr/local/percona-toolkit/bin
Additional Useful Tools from the Suite
- pt-summary – System summary
- pt-diskstats – Disk I/O monitoring
- pt-mysql-summary – MySQL server status overview
Syntax and Key Options
pt-query-digest [OPTIONS] [FILES] [DSN]
Important options:
--create-review-table– Automatically create the review table when using--review.--create-history-table– Automatically create the history table when using--history.--filter– Apply a Perl expression to filter queries before analysis.--limit– Restrict output to a percentage or number of queries (default: 20).--host,--user,--password– MySQL connection credentials.--history– Save detailed analysis to a table; subsequent runs compare against historical data.--review– Save simple parameterized results to a table; avoids duplicates.--output– Output format:report(default, human-readable),slowlog,json,json-anon.--since– Start time for analysis (e.g.,12h,2017-01-07 09:30:00).--until– End time.
Understanding the Output
1. Overall Statistics
This section provides a high-level summary:
| Attribute | Description |
|---|---|
| Overall | Total number of queries |
| Time range | Time window of the log |
| unique | Number of unique parameterized queries |
| total, min, max, avg | Aggregate statistics to each metric |
| 95% | 95th percentile value |
| median | Median value |
Example output:
# 340ms user time, 140ms system time, 23.99M rss, 203.11M vsz
# Current date: Fri Nov 25 02:37:18 2016
# Hostname: myserver.local
# Files: slow.log
# Overall: 2 total, 2 unique, 0.01 QPS, 0.01x concurrency
# Time range: 2016-11-22 06:06:18 to 06:11:40
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 3s 640ms 2s 1s 2s 999ms 1s
# Lock time 1ms 0 1ms 723us 1ms 1ms 723us
# Rows sent 5 1 4 2.50 4 2.12 2.50
# Rows examine 186.17k 0 186.17k 93.09k 186.17k 131.64k 93.09k
# Query size 455 15 440 227.50 440 300.52 227.50
2. Query Profile (Grouped Results)
Each distinct query fingerprint is ranked by total response time:
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ============= ===== ====== ===== ===============
# 1 0xF9A57DD5A41825CA 2.0529 76.2% 1 2.0529 0.00 SELECT
# 2 0x4194D8F83F4F9365 0.6401 23.8% 1 0.6401 0.00 SELECT wx_member_base
Columns: Rank, Query ID (hash), Response time (total and percentage), Calls (execution count), R/Call (average response per call), V/M (variance-to-mean ratio), Item (query pattern).
3. Detailed Per-Query Report
For each query ID, a detailed breakdown includes:
- Count, execution time distribution, lock time, rows sent/examined, query size
- Databases, hosts, users involved
- Query time histogram (bars representing time ranges)
EXPLAINoutput if available
Example:
# Query 1: 0 QPS, 0x concurrency, ID 0xF9A57DD5A41825CA at byte 802
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2016-11-22 06:11:40
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 50 1
# Exec time 76 2s 2s 2s 2s 2s 0 2s
# Lock time 0 0 0 0 0 0 0 0
# Rows sent 20 1 1 1 1 1 0 1
# Rows examine 0 0 0 0 0 0 0 0
# Query size 3 15 15 15 15 15 0 15
# String:
# Databases test
# Hosts 192.168.8.1
# Users mysql
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+
# EXPLAIN /*!50100 PARTITIONS*/
select sleep(2)
Parctical Usage Examples
1. Basic Slow Log Analysis
pt-query-digest /var/log/mysql/slow.log > report.txt
2. Analyze Queries from the Last 12 Hours
pt-query-digest --since=12h /var/log/mysql/slow.log > report_12h.log
3. Filter by Custom Time Window
pt-query-digest slow.log --since '2023-01-01 10:00:00' --until '2023-01-01 12:00:00' > window_report.log
4. Focus Only on SELECT Statements
pt-query-digest --filter '$event->{fingerprint} =~ m/^select/i' slow.log > select_only.log
5. Filter for a Specific User
pt-query-digest --filter '($event->{user} || "") =~ m/^appuser/' slow.log > user_report.log
6. Detect Full Table Scans or Full Joins
pt-query-digest --filter '(($event->{Full_scan} || "") eq "yes") || (($event->{Full_join} || "") eq "yes")' slow.log > full_scan_report.log
7. Save Results to a Review Table
pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_review --create-review-table slow.log
8. Save Results to a History Table (Multiple Files)
pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_history --create-review-table slow.log_0001
pt-query-digest --user=root --password=secret --review h=localhost,D=mydb,t=query_history --create-review-table slow.log_0002
9. Capture and Analyze TCP Traffic
tcpdump -s 65535 -x -nn -q -tttt -i eth0 -c 5000 port 3306 > mysql_traffic.txt
pt-query-digest --type tcpdump mysql_traffic.txt > tcp_analysis.log
10. Analyze Binary Logs
mysqlbinlog mysql-bin.000001 > binlog.sql
pt-query-digest --type=binlog binlog.sql > binlog_analysis.log
11. Analyze General Log
pt-query-digest --type=genlog /var/log/mysql/general.log > general_report.log