在非生产服务器测试工具 Test the tool on a non-production server
备份你的生产环节并且分析核查备份 Backup your production server and verify the backups
描述:DESCRIPTION
This tool connects to a MySQL database server, reads through a query log, and uses EXPLAIN to ask MySQL how it will use each query. When it is finished, it
prints out a report on indexes that the queries didn’t use.
The query log needs to be in MySQL’s slow query log format. If you need to input a different
format, you can use pt-query-digest to translate the formats. If you don’t specify a filename, the tool reads from STDIN.
The
tool runs two stages. In the first stage, the tool takes inventory of all the tables and indexes in your database, so it can compare the existing indexes to those that were actually used by the queries in the log. In the second stage, it runs EXPLAIN on each
query in the query log. It uses separate database connections to inventory the tables and run EXPLAIN, so it opens two connections to the database.
工具运行2个阶段,在第一个阶段,工具缓存库里面所有的表以及索引,因此它能已存在的索引与在日志里面真正被使用的查询进行比对。第二个阶段,他对查询日志的每一条查询执行EXPLAIN命令,它使用隔离的库连接去存储表并且执行explain,因此它开启了2个数据库连接。
If a query is not a SELECT, it tries to transform it to a roughly equivalent SELECT query
so it can be EXPLAINed. This is not a perfect process, but it is good enough to be useful.
The tool skips the
EXPLAIN step for queries that are exact duplicates of those seen before. It assumes that the same query will generate the same EXPLAIN plan as it did previously (usually a safe assumption, and generally good for performance), and simply increments the count
of times that the indexes were used. However, queries that have the same fingerprint but different checksums will be re-EXPLAINed. Queries that have different literal constants can have different execution plans, and this is important to measure.
After EXPLAIN-ing the query, it is necessary
to try to map aliases in the query back to the original table names. For example, consider the EXPLAIN plan for the following query:
The
EXPLAIN output will show access to tablefoo, and that must be translated back
totbl1. This process involves complex parsing. It is generally very accurate,
but there is some chance that it might not work right. If you find cases where it fails, submit a bug report and a reproducible test case.
Queries that cannot be EXPLAINed will cause all subsequent queries with the same fingerprint to be blacklisted. This is to reduce the work they cause, and prevent
them from continuing to print error messages. However, at least in this stage of the tool’s development, it is my opinion that it’s not a good idea to preemptively silence these, or prevent them from being EXPLAINed at all. I am looking for lots of feedback
on how to improve things like the query parsing. So please submit your test cases based on the errors the tool prints!
After it reads all the events in the log, the tool prints out DROP statements for every index that was not used. It skips indexes for tables that were never accessed by any queries in the log, to avoid false-positive results.