MySQL分表优化试验 - MySQL运维 - 运维网 - Powered by Discuz! Archiver

论坛 › MySQL运维 › MySQL分表优化试验

漂亮蓝影 发表于 2018-9-27 06:03:50

MySQL分表优化试验

我们的项目中有好多不等于的情况。今天写这篇文章简单的分析一下怎么个优化法。　　这里的分表逻辑是根据t_group表的user_name组的个数来分的。
　　因为这种情况单独user_name字段上的索引就属于烂索引。起不了啥名明显的效果。
　　1、试验PROCEDURE.
　　DELIMITER $$
　　DROP PROCEDURE `t_girl`.`sp_split_table`$$
　　CREATEPROCEDURE `t_girl`.`sp_split_table`()
　　BEGIN
　　declare done int default 0;
　　declare v_user_name varchar(20) default '';
　　declare v_table_name varchar(64) default '';
　　-- Get all users' name.
　　declare cur1 cursor for select user_name from t_group group by user_name;
　　-- Deal with error or warnings.
　　declare continue handler for 1329 set done = 1;
　　-- Open cursor.
　　open cur1;
　　while done1
　　do
　　fetch cur1 into v_user_name;
　　if not done then
　　-- Get table name.
　　set v_table_name = concat('t_group_',v_user_name);
　　-- Create new extra table.
　　set @stmt = concat('create table ',v_table_name,' like t_group');
　　prepare s1 from @stmt;
　　execute s1;
　　drop prepare s1;
　　-- Load data into it.
　　set @stmt = concat('insert into ',v_table_name,' select * from t_group where user_name = ''',v_user_name,'''');
　　prepare s1 from @stmt;
　　execute s1;
　　drop prepare s1;
　　end if;
　　end while;
　　-- Close cursor.
　　close cur1;
　　-- Free variable from memory.
　　set @stmt = NULL;
　　END$$
　　DELIMITER ;
　　2、试验表。
　　我们用一个有一千万条记录的表来做测试。
　　mysql> select count(*) from t_group;
　　+----------+
　　| count(*) |
　　+----------+
　　| 10388608 |
　　+----------+
　　1 row in set (0.00 sec)
　　表结构。
　　mysql> desc t_group;
　　+-------------+------------------+------+-----+-------------------+----------------+
　　| Field    | Type          | Null | Key | Default       | Extra       |
　　+-------------+------------------+------+-----+-------------------+----------------+

　　|>　　| money    | decimal(10,2) | NO | |                |             |
　　| user_name | varchar(20)    | NO | MUL |                |             |
　　| create_time | timestamp    | NO | | CURRENT_TIMESTAMP |             |
　　+-------------+------------------+------+-----+-------------------+----------------+
　　4 rows in set (0.00 sec)
　　索引情况。
　　mysql> show index from t_group;
　　+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
　　|Table | Non_unique | Key_name       | Seq_in_index | Column_name |Collation | Cardinality | Sub_part | Packed | Null | Index_type |Comment |
　　+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

　　|t_group |       0 | PRIMARY       |          1 |>
　　| t_group |       1 |>
　　| t_group |       1 |>　　| t_group |       1 |idx_combination1 |          2 | money    | A       |    3776| NULL | NULL |    | BTREE    |       |
　　+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
　　4 rows in set (0.00 sec)
　　PS:
　　idx_combination1 这个索引是必须的，因为要对user_name来GROUP BY。此时属于松散索引扫描！当然完了后你可以干掉她。
　　idx_user_name 这个索引是为了加快单独执行constant这种类型的查询。
　　我们要根据用户名来分表。
　　mysql> select user_name from t_group where 1 group by user_name;
　　+-----------+
　　| user_name |
　　+-----------+
　　| david |
　　| leo    |
　　| livia |
　　| lucy    |
　　| sarah |
　　| simon |
　　| sony    |
　　| sunny |
　　+-----------+
　　8 rows in set (0.00 sec)
　　所以结果表应该是这样的。
　　mysql> show tables like 't_group_%';
　　+------------------------------+
　　| Tables_in_t_girl (t_group_%) |
　　+------------------------------+
　　| t_group_david             |
　　| t_group_leo                |
　　| t_group_livia             |
　　| t_group_lucy             |
　　| t_group_sarah             |
　　| t_group_simon             |
　　| t_group_sony             |
　　| t_group_sunny             |
　　+------------------------------+
　　8 rows in set (0.00 sec)
　　3、对比结果。
　　mysql> select count(*) from t_group where user_name = 'david';
　　+----------+
　　| count(*) |
　　+----------+
　　|1298576 |
　　+----------+
　　1 row in set (1.71 sec)
　　执行了将近2秒。
　　mysql> select count(*) from t_group_david;
　　+----------+
　　| count(*) |
　　+----------+
　　|1298576 |
　　+----------+
　　1 row in set (0.00 sec)
　　几乎是瞬间的。
　　mysql> select count(*) from t_group where user_name'david';
　　+----------+
　　| count(*) |
　　+----------+
　　|9090032 |
　　+----------+
　　1 row in set (9.26 sec)
　　执行了将近10秒，可以想象，这个是实际的项目中是不能忍受的。
　　mysql> select (select count(*) from t_group) - (select count(*) from t_group_david) as total;
　　+---------+
　　| total |
　　+---------+
　　| 9090032 |
　　+---------+
　　1 row in set (0.00 sec)
　　几乎是瞬间的。
　　我们来看看聚集函数。
　　对于原表的操作。
　　mysql> select min(money),max(money) from t_group where user_name = 'david';
　　+------------+------------+
　　| min(money) | max(money) |
　　+------------+------------+
　　|    -6.41 | 500.59 |
　　+------------+------------+
　　1 row in set (0.00 sec)
　　最小，最大值都是FULL INDEX SCAN。所以是瞬间的。
　　mysql> select sum(money),avg(money) from t_group where user_name = 'david';
　　+--------------+------------+
　　| sum(money) | avg(money) |
　　+--------------+------------+
　　| 319992383.84 | 246.417910 |
　　+--------------+------------+
　　1 row in set (2.15 sec)
　　其他聚集函数的结果就不是FULL INDEX SCAN了。耗时2.15秒。
　　对于小表的操作。
　　mysql> select min(money),max(money) from t_group_david;
　　+------------+------------+
　　| min(money) | max(money) |
　　+------------+------------+
　　|    -6.41 | 500.59 |
　　+------------+------------+
　　1 row in set (1.50 sec)
　　最大最小值完全是FULL TABLE SCAN,耗时1.50秒，不划算。以此看来。
　　mysql> select sum(money),avg(money) from t_group_david;
　　+--------------+------------+
　　| sum(money) | avg(money) |
　　+--------------+------------+
　　| 319992383.84 | 246.417910 |
　　+--------------+------------+
　　1 row in set (1.68 sec)
　　取得这两个结果也是花了快2秒，快了一点。
　　我们来看看这个小表的结构。
　　mysql> desc t_group_david;
　　+-------------+------------------+------+-----+-------------------+----------------+
　　| Field    | Type          | Null | Key | Default       | Extra       |
　　+-------------+------------------+------+-----+-------------------+----------------+

　　|>　　| money    | decimal(10,2) | NO | |                |             |
　　| user_name | varchar(20)    | NO | MUL |                |             |
　　| create_time | timestamp    | NO | | CURRENT_TIMESTAMP |             |
　　+-------------+------------------+------+-----+-------------------+----------------+
　　4 rows in set (0.00 sec)
　　明显的user_name属性是多余的。那么就干掉它。

　　mysql>>　　Query OK, 1298576 rows affected (7.58 sec)
　　Records: 1298576Duplicates: 0Warnings: 0
　　现在来重新对小表运行查询
　　mysql> select min(money),max(money) from t_group_david;
　　+------------+------------+
　　| min(money) | max(money) |
　　+------------+------------+
　　|    -6.41 | 500.59 |
　　+------------+------------+
　　1 row in set (0.00 sec)
　　此时是瞬间的。
　　mysql> select sum(money),avg(money) from t_group_david;
　　+--------------+------------+
　　| sum(money) | avg(money) |
　　+--------------+------------+
　　| 319992383.84 | 246.417910 |
　　+--------------+------------+
　　1 row in set (0.94 sec)
　　这次算是控制在一秒以内了。
　　mysql> Aborted
　　小总结一下：分出的小表的属性尽量越少越好。大胆的去干吧。

页: [1]

查看完整版本: MySQL分表优化试验