msyql sql优化一列

websyhg 发表于 2018-10-21 11:00:01

　　一个在SSD硬盘64核CPU服务器上跑出来的慢sql，执行时间在2s左右，写出来做个记录：
　　select
　　a.sub_id subid,
　　a.sub_user_type subtype,
　　a.longitude lng,
　　a.latitude lat,
　　a.sub_content subcontent,
　　case
　　when b.is_top = 1 and b.top_start_time= now() then 1
　　else 0
　　end istop,
　　a.perfect_sign isperf,
　　a.vote_sign isvote,
　　a.publisher,
　　UNIX_TIMESTAMP(a.sub_time) subtime,
　　a.show_persons vnum,
　　(select count(1) from app_find_praise where object_id = a.sub_id and object_class =0) znum,
　　(select count(1) from app_find_comment where object_id = a.sub_id and object_class= 0 and comment_status = 0) rnum,
　　if((select count(1) from app_find_praise where object_id = a.sub_id and object_class = 0 and user_id = 'd7b4e485d9fe11e5a9c95fa622b527a1') > 0, 1, 0) iszan,
　　if((select count(1) from app_find_mycollect where object_id = a.sub_id and    object_type = 0 and user_id = 'd7b4e485d9fe11e5a9c95fa622b527a1') > 0, 1, 0) iscoll,
　　if((select count(1) from app_find_subject_hot where sub_id = a.sub_id) > 0, 1, 0) isHot
　　from app_find_subject a
　　inner join app_find_subject_hot b
　　on b.sub_id = a.sub_id
　　where (a.sub_status = 0 or (a.sub_status in (0, 3) and a.publisher='d7b4e485d9fe11e5a9c95fa622b527a1'))
　　and a.is_public = 1
　　order by b.is_top = 1 and b.top_start_time = now() desc,
　　a.sub_time desc
　　limit 10
　　第一眼看有limit 10做限制，应该不会执行太慢，查看执行计划等逐一排查导致慢的原因在于order by的使用，这个order by的写法在于按扫描顺序对满足条件的进行排序，其余的都是乱序的（PS:我也是第一次见这种写法），结果和开发沟通是否业务上需求的能不能改掉这order by，答案是不可以改...........
　　首先说下为什么使用limit效率会相当慢，首先sql语句的执行顺序为查询数据->where条件->order/group by->limit ，这样的一个顺序limit要等数据查询完成在order by时才能起到优化作用，而前面又有几个嵌套子查询和join,不需要的数据也执行了嵌套查询，这导致很多执行时间是对最终结果无用的
　　优化思路：观察order by使用的是a、b两个表的字段，经过查询结构数据，发现a、b两个表连接条件都为主键，这可以确定数据唯一，不会在嵌套时产生一对多的情况，那就利用延迟关联的方式先把满足order by条件的10条数据先取出来，再利用这10条数据进行对其他数据表的关联查找
　　下面是修改过后的sql:
　　select a.sub_id subid,
　　a.sub_user_type subtype,
　　a.longitude lng,
　　a.latitude lat,
　　a.sub_content subcontent,
　　b.istop,
　　a.perfect_sign isperf,
　　a.vote_sign isvote,
　　a.publisher,
　　UNIX_TIMESTAMP(a.sub_time) subtime,
　　a.show_persons vnum,
　　(select count(1) from app_find_praise where object_id = a.sub_id and object_class =0) znum,
　　(select count(1) from app_find_comment where object_id = a.sub_id and object_class = 0 and comment_status = 0) rnum,
　　if((select count(1) from app_find_praise where object_id = a.sub_id and object_class = 0 and user_id = 'd7b4e485d9fe11e5a9c95fa622b527a1') > 0, 1, 0) iszan,
　　if((select count(1) from app_find_mycollect where object_id = a.sub_id and object_type = 0 and user_id = 'd7b4e485d9fe11e5a9c95fa622b527a1') > 0, 1, 0) iscoll,
　　if((select count(1) from app_find_subject_hot where sub_id = a.sub_id) > 0, 1, 0) isHot
　　from
　　(select bb.sub_id as sub_id,case
　　when bb.is_top = 1 and bb.top_start_time = now() then 1
　　else 0
　　end istop from app_find_subject_hot bb join app_find_subject aa on aa.sub_id = bb.sub_id
　　order bybb.is_top = 1 and bb.top_start_time = now() desc,
　　aa.sub_time desc limit 10) b
　　join app_find_subject a on (a.sub_id = b.sub_id);
　　现在来进行效率对比（采用profiles查看分别都执行两次的时间）：
　　优化前：
　　|    1 | 2.29620150 |
　　|    2 | 2.33743150 |
　　优化后：
　　|    3 | 0.08940750 |
　　|    4 | 0.09856800 |
　　经过执行时间来看优化提升是很多的，该优化增加了一个子查询作为整个嵌套循环的驱动表，删除了a表查询的条件语句，通过业务结构判断从b表查询出来的10条数据基本不会不满足a表这些条件，看到这可能有人已经发现对a表做了两次嵌套，会觉得这里写成一次不会更好么，下面看改为一次查询效率：
　　select
　　a.sub_id subid,
　　a.sub_user_type subtype,
　　a.longitude lng,
　　a.latitude lat,
　　a.sub_content subcontent,
　　case
　　when b.is_top = 1 and b.top_start_time= now() then 1
　　else 0
　　end istop,
　　a.perfect_sign isperf,
　　a.vote_sign isvote,
　　a.publisher,
　　UNIX_TIMESTAMP(a.sub_time) subtime,
　　a.show_persons vnum
　　from
　　app_find_subject_hot b
　　JOIN app_find_subject a
　　on (b.sub_id = a.sub_id)
　　order by b.is_top = 1 and b.top_start_time = now() desc,a.sub_time DESC
　　limit 10;
　　改成这样把a表所需字段数据一起查询出来，看看执行效果：
　　|    1 | 0.24470025 |
　　|    2 | 0.20582050 |
　　这还是没有对其他子查询进行查找的情况下，所需时间比多一次嵌套的优化执行时间更长，这是因为这样对a表字段进行查找时，在limit之前需要返回足够多数据，而每次对a表进行嵌套时都进行了数据扫描，如果不查询其他字段只需要sub_id，a表就只进行了主键索引查找，不会去对数据进行扫描，对CPU时间消耗更低，再次嵌套时就只对表数据进行了10次扫描，当然开销更小，这告诉我们sql优化是门复杂课程，需要结合具体场景进行，书上或者网上别人说的知识都是死的，怎样的方式更优需结合具体的场景

页: [1]

运维网's Archiver

msyql sql优化一列