|
如何创建高性能索引
创建测试表:
drop table t_index ;
create table t_index(
tid int not null PRIMARY key auto_increment ,
tname varchar(100) not null ,
tage TINYINT default 0 ,
tadd varchar(100) default '' ,
tel int default 0,
tmob varchar(20) DEFAULT '' ,
tsfz varchar(100) default ''
)
ENGINE=InnoDB DEFAULT CHARSET=utf8;
插入测试数据:
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('张三风',110,'恒山' ,18099001122,'012-46319976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('朱元璋',56,'北京' ,18112401122,'012-40119976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('杨过',25,'武汉' ,18099112122,'012-46340116','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('郭靖',45,'长沙' ,13149001122,'012-46900176','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('黄老邪',100,'河北' ,13129001122,'012-49001976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('周伯通',102,'河南' ,15679001122,'012-46319001','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('洪七公',78,'合肥' ,11243001122,'012-46319976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('欧阳峰',67,'广西' ,13214001122,'012-14009976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('欧阳可',27,'深圳' ,15123001122,'012-46314006','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('尼玛',10,'上海' ,13125001122,'012-41400976','') ;
insert into t_index(tname,tage,tadd,tel,tmob,tsfz)
VALUES('杨康',30,'西藏' ,15798001122,'012-46311400','') ;
1.前缀索引:
如果列值很长,那么索引就有问题了,列值很长,索引的键值就很大,这样不仅浪费空间而且对查询
起不了什么作用,这个时候就只需要索引列的前半部分就行了,也就是前缀索引
那么问题就出来了 ,因该取列的前多少位才是最好的呢.
大致上遵从一个规则就是不同重复值占总数据的比值,比值越大,选择性越高,
看下面这个例子:
mysql> select count(DISTINCT left(submitcontent,10))/count(*) ct1,
-> count(DISTINCT left(submitcontent,30))/count(*) ct2,
-> count(DISTINCT left(submitcontent,40))/count(*) ct3,
-> count(DISTINCT left(submitcontent,50))/count(*) ct4,
-> count(DISTINCT left(submitcontent,60))/count(*) ct5,
-> count(DISTINCT left(submitcontent,100))/count(*) ct6
-> from GetDetail10dayBefore
-> ;
+--------+--------+--------+--------+--------+--------+
| ct1 | ct2 | ct3 | ct4 | ct5 | ct6 |
+--------+--------+--------+--------+--------+--------+
| 0.6149 | 0.8388 | 0.9463 | 0.9761 | 0.9791 | 0.9821 |
+--------+--------+--------+--------+--------+--------+
ct1,ct2,ct3,ct4,ct5,ct6分别取列值的前10,30,40,50,60,100
这里取前100位的选择性最高,但前100并不是最佳选择,这里我们还要考虑到
字符数,纵观以上6个值中,取前40是最好的。
取前40作为索引值
mysql> alter table GetDetail10dayBefore add key(submitcontent(40)) ;
Query OK, 335 rows affected (0.01 sec)
Records: 335 Duplicates: 0 Warnings: 0
mysql> EXPLAIN
-> select * from GetDetail10dayBefore
-> where submitcontent ='asdf';
+----+-------------+----------------------+------+-----------------------------------------------+---------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+-----------------------------------------------+---------------+---------+-------+------+-------------+
| 1 | SIMPLE | GetDetail10dayBefore | ref | submitcontent,submitcontent_2,submitcontent_3 | submitcontent | 82 | const | 1 | Using where |
+----+-------------+----------------------+------+-----------------------------------------------+---------------+---------+-------+------+-------------+
从执行计划看,该查询使用到了索引的
2.多列索引:
有的时候单列索引无法满足需求 ,这个时候就需要创建多列索引了。多列索引就真的能使用查询
变快吗 ,这个是不确定的。这个里面还取决与一个非常重要的因素-列的顺序。
这里我们还是应该遵循一个原则,列的选择性越高,就越靠前
比如:
现在我要创建一个多列包含tname,tage,tadd的多列索引
我们先计算下tname,tage,tadd的选择性
mysql> select count(distinct tname)/count(*) tname,count(distinct tage)/count(*) tage,count(distinct tadd)/count(*) tadd from t_index ;
+--------+--------+--------+
| tname | tage | tadd |
+--------+--------+--------+
| 0.8824 | 0.7647 | 0.8235 |
+--------+--------+--------+
1 row in set (0.00 sec)
从以上结果中可以看出tname的选择性最高,其次是tadd,最后tage 的选择性最低。
那么索引中德列的顺序应该(tname,tadd,tage)
alter table t_index add key (tname,tadd,tage)
|
|