[DATABASE] MySQL 最左匹配原则
* 本文基于 MySQL 5.7 + InnoDB 引擎
最左匹配原则(leftmost perfix,也可以翻译为最左前缀),一般现在网上通俗化的解释是:
在 MySQL 建立联合索引时会遵守最左匹配原则,即最左优先,在检索数据时从联合索引的最左边开始匹配。
其实这个解释会让人搞不太清楚具体是什么意思,再看看官方文档是怎么描述的(在 MySQL 5.7 Reference Manual - 8.3.6 Multiple-Column Indexes 一节中):
MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.
简单翻译一下:
MySQL 可以使用联合索引来处理涉及到索引中的所有列或仅涉及索引中前几列的查询。如果你按正确的顺序在索引定义中指定这些列,那么联合索引就可以加速同一张表上多种类型的查询。
不论是网上一般的解释,还是官方描述其实都比较简单,但在实际查询过程中情况就比较复杂了,那么下面通过几个 SQL 查询例子试着搞清楚。
来点 🌰
先创建一个测试表,然后添加一点测试数据:
-- 建表
CREATE TABLE `test_tb` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`col1` int(11) DEFAULT NULL,
`col2` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
`col3` int(11) DEFAULT NULL,
`col4` int(11) DEFAULT NULL,
PRIMARY KEY (`id`), -- 主键索引
KEY `idx1` (`col1`,`col2`,`col3`) -- 联合索引
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4;
-- 测试数据
insert into test_tb(col1,col2,col3,col4) VALUE(101,"102",103,104);
insert into test_tb(col1,col2,col3,col4) VALUE(202,"202",203,204);
insert into test_tb(col1,col2,col3,col4) VALUE(301,"302",303,304);
insert into test_tb(col1,col2,col3,col4) VALUE(401,"402",403,404);
insert into test_tb(col1,col2,col3,col4) VALUE(501,"502",503,504);
insert into test_tb(col1,col2,col3,col4) VALUE(601,"602",603,604);
insert into test_tb(col1,col2,col3,col4) VALUE(701,"702",703,704);
insert into test_tb(col1,col2,col3,col4) VALUE(801,"802",803,804);
insert into test_tb(col1,col2,col3,col4) VALUE(901,"902",903,904);
insert into test_tb(col1,col2,col3,col4) VALUE(1001,"1002",1003,1004);
先看下面几条 SQL 的执行分析结果:
mysql> explain select * from test_tb where col1 = 1; --索引生效
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 5 | const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_tb where col1 = 1 and col2 = '2'; --索引生效
+----+-------------+---------+------------+------+---------------+------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 104 | const,const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+-------------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_tb where col1 = 1 and col2 = '2' and col3 = 3; --索引生效
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 109 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
mysql> explain select * from test_tb where col2 = '2' and col1 = 1 and col3 = 3; --索引生效
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 109 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+-------------------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
mysql> explain select * from test_tb where col2 = '2' and col3 = 3; --索引不生效
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | test_tb | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_tb where col3 = 3 and col1 = 1; --只有 col1 列索引生效
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 5 | const | 1 | 10.00 | Using index condition |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
根据分析结果的 type 列和 ref 列(Explain 的详解可以参考官方文档 8.8.2 EXPLAIN Output Format)可以发现,只要查询条件中的 N 列包含于联合索引的最左连续的 N 列,联合索引就会生效。也就是说我们创建联合索引 idx1(col1,col2,col3)
,相当于创建了 (col1)、(col1,col2)、(col1,col2,col3)
这三个索引。 也可以发现查询条件中列顺序不会影响索引生效,这是因为 MySQL 的查询优化器,这点顺序问题还是能解决的。
但是下面这句 SQL 的分析结果:
mysql> explain select col1,col2,col3 from test_tb where col2 ='2' and col3 = 3;
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | test_tb | NULL | index | NULL | idx1 | 109 | NULL | 10 | 10.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.01 sec)
可以看到 type 不是 NULL,说明是走了索引的,但是这不符合联合索引需要最左连续列的定义,为什么呢?
覆盖索引
我们知道在 InnoDB 中,表都是根据主键顺序以索引的形式存放的,这种存储方式的表称为索引组织表。而 InnoDB 使用了 B+ 树(本节对 B+ 树不做详解)索引模型,所以数据都是存储在 B+ 树中的。根据叶子节点的内容,索引类型分为主键索引和非主键索引。
- 主键索引(clustered index)也就是聚簇索引,在叶子节点存的是整行数据。(注:聚簇索引默认是主键,如果表中没有定义主键,InnoDB 会选择一个唯一的非空索引代替。如果没有这样的索引,InnoDB 会隐式定义一个主键来作为聚簇索引。)
- 非主键索引(secondary index)也就是二级索引,在叶子节点存的是主键的值。
所以使用 select * from test_tb where id = 1;
这种主键查询方式,就只需要查询主键索引对应的那棵树。而使用 select * from test_tb where col1 = 1;
这样的非主键索引查询方式,就需要先查询 idx1 这个索引对应的那棵树获得 id 值之后,再到主键索引树查询,这个过程称之为回表。
使用非主键查询需要多扫描一棵树,会降低查询效率,怎么尽量不回表呢?除了主键查询,就是覆盖索引了。根据上面的概述,如果二级索引查询的时候只查询查询索引里包含的字段,也就是索引中覆盖了你查询的字段,那就不需要回表了。这就解释了上面的示例为什么是走了索引的,因为查询的字段包含在联合索引 idx1(col1,col2,col3)
中,不需要回表,也就是覆盖索引。
中断匹配索引
之前的示例中,最后一条 SQL 条件列不连续会中断索引,还有哪些情况会中断联合索引,最常见的说法是遇到范围查询如(<,>,between,like
)就会中断匹配索引。
示例:
mysql> explain select * from test_tb where col1 > 1 and col2 = '2' and col3 = 3;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | test_tb | NULL | ALL | idx1 | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)
根据测试,确实如 <,>,between,!=,not in
这样的范围查询是会中断匹配的(因为篇幅没有把所有查询示例列出,感兴趣的可以自己执行查看),虽然 in 也是范围查询,但是不会中断匹配:
mysql> explain select * from test_tb where col1 in (1,2) and col2 = '2' and col3 = 3; --(col1 = 1 or col1 =2)结果是一样的
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | test_tb | NULL | range | idx1 | idx1 | 109 | NULL | 2 | 100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
但是 like
的情况有些复杂:
* 注:like
默认是左匹配查询,也就是 like 'x'
等价于 like 'x%'
。rlike
就是右匹配查询。
mysql> explain select * from test_tb where col1 = 1 and col2 like '1' and col3 = 3;
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | test_tb | NULL | range | idx1 | idx1 | 109 | NULL | 1 | 100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
mysql> explain select * from test_tb where col1 = 1 and col2 like '%1' and col3 = 3; -- like '%1%' 执行分析结果相同
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | test_tb | NULL | ref | idx1 | idx1 | 5 | const | 1 | 10.00 | Using index condition |
+----+-------------+---------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
可以发现,使用左匹配 like '1'
不会中断匹配索引。like '%1'
和 like '%1%'
才会中断匹配。
也就是说最左匹配原则不光指联合索引中的列,也是字符串索引中的字符。
其他情况:
mysql> explain select * from test_tb where col1 = 1 or col2 = '2' and col3 = 3; -- 列间使用 or
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | test_tb | NULL | ALL | idx1 | NULL | NULL | NULL | 10 | 10.90 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_tb where col1 - 1 = 1 and col2 = '2' and col3 = 3; -- 列参与计算
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | test_tb | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select * from test_tb where date(col1) = 1 and col2 = '2' and col3 = 3; -- 列包含在函数式里
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | test_tb | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.01 sec)