mysql子查询的缺陷以及5.6的优化

 

分类

子查询分为from clausewhere clause,在执行计划的select_type分别显示为derived[dependent] subquery;

根据和外部查询的依赖关系又可分为两种,

相关子查询:子查询依赖外层连接的返回值

非相关子查询:子查询不依赖外层连接的返回值


缺陷

pre-5.6通常会将非相关子查询改为相关子查询,即先遍历outer table,对于其返回的每一条记录都执行一次subquery
注:mysql目前只实现nested-loop join,所以dependent subquery才会如此消耗资源,如果是oracle则可进行semi/anti hash join
http://blog.itpub.net/15480802/viewspace-703260

SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE where_condition);

有可能被重写为

SELECT * FROM t1 WHERE EXISTS (SELECT t2.b FROM t2 WHERE where_condition AND t1.a=t2.b);


Pre-5.6优化

Mysql可以将from clause的子查询进行物化,此时先执行inner query并将结果存于临时表

以下是一个5.0.3版本的优化案例  http://www.xaprb.com/blog/2006/04/30/how-to-optimize-subqueries-and-joins-in-mysql/   

select * from subcategory

where id in (

    select c.id

    from subcategory as c

        inner join item as i on i.subcategory = c.id

    where c.category = 14

    group by c.id

    having count(*) > 2000

);

此时mysql optimizer自作聪明的将非相关子查询改写为相关子查询,执行计划如下:

因为subquery被转化为相关子查询,即先遍历subcategory(outer table),对于每条记录都执行一次subquery(总计300783)

*************************** 1. row ***************************

           id: 1

  select_type: PRIMARY

        table: subcategory

         type: ALL

possible_keys: NULL

          key: NULL

      key_len: NULL

          ref: NULL

         rows: 300783

        Extra: Using where

*************************** 2. row ***************************

           id: 2

  select_type: DEPENDENT SUBQUERY

        table: c

         type: ref

possible_keys: PRIMARY,category

          key: category

      key_len: 4

          ref: const

         rows: 100

        Extra: Using where; Using index; Using temporary; Using filesort

*************************** 3. row ***************************

           id: 2

  select_type: DEPENDENT SUBQUERY

        table: i

         type: ref

possible_keys: subcategory

          key: subcategory

      key_len: 4

          ref: c.id

         rows: 28

        Extra: Using index

优化:采用物化子查询,所做的就是将子查询改为from clause,即添加一对括号即可;

select * from subcategory

where id in (

    select id from (

        select c.id

        from subcategory as c

            inner join item as i on i.subcategory = c.id

        where c.category = 14

        group by c.id

        having count(*) > 2000

    ) as x

);


5.6优化

引入物化子查询(针对where clausesubquery)

5.6.5引入此功能,在此之前,优化器有时会把非关联子查询重写为相关子查询,导致效率变差;

子查询物化将子查询结果存入临时表,确保子查询只执行一次,该表不记录重复数据且采用哈希索引查找;

Optimizer_switch需设置materialization=on


优化derived table

以前的版本在explain时就会对from clausesubquery进行物化 ,引发了部分执行,5.6消除了这个问题;

另外,优化器可能会为derived table添加索引以加速执行

SELECT * FROM t1 JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;

优化器可以为derived_t2f1添加索引以采用ref

http://dev.mysql.com/doc/refman/5.6/en/subquery-materialization.html

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。