Oracle 表的连接方式(2)-----HASH JOIN的基本机制1

浏览数：22 / 时间：2015年06月12日

　　我们对hash join的常见误解，一般包括两个：

　　第一个误解:是我们经常以为hash join需要对两个做join的表都做全表扫描

　　第二个误解:是经常以为hash join会选择比较小的表做build table

纠正第一个误解：

　　我们经常以为hash join需要对两个做join的表都做全表扫描，但实际情况HASH JOIN是不会限制SQL的访问方法的。我们用下面的测试来验证：

--创建测试表probe_tab：
SQL> create table probe_tab
  2  initrans 3
  3  nologging
  4  as
  5  with generator as (
  6   select 
  7   rownum id
  8   from all_objects 
  9   where rownum <= 3000
 10  )
 11  select
 12   10000 + rownum id,
 13   trunc(dbms_random.value(0,5000)) n1,
 14   rpad(rownum,20) probe_vc,
 15   rpad(‘x‘,500) probe_padding
 16  from
 17   generator v1,
 18   generator v2
 19  where
 20   rownum <= 5000
 21  ;

Table created.

--创建主键
SQL> alter table probe_tab add constraint pb_pk primary key(id);

Table altered.

--创建测试表build_tab：
SQL> create table build_tab
  2  initrans 3
  3  nologging
  4  as
  5  with generator as (
  6   select 
  7   rownum id
  8   from all_objects 
  9   where rownum <= 3000
 10  )
 11  select
 12   rownum id,
 13   10001 + trunc(dbms_random.value(0,5000)) id_probe,
 14   rpad(rownum,20) build_vc,
 15   rpad(‘x‘,500) build_padding
 16  from
 17   generator v1,
 18   generator v2
 19  where
 20   rownum <= 5000
 21  ;

Table created.

--创建主键
SQL> alter table build_tab add constraint bu_pk primary key(id);


--创建外键
SQL> alter table build_tab add constraint bu_fk_pb foreign key (id_probe) references probe_tab;

--创建索引
SQL> create index bu_fk_pb on build_tab(id_probe);


--收集两个表的统计信息
SQL> begin
  2   dbms_stats.gather_table_stats(
  3   ‘SYS‘,
  4   ‘build_tab‘,
  5   cascade => true,
  6   estimate_percent => null,
  7   method_opt => ‘for all columns size 1‘
  8   );
  9  end;
 10  /

SQL> begin
  2   dbms_stats.gather_table_stats(
  3   ‘SYS‘,
  4   ‘probe_tab‘,
  5   cascade => true,
  6   estimate_percent => null,
  7   method_opt => ‘for all columns size 1‘
  8   );
  9  end;
 10  /


--我们使用workarea_size_policy = manual并且设置hash_area_size = 1048576，
--因为hash join的效率和我们分配给它的内存多少关系很大，
--所以为了能够精确的控制分配的内存数量，我们采用了手工的分配方式
SQL> begin
  2  
  3   begin execute immediate ‘alter session set workarea_size_policy = manual‘;
  4   exception when others then null;
  5   end;
  6  
  7   begin execute immediate ‘alter session set hash_area_size = 1048576‘;
  8   exception when others then null;
  9   end;
 10  
 11  end;
 12  /

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL> set line 200
SQL> select
  2   bu.build_vc,
  3   pb.probe_vc,
  4   pb.probe_padding
  5  from
  6   build_tab bu,
  7   probe_tab pb
  8  where
  9   bu.id between 1 and 500
 10  and pb.id = bu.id_probe
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2915561138

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   500 |   271K|   149   (1)| 00:00:02 |
|*  1 |  HASH JOIN                   |           |   500 |   271K|   149   (1)| 00:00:02 |
|   2 |   TABLE ACCESS BY INDEX ROWID| BUILD_TAB |   500 | 15000 |    42   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | BU_PK     |   500 |       |     3   (0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL          | PROBE_TAB |  5000 |  2573K|   106   (0)| 00:00:02 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PB"."ID"="BU"."ID_PROBE")
   3 - access("BU"."ID">=1 AND "BU"."ID"<=500)

SQL>

　　结论：根据计划的第3步（index range scan），可以确定hash join并不会限制SQL的访问方法。

注：在执行计划中靠近HASH JOIN的表为build table（内表）

纠正第二误解：

　　我们通常以为hash join会选择比较小的表做build table，但看看建表语句，这两个表其实是一样大的。所以并不是选择比较小的表，而是看哪个表上得到的结果集比较小，就把哪个表作为build table。进一步说，Oracle是如何比较哪个表上的结果集比较小呢？为了说明这一点，我们需要把原来的SQL拆分成2部分:

--第一部分sql
select 
    bu.id,
    bu.build_vc,
    bu.id_probe
from build_tab bu
where    bu.id between 1 and 500    ;

--第二部分sql
select 
    pb.probe_vc,
    pb.probe_padding,
    pb.id
from probe_tab pb;

　 Oracle会根据这两个虚拟的查询的返回结果决定到底哪个表的返回结果集表较小

　　这个结果集的大小则是 = 结果集的行数*user_tab_columns.sum(avg_col_len)计算得出的

　　需要注意的是：一般我们收集统计数据的方式都是dbms_stats，有的时候由于历史原因我们可能还在使用analyze，当在计算 avg_col_len的时候，我们会发现dbms_stats计算出来的avg_col_len一般要比analyze计算的要大1，这是因为我们表里的列除了数据占用空间，列本身也是需要空间的，当计算avg_col_len的时候dbms_stats注意到了这一点，而analyze忽略了这一点。这可能导致同样的SQL在dbms_stats分析的系统上的执行计划，和analyze分析的系统上的执行计划不一样。

先查询两个表的user_tab_columns字节数：

SQL> select column_name, avg_col_len from user_tab_columns where table_name=‘PROBE_TAB‘;

COLUMN_NAME                    AVG_COL_LEN
------------------------------ -----------
ID                                       5
N1                                       4
PROBE_VC                                21
PROBE_PADDING                          501

SQL> select column_name, avg_col_len from user_tab_columns where table_name=‘BUILD_TAB‘;

COLUMN_NAME                    AVG_COL_LEN
------------------------------ -----------
ID                                       4
ID_PROBE                                 5
BUILD_VC                                21
BUILD_PADDING                          501

SQL>

　　根据拆分后的sql 得到的结果集：

　　表probe_tab的结果集是(5+21+521)*5000=2635000

　　表build_tab的结果集是(4+5+21)*500=15000

　　两个表的大小是相同的，但build_tab的结果集远远小于probe_tab的结果集，所以用表build_tab作为hash内表

作为验证，我们把原来查询的select list做一下修改，即改变查询的结果集:

--更改后的sql
select
 bu.build_vc,
 pb.probe_vc,
 bu.build_padding
from
 build_tab bu,
 probe_tab pb
where
 bu.id between 1 and 500
and pb.id = bu.id_probe;


--拆分后sql1：
select 
    bu.build_vc,
    bu.build_padding,
    bu.id,bu.
    id_probe
from build_tab bu
where bu.id between 1 and 500

--拆分后的sql2：
select 
    pb.probe_vc,
    pb.id
from probe_tab pb

根据拆分后的sql 得到的结果集：

probe_tab的结果集是(21+4)*5000=125000，

build_tab的结果集是(501+21+4+5)*500=265500，

根据结果集推测probe_tab为hash内表，验证结果probe_tab为HASH内表：

SQL> select
  2   bu.build_vc,
  3   pb.probe_vc,
  4   bu.build_padding
  5  from
  6   build_tab bu,
  7   probe_tab pb
  8  where
  9   bu.id between 1 and 500
 10  and pb.id = bu.id_probe;

Execution Plan
----------------------------------------------------------
Plan hash value: 861942995

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   500 |   271K|   149   (1)| 00:00:02 |
|*  1 |  HASH JOIN                   |           |   500 |   271K|   149   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL          | PROBE_TAB |  5000 |   126K|   106   (0)| 00:00:02 |
|   3 |   TABLE ACCESS BY INDEX ROWID| BUILD_TAB |   500 |   259K|    42   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | BU_PK     |   500 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PB"."ID"="BU"."ID_PROBE")
   4 - access("BU"."ID">=1 AND "BU"."ID"<=500)

SQL>