|
6#
楼主 |
发表于 2010-10-24 13:34
|
只看该作者
【sql调优之执行计划】merge sort join
版本:10g 10.2.0.1
Sort merge join排序合并连接的原理是两个连接的行集按照连接列先分别排序,然后再做连接。
Oracle使用merge join的倾向程度和sort_area_size和db_file_mutliblock_read_count两个初始参数有关,因为需要排序,所以较大的sort_area_size设置会更有利于使用merge join。而多块读的设置,定义了oracle操作一次全表扫描或者索引范围扫描返回的block数量,如果设置的太高,优化器将倾向于全表扫描,而太低则倾向于使用索引。在没有索引的情况下,而两个行集都接近有序,则使用merge join也可能提升效率。
看个例子:
SQL> select /*+ use_merge(a,b)*/
2 *
3 from scott.emp a, scott.dept b
4 where a.deptno = b.deptno
5 ;
已选择15行。
执行计划
----------------------------------------------------------
Plan hash value: 844388907
--------------------------------------------------------------------------------
--------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Ti
me |
--------------------------------------------------------------------------------
--------
| 0 | SELECT STATEMENT | | 15 | 855 | 6 (17)| 00
:00:01 |
| 1 | MERGE JOIN | | 15 | 855 | 6 (17)| 00
:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00
:00:01 |
| 3 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00
:00:01 |
|* 4 | SORT JOIN | | 15 | 555 | 4 (25)| 00
:00:01 |
| 5 | TABLE ACCESS FULL | EMP | 15 | 555 | 3 (0)| 00
:00:01 |
--------------------------------------------------------------------------------
--------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("A"."DEPTNO"="B"."DEPTNO")
filter("A"."DEPTNO"="B"."DEPTNO")
统计信息
----------------------------------------------------------
38 recursive calls
5 db block gets
12 consistent gets
2 physical reads
1012 redo size
1687 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
15 rows processed
上述信息可以看到dept使用了索引,而emp为全表,由于索引已经排序,故而后面的1 sort(memory)即一次内存排序完成了对emp表按连接列deptno的排序。
下面稍稍修改一下上述sql,使得两个表都使用全表扫描,则可以看到,进行了2次sort join,后面的2 sorts (memory)两次内存排序也说明了这点,
例如:
SQL>
SQL> select /*+ no_index(a) no_index(b) use_merge(a,b)*/
2 *
3 from scott.emp a, scott.dept b
4 where a.deptno = b.deptno
5 ;
已选择15行。
执行计划
----------------------------------------------------------
Plan hash value: 1407029907
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 855 | 8 (25)| 00:00:01 |
| 1 | MERGE JOIN | | 15 | 855 | 8 (25)| 00:00:01 |
| 2 | SORT JOIN | | 4 | 80 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 |
|* 4 | SORT JOIN | | 15 | 555 | 4 (25)| 00:00:01 |
| 5 | TABLE ACCESS FULL| EMP | 15 | 555 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("A"."DEPTNO"="B"."DEPTNO")
filter("A"."DEPTNO"="B"."DEPTNO")
统计信息
----------------------------------------------------------
21 recursive calls
5 db block gets
14 consistent gets
5 physical reads
1076 redo size
1687 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
15 rows processed
SQL>
如果sort_area_size不足以完成排序操作,则将借助temp表空间来完成排序,即为磁盘排序,大量的磁盘排序是merge join性能下降的一个主要原因。
看一个例子:
SQL> create table t_policy_merge as
2 select * from t_policy a where rownum <1000;
表已创建。
create table t_policy_merge_big as
select * from t_policy union all
select * from t_policy union all
select * from t_policy union all
select * from t_policy union all
select * from t_policy ;
insert into t_policy_merge_big select* from t_policy_merge_big ;
commit;
SQL> select /*+ use_merge(a,b)*/
2 *
3 from t_policy_merge_big a, t_policy_merge b
4 where a.policy_id = b.policy_id;
已选择9990行。
执行计划
----------------------------------------------------------
Plan hash value: 2074009417
--------------------------------------------------------------------------------
------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost
(%CPU)| Time |
--------------------------------------------------------------------------------
------------------
| 0 | SELECT STATEMENT | | 3654 | 65M| | 198
K (1)| 00:39:38 |
| 1 | MERGE JOIN | | 3654 | 65M| | 198
K (1)| 00:39:38 |
| 2 | SORT JOIN | | 1000 | 9149K| 15M| 1971
(1)| 00:00:24 |
| 3 | TABLE ACCESS FULL| T_POLICY_MERGE | 1000 | 9149K| | 16
(7)| 00:00:01 |
|* 4 | SORT JOIN | | 99803 | 891M| 1559M| 196
K (1)| 00:39:15 |
| 5 | TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 | 891M| | 1321
(6)| 00:00:16 |
--------------------------------------------------------------------------------
------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("A"."POLICY_ID"="B"."POLICY_ID")
filter("A"."POLICY_ID"="B"."POLICY_ID")
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
60 recursive calls
226 db block gets
5776 consistent gets
10271 physical reads
132 redo size
744643 bytes sent via SQL*Net to client
7700 bytes received via SQL*Net from client
667 SQL*Net roundtrips to/from client
1 sorts (memory)
1 sorts (disk)
9990 rows processed
SQL>
如果使用hash join则会将小表T_POLICY_MERGE hash,从而避免了对大表的磁盘排序,从下面的黑体的db_block_gets, physical reads和sorts (disk)可以看到这个差别。
SQL> select
2 *
3 from t_policy_merge_big a, t_policy_merge b
4 where a.policy_id = b.policy_id;
已选择9990行。
执行计划
----------------------------------------------------------
Plan hash value: 3615210701
--------------------------------------------------------------------------------
-----------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (
%CPU)| Time |
--------------------------------------------------------------------------------
-----------------
| 0 | SELECT STATEMENT | | 3654 | 65M| | 46107
(1)| 00:09:14 |
|* 1 | HASH JOIN | | 3654 | 65M| 9168K| 46107
(1)| 00:09:14 |
| 2 | TABLE ACCESS FULL| T_POLICY_MERGE | 1000 | 9149K| | 16
(7)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T_POLICY_MERGE_BIG | 99803 | 891M| | 1321
(6)| 00:00:16 |
--------------------------------------------------------------------------------
-----------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."POLICY_ID"="B"."POLICY_ID")
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
7 recursive calls
0 db block gets
6491 consistent gets
5724 physical reads
0 redo size
2287605 bytes sent via SQL*Net to client
7700 bytes received via SQL*Net from client
667 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
9990 rows processed
SQL>
注意到前面有些redo的情况,特别是scott下表的测试,做个10046
SQL> alter session set events '10046 trace name context forever, level 8';
会话已更改。
查看了trace内容后发现:
insert into sys.aud$( sessionid,entryid,statement,ntimestamp#, userid,
userhost,terminal,action#,returncode, obj$creator,obj$name,auth$privileges,
auth$grantee, new$owner,new$name,ses$actions,ses$tid,logoff$pread,
logoff$lwrite,logoff$dead,comment$text,spare1,spare2, priv$used,clientid,
sessioncpu,proxy$sid,user$guid, instance#,process#,xid,scn,auditid,sqlbind,
sqltext)
values
(:1,:2,:3,SYS_EXTRACT_UTC(SYSTIMESTAMP), :4,:5,:6,:7,:8, :9,:10,:11,
:12, :13,:14,:15,:16,:17, :18,:19,:20,:21,:22, :23,:24,:25,:26,
:27, :28,:29,:30,:31,:32,:33,:34)
insert into plan_table (statement_id, timestamp, operation, options,
object_node, object_owner, object_name, object_instance, object_type,
search_columns, id, parent_id, position, other,optimizer, cost, cardinality,
bytes, other_tag, partition_start, partition_stop, partition_id,
distribution, cpu_cost, io_cost, temp_space, access_predicates,
filter_predicates, projection, time, qblock_name, object_alias, plan_id,
depth, remarks, other_xml )
values
(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19,:20,:21,
:22,:23,:24,:25,:26,:27,:28,:29,:30,:31,:32,:33,:34,:35,:36)
sys.aud$表以及plan_table有insert操作,是由于scott上开了审计功能。
[ 本帖最后由 yellowlee 于 2010-10-24 13:38 编辑 ] |
|