|
谢谢楼上的几位捧场子
我继续:
介绍一些次常用的hint,
我们都熟悉那些常用的Hint,
比如用来调整access path的 (如full, index, parallel...),
调整join order的 (如order, leading...),
调整join path的(use_nl, use_hash...)
调整remote access的 (如driving_site),
调整查询转化的(如rewrite, unnest, merge),
这些都是比较常用的,说起来没什么意思,现在说几个"次常用"的hint.
1)dynamic_sampling,
这个Hint在sql reference里面有介绍:
The DYNAMIC_SAMPLING hint instructs the optimizer how to control dynamic sampling to improve server performance by determining more accurate predicate selectivity and statistics for tables and indexes.
drop table t1;
drop table t2;
create table t1 (x int, y int);
create index t1_xy on t1(x,y);
create table t2 (x int);
create index t2_x on t2(x);
insert into t1 select mod(rownum, 10), mod(rownum, 10) from dba_objects where rownum<=50000;
insert into t2 select mod(rownum, 50) from dba_objects where rownum<=50000;
commit;
exec dbms_stats.gather_table_stats(user,'T1');
exec dbms_stats.gather_table_stats(user,'T2');
SQL> set autotrace traceonly exp
SQL> select t1.x, t2.x from t1,t2
2 where t1.x=1 and t1.y=1 and t2.x=t1.x;
执行计划
----------------------------------------------------------
Plan hash value: 4152676900
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 501K| 3428K| 32 (29)| 00:00:01 |
|* 1 | HASH JOIN | | 501K| 3428K| 32 (29)| 00:00:01 |
|* 2 | INDEX RANGE SCAN| T1_XY | 496 | 2480 | 19 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| T2_X | 1011 | 2022 | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."X"="T1"."X")
2 - access("T1"."X"=1 AND "T1"."Y"=1)
3 - access("T2"."X"=1)
SQL> select /*+ dynamic_sampling(4) */ t1.x, t2.x from t1,t2
where t1.x=1 and t1.y=1 and t2.x=t1.x;
执行计划
----------------------------------------------------------
Plan hash value: 224376697
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5015K| 33M| 110 (80)| 00:00:02 |
|* 1 | HASH JOIN | | 5015K| 33M| 110 (80)| 00:00:02 |
|* 2 | INDEX RANGE SCAN| T2_X | 1011 | 2022 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| T1_XY | 4963 | 24815 | 19 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."X"="T1"."X")
2 - access("T2"."X"=1)
3 - access("T1"."X"=1 AND "T1"."Y"=1)
Note
-----
- dynamic sampling used for this statement
那个执行计划更好一些?应该是第二个,
这里的第一个执行计划把t1作为驱动表,t2作为被驱动表,因为oracle认为t1.x=1 and t1.y=1应该返回更少的行数。
oracle在判断t1.x=1 and t1.y=1的总行数 => (t1.x=1的选择率 * t1.y=1的选择率)* t1行数 => (0.1 * 0.1)* 50000 = 500 (计划里是496)
oracle在判断t2.x=1的总行数 => t2.x=1的选择率 * t2行数 => 0.02 * 50000 = 1000.
所以oracle把t1作为驱动表是可以理解的,但问题是实际的数据分布t1.x总是等于t1.y,所以真实的数据t1.x=1 and t1.y=1的总行数 = t1.x=1的选择率 * t1行数 = 0.1 * 50000 = 5000 (计划里的4963) !
为了让oracle认识到这一点,我们使用level 4的dynamic_sampling, dynamic_sampling level 4的意思是:
Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all
tables that have single-table predicates that reference 2 or more columns. The
number of blocks sampled is the default number of dynamic sampling blocks. For
unanalyzed tables, the number of blocks sampled is two times the default number
of dynamic sampling blocks
从level 4开始oracle开始对“tables that have single-table predicates that reference 2 or more columns” 这种情况开始采样,所以能够意识到真实的数据分布是什么样,所以我们
看到第二个是以t2为驱动表的。
这种类似的问题在复杂的SQL里面会很常见。
但是dynamic_sampling这个Hint会在oracle解析的时候产生额外的开销,我很少真的在代码里用,不过我经常在调优的时候用。
比如当我运行完这个SQL语句之后:
select /*+ dynamic_sampling(4) */ t1.x, t2.x from t1,t2
where t1.x=1 and t1.y=1 and t2.x=t1.x;
我会马上跟一个:
select * from table(dbms_xplan.display_cursor(null, 0 , 'advanced'));
这样就可以得到为了实现这个比较好的计划所需要的所有的Hint,你如果要在系统里加hint, 动态采样+dbms_xplan.display_cursor是个捷径:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX(@"SEL$1" "T2"@"SEL$1" ("T2"."X"))
INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."X" "T1"."Y"))
LEADING(@"SEL$1" "T2"@"SEL$1" "T1"@"SEL$1")
USE_HASH(@"SEL$1" "T1"@"SEL$1")
END_OUTLINE_DATA
*/
在真实使用的时候你必须把outline相关的和IGNORE_OPTIM_EMBEDDED_HINTS这个奇怪的Hint去掉,
也可以把OPTIMIZER_FEATURES_ENABLE('10.2.0.1'),ALL_ROWS去掉,因为这些环境不大会改变。
然后可以把oracle自定义的query block去掉,如果你的sql有多个query block的话可能需要自己定义query block的名字(见我前面的帖子)
我这个SQL最后留下这么几个hint就够了:
/*+
INDEX("T2" ("T2"."X"))
INDEX("T1" ("T1"."X" "T1"."Y"))
LEADING( "T2" "T1")
USE_HASH("T1")
*/
还有一个问题,就是有的SQL一跑就是几天,甚至会影响整个系统,你优化的时候不可能把它跑一遍,所以也不能用dbms_xplan.display_cursor,那你可以考虑这个方法。
alter session set events '10132 trace name context forever, level 1';
这里使用10053效果也类似,但我发现10053只有在hard parse的时候才生成trace,所以一般用10053之前可能需要flush shared_pool。
set autotrace traceonly exp
select /*+ dynamic_sampling(4) */ t1.x, t2.x from t1,t2
where t1.x=1 and t1.y=1 and t2.x=t1.x;
于是在trace里你可以看到下面的内容:
sql_id=62dhqxdq49t6a.
Current SQL statement for this session:
EXPLAIN PLAN SET STATEMENT_ID='PLUS1777' FOR select /*+ dynamic_sampling(4) */ t1.x, t2.x from t1,t2
where t1.x=1 and t1.y=1 and t2.x=t1.x
============
Plan Table
============
-------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 110 | |
| 1 | HASH JOIN | | 4897K | 33M | 110 | 00:00:02 |
| 2 | INDEX RANGE SCAN | T2_X | 1011 | 2022 | 4 | 00:00:01 |
| 3 | INDEX RANGE SCAN | T1_XY | 4963 | 24K | 19 | 00:00:01 |
-------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - access("T2"."X"="T1"."X")
2 - access("T2"."X"=1)
3 - access("T1"."X"=1 AND "T1"."Y"=1)
Content of other_xml column
===========================
db_version : 10.2.0.1
parse_schema : HR
dynamic_sampling: yes
plan_hash : 224376697
Outline Data:
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX(@"SEL$1" "T2"@"SEL$1" ("T2"."X"))
INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."X" "T1"."Y"))
LEADING(@"SEL$1" "T2"@"SEL$1" "T1"@"SEL$1")
USE_HASH(@"SEL$1" "T1"@"SEL$1")
END_OUTLINE_DATA
*/
[ 本帖最后由 fusnow 于 2010-2-5 09:54 编辑 ] |
|