|
在v8 fp9之后db2增加了statistical view的功能.通过对view里面的列收集统计信息,可以使优化器更准确地统计出多个表进行操作后的统计信息,使得优化器在某些比较特殊的数据分布情况下也能够有效地选择正确的访问计划.
在v8中statistical view还有很多限制,例如只支持最多2个表,而在v9中则得以完善,取消了全部限制.
在下面的连接中有比较详细的说明,为了直观起见,俺也做了一个step by step的test进行说明
http://www.ibm.com/developerwork ... 0612chen/index.html
D:\TEMP>db2level
DB21085I Instance "DB2" uses "32" bits and DB2 code release "SQL09010" with
level identifier "02010107".
Informational tokens are "DB2 v9.1.0.356", "s060629", "NT32", and Fix Pack "0".
Product is installed at "D:\PROGRA~1\IBM\SQLLIB\" with DB2 Copy Name
D:\>db2 connect to sampel2
Database Connection Information
Database server = DB2/NT 9.1.0
SQL authorization ID = TAOEWANG
Local database alias = SAMPEL2
<---创建两个极为简单的表
D:\>db2 "create table t1 (c1 int)"
DB20000I The SQL command completed successfully.
D:\>db2 "create table t2 (c1 int)"
DB20000I The SQL command completed successfully.
<---增加点复杂度,创建索引
D:\>db2 "create index i1 on t1 (c1)"
DB20000I The SQL command completed successfully.
D:\>db2 "create index i2 on t2 (c1)"
DB20000I The SQL command completed successfully.
<---对第一个表插入数据,其中最小的0,最大的1000000,1是low2key,999999是high2key,数据集中分布在999999上,cardinality为27
D:\>db2 "insert into t1 values (0)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (1)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (2)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (6)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (7)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (8)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (9)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (10)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (11)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (12)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (13)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (14)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (15)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (16)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (17)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (18)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (19)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (20)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (21)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (22)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (23)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (24)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (25)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (26)"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "insert into t1 values (27)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (999999)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (1000000)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (999999)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t1 values (999999)"
DB20000I The SQL command completed successfully.
。。。。
<----总共116740行,其中999999占116736行
D:\>db2 "select count(*) from t1"
1
-----------
116762
1 record(s) selected.
D:\>db2 "select count(*) from t1 where c1=999999"
1
-----------
116736
1 record(s) selected.
<----另外一个表,最小999997,最大1000001,同样数据集中分布在999999上,,cardinality为5
D:\>db2 "insert into t2 values (999997)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t2 values (999998)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t2 values (1000000)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t2 values (1000001)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t2 values (999999)"
DB20000I The SQL command completed successfully.
D:\>db2 "insert into t2 values (999999)"
DB20000I The SQL command completed successfully.
......
D:\TEMP>db2 "select count(*) from t2"
1
-----------
32772
1 record(s) selected.
D:\TEMP>db2 "select count(*) from t2 where c1=999999"
1
-----------
32768
1 record(s) selected.
<----对两个表作runstats
D:\>db2 runstats on table taoewang.t1
DB20000I The RUNSTATS command completed successfully.
D:\>db2 runstats on table taoewang.t2
DB20000I The RUNSTATS command completed successfully.
D:\>db2 runstats on table taoewang.t1 and indexes all
DB20000I The RUNSTATS command completed successfully.
D:\>db2 runstats on table taoewang.t2 and indexes all
DB20000I The RUNSTATS command completed successfully.
<----做explain
D:\TEMP>db2 set current explain mode explain
DB20000I The SQL command completed successfully.
<---查询
D:\TEMP>db2 "select count(*) from t1, t2 where t1.c1=t2.c1"
SQL0217W The statement was not executed as only Explain information requests
are being processed. SQLSTATE=01604
D:\TEMP>db2 set current explain mode no
DB20000I The SQL command completed successfully.
D:\TEMP>db2exfmt -d sampel2 -g TIC -w -1 -n % -s % -# 0 -o exfmt.txt
DB2 Universal Database Version 9.1, 5622-044 (c) Copyright IBM Corp. 1991, 2006
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool
Connecting to the Database.
Connect to Database Successful.
Output is in exfmt.txt.
Executing Connect Reset -- Connect Reset was Successful.
<----我们看到了什么?预期输出1.41723e+008可是实际呢?最少也不应该小于 116736x32768=3825205248=3.825205248+009的,也就是只有真实输出的3.7%左右
Total Cost: 7608.62
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
7608.62
998.926
|
1.41723e+008
HSJOIN
( 3)
2309.03
998.926
/-----+-----\
116762 32772
TBSCAN IXSCAN
( 4) ( 5)
469.144 602.804
436 522.789
| |
116762 32772
TABLE: TAOEWANG INDEX: TAOEWANG
T1 I2
<---现在创建statistical view,来统计join后的列信息
D:\TEMP>db2 "create view v1 as select t1.c1 as a, t2.c1 as b from t1, t2 where t1.c1=t2.c1"
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "alter view v1 enable query optimization"
D:\TEMP>db2 runstats on table taoewang.v1 with distribution
DB20000I The RUNSTATS command completed successfully.
<---清空package cache
D:\TEMP>db2 terminate
DB20000I The TERMINATE command completed successfully.
D:\TEMP>db2 connect to sampel2
Database Connection Information
Database server = DB2/NT 9.1.0
SQL authorization ID = TAOEWANG
Local database alias = SAMPEL2
<---从新收集计划
D:\TEMP>db2 set current explain mode explain
DB20000I The SQL command completed successfully.
D:\TEMP>db2 "select count(*) from t1, t2 where t1.c1=t2.c1"
SQL0217W The statement was not executed as only Explain information requests
are being processed. SQLSTATE=01604
D:\TEMP>db2 set current explain mode no
DB20000I The SQL command completed successfully.
D:\TEMP>db2exfmt -d sampel2 -g TIC -w -1 -n % -s % -# 0 -o exfmt1.txt
DB2 Universal Database Version 9.1, 5622-044 (c) Copyright IBM Corp. 1991, 2006
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool
Connecting to the Database.
Connect to Database Successful.
Output is in exfmt1.txt.
Executing Connect Reset -- Connect Reset was Successful.
<---这次的计划就得到了正确的输出.
Access Plan:
-----------
Total Cost: 169951
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
1
GRPBY
( 2)
169951
1066.28
|
3.82521e+009
HSJOIN
( 3)
26911.5
1066.28
/-----+-----\
116762 32772
TBSCAN IXSCAN
( 4) ( 5)
469.138 602.804
436 522.789
| |
116762 32772
TABLE: TAOEWANG INDEX: TAOEWANG
T1 I2
上面的这个例子非常的简单,几乎不可能出现在正常的应用之中.但是假设刚才的那个join是某一个大查询的子查询,并且数据分布的skew更为夸张,有可能和正确输出相差几万倍(上面的那个例子不过20多倍的差距,因为HSJOIN的filter factor只有1/27,如果继续向两个表里面增添更多的distinct row,会造成filter factor的继续减小,estimate结果与真实输出的差异持续增加) ,那么也许正常情况下优化器绝对不可能选择的路径会被使用,造成性能上千倍的下降...
当然,在创建了统计视图后,每次做数据库维护的时候不要忘记对视图也更新统计信息,要不然统计视图只会起到反作用
[ 本帖最后由 wangzhonnew 于 2008-1-1 04:06 编辑 ] |
|