查看: 5195|回复: 3

[精华] statistical view的简单例子

[复制链接]
招聘 : c/c++研发
论坛徽章:
45
技术图书徽章
日期:2014-03-10 14:09:192012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-01-04 11:51:22ITPUB十周年纪念徽章
日期:2011-11-01 16:21:15现任管理团队成员
日期:2011-05-07 01:45:082011新春纪念徽章
日期:2011-01-25 15:42:562011新春纪念徽章
日期:2011-01-25 15:42:332011新春纪念徽章
日期:2011-01-25 15:42:152011新春纪念徽章
日期:2011-01-25 15:41:50
跳转到指定楼层
1#
发表于 2008-1-1 02:57 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
在v8 fp9之后db2增加了statistical view的功能.通过对view里面的列收集统计信息,可以使优化器更准确地统计出多个表进行操作后的统计信息,使得优化器在某些比较特殊的数据分布情况下也能够有效地选择正确的访问计划.
在v8中statistical view还有很多限制,例如只支持最多2个表,而在v9中则得以完善,取消了全部限制.

在下面的连接中有比较详细的说明,为了直观起见,俺也做了一个step by step的test进行说明
http://www.ibm.com/developerwork ... 0612chen/index.html

D:\TEMP>db2level
DB21085I  Instance "DB2" uses "32" bits and DB2 code release "SQL09010" with
level identifier "02010107".
Informational tokens are "DB2 v9.1.0.356", "s060629", "NT32", and Fix Pack "0".
Product is installed at "D:\PROGRA~1\IBM\SQLLIB\" with DB2 Copy Name

D:\>db2 connect to sampel2

   Database Connection Information

Database server        = DB2/NT 9.1.0
SQL authorization ID   = TAOEWANG
Local database alias   = SAMPEL2

<---创建两个极为简单的表
D:\>db2 "create table t1 (c1 int)"
DB20000I  The SQL command completed successfully.

D:\>db2 "create table t2 (c1 int)"
DB20000I  The SQL command completed successfully.

<---增加点复杂度,创建索引
D:\>db2 "create index i1 on t1 (c1)"
DB20000I  The SQL command completed successfully.

D:\>db2 "create index i2 on t2 (c1)"
DB20000I  The SQL command completed successfully.

<---对第一个表插入数据,其中最小的0,最大的1000000,1是low2key,999999是high2key,数据集中分布在999999上,cardinality为27
D:\>db2 "insert into t1 values (0)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (1)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (2)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (6)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (7)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (8)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (9)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (10)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (11)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (12)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (13)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (14)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (15)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (16)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (17)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (18)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (19)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (20)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (21)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (22)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (23)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (24)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (25)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (26)"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "insert into t1 values (27)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (999999)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (1000000)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (999999)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t1 values (999999)"
DB20000I  The SQL command completed successfully.

。。。。

<----总共116740行,其中999999占116736行
D:\>db2 "select count(*) from t1"

1
-----------
     116762

  1 record(s) selected.


D:\>db2 "select count(*) from t1 where c1=999999"

1
-----------
     116736

  1 record(s) selected.


<----另外一个表,最小999997,最大1000001,同样数据集中分布在999999上,,cardinality为5
D:\>db2 "insert into t2 values (999997)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t2 values (999998)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t2 values (1000000)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t2 values (1000001)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t2 values (999999)"
DB20000I  The SQL command completed successfully.

D:\>db2 "insert into t2 values (999999)"
DB20000I  The SQL command completed successfully.

......

D:\TEMP>db2 "select count(*) from t2"

1
-----------
      32772

  1 record(s) selected.


D:\TEMP>db2 "select count(*) from t2 where c1=999999"

1
-----------
      32768

  1 record(s) selected.

<----对两个表作runstats
D:\>db2 runstats on table taoewang.t1
DB20000I  The RUNSTATS command completed successfully.

D:\>db2 runstats on table taoewang.t2
DB20000I  The RUNSTATS command completed successfully.

D:\>db2 runstats on table taoewang.t1 and indexes all
DB20000I  The RUNSTATS command completed successfully.

D:\>db2 runstats on table taoewang.t2 and indexes all
DB20000I  The RUNSTATS command completed successfully.

<----做explain
D:\TEMP>db2 set current explain mode explain
DB20000I  The SQL command completed successfully.

<---查询
D:\TEMP>db2 "select count(*) from t1, t2 where t1.c1=t2.c1"
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

D:\TEMP>db2 set current explain mode no
DB20000I  The SQL command completed successfully.

D:\TEMP>db2exfmt -d sampel2 -g TIC -w -1 -n % -s % -# 0 -o exfmt.txt
DB2 Universal Database Version 9.1, 5622-044 (c) Copyright IBM Corp. 1991, 2006
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Output is in exfmt.txt.
Executing Connect Reset -- Connect Reset was Successful.

<----我们看到了什么?预期输出1.41723e+008可是实际呢?最少也不应该小于 116736x32768=3825205248=3.825205248+009的,也就是只有真实输出的3.7%左右
        Total Cost:                 7608.62
        Query Degree:                1

              Rows
             RETURN
             (   1)
              Cost
               I/O
               |
                1
             GRPBY
             (   2)
             7608.62
             998.926
               |
          1.41723e+008
             HSJOIN
             (   3)
             2309.03
             998.926
          /-----+-----\
     116762            32772
     TBSCAN           IXSCAN
     (   4)           (   5)
     469.144          602.804
       436            522.789
       |                |
     116762            32772
TABLE: TAOEWANG  INDEX: TAOEWANG
       T1               I2

<---现在创建statistical view,来统计join后的列信息
D:\TEMP>db2 "create view v1 as select t1.c1 as a, t2.c1 as b from t1, t2 where t1.c1=t2.c1"
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "alter view v1 enable query optimization"

D:\TEMP>db2 runstats on table taoewang.v1 with distribution
DB20000I  The RUNSTATS command completed successfully.


<---清空package cache
D:\TEMP>db2 terminate
DB20000I  The TERMINATE command completed successfully.

D:\TEMP>db2 connect to sampel2

   Database Connection Information

Database server        = DB2/NT 9.1.0
SQL authorization ID   = TAOEWANG
Local database alias   = SAMPEL2

<---从新收集计划
D:\TEMP>db2 set current explain mode explain
DB20000I  The SQL command completed successfully.

D:\TEMP>db2 "select count(*) from t1, t2 where t1.c1=t2.c1"
SQL0217W  The statement was not executed as only Explain information requests
are being processed.  SQLSTATE=01604

D:\TEMP>db2 set current explain mode no
DB20000I  The SQL command completed successfully.

D:\TEMP>db2exfmt -d sampel2 -g TIC -w -1 -n % -s % -# 0 -o exfmt1.txt
DB2 Universal Database Version 9.1, 5622-044 (c) Copyright IBM Corp. 1991, 2006
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.
Connect to Database Successful.
Output is in exfmt1.txt.
Executing Connect Reset -- Connect Reset was Successful.

<---这次的计划就得到了正确的输出.
Access Plan:
-----------
        Total Cost:                 169951
        Query Degree:                1

              Rows
             RETURN
             (   1)
              Cost
               I/O
               |
                1
             GRPBY
             (   2)
             169951
             1066.28
               |
          3.82521e+009
             HSJOIN
             (   3)
             26911.5
             1066.28
          /-----+-----\
     116762            32772
     TBSCAN           IXSCAN
     (   4)           (   5)
     469.138          602.804
       436            522.789
       |                |
     116762            32772
TABLE: TAOEWANG  INDEX: TAOEWANG
       T1               I2

上面的这个例子非常的简单,几乎不可能出现在正常的应用之中.但是假设刚才的那个join是某一个大查询的子查询,并且数据分布的skew更为夸张,有可能和正确输出相差几万倍(上面的那个例子不过20多倍的差距,因为HSJOIN的filter factor只有1/27,如果继续向两个表里面增添更多的distinct row,会造成filter factor的继续减小,estimate结果与真实输出的差异持续增加) ,那么也许正常情况下优化器绝对不可能选择的路径会被使用,造成性能上千倍的下降...
当然,在创建了统计视图后,每次做数据库维护的时候不要忘记对视图也更新统计信息,要不然统计视图只会起到反作用

[ 本帖最后由 wangzhonnew 于 2008-1-1 04:06 编辑 ]
论坛徽章:
42
ITPUB元老
日期:2005-09-09 13:45:35马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14优秀写手
日期:2013-12-18 09:29:09ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:32版主3段
日期:2012-05-15 15:24:112012新春纪念徽章
日期:2012-02-13 15:13:362012新春纪念徽章
日期:2012-02-13 15:13:36
2#
发表于 2008-1-1 11:28 | 只看该作者
有点象物化查询表的功能。
加精、置顶!

使用道具 举报

回复
论坛徽章:
71
马上加薪
日期:2014-02-19 11:55:14ITPUB十周年纪念徽章
日期:2011-11-01 16:19:412010广州亚运会纪念徽章:橄榄球
日期:2011-05-22 10:54:33管理团队成员
日期:2011-05-07 01:45:082011新春纪念徽章
日期:2011-01-25 15:42:562011新春纪念徽章
日期:2011-01-25 15:42:332011新春纪念徽章
日期:2011-01-25 15:42:152011新春纪念徽章
日期:2011-01-25 15:41:502011新春纪念徽章
日期:2011-01-25 15:41:012010年世界杯参赛球队:丹麦
日期:2010-04-06 10:23:36
3#
发表于 2008-1-14 08:09 | 只看该作者
对物化视图做enable query optimization

使用道具 举报

回复
论坛徽章:
5
ITPUB新首页上线纪念徽章
日期:2007-10-20 08:38:44生肖徽章2007版:鸡
日期:2008-01-02 17:35:53生肖徽章2007版:鼠
日期:2008-01-02 17:35:532009新春纪念徽章
日期:2009-01-04 14:52:282011新春纪念徽章
日期:2011-02-18 11:43:33
4#
发表于 2008-2-22 12:50 | 只看该作者
学习了:)

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表