db2 join基估计的问题

mdkii · 发表于 2010-4-7 16:42

最近在优化SQL时遇到一个问题，
这里高手众多，希望能指点一二。
db2 优化器一般来讲在做基估计的时候都比较准确，但在多表等值连接，
而且参与连接的列数据分布不均的时候估值跟实际值相差很大，
这会导致最终的执行性能很差。

比如

select *
from tt1,tt2
where tt1.c2=tt2.c2
and tt1.c1 >= 800
and tt2.c1 >= 800
这是一个比较常见的sql
在列C2上均有索引，但C2上的数据不是均匀分布的。
runstats 语句是
db2 "runstats on table db2inst1.tt1 with distribution and detailed indexes all"
db2 "runstats on table db2inst1.tt2 with distribution and detailed indexes all"

实际的执行结果集是9801条
但exfmt的结果是：1794条
差了8倍，不知道除了runstats外有什么命令可以让db2更准确的估计结果集？

exfmt 详细

exfmt.txt (17.51 KB, 下载次数: 94) 结果见附件

Access Plan:
-----------
      Total Cost:                210.801
      Query Degree:             1

                     Rows
                     RETURN
                     ( 1)
                     Cost
                        I/O
                        |
                     1794.43
                     HSJOIN
                     ( 2)
                     210.801
                     52.9262
            /----------+---------\
         1340.23                1340.23
         FETCH                   FETCH
         ( 3)                   ( 7)
         105.243                105.243
         26.4631                26.4631
      /---+---\                /---+---\
   3001       3001       3001       3001
   RIDSCN TABLE: DB2INST1  RIDSCN TABLE: DB2INST1
   ( 4)       TT1       ( 8)       TT2
   79.8431                79.8431
   10                      10
   |                      |
   3001                   3001
   SORT                   SORT
   ( 5)                   ( 9)
   79.8425                79.8425
   10                      10
   |                      |
   3001                   3001
   IXSCAN                   IXSCAN
   ( 6)                   (  10)
   78.0361                78.0361
   10                      10
   |                      |
   3001                   3001
INDEX: DB2INST1          INDEX: DB2INST1
   IDX_TT1                IDX_TT2

Objects Used in Access Plan:
---------------------------

      Schema: DB2INST1
      Name:       IDX_TT1
      Type:       Index
                     Time of creation:                2010-04-07-16.10.53.766351
                     Last statistics update:       2010-04-07-16.11.17.224170
                     Number of columns:                1
                     Number of rows:                3001
                     Width of rows:                      -1
                     Number of buffer pool pages:       16
                     Distinct row values:                No
                     Tablespace name:                USER4KTBS
                     Tablespace overhead:                7.500000
                     Tablespace transfer rate:       0.060000
                     Source for statistics:                Single Node
                     Prefetch page count:                32
                     Container extent page count:       32
                     Index clustering statistic:       43.000000
                     Index leaf pages:                10
                     Index tree levels:                2
                     Index full key cardinality:       1001
                     Index first key cardinality:       1001
                     Index first 2 keys cardinality: -1
                     Index first 3 keys cardinality: -1
                     Index first 4 keys cardinality: -1
                     Index sequential pages:       0
                     Index page density:                0
                     Index avg sequential pages:       0
                     Index avg gap between sequences:0
                     Index avg random pages:       10
                     Fetch avg sequential pages:       -1
                     Fetch avg gap between sequences:-1
                     Fetch avg random pages:       -1
                     Index RID count:                3001
                     Index deleted RID count:       0
                     Index empty leaf pages:       0
                     Base Table Schema:                DB2INST1
                     Base Table Name:                TT1
                     Columns in index:
                              C2

      Schema: DB2INST1
      Name:       IDX_TT2
      Type:       Index
                     Time of creation:                2010-04-07-16.11.00.087799
                     Last statistics update:       2010-04-07-16.11.13.842280
                     Number of columns:                1
                     Number of rows:                3001
                     Width of rows:                      -1
                     Number of buffer pool pages:       16
                     Distinct row values:                No
                     Tablespace name:                USER4KTBS
                     Tablespace overhead:                7.500000
                     Tablespace transfer rate:       0.060000
                     Source for statistics:                Single Node
                     Prefetch page count:                32
                     Container extent page count:       32
                     Index clustering statistic:       43.000000
                     Index leaf pages:                10
                     Index tree levels:                2
                     Index full key cardinality:       1001
                     Index first key cardinality:       1001
                     Index first 2 keys cardinality: -1
                     Index first 3 keys cardinality: -1
                     Index first 4 keys cardinality: -1
                     Index sequential pages:       0
                     Index page density:                0
                     Index avg sequential pages:       0
                     Index avg gap between sequences:0
                     Index avg random pages:       10
                     Fetch avg sequential pages:       -1
                     Fetch avg gap between sequences:-1
                     Fetch avg random pages:       -1
                     Index RID count:                3001
                     Index deleted RID count:       0
                     Index empty leaf pages:       0
                     Base Table Schema:                DB2INST1
                     Base Table Name:                TT2
                     Columns in index:
                              C2

      Schema: DB2INST1
      Name:       TT1
      Type:       Table
                     Time of creation:                2010-04-07-13.11.15.839838
                     Last statistics update:       2010-04-07-16.11.17.224170
                     Number of columns:                2
                     Number of rows:                3001
                     Width of rows:                      17
                     Number of buffer pool pages:       16
                     Number of data partitions:       1
                     Distinct row values:                No
                     Tablespace name:                USER4KTBS
                     Tablespace overhead:                7.500000
                     Tablespace transfer rate:       0.060000
                     Source for statistics:                Single Node
                     Prefetch page count:                32
                     Container extent page count:       32
                     Table overflow record count:       0
                     Table Active Blocks:                -1
                     Average Row Compression Ratio:       0
                     Percentage Rows Compressed:       0
                     Average Compressed Row Size:       0

      Schema: DB2INST1
      Name:       TT2
      Type:       Table
                     Time of creation:                2010-04-07-13.11.23.752839
                     Last statistics update:       2010-04-07-16.11.13.842280
                     Number of columns:                2
                     Number of rows:                3001
                     Width of rows:                      17
                     Number of buffer pool pages:       16
                     Number of data partitions:       1
                     Distinct row values:                No
                     Tablespace name:                USER4KTBS
                     Tablespace overhead:                7.500000
                     Tablespace transfer rate:       0.060000
                     Source for statistics:                Single Node
                     Prefetch page count:                32
                     Container extent page count:       32
                     Table overflow record count:       0
                     Table Active Blocks:                -1
                     Average Row Compression Ratio:       0
                     Percentage Rows Compressed:       0
                     Average Compressed Row Size:       0

wangzhonnew · 发表于 2010-4-7 22:42

The filter factor for HSJOIN (2) is 0.000999:
2) HSJOIN: (Hash Join)
Cumulative Total Cost: 210.801
Cumulative CPU Cost: 2.74143e+07
Cumulative I/O Cost: 52.9262
Cumulative Re-Total Cost: 210.801
Cumulative Re-CPU Cost: 2.74143e+07
Cumulative Re-I/O Cost: 52.9262
Cumulative First Row Cost: 210.801
Estimated Bufferpool Buffers: 54.9262

Arguments:
---------
BITFLTR : (Hash Join Bit Filter used)
FALSE
EARLYOUT: (Early Out flag)
NONE
HASHCODE: (Hash Code Size)
24 BIT
TEMPSIZE: (Temporary Table Page Size)
4096

Predicates:
----------
4) Predicate used in Join
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.000999001

Predicate Text:
--------------
(Q2."C2" = Q1."C2")

Input Streams:
-------------
6) From Operator #3

Estimated number of rows: 1340.23
Number of columns: 2
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q2."C1"+Q2."C2"

12) From Operator #7

Estimated number of rows: 1340.23
Number of columns: 2
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q1."C1"+Q1."C2"

Output Streams:
--------------
13) To Operator #1

Estimated number of rows: 1794.43
Number of columns: 4
Subquery predicate ID: Not Applicable

Column Names:
------------
+Q3."C2"+Q3."C1"+Q3."C2"+Q3."C1"

That means the estimated row is calculated by 1340.23 * 1340.23 * 0.000999001
And how does DB2 get 0.000999? The formula is 1/MAX(CARD(COL1),CARD(COL2))
In this particular case, we have CARD(COL1) and CARD(COL2) both equal : 1001, which means FF=1/1001=0.000999000999000999000999000999001

Why we use this formula? We assume data is normally distributed. But in this case, obviously it's not...

One thing you can try is Column Group Statistics, so when you do runstats, try to collect stats for (c1,c2) instead of each individual column
http://www.ibm.com/developerwork ... 12kapoor/index.html
ex
RUNSTATS ON TABLE schema_name.SHOW_LISTINGS ON COLUMNS ((SHOW_ID, CHANNEL_ID, STATION_ID))

Another thing we can try is manually specify the filter factor for the join.
The expected filter factor should be 9801/1340.23/1340.23=0.0054564693381893028088296495972933
One thing you can consider is to manually specify the filter factor for the join
db2set db2_selectivity=YES
db2stop force
db2start
and make the query like
select *
from tt1,tt2
where tt1.c2=tt2.c2 selectivity 0.005456 and tt1.c1 >= 800 and tt2.c1 >= 800

please try them out and see whether helps...

cngd1 · 发表于 2010-4-8 09:41

学习了

mdkii · 发表于 2010-4-8 16:31

感谢狼兄的回答。
我试了一下你说的两个办法，
1、使用了runstats on table db2inst1.ttXX on columns((c1,c2)) with distribution and detailed indexes all命令
收集了一下Column Group 的统计信息，可惜最后执行计划与原来一样，没有变化，
我想可能是这里的predicate 一个是join predicate，一个是local predicate，列相关性的统计信息用不上。

2、加selectivity语句，
select *
from tt1,tt2
where tt1.c2=tt2.c2 selectivity 0.005456
and tt1.c1 >= 800
and tt2.c1 >= 800

不过db2 报了这样一个错：
select * from tt1,tt2 where tt1.c2=tt2.c2 selectivity 0.005456 and tt1.c1 >= 800 and tt2.c1 >= 800
SQL20046N SELECTIVITY clause following "where TT1.C2=TT2.C2 " can only be
specified for a valid user-defined predicate. SQLSTATE=428E5。
貌似只有在 user-defined predicate才能加 selecttivity。

另 db2set
DB2_NUM_CKPW_DAEMONS=0
DB2_SELECTIVITY=YES
DB2COMM=tcpip
DB2CODEPAGE=1386

请问还有其他办法吗？

wangzhonnew · 发表于 2010-4-8 19:36

try
db2set DB2_SELECTIVITY=ALL
db2stop force
db2start

mdkii · 发表于 2010-4-9 10:38

db2set DB2_SELECTIVITY=ALL确实可以，
这里先谢谢了，
不过，还有一个问题就是，这个selectivity是根据结果估计的。
如果是一个复杂的SQL，一个个通过实际执行去估计selectivity好像不太现实，
而且随着数据量的变化，这个selectivity应该也会不太相同，
总不至于每次都要去修改脚本。。。
不知道狼兄有没有更好的办法？
其实，一般来讲，就算估计不准，db2选择了次优的执行策略，
效率也不会有大问题，
但在以下两种情况下：db2效率会变的很差，这也是很多人误解db2跑得慢的原因。
1、db2会因为结果集很小而选择NLjoin。
2、db2会因为结果集很小选择在中间结果集里排序，
在这两种情况下，你会发现db2跑得奇慢，而且cpu和io利用情况还很低。
形成了所谓的懒惰系统。
不知道有没有什么办法，在db2估计不准的情况下
提高NLJOIN和SORT的执行成本？

wangzhonnew · 发表于 2010-4-9 11:07

in this particular case, i think selectivity is the best solution to give "hint" to optimizer about the join filter factor.
when you find a SQL doesn't use ideal plan, maybe you have to go through the same analysis as what we did above, and see whether we can specify selectivity or some other "tricks" to influence optimizer picking up a better plan...
yeah optimizer issues usually hard and tricky, there's no silver bullet can resolve all cases

(and that's why i'm still keeping my job) ^_^

diablo2 · 发表于 2010-4-9 13:36

LZ，我有一点不明，您的runstats里，两个表都是3001条，怎么你的SELECT结果是9801条呢？

mdkii · 发表于 2010-4-9 16:42

因为有重复记录啊。。。

wangzhonnew · 发表于 2010-4-9 20:11

another possibility i'm thinking maybe statistical view could be useful here, when it's defined as join for 2 tables...
http://www.ibm.com/developerwork ... 0612chen/index.html

ex
Create view SCHEMA.V1 as select * from T1, T2 where T1.C1 = T2.C1

Alter view SCHEMA.V1 enable query optimization

Runstats on table SCHEMA.V1 with distribution

db2 join基估计的问题

浏览过的版块