这个sql怎么优化!

askgyliu · 发表于 2008-5-6 12:53

1) CUST_ID 应当是UNIQUE，不然的话出来的结果基本可以肯定是错误的。在CUST_ID上应该有UNIQUE INDEX或是PRIMARY KEY。

2) MSJOIN 应该是对的。除非CCP.CUST_BILLSUM_200802 a中符合“USEE8_TIME is null”的很少。也就是说，正常情况下TBSCAN在这个QUERY是对的。

3) OPTIMIZATION LEVEL越低考虑的因素就越少吧？

4) SORT时应该是这个的RESULTED ROW在做SORTING，SORTED BY CUST_ID，BUT THE WHOLE ROW WILL BE PLACED IN THE SORT HEAP。LZ的QUERY里有几十个COLUMNS。

5) 回到DESIGN问题。若是商业要求做WEEKLY REPORT，那怎么办？ROLLING REPORT呢？把几个TABLES UNION ALL成一个VIEW？觉得很成问题。

myfriend2010 · 发表于 2008-5-6 13:32

1)CUST_ID是unique的!---------见13楼的UNIQUE INDEX!
2)MSJOIN 的话就必须有一个排序的过程,
因为当前数据库单分区的SORTHEAP=1555,而这个值是数据库配置顾问程序给的---------见20和17楼
3)OPTIMIZATION LEVEL降低可以使走索引,但是看样子索引全扫描的cust和原来差不多!
4)sort应该是针对where的连接列进行的--------这个我猜的

最后,这个sql怎么优化? 3个500万数量级的表的计算!

myfriend2010 · 发表于 2008-5-6 13:33

askgyliu 能加入进来,这个帖子就太有意义了!

askgyliu · 发表于 2008-5-6 17:51

觉得关键是符合“USEE8_TIME is null”的有多少？

比如说，每个TABLE都是五百万，但若是符合条件的只有几万/几十万的话，我倒是觉得改一下SQL可能会有帮助：

with cust_id_tmp as
(select cust_id_tmp from CCP.CUST_BILLSUM_200802
where USEE8_TIME is null)
select ....
from CCP.CUST_BILLSUM_200802 a,
(select * from CCP.CUST_BILLSUM_200801 where cust_id in (select cust_id from cust_id_tmp)) b,
(select * from CCP.CUST_BILLSUM_200712 where cust_id in (select cust_id from cust_id_tmp)) c
where a.cust_id = b.cust_id and a.cust_id =c.cust_id and a.USEE8_TIME is null

Indexing on CUST_BILLSUM_200802 (USEE8_TIME, CUST_ID)
Indexing on CUST_BILLSUM_200801 (CUST_ID)
Indexing on CUST_BILLSUM_200712 (CUST_ID)

runstats with distribution on CUST_BILLSUM_200802

LZ可以把这个的SQL PLAN倒出来看看？

如果符合条件的很多的话(>30%)，那就没什么必要再想了。TABSCAN，DB2已经给答案了。

askgyliu · 发表于 2008-5-6 17:53

BTW, DB2 would be able to use the index to determine whether a column is null if the column is indexed. Oracle will need to other consideration.

wangzhonnew · 发表于 2008-5-6 19:13

原帖由 myfriend2010 于 2008-5-6 12:58 发表
优化级别为0以后,确实走索引了! 不过看样子效果还是不佳

看执行计划....

did you test the runtime performance with optlvl=0? you said "看样子效果还是不佳", were you only talking about the estimation cost or the real run-time performance?

thx

wangzhonnew · 发表于 2008-5-6 19:22

原帖由 askgyliu 于 2008-5-6 13:53 发表
1) CUST_ID 应当是UNIQUE，不然的话出来的结果基本可以肯定是错误的。在CUST_ID上应该有UNIQUE INDEX或是PRIMARY KEY。

2) MSJOIN 应该是对的。除非CCP.CUST_BILLSUM_200802 a中符合“USEE8_TIME is null”的很少。也就是说，正常情况下TBSCAN在这个QUERY是对的。

3) OPTIMIZATION LEVEL越低考虑的因素就越少吧？

4) SORT时应该是这个的RESULTED ROW在做SORTING，SORTED BY CUST_ID，BUT THE WHOLE ROW WILL BE PLACED IN THE SORT HEAP。LZ的QUERY里有几十个COLUMNS。

5) 回到DESIGN问题。若是商业要求做WEEKLY REPORT，那怎么办？ROLLING REPORT呢？把几个TABLES UNION ALL成一个VIEW？觉得很成问题。

got questions, for your comment (2), why do you think MSJOIN is correct? i was thinking SORT may underestimated the cost because for a small SORTHEAP, 3 sorts in the plan may overflow to tempspace or even overflow to disk... (but didn't do manual calculation, just a brief idea)
i think you mean TBSCAN on CUST_BILLSUM_200 was correct, because most of the rows in the table are NOT NULL...

3) when optlvl<2 db2 uses greedy algorithm rather than dynamic programming, especially in opt=0, which will pickup index whenever possible, so that's a common way to test how's the runtime performance with using an index

4) yeah you are right, the whole row will be placed in sortheap, unless the plan uses FETCH on top, TBSCAN+SORT on outter side and table name on inner side...
ex:
9) SORT : (Sort)
Cumulative Total Cost: 2.61297e+006
Cumulative CPU Cost: 1.72482e+010
Cumulative I/O Cost: 474282
Cumulative Re-Total Cost: 0
Cumulative Re-CPU Cost: 0
Cumulative Re-I/O Cost: 159922
Cumulative First Row Cost: 2.61297e+006
Cumulative Comm Cost: 0
Cumulative First Comm Cost: 0
Estimated Bufferpool Buffers: 547612

Arguments:
---------
DUPLWARN: (Duplicates Warning flag)
FALSE
NUMROWS : (Estimated number of rows)
1166257
PARTCOLS: (Table partitioning columns)
1: Q3."CUST_ID"
ROWWIDTH: (Estimated width of rows)
780
SORTKEY : (Sort Key column)
1: Q3."CUST_ID"(A)
SORTTYPE: (Intra-Partition parallelism sort type)
PARTITIONED
SPILLED : (Pages spilled to bufferpool or disk)
233252
TEMPSIZE: (Temporary Table Page Size)
4096
UNIQUE : (Uniqueness required flag)
FALSE

in this sort, please note that ROWWIDTH=780 and NUMROWS=1166257, so from calculation 780*1166257/1024/1024~=867MB.... that means at least 867MB of sortheap will be required for this particular sort.... and please note there are 3 sorts in the original plan....

unixnewbie · 发表于 2008-5-6 20:50

如果table CUST_BILLSUM_200802没有update的distribution stats，则先用
select count(*) from CUST_BILLSUM_200802 where USEE8_TIME is null with ur
看看满足条件的有多少。

我估计要不USEE8_TIME上没有关于“null”值的frequent value stats，要不就是满足null的Row仍然有很多。由于参加joining的三个都是超过million rows的table，从第一个explain output就知道就算merge join已经产生spilling，所以hash join是不可能的了。那么nljoin可能吗？假设由于满足null的Row仍然有很多或者根本没有null的统计值（DB2仍假设有很多满足null的rows），所以就算其作为outer table，nested loop join中的index scan次数仍然很多。且index scan后还要根据RID从Table中再读出ROWS内容（如果cluster ratio不高的话，产生的“小”IO更多）。MSJOIN虽然溢出，不过有可能还在bufferpool中并未产生physical IO，且outer and inter只需要分别scan 一次，还可以利用sequential scan prefeching。

正如askgyliu所说，关键是看符合null值的有多少。如果很少且db2已经知道（通过distribution stats），那么nljoin（须有cust_id index配合）会是个更好的选择。

wangzhonnew · 发表于 2008-5-6 20:53

是这样滴，所以现在并不能肯定MSJOIN或者NLJOIN在运行时到底谁好谁坏，还需要LZ亲自运行一次看一看真正的运行所消耗的时间

myfriend2010 · 发表于 2008-5-7 08:54

==马上贴过来!!

这个sql怎么优化!

浏览过的版块