100倍以上的性能提高

richto · 发表于 2002-5-24 10:55

To simplify the discussion, let us focus on Nested Loop and Sort Merge first.  Hash Join is quit difficult to discuss here.

Select * from A, B
Where A1=B1

If A1 and B1 are B-Tree indexed on table A and B.  Let consider an isolated case, if you have unlimited memory and everything can be performed in memory, now we count the number of operation for Nested Loop.
We assume that A has 1,000 records and B has 200,000 records.  Let us calculate number of operations for driving path A-> B.  Which means that we open table A, for each record from A to fetch (scan) table B.
Operations= 1000*LN(200,000)/2 ~ 6103

Where LN(200,000) is depth of the B-tree index, LN(200,000)/2 is the average operation to match the right node in tree.

Let’s calculated the driving path for B->A
Operations= 200,000*LN(1,000)/2 ~ 690775

So, it is almost 113 time slower then the A->B.  So, normally, the Nested Loop problem is caused by wrong driving path.
Let’s consider the following SQL, you will know more why Oracle and other database cannot always make right decision.
Select * from A, B
Where A1=B1 and A1<:var

Actually Oracle will improve your SQL by adding a new condition.
Select * from A, B
Where A1=B1 and A1<:var and B1<:var

So, A->B or B-> A are also be consider, :var is an unknown factor, it also hard to predict how many records will be filtered by A1<:var or B1<:var .  So, the decision is not always correct.  Sometimes you change it Hash Join may find a big improvement, that may not be caused by Hash Join operation, but it may caused by Hash changing your wrong driving path only.  As it is too long, I would leave it here and discuss Sort Merge next time.

Forgive me, if any mistakes I made here, no prove read before publish!

richto · 发表于 2002-5-27 16:00

Let me finish the Sort Merge Join. I use the same example as Nested Loop.
Select * from A, B
Where A1=B1

If Sort Merge is selected by Oracle optimizer, Oracle will sort table A and B first. Assume that data in table A and B is in random order, then the number of operations for sorting will be:
Table A ~ N*LN(N) where N is number of in A table
Table B ~ M*LN(M) where M is number of in B table

Number of operations total for A and B= Power(1,000,3/2)+ Power(200,000,3/2) =6907 + 2441214 = 2448121
(To simplify the discussion, let me ignore the Merge operations)

You can see the initial overhead for Sort Merge is larger than Nested Loop, but the growth rate is the almost the same ~ N*LN(N) . But the point is that sometimes your records inserted are already in a natural order which require less than N*LN(N) move operations for sorting. So, Sort Merge may faster for those tables which records are sorted(or partially sorted). Memory is also an additional overhead for Sorting; so many users environment may need extra memory allocation. Nested Loop is relative stable, less overhead and need less resources (if the driving path is correct), but Sort Merge has less risk for taking wrong driving path, since the Sort A or B first will not affect the path significantly. So, there is no one single winner, so sometimes you have to teach your Oracle to pick the right plan by Hints or SQL Rewrite. Hash Join is even more complicated. Let’s discuss later!

mikesj · 发表于 2002-8-29 22:50

其实我觉的两张大表A X条记录, B Y 条记录
LOOP COST X*Y
MERG 可能用的是数据结构中的某一个如 COST LOG N算法分别对两个结果进行排许在MERG
具体那一种算法不得而知
道理有点象乘法与加法的关系
1+2>1*2
5+6<5*6

aleckqian · 发表于 2003-2-12 13:37

有个小问题，ORACLE里总是最右边的表是驱动表，也就是说Select * from A, B
where A1=B1 还应该看A表和B表返回的数据行哪个多，如果B表返回的数据行少则B表做驱动表，按照上面的SQL写；如果A表返回的数据少，则应该写成Select * from B, A Where A1=B1
再按照你的算法进行计算，而不仅仅是根据本身表中的数据来计算。

shining_leon · 发表于 2003-7-5 18:13

biti_rainy · 发表于 2003-7-5 18:24

最初由 aleckqian 发布
[B]有个小问题，ORACLE里总是最右边的表是驱动表，也就是说Select * from A, B
where A1=B1 还应该看A表和B表返回的数据行哪个多，如果B表返回的数据行少则B表做驱动表，按照上面的SQL写；如果A表返回的数据少，则应该写成Select * from B, A Where A1=B1
再按照你的算法进行计算，而不仅仅是根据本身表中的数据来计算。 [/B]

如果是 rule-based 则跟顺序有关

如果是 cost-based 则跟顺序无关

sztangxy · 发表于 2003-7-7 11:51

感觉各位大虾还是没有讨论清楚，能不能继续讨论？

mophe · 发表于 2003-7-8 14:09

请问krlion, use_nl,use_hash怎么用，

能具体的把你使用了NESTED LOOP或HASH的SQL语句写出来吗？

stephen · 发表于 2004-9-27 23:57

同样的SQL语句，过一段时间后Oracle改变了执行计划，
一时没有办法，特别慢，只好先用 /* use_merge(A B) */
强制设置执行计划，不是最好的，但能在几十秒中返回结果，
否则的话，这个SQL语句1.5小时都不能出来结果。

涉及多表连接，其中一张500M，一张50M，其他几张表0.5--7M
，很小，不加提示时，以前好好的，不清楚何故，执行计划变了，
区别是做了表分析，其他没干什么。

现在建立index已无用，目前的index是足够的，还需要继续分析。
有些原理没有弄清楚。

gozheng · 发表于 2004-9-28 08:55

做了表分析,你这SQL语句的优化模式就变为CBO了呀!

[精华] 100倍以上的性能提高

nice

o

疑惑

遇到一个SQL语句

浏览过的版块