|
看我的一个 hash join的例子
sys@OCN>show sga
Total System Global Area 722540200 bytes
Fixed Size 452264 bytes
Variable Size 301989888 bytes
Database Buffers 419430400 bytes
Redo Buffers 667648 bytes
sys@OCN>select bytes/1024/1024 from user_segments where segment_name = 'MEMBER';
BYTES/1024/1024
---------------
2080
Elapsed: 00:00:00.03
sys@OCN>show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 1073741824
sys@OCN>show parameter work
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
workarea_size_policy string AUTO
SQL> set autotrace trace exp
SQL>
SQL> select /*+ use_hash(a b)*/ a.login_id,count(*) dup_num,
sum(least(a.gmt_modified,b.gmt_create)-b.gmt_create) duptime_sum
from member a,member b
where a.login_id=b.login_id
and a.pen_name!=b.first_name
group by a.login_id
having count(*)!=0;
2 3 4 5 6 7
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=57258 Card=24810 Byt
es=2605050)
1 0 FILTER
2 1 SORT (GROUP BY) (Cost=57258 Card=24810 Bytes=2605050)
3 2 HASH JOIN (Cost=52856 Card=496200 Bytes=52101000)
4 3 TABLE ACCESS (FULL) OF 'MEMBER' (Cost=22141 Card=437
4600 Bytes=87492000)
5 3 TABLE ACCESS (FULL) OF 'MEMBER' (Cost=22141 Card=437
4600 Bytes=371841000)
sys@OCN>select /*+ use_hash(a b)*/ a.login_id,count(*) dup_num,
sum(least(a.gmt_modified,b.gmt_create)-b.gmt_create) duptime_sum
from member a,member b
where a.login_id=b.login_id
group by a.login_id
having count(*)!=0;
2 3 4 5 6
5078487 rows selected.
Elapsed: 00:03:13.09
Statistics
----------------------------------------------------------
0 recursive calls
23 db block gets
964505 consistent gets
575690 physical reads
0 redo size
88534739 bytes sent via SQL*Net to client
1117758 bytes received via SQL*Net from client
101571 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
5078487 rows processed
表实际有2g大小,唯一和你不同的是记录数只有不到600万,只是因为行长度大一些。使用 hash join 完成时间大约是 3分多一点。根据我的经验,即使这个表大小增加到1000万,也不会超过10分钟。
这是我在我的系统中的测试,仅仅给你一个参考
我的环境是 linux radhat AS2.1 + oracle 9204 , file system ext3 , EMC cx200 raid10 |
|