|
The filter factor for HSJOIN (2) is 0.000999:
2) HSJOIN: (Hash Join)
Cumulative Total Cost: 210.801
Cumulative CPU Cost: 2.74143e+07
Cumulative I/O Cost: 52.9262
Cumulative Re-Total Cost: 210.801
Cumulative Re-CPU Cost: 2.74143e+07
Cumulative Re-I/O Cost: 52.9262
Cumulative First Row Cost: 210.801
Estimated Bufferpool Buffers: 54.9262
Arguments:
---------
BITFLTR : (Hash Join Bit Filter used)
FALSE
EARLYOUT: (Early Out flag)
NONE
HASHCODE: (Hash Code Size)
24 BIT
TEMPSIZE: (Temporary Table Page Size)
4096
Predicates:
----------
4) Predicate used in Join
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.000999001
Predicate Text:
--------------
(Q2."C2" = Q1."C2")
Input Streams:
-------------
6) From Operator #3
Estimated number of rows: 1340.23
Number of columns: 2
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q2."C1"+Q2."C2"
12) From Operator #7
Estimated number of rows: 1340.23
Number of columns: 2
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q1."C1"+Q1."C2"
Output Streams:
--------------
13) To Operator #1
Estimated number of rows: 1794.43
Number of columns: 4
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q3."C2"+Q3."C1"+Q3."C2"+Q3."C1"
That means the estimated row is calculated by 1340.23 * 1340.23 * 0.000999001
And how does DB2 get 0.000999? The formula is 1/MAX(CARD(COL1),CARD(COL2))
In this particular case, we have CARD(COL1) and CARD(COL2) both equal : 1001, which means FF=1/1001=0.000999000999000999000999000999001
Why we use this formula? We assume data is normally distributed. But in this case, obviously it's not...
One thing you can try is Column Group Statistics, so when you do runstats, try to collect stats for (c1,c2) instead of each individual column
http://www.ibm.com/developerwork ... 12kapoor/index.html
ex
RUNSTATS ON TABLE schema_name.SHOW_LISTINGS ON COLUMNS ((SHOW_ID, CHANNEL_ID, STATION_ID))
Another thing we can try is manually specify the filter factor for the join.
The expected filter factor should be 9801/1340.23/1340.23=0.0054564693381893028088296495972933
One thing you can consider is to manually specify the filter factor for the join
db2set db2_selectivity=YES
db2stop force
db2start
and make the query like
select *
from tt1,tt2
where tt1.c2=tt2.c2 selectivity 0.005456 and tt1.c1 >= 800 and tt2.c1 >= 800
please try them out and see whether helps... |
|