奇怪的Global索引，各位版主分析

棉花糖ONE · 发表于 2009-3-25 09:57

改下统计信息把clustoring_factor改小

棉花糖ONE · 发表于 2009-3-25 09:59

alter session set optimizer_dynamic_sampling=6;
select *
from table1
where field1 = 'value1'
and field2 = 'value2'
and from_date <= to_date('20090401', 'yyyymmdd')
and to_date >= to_date('20090401', 'yyyymmdd');

看看rows的评估能对吗

必有我师 · 发表于 2009-3-26 09:43

原帖由 Yong Huang 于 2009-3-25 09:29 发表

Please post the 10053 trace to Itpub. There're a couple of SQL experts on this forum. They can help you.

How many partitions does the table have? 2?

Based on your trace, the index range scan path takes a higher cost (38484) than full table scan (5155.86) so a FTS is chosen. The cost 38484 is

index blevel + ceil(leaf blocks * index selectivity) + ceil(clustering factor * table selectivity)
= 3 + ceil(110049*0.012359) + ceil(102253*0.0036302) = 3 + 1361 + 37120

The table cost 5155.86 is something I can't fully understand. Supposedly it's table blocks divided by a modified db_file_multiblock_read_count (mbrc). Since you don't have workload system stats, according to
http://www.freelists.org/post/or ... ad-count-autotuning,1
we need to look at _db_file_optimizer_read_count, which is 16 in your trace. But if you get 5155.86 with 23374 table blocks in the partition, that means the modified mbrc is about 4.53.

I suggest you collect workload system stats. If you explicitly set mbrc in pfile or spfile, unset it and let Oracle decide.

Yong Huang

Many thanks for you reply!
The cost for index range scan should be:
index blevel + ceil(leaf blocks * index selectivity) + ceil(clustering factor * table selectivity)
= 3 + ceil(110049*0.012359) + ceil(10225329*0.0036302) = 3 + 1361 + 37120

But in this formula the clustering factor is the global index CF, but the "table selectivity" is the partition table selectivity, is it resonable?

[ 本帖最后由必有我师于 2009-3-26 09:50 编辑 ]

必有我师 · 发表于 2009-3-26 09:58

Local分区索引的信息，没有再生成使用local分区索引的trace，但从执行计划可以推断出，优化器使用的是local partition index的单个分区的CF，而不是整个索引的CF。也许这是造成问题的主要原因。
BTW：由于测试环境定期从生产库同步数据，这里的数据与之前的数据会略有差别。

LEAF_BLOCKS    BLEVEL CLUSTERING_FACTOR
----------- ---------- -----------------
   109600       2          9847591

INDEX_NAME PARTITION_NAME    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------- ---------------- -------- ----------- -----------------
table1_indx PART001                2       2760          272251
table1_indx PART002                2       3904          373923
table1_indx PART003                2       3498          314809
table1_indx PART004                2       4661          359226
table1_indx PART005                2       5988          543600
table1_indx PART006                2       5325          480396
table1_indx PART007                2       4119          372753
table1_indx PART008                2       3895          407035
table1_indx PART009                2       6943          586473
table1_indx PART010                2       3218          271085
table1_indx PART011                2       5918          535255
table1_indx PART012                2       3727          347536
table1_indx PART013                2       4348          380869
table1_indx PART014                2       2818          313590
table1_indx PART015                2       5753          503883
table1_indx PART016                2       2679          290398
table1_indx PART017                2       4614          429929
table1_indx PART018                2       3458          347381
table1_indx PART019                2       2860          260179
table1_indx PART020                2       6624          406112
table1_indx PART021                2       4161          408103
table1_indx PART022                2       6949          592897
table1_indx PART023                2       2625          239860
table1_indx PART024                2       6341          567797
table1_indx PART025                2       2414          242251

[ 本帖最后由必有我师于 2009-3-27 11:32 编辑 ]

必有我师 · 发表于 2009-3-26 10:00

原帖由 棉花糖ONE 于 2009-3-25 09:59 发表
alter session set optimizer_dynamic_sampling=6;
select *
from table1
where field1 = 'value1'
and field2 = 'value2'
and from_date = to_date('20090401', 'yyyymmdd');

看看rows的评估能对吗

棉花糖，动态采样不太了解，如果已经有统计信息还会做动态采样吗？你这里是说执行语句还是生成执行计划？

Yong Huang · 发表于 2009-3-26 11:40

> But in this formula the clustering factor is the global index CF, but the
> "table selectivity" is the partition table selectivity, is it resonable?

That's because there's partition elimination (PE). Even though the plan says full table scan, because of PE, Oracle actually only scan that single partition.

> Local分区索引的信息，没有再生成使用local分区索引的trace，

Did you hard parse the SQL that uses the local partitioned index? 10053 trace is not created if you don't hard parse, or don't run explain plan for ...

> 但从这里可以推断出，优化器使用的是partition index的CF，而不是整个索引的CF。也许这是造成问题的主要原因。

You mean CBO uses the CF's of the individual index partitions (Don't say partition index!). Why do you say so?

> 如果已经有统计信息还会做动态采样吗？

If dynamic sampling (DS) level is higher than or equal to 4, DS *will* happen as long as >= 2 columns are referenced in the where clause, even if the table/index already has statistics.

Yong Huang

必有我师 · 发表于 2009-3-26 11:45

原帖由 棉花糖ONE 于 2009-3-25 09:59 发表
alter session set optimizer_dynamic_sampling=6;
select *
from table1
where field1 = 'value1'
and field2 = 'value2'
and from_date = to_date('20090401', 'yyyymmdd');

看看rows的评估能对吗

SQL> select *  from table1 where field1 = 'value1' and field2 = 'value2' and from_date <= to_date('20090401', 'yyyymmdd') and to_date >= to_date('20090401', 'yyyymmdd');

12 rows selected.

Elapsed: 00:00:00.47

Execution Plan
----------------------------------------------------------
Plan hash value: 806976405

--------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time    | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |       |  4774 | 652K| 228  (10)| 00:00:03 |    |    |
| 1 |  PARTITION RANGE SINGLE|       |  4774 | 652K| 228  (10)| 00:00:03 |    3 |    3 |
|*  2 | TABLE ACCESS FULL | table1  |  4774 | 652K| 228  (10)| 00:00:03 |    3 |    3 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("field2"='value2' AND "field1"='value1' AND "TO_DATE">=TO_DATE('2009-04-01
            00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "FROM_DATE"<=TO_DATE('2009-04-01 00:00:00',
            'yyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
   23380  consistent gets
      0  physical reads
      0  redo size
   5539  bytes sent via SQL*Net to client
      492  bytes received via SQL*Net from client
      2  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
      12  rows processed

必有我师 · 发表于 2009-3-26 11:55

原帖由 Yong Huang 于 2009-3-26 11:40 发表
> But in this formula the clustering factor is the global index CF, but the
> "table selectivity" is the partition table selectivity, is it resonable?

That's because there's partition elimination (PE). Even though the plan says full table scan, because of PE, Oracle actually only scan that single partition.

我的意思是CF用global索引(相对于分区而言一定是全局的)的统计值，而表选择率用单个分区的统计值，这样不是不匹配？是不是需要换算一下?呵呵

> Local分区索引的信息，没有再生成使用local分区索引的trace，

Did you hard parse the SQL that uses the local partitioned index? 10053 trace is not created if you don't hard parse, or don't run explain plan for ...
我是指我没有去生成local分区索引的10053 trace，而不是没有生成出来trace。

> 但从这里可以推断出，优化器使用的是partition index的CF，而不是整个索引的CF。也许这是造成问题的主要原因。

You mean CBO uses the CF's of the individual index partitions (Don't say partition index!). Why do you say so?
我认为优化器应该统一标准，就像上面提到的，不应该用全局索引的CF去和分区表的单个分区的选择率去计算Cost，而应该用全局索引的CF和分区表整体的选择率来计算cost。不知道我的理解是否正确？

> 如果已经有统计信息还会做动态采样吗？

If dynamic sampling (DS) level is higher than or equal to 4, DS *will* happen as long as >= 2 columns are referenced in the where clause, even if the table/index already has statistics.
谢谢，DS的autotrace结果已经生成，看前一个回复。
Yong Huang

[ 本帖最后由必有我师于 2009-3-26 12:14 编辑 ]

Yong Huang · 发表于 2009-3-30 05:39

> 我的意思是CF用global索引(相对于分区而言一定是全局的)的统计值，而表选择率用单个分区的统计
> 值，这样不是不匹配？是不是需要换算一下?呵呵
> 我认为优化器应该统一标准，就像上面提到的，不应该用全局索引的CF去和分区表的单个分区的选择
> 率去计算Cost，而应该用全局索引的CF和分区表整体的选择率来计算cost。不知道我的理解是否正确？

If you think about the plan, it makes perfect sense. The CBO cost is the number of I/O blocks Oracle has to read (slightly modified by a factor involving CPU time). In table access preceeded by an index range scan, Oracle has to go through the branch block(s) (including root block) of the index to get to the leaf blocks, scan all the leaf blocks needed (i.e. those picked based on the index selectivity), and read all table blocks pointed to by the rowid's stored in the index leaf blocks (i.e. table blocks times table selectivity). You only have a global index. There's no ambiguity there. Now, you may ask why not full table, the entire table, not just a partition? Think about it. The where clause tells Oracle which single partition it needs to read. Why would Oracle still need to read the whole table? The table selectivity is also about the single partition, because even this selectivity calculation does not need to be based on the whole table (the partition is already clearly identified early on).

Yong Huang

必有我师 · 发表于 2009-3-30 10:45

原帖由 Yong Huang 于 2009-3-30 05:39 发表

If you think about the plan, it makes perfect sense. The CBO cost is the number of I/O blocks Oracle has to read (slightly modified by a factor involving CPU time). In table access preceeded by an index range scan, Oracle has to go through the branch block(s) (including root block) of the index to get to the leaf blocks, scan all the leaf blocks needed (i.e. those picked based on the index selectivity), and read all table blocks pointed to by the rowid's stored in the index leaf blocks (i.e. table blocks times table selectivity). You only have a global index. There's no ambiguity there. Now, you may ask why not full table, the entire table, not just a partition? Think about it. The where clause tells Oracle which single partition it needs to read. Why would Oracle still need to read the whole table? The table selectivity is also about the single partition, because even this selectivity calculation does not need to be based on the whole table (the partition is already clearly identified early on).

Yong Huang

1. 我并是说oracle需要读整个表(做FTS)，因为数据可以只在一个分区内得到，所以只用单个分区的选择率计算基数是没有问题的。
2. 在计算代价公式的第三部分中CF*(table selectivity)，
2.1 CF是针对整个索引，因为索引是非分区的，根据CF的定义，它是针对表中的所有记录而得到
2.2 表选择率是基于单个分区的，假设我选出分区中的所有记录，即selectivity=1时，此时得到的代价是CF，即为根据rowid获取整个表的代价。(但实际上我们只需要根据rowid获取单个分区的记录)
2.3 根据上面两点，我认为在计算代价时的第三部分是不合适的。

奇怪的Global索引，各位版主分析

浏览过的版块