请教一下density的公式

oracledba · 发表于 2008-9-18 08:31

density实际是反映一部分数据的特性，以前都是比如value1的density d1,value2的density d2的density就是(d1+d2)/2，但是d1只有1个只d2有100000个，那这样算的density客观吗？做二次方就是为了扩大特性，就是从0.1到0.2的200%的差异扩大到0.01到0.04的400%的差异，这样以后就尽可能少的出现两列density极其相近但是实际分布是相差非常之大而已。在没有histogram的时候能用尽可能准确的密度来决定到底那个条件来drive，算法这种东西都差不多，关键是只要都通用一样的算法就可以了

宇野 · 发表于 2008-9-22 13:16

个人理解是: (value1 rows / nonull rows + value2 rows/nonull rows + ... + valuen rows/nonull rows)/distinct value count

morong · 发表于 2009-4-8 20:29

Density is calculated as follows:

Pre 7.3
~~~~~~~

Density = 1 / Number of distinct NON null values

The number of distinct NON-null values for a column (COL1) on table TABLE1
can be obtained as follows:

select distinct count(COL1)
from TABLE1
where  COL1 is not null;

7.3+
~~~~

The Density calculation has been refined by the use of histograms. If
you have created histograms on your columns we can now use the histogram
information to give more accurate information. Otherwise the Density is
calculated as before. With histograms we can use information on
popular and non-popular values to determine the selectivity.

A non-popular value is one that does not span multiple bucket end points.
A popular value is one that spans multiple end points.

(Refer to <Note:50750.1> for details on histograms)

For non-popular values the density is calculated as the number of non-popular
values divided by the total number of values. Formula:

Density =  Number of non-popular values
            ----------------------------
               total number of values

We only use the density statistic for non-popular values.

Popular values calculate the selectivity of a particular column values by
using histograms as follows:

The Selectivity for popular values is calculated as the number of end points
spanned by that value divided by the total number of end points. Formula:

Selectivity = Number of end points spanned by this value
               ------------------------------------------
                     total number of end points

morong · 发表于 2009-4-8 20:33

i still have one question: what if i have a column that don't have histogram in my oracle 10g . how to compute that column's density?

morong · 发表于 2009-4-8 20:33

回家先搞

Yong Huang · 发表于 2009-4-9 03:43

morong,

Thanks for sharing. But here's a friendly reminder. Whenever you post something not written by you, you must tell us the source for reason of copyright.

Message #13 is found at
http://asktom.oracle.com/pls/ask ... ON_ID:2969235095639

But I find that document to be not quite accurate for current versions of Oracle. For instance, on 10g, 10.2.0.2 and up, here's a good example of how density is calculated:

http://www.freelists.org/post/or ... case-of-histogram,3

Also I summarized some of the interesting points at
http://yong321.freeshell.org/oranotes/Histogram.html

Yong Huang

[ 本帖最后由 Yong Huang 于 2009-4-9 07:19 编辑 ]

morong · 发表于 2009-4-14 09:24

谢谢Yong Huang 的解答!!!

feng_shou_dong · 发表于 2009-4-27 16:56

原帖由 Yong Huang 于 2009-4-9 03:43 发表
morong,

Thanks for sharing. But here's a friendly reminder. Whenever you post something not written by you, you must tell us the source for reason of copyright.

Message #13 is found at
http://asktom.oracle.com/pls/ask ... ON_ID:2969235095639

But I find that document to be not quite accurate for current versions of Oracle. For instance, on 10g, 10.2.0.2 and up, here's a good example of how density is calculated:

http://www.freelists.org/post/or ... n-case-of-histogram,3

Also I summarized some of the interesting points at
http://yong321.freeshell.org/oranotes/Histogram.html

Yong Huang

Yong Huang
能把原理、背景、应用场景在给大家介绍一下么！
最好使用中文并附有例子，这样大家理解更深刻！
好人做到底么！先替兄弟们谢谢你

wangkxxe · 发表于 2009-6-26 14:30

看看精华帖学习中！看英文有点儿累哈~ 坚持

wangkxxe · 发表于 2009-6-26 14:39

学习！！！！

[精华] 请教一下density的公式

浏览过的版块