还是查询优化的问题，可能有点怪

4pal · 发表于 2003-5-16 16:27

我也写一个 sql

select B.type from B where exists ( select A.id from A where A.id between B.start_id and B.end_id);

假如你还要id，那就只能用楼主自己的sql语句了。

4pal · 发表于 2003-5-16 16:30

假如你要让你的sql语句用上你所谓的索引，呵呵，你就准备等更长更长的时间吧。
（如果没猜错，你的索引没有包括 type这个 column）

shangym · 发表于 2003-5-16 16:45

能把执行计划贴出来吗？
A表上建索引了吗？

xiaba · 发表于 2003-5-16 16:56

楼主算一算你返回结果集的个数SELECT COUNT（*） FROM 。。。，我有点怀疑你这样子写返回的是笛卡儿积，另外A表ID索引用了没有？可加HINTS强制使用，还有可以分析以下ID列，看他是不是非均匀分布，可使用HISTOGRAN属性。方法很多，漫漫十八

newkid · 发表于 2021-7-17 19:07

测试回帖

sql_tigerliu · 发表于 2021-7-17 19:19

本帖最后由 sql_tigerliu 于 2021-7-17 19:23 编辑

如果是12c及以上版本, 可以用下面写法:
select *
from
(
select id,
(select case when start_id<=a.id then type else null end as type
      from
      (select type,start_id
      from range_b
      where end_id>=a.id
      order by end_id
      ) where rownum=1
) as type
from range_a a
) where type is not null;
需要创建 range_b表end_id字段上的索引

sql_tigerliu · 发表于 2021-7-17 19:33

如果是12c之前的版本, 标量子查询的子查询中不支持对主查询字段的引用, 建议创建一个函数, 然后调用函数:
这里type用的是number类型
create or replace function fun_get_range_type(p_id integer)  return integer
is
v_type integer;
BEGIN
select case when start_id<=p_id then type else null end as type
into v_type
from
(select type,start_id
   from range_b
   where end_id>=p_id order by end_id
) where rownum=1;
return(v_type);
end;
/
--查询语句:
select * from
(
select id,fun_get_range_type(id) as type
from range_a
)where type is not null;

还是需要创建 range_b表end_id字段上的索引

newkid · 发表于 2021-7-17 20:59

A 表有二十万，意味着要执行二十万次标量子查询，效率也不会高。

sql_tigerliu · 发表于 2021-7-17 21:12

本帖最后由 sql_tigerliu 于 2021-7-17 21:14 编辑

newkid 发表于 2021-7-17 20:59
A 表有二十万，意味着要执行二十万次标量子查询，效率也不会高。

因为没找到比这个效率更高的写法, 只能凑合了. 简单测试了一下, 不到1秒. 而且里面还设计了将近一半记录匹配不到区间的情况.

newkid · 发表于 2021-7-18 02:36

本帖最后由 newkid 于 2021-7-18 03:30 编辑

我的思路是构造等值连接条件，这样就可以用HASH JOIN, 原有的比较条件可以作为附加的 filter

下面是刘兄提供的测试用例：
create table range_a(id number);
create table range_b(start_id number,end_id integer,type number);

insert into range_a
select round(dbms_random.value(1,25000000)) from dual connect by level<=200000;

insert into range_b
select 10000*(level-1)+3000 as start_id, 10000*level as end_id , level as type
from dual connect by level<=2000
order by dbms_random.value;

commit;

create index idx_range_b_0 on range_b(start_id,end_id);
create index idx_range_b_1 on range_b(end_id);

create table range_result (id number, type number);

create table range_result2 (id number, type number);

--原sql:
insert into range_result
SELECT ID, TYPE
FROM range_a A,range_b B
WHERE A.ID>=B.START_ID AND A.ID<=B.END_ID;

-------- 刘兄写法：
--12c+版本
insert into range_result
select *
from
(
select id,
(select case when start_id<=a.id then type else null end as type
      from
      (select /*+ index(range_b idx_range_b_1) */ type,start_id
      from range_b
      where end_id>=a.id
      order by end_id
      ) where rownum=1
) as type
from range_a a
) where type is not null;

--- 111824 rows created.

--- Elapsed: 00:00:01.23

按我的思路构造的写法：分析B表数据特点，范围以千计, 所以把 id/10000 进行等值比较。范围除以10000之后可能有多个值，比如从13000—20000这个返回，除以10000之后会映射到1,2两个值，所以一行B表数据会变成多行，我用了12C的LATERAL方法来复制B表数据，在12C之前可以用传统的CONNECT BY方法。

insert into range_result2
with b as (
select b.*,trunc(start_id/10000)+n-1 as compare_id  ------- 构造一个等值比较的 ID
  from range_b b
   ,lateral(select level n from dual connect by level<=trunc(end_id/10000)-trunc(start_id/10000)+1)
)
select a.id,b.type from range_a a,b
where trunc(a.id/10000)=b.compare_id  -------- 等值比较
  and a.id between b.start_id and b.end_id ---------- 附加filter
;

111824 rows created.

Elapsed: 00:00:00.12

从执行计划可以看到，等值比较如愿以偿用上了HASH JOIN：

Execution Plan
----------------------------------------------------------
Plan hash value: 2108890611

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time    |
----------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT                |                | 960K| 29M|  4124 (1)| 00:00:01 |
| 1 |  LOAD TABLE CONVENTIONAL       | RANGE_RESULT2 |    |    |          |       |
|*  2 | HASH JOIN                   |                | 960K| 29M|  4124 (1)| 00:00:01 |
| 3 | NESTED LOOPS                |                |  2000 | 52000 |  4004 (1)| 00:00:01 |
| 4 |    TABLE ACCESS FULL          | RANGE_B       |  2000 | 26000 |    3 (0)| 00:00:01 |
| 5 |    VIEW                      | VW_LAT_535DE542 |    1 | 13 |    2 (0)| 00:00:01 |
|*  6 |    CONNECT BY WITHOUT FILTERING|                |    |    |          |       |
| 7 |    FAST DUAL                |                |    1 |    |    2 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL          | RANGE_A       | 200K|  1171K| 103 (1)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(TRUNC("A"."ID"/10000)=TRUNC("START_ID"/10000)+"N"-1)
   filter("A"."ID"<="B"."END_ID" AND "A"."ID">="B"."START_ID")
6 - filter(LEVEL<=TRUNC("END_ID"/10000)-TRUNC("START_ID"/10000)+1)