楼主: newkid

[精华] SQL解惑(第2版) 的一些样题

[复制链接]
论坛徽章:
520
奥运会纪念徽章:垒球
日期:2008-09-15 01:28:12生肖徽章2007版:鸡
日期:2008-11-17 23:40:58生肖徽章2007版:马
日期:2008-11-18 05:09:48数据库板块每日发贴之星
日期:2008-11-29 01:01:02数据库板块每日发贴之星
日期:2008-12-05 01:01:03生肖徽章2007版:虎
日期:2008-12-10 07:47:462009新春纪念徽章
日期:2009-01-04 14:52:28数据库板块每日发贴之星
日期:2009-02-08 01:01:03生肖徽章2007版:蛇
日期:2009-03-09 22:18:532009日食纪念
日期:2009-07-22 09:30:00
21#
 楼主| 发表于 2010-3-13 00:46 | 只看该作者
PUZZLE 3 THE ANESTHESIA PUZZLE
Leonard C. Medal came up with this nifty little problem many years ago.
Anesthesiologists administer anesthesia during surgeries in hospital
operating rooms. Information about each anesthesia procedure is
recorded in a table.
Procs
proc_id anest_name start_time end_time
======================================
10 'Baker' 08:00 11:00
20 'Baker' 09:00 13:00
30 'Dow' 09:00 15:30
40 'Dow' 08:00 13:30
50 'Dow' 10:00 11:30
60 'Dow' 12:30 13:30
70 'Dow' 13:30 14:30
80 'Dow' 18:00 19:00
Note that some of the times for a given anesthesiologist overlap. This
is not a mistake. Anesthesiologists, unlike surgeons, can move from one
operating room to another while surgeries are underway, checking on
each patient in turn, adjusting dosages, and leaving junior doctors and
anesthesia nurses to monitor the patients on a minute-to-minute basis.
Pay for the anesthesiologist is per procedure, but there’s a catch.
There is a sliding scale for remuneration for each procedure based on
the maximum count of simultaneous procedures that an
anesthesiologist has underway. The higher this count, the lower the
amount paid for the procedure.
For example, for procedure #10, at each instant during that
procedure Dr. Baker counts up how many total procedures in which he
was concurrently involved. This maximum count for procedure #10 is 2.
Based on the “concurrency” rules, Dr. Baker gets paid 75% of the fee for
procedure #10.
The problem then is to determine for each procedure over its
duration, the maximum, instantaneous count of procedures carried out
by the anesthesiologist.
We can derive the answer graphically at first to get a better
understanding of the problem.

Example 1 shows two overlapping procedures. The upper, Ganttlike
graph displays the elapsed time of the procedure we are evaluating
(the subject procedure) and all the doctor’s other procedures that
overlap in time.
The lower graph (Instantaneous Count of In-Progress Procedures)
shows how many procedures are underway at each moment. It helps to
think of a slide rule hairline moving from left to right over the Gantt
chart while each procedure start or end is plotted stepwise on the
lower chart.
We can see in Example 1 by inspection that the maximum is 2.
Example 1—Dr. Baker, Proc #10
Example 2 shows a more complex set of overlapping procedures, but
the principle is the same. The maximum, which happens twice, is 3.
Example 2—Dr. Dow, Proc #30
Note that the correct answer is not the number of overlapping
procedures but the maximum instantaneous count. The puzzle is how to
do this for each procedure using SQL. Here is the desired result for the
sample data:

proc_id max_inst_count
============================
10 2
20 2
30 3
40 3
50 3
60 3
70 2
80 1
Answer #1
The first step is to convert each procedure into two Events—a start and
end event—and put them in a view. The UNION operator appends the set
of end Events to the set of start Events. A (+1) indicates a start event and
a (-1) indicates an end event.
The WHERE clauses assure that the procedures compared overlap and
are for the same anesthesiologist. The NOT condition eliminates
procedures that do not overlap the subject procedure.
CREATE VIEW Events (proc_id, comparison_proc, anest_name,
event_time, event_type)
AS SELECT P1.proc_id,
P2.proc_id,
P1.anest_name,
P2.start_time,
+1
FROM Procs AS P1, Procs AS P2
WHERE P1.anest_name = P2.anest_name
AND NOT (P2.end_time <= P1.start_time
OR P2.start_time >= P1.end_time)
UNION
SELECT P1.proc_id,
P2.proc_id,
P1.anest_name,
P2.end_time,
-1 AS event_type
FROM Procs AS P1, Procs AS P2
WHERE P1.anest_name= P2.anest_name
AND NOT (P2.end_time <= P1.start_time
OR P2.start_time >= P1.end_time);

The result is this view shown here for procedure #10 only and sorted
by event_time for clarity. Notice that the same anesthesiologist can start
more than one procedure at the same time.
Events
proc_id comparison_proc anest_name event_time event_type
==========================================================
10 10 Baker 08:00 +1
10 20 Baker 09:00 +1
10 10 Baker 11:00 -1
10 20 Baker 13:00 -1
Now, for each set of Events with the same proc_id id, we can
compute for each event the sum of the event_types for Events that
occur earlier. This series of backward-looking sums gives us the values
represented by each step in the step charts.
SELECT E1.proc_id, E1.event_time,
(SELECT SUM(E2.event_type)
FROM Events AS E2
WHERE E2.proc_id = E1.proc_id
AND E2.event_time < E1.event_time)
AS instantaneous_count
FROM Events AS E1
ORDER BY E1.proc_id, E1.event_time;
The result of this query is shown here for procedure #10 only.
proc_id instantaneous_count
===============================
10 NULL
10 1
10 2
10 1
You could put this result set into a view called ConcurrentProcs,
then query the view to get the maximum instantaneous count for each
subject procedure using this statement:

SELECT proc_id,
MAX(instantaneous_count) AS max_inst
FROM ConcurrentProcs
GROUP BY proc_id;
But you could also extract the desired result directly from the Events
view. You could do this by merging the two select statements:
SELECT E1.proc_id,
MAX((SELECT SUM(E2.event_type)
FROM Events AS E2
WHERE E2.proc_id = E1.proc_id
AND E2.event_time < E1.event_time)) AS
max_inst_count
FROM Events AS E1
GROUP BY E1.proc_id;
However, it is illegal to put a subquery expression in an aggregate
function in SQL-92, so you are depending on a vendor extension.

Answer #2
Richard Romley’s single-query answer depends on the SQL-92 table
query expressions in the FROM clause, so that what had been a VIEW
can be folded into the query.
SELECT P3.proc_id, MAX(ConcurrentProcs.tally)
FROM (SELECT P1.anest_name, P1.start_time, COUNT(*)
FROM Procs AS P1
INNER JOIN
Procs AS P2
ON P1.anest_name= P2.anest
AND P2.start_time <= P1.start_time
AND P2.stop_time > P1.start_time
GROUP BY P1.anest_name, P1.start_time)
AS ConcurrentProcs(anest_name, start_time, tally)
INNER JOIN
Procs AS P3
ON ConcurrentProcs.anest_name= P3.anest
AND P3.start_time <= ConcurrentProcs.start_time
AND P3.stop_time > ConcurrentProcs.start_time
GROUP BY P3.proc_id;

Answer #3
This answer came from Lex van de Pol (aavdpol@hotmail.com) on June
9, 2000. The idea is to loop through all procedures (P1); for each
procedure P1 you look at procedures P2 where their start time lies in the
interval of procedure P2. For each start time you found of P2, count the
number of procedures (P3) that are ongoing on that time. Then, take the
maximum count for a certain procedure P1.
For doing this, Lex first created this view:
CREATE VIEW Vprocs (id1, id2, total)
AS SELECT P1.prc_id, P2.prc_id, COUNT(*)
FROM Procs AS P1, Procs AS P2, Procs AS P3
WHERE P2.ant_name = P1.ant_name
AND P3.ant_name = P1.ant_name
AND P1.prc_start <= P2.prc_start
AND P2.prc_start < P1.prc_end
AND P3.prc_start <= P2.prc_start
AND P2.prc_start < P3.prc_end
GROUP BY P1.prc_id, P2.prc_id;
He then took the maximum for each procedure P1:
SELECT id1 AS proc_id, MAX(total) AS max_inst_count
FROM Vprocs
GROUP BY id1;
The funny thing is, you do not need to look at the end times for
procedures P2.

Answer #4
Bert C. Hughes (bhughes@twincities.net) came up with a solution in
Microsoft ACCESS, a proprietary near-SQL language. Here is his code
translated into a single SQL statement:
SELECT P1.proc_id, P1.anest_name, MAX(E1.ecount) AS maxops
FROM Procs AS P1,
-- E1 is # of processes active at time for each
anesthetist
(SELECT P2.anest_name, P2.start_time, COUNT(*)
FROM Procs AS P1, Procs AS P2
WHERE P1.anest_name = P2.anest_name
AND P1.start_time <= P2.start_time
AND P1.end_time > P2.start_time
GROUP BY P2.anest_name, P2.start_time)
AS E1(anest_name, etime, ecount)
WHERE E1.anest_name= P1.anest_name
AND E1.etime >= P1.start_time
AND E1.etime < P1.end_time
GROUP BY P1.proc_id, P1.anest;

Answer #5
Another approach is to set up a Clock table, since you probably round
the billing to within a minute. That means we would have a table with
(24 hours * 60 minutes) = 1,440 rows per day, or 525,600 rows; a year’s
worth of scheduling. But you can also set up a VIEW for the current day:
SELECT X.anest_name, MAX(X.proc_tally)
FROM (SELECT P1.anest_name, COUNT(DISTINCT proc_id)
FROM Procs AS P1, Clock AS C
WHERE C1.clock_time BETWEEN P1.start_time
AND P1.end_time
GROUP BY P1.anest_name)
AS X(anest_name, proc_tally)
GROUP BY X.anest_name);
This is just another version of the Calendar auxiliary table. This kind
of table depends on a known granularity—Calendars work to the day,
and Clocks to the minute, usually. You also can create a VIEW that uses
a table of one day’s clock ticks stored in minutes and the system constant
CURRENT_DATE.
CREATE VIEW TodayClock (clock_time)
AS
SELECT CURRENT_DATE + ticks
FROM DayTicks;


题目大意:有一张手术时间表,每个手术指派一名麻醉师。同一名麻醉师服务的手术时间可以重叠。
对于每个手术,求出在时间轴上同时出现的重叠手术个数的最大值。这将被用于计算麻醉师的报酬。
比如手术10的最高重叠数是2(10和20), 手术30的最高重叠数是3 (30/40/50, 或者30/40/60),见附件插图。

CREATE TABLE Procs (proc_id NUMBER, anest_name VARCHAR2(20), start_time VARCHAR2(10), end_time VARCHAR2(10));

INSERT INTO PROCS VALUES (10,'Baker' ,'08:00','11:00');
INSERT INTO PROCS VALUES (20,'Baker' ,'09:00','13:00');
INSERT INTO PROCS VALUES (30,'Dow'   ,'09:00','15:30');
INSERT INTO PROCS VALUES (40,'Dow'   ,'08:00','13:30');
INSERT INTO PROCS VALUES (50,'Dow'   ,'10:00','11:30');
INSERT INTO PROCS VALUES (60,'Dow'   ,'12:30','13:30');
INSERT INTO PROCS VALUES (70,'Dow'   ,'13:30','14:30');
INSERT INTO PROCS VALUES (80,'Dow'   ,'18:00','19:00');

书中答案#1的ORACLE版本:
WITH Events AS (
SELECT P1.proc_id,
       P2.proc_id AS comparison_proc,
       P1.anest_name,
       P2.start_time event_time,
       +1 event_type
FROM Procs  P1, Procs  P2
WHERE P1.anest_name = P2.anest_name
      AND NOT (P2.end_time <= P1.start_time
      OR P2.start_time >= P1.end_time)
UNION ALL
SELECT P1.proc_id,
       P2.proc_id,
       P1.anest_name,
       P2.end_time,
       -1 AS event_type
FROM Procs  P1, Procs  P2
WHERE P1.anest_name = P2.anest_name
      AND NOT (P2.end_time <= P1.start_time
      OR P2.start_time >= P1.end_time)
)
,ConcurrentProcs AS (
SELECT E1.proc_id, E1.event_time,
      (SELECT SUM(E2.event_type)
      FROM Events  E2
      WHERE E2.proc_id = E1.proc_id
      AND E2.event_time < E1.event_time)
      AS instantaneous_count
FROM Events  E1
      )
SELECT proc_id,
MAX(instantaneous_count) AS max_inst
FROM ConcurrentProcs
GROUP BY proc_id
ORDER BY 1;

经我改良过的答案#1:
SELECT proc_id,MAX(cnt)
  FROM (SELECT proc_id,SUM(event_type) OVER (PARTITION BY proc_id ORDER BY event_time) AS cnt
          FROM (SELECT a.*
                      ,(CASE WHEN rn=1 THEN b.start_time ELSE b.end_time END) AS event_time
                      ,(CASE WHEN rn=1 THEN 1 ELSE -1 END) AS event_type
                  FROM Procs a,Procs b,(SELECT ROWNUM rn FROM DUAL CONNECT BY ROWNUM<=2)
                 WHERE a.anest_name = b.anest_name
                       AND a.end_time > b.start_time
                       AND a.start_time < b.end_time
               )
        WHERE event_time<end_time
      )
GROUP BY proc_id;

p3_1.GIF (18.46 KB, 下载次数: 108)

p3_1.GIF

p3_2.GIF (23.82 KB, 下载次数: 88)

p3_2.GIF

使用道具 举报

回复
论坛徽章:
520
奥运会纪念徽章:垒球
日期:2008-09-15 01:28:12生肖徽章2007版:鸡
日期:2008-11-17 23:40:58生肖徽章2007版:马
日期:2008-11-18 05:09:48数据库板块每日发贴之星
日期:2008-11-29 01:01:02数据库板块每日发贴之星
日期:2008-12-05 01:01:03生肖徽章2007版:虎
日期:2008-12-10 07:47:462009新春纪念徽章
日期:2009-01-04 14:52:28数据库板块每日发贴之星
日期:2009-02-08 01:01:03生肖徽章2007版:蛇
日期:2009-03-09 22:18:532009日食纪念
日期:2009-07-22 09:30:00
22#
 楼主| 发表于 2010-3-13 02:20 | 只看该作者
接上贴:
书中答案#2的ORACLE版本:这是一个相当优秀的答案
SELECT P3.proc_id, MAX(ConcurrentProcs.tally)
FROM (SELECT P1.anest_name, P1.start_time, COUNT(*) tally
      FROM Procs  P1
      INNER JOIN
      Procs  P2
      ON P1.anest_name= P2.anest_name
      AND P2.start_time <= P1.start_time
      AND P2.end_time > P1.start_time
      GROUP BY P1.anest_name, P1.start_time
      ) ConcurrentProcs
      INNER JOIN
      Procs P3
      ON ConcurrentProcs.anest_name= P3.anest_name
      AND P3.start_time <= ConcurrentProcs.start_time
      AND P3.end_time > ConcurrentProcs.start_time
GROUP BY P3.proc_id;


经我用分析函数改良过的答案#2: (认不出来了吧?)
SELECT proc_id,MAX(tally)
  FROM (SELECT p2.proc_id, COUNT(*) OVER(PARTITION BY p1.proc_id) tally
         FROM Procs  P1
              INNER JOIN
              Procs  P2
              ON P1.anest_name= P2.anest_name
                 AND P2.start_time <= P1.start_time
                 AND P2.end_time > P1.start_time
        )
GROUP BY proc_id;


#3仔细一看,和#2其实是很类似的,只不过把所有表连接放到同一层:
SELECT id1 AS proc_id, MAX(total) AS max_inst_count
  FROM (SELECT P1.proc_id id1, P2.proc_id id2, COUNT(*) total
         FROM Procs P1, Procs P2, Procs P3
         WHERE P2.anest_name = P1.anest_name
               AND P3.anest_name = P1.anest_name
               AND P1.start_time <= P2.start_time
               AND P2.start_time <  P1.end_time
               AND P3.start_time <= P2.start_time
               AND P2.start_time <  P3.end_time
        GROUP BY P1.proc_id, P2.proc_id
        ) Vprocs
GROUP BY id1;

#4其实和#2也是一样的:
SELECT P1.proc_id, P1.anest_name, MAX(E1.ecount) AS maxops
FROM Procs  P1,
     (SELECT P2.anest_name, P2.start_time etime, COUNT(*) ecount
       FROM Procs P1, Procs P2
      WHERE P1.anest_name = P2.anest_name
            AND P1.start_time <= P2.start_time
            AND P1.end_time > P2.start_time
       GROUP BY P2.anest_name, P2.start_time
      ) E1
WHERE E1.anest_name= P1.anest_name
      AND E1.etime >= P1.start_time
      AND E1.etime < P1.end_time
GROUP BY P1.proc_id, P1.anest_name;


#5有点意思,它的思路是把一天的时间打成碎片然后和手术表连接。可惜的是效率不怎么样。

作者给出的答案,按ORACLE写法实现如下,但其实是错的:
SELECT X.anest_name, MAX(X.proc_tally)
FROM (SELECT P1.anest_name, COUNT(DISTINCT proc_id) proc_tally
       FROM Procs P1,
            (SELECT TO_CHAR(TIMESTAMP '2008-1-1 00:00:00' + numtodsinterval(ROWNUM, 'minute' ),'HH24:MI') clock_time
              FROM DUAL
             CONNECT BY ROWNUM<=24*60-1) C1
       WHERE C1.clock_time BETWEEN P1.start_time AND P1.end_time
       GROUP BY P1.anest_name) X
GROUP BY X.anest_name;

按此思路经过我改正的答案:
SELECT X.proc_id, MAX(X.proc_tally)
FROM (SELECT P1.proc_id, COUNT(DISTINCT proc_id) OVER(PARTITION BY anest_name,clock_time) proc_tally
       FROM Procs P1,
            (SELECT TO_CHAR(TIMESTAMP '2008-1-1 00:00:00' + numtodsinterval(ROWNUM, 'minute' ),'HH24:MI') clock_time
              FROM DUAL
             CONNECT BY ROWNUM<=24*60-1) C1
       WHERE C1.clock_time >= P1.start_time AND C1.clock_time<P1.end_time
      ) X
GROUP BY X.proc_id;

[ 本帖最后由 newkid 于 2010-3-13 02:58 编辑 ]

使用道具 举报

回复
论坛徽章:
520
奥运会纪念徽章:垒球
日期:2008-09-15 01:28:12生肖徽章2007版:鸡
日期:2008-11-17 23:40:58生肖徽章2007版:马
日期:2008-11-18 05:09:48数据库板块每日发贴之星
日期:2008-11-29 01:01:02数据库板块每日发贴之星
日期:2008-12-05 01:01:03生肖徽章2007版:虎
日期:2008-12-10 07:47:462009新春纪念徽章
日期:2009-01-04 14:52:28数据库板块每日发贴之星
日期:2009-02-08 01:01:03生肖徽章2007版:蛇
日期:2009-03-09 22:18:532009日食纪念
日期:2009-07-22 09:30:00
23#
 楼主| 发表于 2010-3-13 06:36 | 只看该作者
根据英文版补充8楼的"谜题12 索赔状态"

书中答案#2的ORACLE版本:

SELECT E1.claim_id, C1.patient_name
     , (SELECT claim_seq
          FROM ClaimStatusCodes S1
         WHERE S1.claim_status = E1.claim_status
        ) claim_seq
  FROM LegalEvents  E1, Claims  C1
WHERE E1.claim_id = C1.claim_id
GROUP BY E1.claim_id, C1.patient_name, E1.claim_status
HAVING COUNT(*) = (SELECT COUNT(DISTINCT defendant_name)
                     FROM LegalEvents  E2
                   WHERE E1.claim_id = E2.claim_id)
ORDER BY E1.claim_id;

作者的答案其实未完工,必须再加上一层:
SELECT claim_id, patient_name, MIN(claim_seq)
  FROM (SELECT E1.claim_id, C1.patient_name
              , (SELECT claim_seq
                   FROM ClaimStatusCodes S1
                  WHERE S1.claim_status = E1.claim_status
                 ) claim_seq
           FROM LegalEvents  E1, Claims  C1
         WHERE E1.claim_id = C1.claim_id
         GROUP BY E1.claim_id, C1.patient_name, E1.claim_status
         HAVING COUNT(*) = (SELECT COUNT(DISTINCT defendant_name)
                              FROM LegalEvents  E2
                            WHERE E1.claim_id = E2.claim_id)
       )
GROUP BY claim_id, patient_name;

用此思路改良的版本,其实已经面目全非了:
SELECT cl.claim_id,cl.patient_name,MIN(claim_status) KEEP(DENSE_RANK FIRST ORDER BY claim_seq DESC)
  FROM (SELECT l.claim_id
              ,l.claim_status
              ,cd.claim_seq
              ,COUNT(*) OVER(PARTITION BY l.claim_id,l.claim_status) cnt_status
              ,COUNT(DISTINCT defendant_name) OVER(PARTITION BY claim_id) cnt_defendant
         FROM LegalEvents l, ClaimStatusCodes cd
        WHERE l.claim_status= cd.claim_status  
        ) l
       ,claims cl
WHERE cnt_status = cnt_defendant
      AND l.claim_id = cl.claim_id
GROUP BY cl.claim_id,cl.patient_name;

#3在ORACLE运行不了:
SELECT claim_id, MAX(patient_name), MAX(T1.claim_status)
FROM ClaimStatusCodes  T1, Claims  C1
WHERE T1.claim_seq =
      (SELECT MAX(T2.claim_seq)
         FROM ClaimStatusCodes  T2, LegalEvents  E1
        WHERE T2.claim_status = E1.claim_status
              AND E1.claim_id = C1.claim_id
       GROUP BY E1.claim_id, defendant_name)
GROUP BY claim_id, T1.claim_seq
HAVING T1.claim_seq = MIN(T1.claim_seq)


Answer #4
This answer came from Francisco Moreno, and it is designed to avoid
subqueries by using the JOIN syntax of SQL-92.
Step 1
Insert a dummy row in the claim_status table:

INSERT INTO ClaimStatusCodes (claim_status, claim_status_desc,
claim_seq)
VALUES ('XX', 'Dummy', 5);

Step 2
The query:

SELECT C1.claim_id, C1.patient_name,
       CASE MIN(S1.claim_seq)  ------- 往前推一级即claim_seq-1
            WHEN 2 THEN 'AP'
            WHEN 3 THEN 'OR'
            WHEN 4 THEN 'SF'
            ELSE 'CL'
       END
FROM ( (Claims  C1
       INNER JOIN
       Defendants  D1
       ON C1.claim_id = D1.claim_id
       )
       CROSS JOIN
       ClaimStatusCodes S1
       )
       LEFT OUTER JOIN
       LegalEvents  E1
       ON C1.claim_id = E1.claim_id
       AND D1.defendant_name= E1.defendant_name
       AND S1.claim_status = E1.claim_status
WHERE E1.claim_id IS NULL   -------- 数据中“没有”的那些状态
GROUP BY C1.claim_id, C1.patient_name;

这个#4思路十分怪异,一般人会从“已有”数据求答案,它是先凑出“没有”的那些状态,然后再倒过来推算“已有”的。
按此怪异思路,用10G的分区外连接改良如下:
SELECT C1.claim_id, C1.patient_name,
       CASE E2.claim_seq  ------- 往前推一级, 即claim_seq-1
            WHEN 2 THEN 'AP'
            WHEN 3 THEN 'OR'
            WHEN 4 THEN 'SF'
            ELSE 'CL'
       END
  FROM (SELECT E1.claim_id,MIN(S1.claim_seq) AS claim_seq
          FROM LegalEvents  E1 PARTITION BY (claim_id, defendant_name)
               RIGHT JOIN ClaimStatusCodes S1
               ON E1.claim_status = S1.claim_status
         WHERE E1.claim_status IS NULL
        GROUP BY E1.claim_id
        ) E2 JOIN Claims C1
       ON E2.claim_id = C1.claim_id;

使用道具 举报

回复
论坛徽章:
17
生肖徽章2007版:鸡
日期:2008-01-02 17:35:53ITPUB十周年纪念徽章
日期:2011-11-01 16:21:152011新春纪念徽章
日期:2011-02-18 11:43:332010新春纪念徽章
日期:2010-03-01 11:04:57生肖徽章2007版:鼠
日期:2009-11-29 12:48:34生肖徽章2007版:兔
日期:2009-11-23 16:38:24祖国60周年纪念徽章
日期:2009-10-09 08:28:00生肖徽章2007版:龙
日期:2009-09-10 11:23:342009日食纪念
日期:2009-07-22 09:30:00生肖徽章2007版:猴
日期:2009-06-14 03:09:34
24#
发表于 2010-3-13 19:20 | 只看该作者
佩服,值得学习

使用道具 举报

回复
论坛徽章:
50
2014年世界杯参赛球队: 荷兰
日期:2014-07-11 07:56:59蛋疼蛋
日期:2012-03-06 07:22:542012新春纪念徽章
日期:2012-02-13 15:09:522012新春纪念徽章
日期:2012-02-13 15:09:522012新春纪念徽章
日期:2012-02-13 15:09:522012新春纪念徽章
日期:2012-02-13 15:09:522012新春纪念徽章
日期:2012-02-13 15:09:522012新春纪念徽章
日期:2012-01-04 11:53:29蛋疼蛋
日期:2011-11-11 15:47:00ITPUB十周年纪念徽章
日期:2011-11-01 16:23:26
25#
发表于 2010-3-13 20:26 | 只看该作者
绝对的偶像!

使用道具 举报

回复
论坛徽章:
55
沸羊羊
日期:2015-04-03 16:29:392013年新春福章
日期:2013-02-25 14:51:24奥运会纪念徽章:乒乓球
日期:2012-10-16 21:48:05奥运会纪念徽章:举重
日期:2012-10-13 21:56:40ITPUB 11周年纪念徽章
日期:2012-10-09 18:13:33奥运会纪念徽章:水球
日期:2012-06-22 21:14:48奥运会纪念徽章:射箭
日期:2012-06-22 13:48:20双黄蛋
日期:2012-01-10 11:13:052012新春纪念徽章
日期:2012-01-04 11:55:23ITPUB十周年纪念徽章
日期:2011-11-01 16:25:22
26#
发表于 2010-3-13 21:57 | 只看该作者
感谢分享,顶~

使用道具 举报

回复
论坛徽章:
520
奥运会纪念徽章:垒球
日期:2008-09-15 01:28:12生肖徽章2007版:鸡
日期:2008-11-17 23:40:58生肖徽章2007版:马
日期:2008-11-18 05:09:48数据库板块每日发贴之星
日期:2008-11-29 01:01:02数据库板块每日发贴之星
日期:2008-12-05 01:01:03生肖徽章2007版:虎
日期:2008-12-10 07:47:462009新春纪念徽章
日期:2009-01-04 14:52:28数据库板块每日发贴之星
日期:2009-02-08 01:01:03生肖徽章2007版:蛇
日期:2009-03-09 22:18:532009日食纪念
日期:2009-07-22 09:30:00
27#
 楼主| 发表于 2010-3-14 03:53 | 只看该作者
谜题17 职业介绍

题目大意:
猎头手里有一些工作(JOB ORDERS)和一些候选人,现在要选出哪些人具备工作所要求的技能。

候选人及其拥有的技能:
CREATE TABLE CandidateSkills
(candidate_id INTEGER NOT NULL,
skill_code CHAR(15) NOT NULL,
PRIMARY KEY (candidate_id, skill_code));

INSERT INTO candidateskills VALUES(100, 'accounting')    ;
INSERT INTO candidateskills VALUES(100, 'inventory')     ;
INSERT INTO candidateskills VALUES(100, 'manufacturing') ;
INSERT INTO candidateskills VALUES(200, 'accounting')    ;
INSERT INTO candidateskills VALUES(200, 'inventory')     ;
INSERT INTO candidateskills VALUES(300, 'manufacturing') ;
INSERT INTO candidateskills VALUES(400, 'inventory')     ;
INSERT INTO candidateskills VALUES(400, 'manufacturing') ;
INSERT INTO candidateskills VALUES(500, 'accounting')    ;
INSERT INTO candidateskills VALUES(500, 'manufacturing') ;

工作及其要求的技能:

CREATE TABLE JobOrders
(job_id INTEGER NOT NULL,
skill_group INTEGER NOT NULL,
skill_code CHAR(15) NOT NULL,
PRIMARY KEY (job_id, skill_group, skill_code));

同一个skill_group中的要求是“与”的关系,即必须全部拥有;不同skill_group之间是或的关系。

例子:
Job 1 = ('inventory' AND 'manufacturing') OR 'accounting'
Job 2 = ('inventory' AND 'manufacturing') OR ('accounting' AND 'manufacturing')
Job 3 = 'manufacturing'
Job 4 = ('inventory' AND 'manufacturing' AND 'accounting')

在表中的数据体现为:
This translates into:

INSERT INTO joborders VALUES(1, 1, 'inventory')     ;
INSERT INTO joborders VALUES(1, 1, 'manufacturing') ;
INSERT INTO joborders VALUES(1, 2, 'accounting')    ;
INSERT INTO joborders VALUES(2, 1, 'inventory')     ;
INSERT INTO joborders VALUES(2, 1, 'manufacturing') ;
INSERT INTO joborders VALUES(2, 2, 'accounting')    ;
INSERT INTO joborders VALUES(2, 2, 'manufacturing') ;
INSERT INTO joborders VALUES(3, 1, 'manufacturing') ;
INSERT INTO joborders VALUES(4, 1, 'inventory')     ;
INSERT INTO joborders VALUES(4, 1, 'manufacturing') ;
INSERT INTO joborders VALUES(4, 1, 'accounting')    ;

answer#1 无实质内容。

ANSWER#2:
SELECT DISTINCT J1.job_id, C1.candidate_id
FROM  JobOrders  J1 INNER JOIN CandidateSkills  C1
       ON J1.skill_code = C1.skill_code
GROUP BY candidate_id, skill_group, job_id
HAVING COUNT(*) >= (SELECT COUNT(*)
                      FROM JobOrders  J2
                     WHERE J1.skill_group = J2.skill_group
                           AND J1.job_id = J2.job_id
                    );
                    
里面用到了相关子查询。改用ORACLE分析函数:
SELECT DISTINCT J1.job_id, C1.candidate_id
FROM  (SELECT j.*
            ,COUNT(*) OVER(PARTITION BY job_id,skill_group) cnt
        FROM JobOrders j            
        )  J1 INNER JOIN CandidateSkills  C1
       ON J1.skill_code = C1.skill_code
GROUP BY candidate_id, skill_group, job_id
HAVING COUNT(*) = MAX(cnt);
也可以这么写:
SELECT DISTINCT J1.job_id, C1.candidate_id
FROM  (SELECT j.*
            ,COUNT(*) OVER(PARTITION BY job_id,skill_group) cnt
        FROM JobOrders j            
        )  J1 INNER JOIN CandidateSkills  C1
       ON J1.skill_code = C1.skill_code
GROUP BY candidate_id, skill_group, job_id,cnt
HAVING COUNT(*) = cnt;

Answer #3
Another answer came from Richard Romley at Smith Barney. He then
came up with an answer that does not involve a correlated subquery in
SQL-92, thus:
SELECT J1.job_id, C1.candidate_id
FROM (SELECT job_id, skill_grp, COUNT(*)
       FROM JobSkillRequirements
      GROUP BY job_id, skill_grp)
       AS J1(job_id, skill_grp, grp_cnt)
     CROSS JOIN
     (SELECT R1.job_id, R1.skill_grp, S1.candidate_id,
             COUNT(*)
       FROM JobSkillRequirements AS R1, CandidateSkills AS S1
      WHERE R1.skillid = S1.skillid
      GROUP BY R1.job_id, R1.skill_grp, S1.candidate_id)
      AS C1(job_id, skill_grp, candidate_id, candidate_cnt)
WHERE J1.job_id = C1.job_id
      AND J1.skill_grp = C1.skill_grp
      AND J1.grp_cnt = C1.candidate_cnt
GROUP BY J1.job_id, C1.candidate_id;

翻译成ORACLE语法:

SELECT J1.job_id, C1.candidate_id
FROM (SELECT job_id, skill_group, COUNT(*) grp_cnt
       FROM JobOrders
      GROUP BY job_id, skill_group) J1
     CROSS JOIN
     (SELECT R1.job_id, R1.skill_group, S1.candidate_id,
             COUNT(*) candidate_cnt
       FROM JobOrders R1, CandidateSkills S1
       WHERE R1.skill_code = S1.skill_code
      GROUP BY R1.job_id, R1.skill_group, S1.candidate_id) C1
WHERE J1.job_id = C1.job_id
      AND J1.skill_group = C1.skill_group
      AND J1.grp_cnt = C1.candidate_cnt
GROUP BY J1.job_id, C1.candidate_id;
其实和#2并无本质区别,还是用的COUNT相等这个判断方法。CROSS JOIN有些莫名其妙,明明下面有连接条件。

使用道具 举报

回复
论坛徽章:
9607
土豪章
日期:2013-12-31 14:11:39土豪章
日期:2013-12-31 14:11:39阿森纳
日期:2013-06-03 17:00:31阿森纳
日期:2013-10-11 09:27:58法拉利
日期:2013-12-27 15:20:30林肯
日期:2013-12-27 15:19:09法拉利
日期:2013-12-27 15:20:30法拉利
日期:2013-12-27 15:20:30法拉利
日期:2013-12-27 15:20:30法拉利
日期:2013-12-27 15:20:30
28#
发表于 2010-3-14 18:31 | 只看该作者
顶一个

使用道具 举报

回复
论坛徽章:
407
紫蛋头
日期:2012-05-21 10:19:41迷宫蛋
日期:2012-06-06 16:02:49奥运会纪念徽章:足球
日期:2012-06-29 15:30:06奥运会纪念徽章:排球
日期:2012-07-10 21:24:24鲜花蛋
日期:2012-07-16 15:24:59奥运会纪念徽章:拳击
日期:2012-08-07 10:54:50奥运会纪念徽章:羽毛球
日期:2012-08-21 15:55:33奥运会纪念徽章:蹦床
日期:2012-08-21 21:09:51奥运会纪念徽章:篮球
日期:2012-08-24 10:29:11奥运会纪念徽章:体操
日期:2012-09-07 16:40:00
29#
发表于 2010-3-14 19:40 | 只看该作者
1

题目——带答案.rar

1.59 MB, 下载次数: 99

使用道具 举报

回复
论坛徽章:
520
奥运会纪念徽章:垒球
日期:2008-09-15 01:28:12生肖徽章2007版:鸡
日期:2008-11-17 23:40:58生肖徽章2007版:马
日期:2008-11-18 05:09:48数据库板块每日发贴之星
日期:2008-11-29 01:01:02数据库板块每日发贴之星
日期:2008-12-05 01:01:03生肖徽章2007版:虎
日期:2008-12-10 07:47:462009新春纪念徽章
日期:2009-01-04 14:52:28数据库板块每日发贴之星
日期:2009-02-08 01:01:03生肖徽章2007版:蛇
日期:2009-03-09 22:18:532009日食纪念
日期:2009-07-22 09:30:00
30#
 楼主| 发表于 2010-3-15 10:33 | 只看该作者
40 PERMUTATIONS 排列组合

题目大意:求出 N 个数的全排列

CREATE TABLE Elements
(i INTEGER NOT NULL PRIMARY KEY);
INSERT INTO Elements VALUES (1);
INSERT INTO Elements VALUES (2);
INSERT INTO Elements VALUES (3);
INSERT INTO Elements VALUES (4);
INSERT INTO Elements VALUES (5);
INSERT INTO Elements VALUES (6);
INSERT INTO Elements VALUES (7);

Answer #1
The obvious and horrible answer is:
SELECT E1.i, E2.i, E3.i, E4.i, E5.i, E6.i, E7.i
FROM Elements AS E1, Elements AS E2, Elements AS E3,
Elements AS E4, Elements AS E5, Elements AS E6,
Elements AS E7
WHERE E1.i NOT IN (E2.i, E3.i, E4.i, E5.i, E6.i, E7.i)
AND E2.i NOT IN (E1.i, E3.i, E4.i, E5.i, E6.i, E7.i)
AND E3.i NOT IN (E1.i, E2.i, E4.i, E5.i, E6.i, E7.i)
AND E4.i NOT IN (E1.i, E2.i, E3.i, E5.i, E6.i, E7.i)
AND E5.i NOT IN (E1.i, E2.i, E3.i, E4.i, E6.i, E7.i)
AND E6.i NOT IN (E1.i, E2.i, E3.i, E4.i, E5.i, E7.i)
AND E7.i NOT IN (E1.i, E2.i, E3.i, E4.i, E5.i, E6.i);

Answer #2
An improvement on this query can be made by adding one more
predicate to the WHERE clause:
SELECT E1.i, E2.i, E3.i, E4.i, E5.i, E6.i, E7.i
FROM Elements AS E1, Elements AS E2, Elements AS E3,
Elements AS E4, Elements AS E5, Elements AS E6,
Elements AS E7
WHERE (E1.i + E2.i + E3.i + E4.i + E5.i + E6.i + E7.i)
= 28
AND E1.i NOT IN (E2.i, E3.i, E4.i, E5.i, E6.i, E7.i)
AND E2.i NOT IN (E1.i, E3.i, E4.i, E5.i, E6.i, E7.i)
AND E3.i NOT IN (E1.i, E2.i, E4.i, E5.i, E6.i, E7.i)
AND E4.i NOT IN (E1.i, E2.i, E3.i, E5.i, E6.i, E7.i)
AND E5.i NOT IN (E1.i, E2.i, E3.i, E4.i, E6.i, E7.i)
AND E6.i NOT IN (E1.i, E2.i, E3.i, E4.i, E5.i, E7.i)
AND E7.i NOT IN (E1.i, E2.i, E3.i, E4.i, E5.i, E6.i);
作者说加上第一个谓词判断可以快速过滤掉一些组合,但我在ORACLE中比较结果并无改善。

Answer #3
But let's carry the totals trick one step further. First, redefine the
Elements table to have a weight for each element in the set:
CREATE TABLE Elements
(i INTEGER NOT NULL
,wgt INTEGER NOT NULL);

INSERT INTO Elements VALUES (1, 1) ;
INSERT INTO elements VALUES (2, 2) ;
INSERT INTO elements VALUES (3, 4) ;
INSERT INTO elements VALUES (4, 8) ;
INSERT INTO elements VALUES (5, 16);
INSERT INTO elements VALUES (6, 32);
INSERT INTO elements VALUES (7, 64);

SELECT E1.i, E2.i, E3.i, E4.i, E5.i, E6.i, E7.i
FROM Elements AS E1, Elements AS E2, Elements AS E3,
Elements AS E4, Elements AS E5, Elements AS E6,
Elements AS E7
WHERE (E1.wgt + E2.wgt + E3.wgt + E4.wgt
+ E5.wgt + E6.wgt + E7.wgt) = 127;

加上了以2的N次幂作为权重,为了快速判断各元素仅出现一次。
听起来不错,不知为何在ORACLE中行不通,SQL计划的成本和逻辑读比上个答案增加了十几倍!

利用这个加权的思路,我重新写了一种解法,把7表连接改为两两连接,每次都尽量过滤掉一些数据:
WITH r2 AS (
SELECT e1.i i1,e2.i i2,e1.wgt + e2.wgt as wgt
  FROM Elements e1, Elements e2
WHERE e1.i<>e2.i
)
,r3 AS (
SELECT r2.i1,r2.i2,e.i i3,r2.wgt+e.wgt as wgt
  FROM r2,Elements e
WHERE e.i NOT IN (r2.i1,r2.i2)
)
,r4 AS (
SELECT r21.i1,r21.i2,r22.i1 i3,r22.i2 i4,r21.wgt+r22.wgt as wgt
  FROM r2 r21,r2 r22
WHERE BITAND(r21.wgt,r22.wgt)=0
)
SELECT r3.i1,r3.i2,r3.i3,r4.i1 i4,r4.i2 i5,r4.i3 i6,r4.i4 i7
  FROM r3,r4
WHERE BITAND(r3.wgt,r4.wgt)=0
;
  

Answer #4
为了在ORACLE中实现这个答案,我们必须先写一个自定义函数,它的功能是在一个字符串中的指定位置塞入另一个串:
CREATE OR REPLACE FUNCTION STUFF(
       p_str IN VARCHAR2
      ,p_pos IN NUMBER
      ,p_len IN NUMBER
      ,p_new IN VARCHAR2
      ) RETURN VARCHAR2
AS
BEGIN
   RETURN SUBSTR(p_str,1,p_pos-1)||p_new||SUBSTR(p_str,p_pos+p_len);
END STUFF;
/
在本例中,p_new总是为NULL, 实际上是在指定位置删除一个字符。

答案#4:
SELECT a || c
FROM (SELECT a || SUBSTR(c , i , 1) a,
             STUFF(c, i, 1, '') c
        FROM Elements,
             (SELECT a || SUBSTR(c , i , 1) a,
                     STUFF(c, i, 1, '') c
                FROM Elements,
                     (SELECT a || SUBSTR(c , i , 1) a,
                             STUFF(c, i, 1, '') c
                       FROM Elements,
                            (SELECT a || SUBSTR(c , i , 1) a,
                                    STUFF(c, i, 1, '') c
                               FROM Elements,
                                    (SELECT a || SUBSTR(c, i, 1) a,
                                            STUFF(c,i, 1, '') c
                                       FROM Elements,
                                            (SELECT SUBSTR('1234567', i, 1) a,
                                                    STUFF('1234567',i, 1, '') c
                                               FROM Elements
                                              WHERE i <= 7) T1
                                       WHERE i <= 6) T2
                                 WHERE i <= 5) T3
                       WHERE i <= 4) T4
                WHERE i <= 3) T5
          WHERE i <= 2) T6;
它也是把7表连接改为两两连接,但是很巧妙,每次连接中总有一个集合很小。

Answer #5
其实和上一种一样,但是我觉得可读性更差。
SELECT SUBSTR('1234567', a, 1) ||
       SUBSTR(STUFF('1234567', a, 1, ''), b, 1) ||
       SUBSTR(STUFF(STUFF('1234567', a, 1, ''), b, 1,''), c,1) ||
       SUBSTR(STUFF(STUFF(STUFF('1234567',a, 1, ''), b, 1, ''), c, 1, ''), d, 1) ||
       SUBSTR(STUFF(STUFF(STUFF(STUFF('1234567',a, 1, ''), b, 1, ''), c, 1, ''), d, 1, ''), e, 1) ||
       SUBSTR(STUFF(STUFF(STUFF(STUFF(STUFF('1234567',a, 1, ''), b, 1, ''), c, 1, ''), d, 1, ''), e, 1,''),f, 1) ||
       STUFF(STUFF(STUFF(STUFF(STUFF(STUFF('1234567',a, 1, ''), b, 1, ''), c, 1, ''), d, 1, ''), e, 1,''),f, 1, '')
  FROM (SELECT i a
         FROM Elements
        WHERE i <= 7) T1,
       (SELECT i b
          FROM Elements
        WHERE i <= 6) T2,
       (SELECT i c
         FROM Elements
        WHERE i <= 5) T3,
       (SELECT i d
          FROM Elements
         WHERE i <= 4) T4,
       (SELECT i e
          FROM Elements
         WHERE i <= 3) T5,
       (SELECT i f
          FROM Elements
         WHERE i <= 2) T6;

最后是我自己的方法,得力于10G的NOCYCLE:
SELECT SYS_CONNECT_BY_PATH(i,',')
  FROM Elements
WHERE LEVEL=7
CONNECT BY NOCYCLE i <> PRIOR i;

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表