//一个表有一个ID字段,是主键并且是递增的,现在需要一条SQL把这个表没有的ID查出来//例如ID的记录如下:ID124579//我们SQL需要把3,6,8这三个值查出来//这是一个典型的求集合的差集问题:with ta as(select 1 id from dual union allselect 2 from dual union allselect 4 from dual union allselect 5 from dual union allselect 7 from dual union allselect 9 from dual)select level idfrom dualconnect by level<=(select max(id)-min(id)+1 from ta)minusselect id from ta/ID----------368//对于此问题,这里ta中数据很小量的,如果ta的最后一个元素为1000001呢?//那么怎么样的查询才算高效呢?//求集合的差集还有其他方法吗?//下面我们来看看类似情况(求连续递增的数中没有出现的数)的差集怎么的:create table t asselect 1 id from dual union allselect 2 from dual union allselect 3 from dual union allselect 4 from dual union allselect 456 from dual union allselect 10145 from dual union allselect 1044653 from dual///方法一:使用minusselect count(*) from (select level idfrom dualconnect by level<=(select max(id)-min(id)+1 from t)minusselect id from t)/COUNT(*)----------1044646Executed in 1.547 seconds//方法二:使用not existsselect count(*)from (select *from (select level idfrom dualconnect by level<=(select max(id)-min(id)+1 from t)) awhere not exists(select 1from t bwhere a.id=b.id))/COUNT(*)----------1044646Executed in 2.157 seconds//方法三:使用not inselect count(*)from (select *from (select level idfrom dualconnect by level<=(select max(id)-min(id)+1 from t)) awhere a.id not in(select b.idfrom t bwhere a.id=b.id))/COUNT(*)----------1044646Executed in 8.39 seconds//从这里看出,在处理大量数据时,相比于exists,not in(in)的效率是相当低的//所以建议在应用中要尽量使用exists,少用in//因为in要进行元素匹配,对比,而exists只需要判断存在性即可//方法四:使用lag()分析函数select count(*)from (with temp as(select s,efrom (select lag(id) over(order by id)+1 s,id-1 e from t)where e - s >= 0)select a.s + b.rn -1 hfrom temp a,(select rownum rnfrom (select max(e-s+1) gap from temp)connect by rownum <= gap) bwhere a.s + b.rn-1 <= a.eorder by 1)/COUNT(*)----------1044646Executed in 10.313 secondsSQL> set time on;7:11:37 SQL> /COUNT(*)----------1044646Executed in 10.156 seconds7:11:50 SQL> //此方法效率最低,因为我们这里用到了lag()分析函数,max(),min(),以及with临时表,//在一个查询中,使用的函数越多,就会增加oracle的负担,//所以oracle engine在查理此查询时,也需要额外的开销(时间和资源)