2. start with connect by 语法结构
如上面说看到的 例子, 其语法结构为 start with condition connect by condition (含 prior 关键字)
start with conditon 给出的seed 数据的范围, connect by 后面给出了递归查询的条件,prior 关键字表示父数据,prior 条件表示子数据需要满足父数据的什么条件【要的时候有两种写法:connect by prior day_number=day_number-1 或 connect by day_number=prior day_number-1,前一种写法表示采用自上而下的搜索方式(先找父节点然后找子节点),后一种写法表示采用自下而上的搜索方式(先找叶子节点然后找父节点)】。
在下面的这个start with connect by 结构中,就表示 查找出了从1开始,并且day_number 逐渐+1 递增的,并且 msisdn 相同的那些个数据.
start with day_number=1
connect by prior day_number=day_number-1 and prior msisdn= msisdn
SQL> explain plan for
2
2 select * from test
3 --where bill_month='200803'
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
Explained
SQL> select * from table( dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | CONNECT BY WITH FILTERING| | | | |
|* 2 | FILTER | | | | |
| 3 | TABLE ACCESS FULL | TEST | | | |
| 4 | NESTED LOOPS | | | | |
| 5 | BUFFER SORT | | | | |
| 6 | CONNECT BY PUMP | | | | |
|* 7 | TABLE ACCESS FULL | TEST | | | |
| 8 | TABLE ACCESS FULL | TEST | | | |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST"."DAY_NUMBER"=1)
2 - filter("TEST"."DAY_NUMBER"=1)
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
7 - filter("TEST"."MSISDN"=NULL AND "TEST"."DAY_NUMBER"-1=NULL)
Note: rule based optimization
23 rows selected
SQL>
另外,发现了在含有其他条件的递归中,是先处理所有的递归查询,最后才用加入的条件过滤.
请看下面的例子。
和上面的执行计划对比下我们可以知道,加入条件 where bill_month='200803' 后,实际上却是在递归完成后,最后才执行的 1 - filter("TEST"."BILL_MONTH"='200803') 。
所以,为了确保语句的性能,不要直接加入条件在start with connect by 结构中,而是要想办法将原表的数据控制住。这个可以采用子查询的办法,或者使用临时表等(最好采用临时表,将数据量从本源上控制住;因为从子查询的执行计划我们可以看到,它每次也都是访问全表,再用条件过滤,要重复三次,不是一次过滤就够了).
--直接加入条件后的执行计划
SQL> explain plan for
2
2 select * from test
3 where bill_month='200803'
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
2 - filter("TEST"."DAY_NUMBER"=1)
3 - filter("TEST"."DAY_NUMBER"=1)
8 - filter("TEST"."MSISDN"=NULL AND "TEST"."DAY_NUMBER"-1=NULL)
Note: rule based optimization
25 rows selected
SQL>
--使用子查询,将过滤条件嵌在子查询中
SQL> explain plan for
2
2 select * from (select * from test
3 where bill_month='200803')
4 start with day_number=1
5 connect by prior day_number=day_number-1 and prior msisdn= msisdn
6 ;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | CONNECT BY WITH FILTERING| | | | |
|* 2 | FILTER | | | | |
|* 3 | TABLE ACCESS FULL | TEST | | | |
| 4 | NESTED LOOPS | | | | |
| 5 | BUFFER SORT | | | | |
| 6 | CONNECT BY PUMP | | | | |
|* 7 | TABLE ACCESS FULL | TEST | | | |
|* 8 | TABLE ACCESS FULL | TEST | | | |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST"."DAY_NUMBER"=1)
2 - filter("TEST"."DAY_NUMBER"=1)
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
3 - filter("TEST"."BILL_MONTH"='200803')
7 - filter("TEST"."BILL_MONTH"='200803' AND "TEST"."MSISDN"=NULL AND
"TEST"."DAY_NUMBER"-1=NULL)
8 - filter("TEST"."BILL_MONTH"='200803')
Note: rule based optimization
SQL>
SQL> select distinct msisdn from test a
2 where bill_month='200803'
3 and exists
4 ( select msisdn from test
5 where bill_month='200803' and msisdn=a.msisdn
6 start with day_number=a.day_number
7 connect by prior day_number=day_number-1 and prior msisdn= msisdn
8 group by msisdn
9 having count(*)>=3
10 );
SQL>
SQL> select distinct msisdn from test a
2 where bill_month='200803'
3 and exists
4 ( select msisdn from test
5 where bill_month='200803' and msisdn=a.msisdn
6 start with day_number=a.day_number
7 connect by prior day_number=day_number-1 and prior msisdn= msisdn
8 group by msisdn
9 having count(*)>=3
10 );