Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1]
http://blog.csdn.net/xujinyang/article/details/6925073
在这篇文章里,把TAF分2种:connect time failover and application failover.
(1)Connect Time Failover
Connect time failover will reroute incoming connections to the instance that has just become primary.This type of failover should work in cases where the old primary node is down, old primary network is down, old primary listener is down, or old primaryinstance is now the standby.
When the old primary network is down, failover functionality is built into the basic layer of Oracle Net.We simply tcp timeout and fail to the next host in the list.Changing the tcp timeout parameters will determine the speed at which failover occurs.However,the basic configuration of connect time failover is not sufficient for the remaining failure scenarios.
Using the above alias, the failover (graceful or forced will work correctly if the old primary instance/listener is down.However it does not work correctly for the switchover scenario.With switchover, the old primary is now the standby and the old standbyis now the primary.When we issue the connection to the old primary node -- now running as a mounted standby -- we receive the following error:
ORA-01033: ORACLE initialization or shutdown in progress
This is expected behavior.Connect time failover is not programmed to failover on this error.
这种Failover下switchover的切换问题的解决方法:
We can solve this by setting following parameters in the init.ora files:
Primary init.ora:instance_name=DGD_P
Standby init.ora:instance_name=DGD_S
After a switchover, as the standby and primary databases are brought up,PMON will register the service_names AND the instance_name.We also must change the TNS service name to look for these values:
Now when we connect to the old primary/new standby, we get the following error:
'ORA-12521 TNS:Listener could not resolve INSTANCE_NAME given in connect descriptor'
At this point the connection failsover to the second host in the address list. This final connection attempt succeeds as the proper instance_name (DGD_P) is present.
Note thatthe DBA must maintain two init.ora's to maintain the seperate instance_name values or alter parameter with the alter system command once the instance has opened.
For application failover,all existing connections from the current primary must failover to the new primary.One of the biggest obstacles to overcome is the lag time from when the standby databse becomes the primary database.
Client connections should continue to retry the failover until the standby has been opened as the new production.This can be configured by having an alias similar to the following:
DGD_TAF=
(DESCRIPTION=
(address_list=
(load_balance=off)
(failover=on)
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu1)(Port=1521))
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu2)(Port=1521))
)
(CONNECT_DATA=
(SERVICE_NAME=DGD)
(FAILOVER_MODE=
(TYPE=session)
(METHOD=BASIC)
(RETRIES=180)
(DELAY =5)))
)
With this alias, TAF will try to failover to the second node in the address_list. If it cannot connect, it will wait five seconds and retry again.It will retry a total of 180 times.This delay will provide the DBA with enough time to perform a switchoveror activate the standby as the new production.This timing can be adjusted to suit your environment and should be tested accordingly.
网友lwd118还提到了LOAD_BALANCE这个参数,在下面的文章中提到了这个参数:
Failover Connections for Data Guard Error with ORA-1033 [ID 461874.1]
http://blog.csdn.net/xujinyang/article/details/6925102
Error ORA-1033 is expected for connections when they attempt to connect to standby instance, because it is mounted and not open.
When LOAD_BALANCE is set, connections can load balance between the addresses in the net service name. When there are two description sections in a net service name, Oracle Net will load balance between them.
3.6新开窗口连接
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on星期日12月19 19:05:21 2010
Copyright (c) 1982, 2010, Oracle.All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL>select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAMEOPEN_MODE
------------------------------ ----------
orcl_stMOUNTED
把原主库的监听停掉:
[oracle@dg2 bdump]$ lsnrctl stop
LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 18-DEC-2010 03:21:58
Copyright (c) 1991, 2005, Oracle.All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dg2)(PORT=1521)))
The command completed successfully
注意:这里停数据库是不行的,必须要停监听。
在查:
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on星期日12月19 19:02:57 2010
Copyright (c) 1982, 2010, Oracle.All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL>select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAMEOPEN_MODE
------------------------------ ----------
orcl_pdREAD WRITE
SQL>
把主库的监听停掉之后,客户端就切换到备库去了。这里就像lwd118网友说的那样,对于switchover,有问题。这个问题的解决方法在Connect Time Failover中已经做了说明,就是在主备库里分别指定instance_name。然后在客户端的监听里也修改相关的参数。