Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1]
http://blog.csdn.net/tianlesoftware/archive/2010/12/17/6081893.aspx
在这篇文章里,把TAF分2种: connect time failover and application failover.
(1)Connect Time Failover
Connect time failover will reroute incoming connections to the instance that has just become primary. This type of failover should work in cases where the old primary node is down, old primary network is down, old primary listener is down, or old primary instance is now the standby.
When the old primary network is down, failover functionality is built into the basic layer of Oracle Net. We simply tcp timeout and fail to the next host in the list. Changing the tcp timeout parameters will determine the speed at which failover occurs. However, the basic configuration of connect time failover is not sufficient for the remaining failure scenarios.
Using the above alias, the failover (graceful or forced will work correctly if the old primary instance/listener is down. However it does not work correctly for the switchover scenario. With switchover, the old primary is now the standby and the old standby is now the primary. When we issue the connection to the old primary node -- now running as a mounted standby -- we receive the following error:
ORA-01033: ORACLE initialization or shutdown in progress
This is expected behavior. Connect time failover is not programmed to failover on this error.
这种Failover下switchover的切换问题的解决方法:
We can solve this by setting following parameters in the init.ora files:
Primary init.ora: instance_name=DGD_P
Standby init.ora: instance_name=DGD_S
After a switchover, as the standby and primary databases are brought up, PMON will register the service_names AND the instance_name. We also must change the TNS service name to look for these values:
Now when we connect to the old primary/new standby, we get the following error:
'ORA-12521 TNS:Listener could not resolve INSTANCE_NAME given in connect descriptor'
At this point the connection failsover to the second host in the address list. This final connection attempt succeeds as the proper instance_name (DGD_P) is present.
Note that the DBA must maintain two init.ora's to maintain the seperate instance_name values or alter parameter with the alter system command once the instance has opened.
For application failover, all existing connections from the current primary must failover to the new primary. One of the biggest obstacles to overcome is the lag time from when the standby databse becomes the primary database.
Client connections should continue to retry the failover until the standby has been opened as the new production. This can be configured by having an alias similar to the following:
DGD_TAF=
(DESCRIPTION=
(address_list=
(load_balance=off)
(failover=on)
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu1)(Port=1521))
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu2)(Port=1521))
)
(CONNECT_DATA=
(SERVICE_NAME=DGD)
(FAILOVER_MODE=
(TYPE=session)
(METHOD=BASIC)
(RETRIES=180)
(DELAY =5)))
)
With this alias, TAF will try to failover to the second node in the address_list. If it cannot connect, it will wait five seconds and retry again. It will retry a total of 180 times. This delay will provide the DBA with enough time to perform a switchover or activate the standby as the new production. This timing can be adjusted to suit your environment and should be tested accordingly.
网友lwd118 还提到了LOAD_BALANCE 这个参数,在下面的文章中提到了这个参数:
Failover Connections for Data Guard Error with ORA-1033 [ID 461874.1]
http://blog.csdn.net/tianlesoftware/archive/2010/12/17/6081898.aspx
Error ORA-1033 is expected for connections when they attempt to connect to standby instance, because it is mounted and not open.
When LOAD_BALANCE is set, connections can load balance between the addresses in the net service name. When there are two description sections in a net service name, Oracle Net will load balance between them.
3.4 Switchover 切换
主库:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
--------------------
SESSIONS ACTIVE
SQL> Alter database commit to switchover to physical standby with session shutdown;
Database altered.
SQL> shutdown immediate
ORA-01507: database not mounted
ORACLE instance shut down.
SQL>
备库:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
--------------------
TO PRIMARY
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
Database altered.
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL>
启动主备库,注意顺序:先备后主。
3.5 在之前的session里继续查询
SQL> select db_unique_name,open_mode from v$database;
3.6 新开窗口连接
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期日 12月 19 19:05:21 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL> select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_st MOUNTED
把原主库的监听停掉:
[oracle@dg2 bdump]$ lsnrctl stop
LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 18-DEC-2010 03:21:58
Copyright (c) 1991, 2005, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dg2)(PORT=1521)))
The command completed successfully
注意:这里停数据库是不行的,必须要停监听。
在查:
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期日 12月 19 19:02:57 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL> select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_pd READ WRITE
SQL>
把主库的监听停掉之后,客户端就切换到备库去了。 这里就像lwd118 网友说的那样, 对于switchover,有问题。 这个问题的解决方法在 Connect Time Failover中已经做了说明,就是在主备库里分别指定instance_name。 然后在客户端的监听里也修改相关的参数。