设为首页 收藏本站
查看: 711|回复: 0

[经验分享] Oracle Data Guard TAF(Transparent Application Failover ) 配置说明

[复制链接]

尚未签到

发表于 2016-8-7 06:34:04 | 显示全部楼层 |阅读模式
. TAF 分类

之前写了篇TAF的说明,参考:
Oracle Transparent Application Failover(TAF) 说明
http://blog.csdn.net/tianlesoftware/archive/2010/12/13/6072294.aspx

网友 lwd118 对DG 测试这块提了一些建议。看的也不是很清楚,所以去metalink上搜了一下。

Data Guard 9i Configuring Transparent Application Failover in a Data Guard Environment [ID 205637.1]
http://blog.csdn.net/tianlesoftware/archive/2010/12/17/6081893.aspx


在这篇文章里,把TAF分2种: connect time failover and application failover.

1Connect Time Failover

Connect time failover will reroute incoming connections to the instance that has just become primary. This type of failover should work in cases where the old primary node is down, old primary network is down, old primary listener is down, or old primary instance is now the standby.

When the old primary network is down, failover functionality is built into the basic layer of Oracle Net. We simply tcp timeout and fail to the next host in the list. Changing the tcp timeout parameters will determine the speed at which failover occurs. However, the basic configuration of connect time failover is not sufficient for the remaining failure scenarios.

Consider the following service name:
DGD =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu2)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = DGD)
)
)

Using the above alias, the failover (graceful or forced will work correctly if the old primary instance/listener is down. However it does not work correctly for the switchover scenario. With switchover, the old primary is now the standby and the old standby is now the primary. When we issue the connection to the old primary node -- now running as a mounted standby -- we receive the following error:
ORA-01033: ORACLE initialization or shutdown in progress

-- 这种配置不适用于switchover,因为switchover之后,主库变成了mounted standby 但是我们的连接还是会连接主库,这样进行连接时就会报如上错误。

This is expected behavior. Connect time failover is not programmed to failover on this error.

这种Failoverswitchover的切换问题的解决方法:
We can solve this by setting following parameters in the init.ora files:
Primary init.ora: instance_name=DGD_P
Standby init.ora: instance_name=DGD_S

After a switchover, as the standby and primary databases are brought up, PMON will register the service_names AND the instance_name. We also must change the TNS service name to look for these values:

DGD =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = hasunclu2)(PORT = 1521))
)
(CONNECT_DATA =
(SID=DGD_P)
(SERVICE_NAME = DGD)
)
)

Now when we connect to the old primary/new standby, we get the following error:
'ORA-12521 TNS:Listener could not resolve INSTANCE_NAME given in connect descriptor'

At this point the connection failsover to the second host in the address list. This final connection attempt succeeds as the proper instance_name (DGD_P) is present.

Note that the DBA must maintain two init.ora's to maintain the seperate instance_name values or alter parameter with the alter system command once the instance has opened.

这种解决方法就是通过指定不同的Instance_name。 主备库的实例名不一样, 当主备库发生切换之后,在客户端的连接里也修改相应的参数。 所以还是需要人工的参与。 所以switchover 的TAF 意义不并不很大。
还有,在这块,metalink上的是instance_name=sid. 其实这块应该是SID=instance_name。 在tnsnames.ora 文件中,只有SID 和service_name 这2个参数。 我用的是10g。 没有9i的环境进行测试。
个人的理解,既然是需要人工参与,直接修改service_name这个参数就可以了。 关键的是要理解这种思想,就是主备库的实例名不一样。 做DG 切换之后,通过修改监听的参数,来完成连接。 至于service_name还是SID,都可以。 只要保证修改之后能正常连接就可以了。


2Application Failover:

For application failover, all existing connections from the current primary must failover to the new primary. One of the biggest obstacles to overcome is the lag time from when the standby databse becomes the primary database.

-- 这种切换将存在的所有连接都转移到新的主库。 但是最大的问题是备库切换到主库的时间上。 解决方法就是通过设置参数多连几次。

Client connections should continue to retry the failover until the standby has been opened as the new production. This can be configured by having an alias similar to the following:
DGD_TAF=
(DESCRIPTION=
(address_list=
(load_balance=off)
(failover=on)
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu1)(Port=1521))
(ADDRESS=(PROTOCOL=TCP)(Host=hasunclu2)(Port=1521))
)
(CONNECT_DATA=
(SERVICE_NAME=DGD)
(FAILOVER_MODE=
(TYPE=session)
(METHOD=BASIC)
(RETRIES=180)
(DELAY =5)))
)

With this alias, TAF will try to failover to the second node in the address_list. If it cannot connect, it will wait five seconds and retry again. It will retry a total of 180 times. This delay will provide the DBA with enough time to perform a switchover or activate the standby as the new production. This timing can be adjusted to suit your environment and should be tested accordingly.


网友lwd118 还提到了LOAD_BALANCE 这个参数,在下面的文章中提到了这个参数:

Failover Connections for Data Guard Error with ORA-1033 [ID 461874.1]
http://blog.csdn.net/tianlesoftware/archive/2010/12/17/6081898.aspx

Error ORA-1033 is expected for connections when they attempt to connect to standby instance, because it is mounted and not open.
When LOAD_BALANCE is set, connections can load balance between the addresses in the net service name. When there are two description sections in a net service name, Oracle Net will load balance between them.

在上面配置Failover参数是是把LOAD_BALANCE这个参数off了。 这个参数默认是ON的。 所以必须显示的off。

设置OFF的作用,就是避免自动的负载均衡。 因为在监听的配置中有2个地址,原主库是mount standby的,原备库是open的。 这样,如果连主库,是没有问题的,如果连备库就会报错:
ORA-01033: ORACLE initialization or shutdown in progress.

我在TAF文章里的测试的过程很简单,就是把备库直接shutdown了。我的配置是参考Application Failover,不过没有禁用Load_balance, 也没有进行open的测试。所以网友lwd118提了一些建议。


. 测试Application Failover 下的Switchover
从上面的分析来看,Application Failover更适用一点,下面就这种情况下做一个Switchover的测试。

3.1 先看一下Data Gard的情况:

主库:
SQL> select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
124
SQL> select open_mode from v$database;
OPEN_MODE
----------
READ WRITE

备库:
SQL> select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
124
SQL> select open_mode from v$database;
OPEN_MODE
----------
MOUNTED

3.2 客户端监听配置:
DG_SF =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.3.3)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.3.2)(PORT = 1521))
(LOAD_BALANCE = OFF)
(FAILOVER=on)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
  (RETRIES=180)
(DELAY =5)
)
)
)


3.3 在客户端连接DG
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期五 12月 17 14:35:55 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL> select db_unique_name,open_mode from v$database;

DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_st READ WRITE


3.4 Switchover 切换
主库:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
--------------------
SESSIONS ACTIVE
SQL> Alter database commit to switchover to physical standby with session shutdown;
Database altered.
SQL> shutdown immediate
ORA-01507: database not mounted
ORACLE instance shut down.
SQL>

备库:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
--------------------
TO PRIMARY
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
Database altered.
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL>

启动主备库,注意顺序:先备后主。

3.5 在之前的session里继续查询
SQL> select db_unique_name,open_mode from v$database;

DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_st MOUNTED


3.6 新开窗口连接
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期日 12月 19 19:05:21 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL> select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_st MOUNTED


把原主库的监听停掉:
[oracle@dg2 bdump]$ lsnrctl stop
LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 18-DEC-2010 03:21:58
Copyright (c) 1991, 2005, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=dg2)(PORT=1521)))
The command completed successfully

注意:这里停数据库是不行的,必须要停监听。

在查:
C:/Users/Administrator.DavidDai>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期日 12月 19 19:02:57 2010
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn sys/oracle@dg_sf as sysdba;
已连接。
SQL> select db_unique_name,open_mode from v$database;
DB_UNIQUE_NAME OPEN_MODE
------------------------------ ----------
orcl_pd READ WRITE
SQL>


把主库的监听停掉之后,客户端就切换到备库去了。 这里就像lwd118 网友说的那样, 对于switchover,有问题。 这个问题的解决方法在 Connect Time Failover中已经做了说明,就是在主备库里分别指定instance_name。 然后在客户端的监听里也修改相关的参数。

但是,如果主备库的instance_name 不同的话,每次切换的话,都需要修改客户端的配置。也不方便。 由此可见,TAF 对Failover的意义更大。







------------------------------------------------------------------------------
Blog: http://blog.csdn.net/tianlesoftware
网上资源: http://tianlesoftware.download.csdn.net
相关视频:http://blog.csdn.net/tianlesoftware/archive/2009/11/27/4886500.aspx
DBA1 群:62697716(满); DBA2 群:62697977(满)
DBA3 群:62697850 DBA 超级群:63306533;
聊天 群:40132017
--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-254008-1-1.html 上篇帖子: Oracle CHAR,VARCHAR,VARCHAR2类型的区别与使用 . 下篇帖子: Oracle RAC CRS-0184 --Cannot communicate with the CRS daemon
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表