设为首页 收藏本站
查看: 992|回复: 0

[经验分享] Oracle RAC CRS-0184 --Cannot communicate with the CRS daemon

[复制链接]

尚未签到

发表于 2016-8-5 15:54:11 | 显示全部楼层 |阅读模式
  Oracle 11gR2下RAC安装后,启动CRS.错误如下:
  
  [iyunv@rac1 bin]# ./crsctl check crs
  CRS-4638: Oracle High Availability Services is online
  CRS-4535: Cannot communicate with Cluster Ready Services
  CRS-4529: Cluster Synchronization Services is online
  CRS-4533: Event Manager is online
  
  从这个错误提示,可以看到是CRS启动失败了。CRS是关键进程。它不能启动,Clusterware也是启动不了。导致这个问题的原因很多。
  
  Log如下:
  [iyunv@rac1 rac1]# tail -50 /u01/app/11.2.0/grid/log/rac1/crsd/crsd.log
  ORA-15077: could not locate ASM instance serving a required diskgroup
  
  2010-11-16 17:13:44.286: [OCRASM][3046411024]proprasmo: kgfoCheckMount returned [7]
  2010-11-16 17:13:44.286: [OCRASM][3046411024]proprasmo: The ASM instance is down
  2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
  2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprioo: No OCR/OLR devices are usable
  2010-11-16 17:13:44.287: [OCRASM][3046411024]proprasmcl: asmhandle is NULL
  2010-11-16 17:13:44.287: [OCRRAW][3046411024]proprinit:Could not open raw device
  2010-11-16 17:13:44.287: [OCRASM][3046411024]proprasmcl:asmhandle is NULL
  2010-11-16 17:13:44.287: [OCRAPI][3046411024]a_init:16!:Backend init unsuccessful : [26]
  2010-11-16 17:13:44.288: [CRSOCR][3046411024] OCR context init failure.Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
  ORA-15077: could not locate ASM instance serving a required diskgroup
  ] [7]
  2010-11-16 17:13:44.288: [CRSD][3046411024][PANIC] CRSD exiting:Could not init OCR, code: 26
  2010-11-16 17:13:44.288: [CRSD][3046411024] Done.
  
  这里的提示是ASM没有启动造成的。这里牵涉到的问题较复杂。
  
  这篇文章不打算去具体分析这个问题。Oracle官网上有一篇文章对这个问题进行了非常详细的说明。转到了我的Blog。参考:
  How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]
  http://blog.csdn.net/xujinyang/article/details/6834912
  
  In this Document
  Goal
  Solution
  Start up sequence:
  Cluster status
  Case 1: OHASD.BIN does not start
  Case 2: OHASD Agents does not start
  Case 3: CSSD.BIN does not start
  Case 4: CRSD.BIN does not start
  Case 5: GPNPD.BIN does not start
  Case 6: Various other daemons does not start
  Case 7: CRSD Agents does not start
  Network and Naming Resolution Verification
  Log File Location, Ownership and Permission
  Network Socket File Location, Ownership and Permission
  Diagnostic file collection
  References
  
  
  在这里写下我分析问题的思路:
  
  1.根据log,看能否找到问题的原因。如果不能清楚的定位问题。就只能继续分析。
  
  2.根据CRS启动的顺序来分析。
  在启动的时候,要先启动ASM实例,这里牵涉到存储问题。
  (1)网络是否正常
  (2)存储是否正常的映射到相关的位置,我的实验采用的是multipath,将存储映射到/dev/mapper/*目录下。在遇到问题的时候,会去检查这个问题是否有相关的映射。
  (3)存储的权限问题。因为映射之后,默认是的root用户。我在/etc/rc.d/rc.local文件里添加了改变权限的脚本。开机启动的时候,就将相关映射文件改成Oracle用户。
  
  3.如果这些都正常,没有问题,可以尝试重启CRS或者重启操作系统。
  
  
  补充:
  
  在网上还搜索到一个导致CSSD启动失败的原因。这个我关注的是,它讲到了一个知识点。讲到了/tmp/.oracle和/var/tmp/.oracle这两个目录的作用。每次Server重启的时候,会在这两个文件里存放锁的信息。当某次重启后,这两个文件不能被删除,就会导致锁不能更新,从而不能启动。
  
  由此也理解了,在删除Clusterware的时候,为什么需要删除这2个目录了。
  
  在RAC删除的那篇文档里提到了卸载RAC时要删除这2个目录。参考:
  RAC卸载说明
  http://blog.csdn.net/xujinyang/article/details/6837237
  
  crs.log日志内容:
  2007-04-11 14:37:34.020: [ COMMCRS][1693]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
  2007-04-11 14:37:34.020: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
  2007-04-11 14:37:34.021: [ CRSRTI][1] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
  2007-04-11 14:37:35.740: [ COMMCRS][1695]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
  2007-04-11 14:37:35.740: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9
  When we checked ocssd.log it contained the following
  CSSD]2007-04-11 12:53:56.211 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rdsk/c5t8d0s5)
[ CSSD]2007-04-11 12:53:56.211 [10] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rdsk/c5t9d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.211 [11] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rdsk/c5t8d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.228 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2007-04-11 12:53:56.269 [13] >TRACE: clssnmconnect: connecting to node 1, flags 0×0001, connector 1
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=drdb1-priv)(PORT=49895))
  [ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmconnect: connecting to node 0, flags 0×0000, connector 1
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
  [ CSSD]2007-04-11 12:53:56.279 [14] >ERROR: clssgmclientlsnr: listening failed for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1)) (3)
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
[ CSSD]2007-04-11 13:07:36.516 >USER: Oracle Database 10g CSS Release 10.2.0.2.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=drdb1DBG_CSSD))
[ CSSD]2007-04-11 13:07:36.516 >USER: CSS daemon log for node drdb1, number 1, in cluster crs
[ clsdmt]Terminating clsdm listening thread
[ CSSD]2007-04-11 13:07:36.536 [1] >TRACE: clssscmain: local-only set to false
[ CSSD]2007-04-11 13:07:36.545 [1] >TRACE: clssnmReadNodeInfo: added node 1 (drdb1) to cluster
[ CSSD]2007-04-11 13:07:36.588 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[ CSSD]2007-04-11 13:07:36.588 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
  
  解决方法:
  By checking the above logs we have realised the listener of CSS deamon was unable to start.
  the reason why it was unable to start was that each time server reboots it creates a socket at /tmp/.oracle or /var/tmp/.oracle directory .
  

  Alsoif there are previously existing sockets they cannot be reused or deleted automatically from this directory .oracle.
  
  Therefore the solution to above problem was obtained by deleting all the files inside .oracle directoery in /var/tmp or /tmp.
  
  Hence the crs started and cluster came up.
  

  
  ------------------------------------------------------------------------------

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-253401-1-1.html 上篇帖子: Oracle数据库日志查看工具LogMiner的使用详解 下篇帖子: 提高Oracle数据库安全性的几个建议
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表