设为首页 收藏本站
查看: 1319|回复: 0

[经验分享] 利用IBM硬件信息中心定位硬件问题(原创)

[复制链接]

尚未签到

发表于 2017-5-26 10:53:51 | 显示全部楼层 |阅读模式
  本文主要是通过一次对AIX服务器的硬件故障排查过程来引进一个故障排查的思路,希望大家拍砖。

  # errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
BFE4C025   0416192308 P H sysplanar0     UNDETERMINED ERROR

# errpt -aj
BFE4C025
---------------------------------------------------------------------------
LABEL:          SCAN_ERROR_CHRP
IDENTIFIER:     BFE4C025

Date/Time:       Wed Apr 16 19:23:10 2008
Sequence Number: 120
Machine Id:      000599F6D700
Node Id:         PEKAX019
Class:           H
Type:            PERM
Resource Name:   sysplanar0      #系统平台错误,根据经验可先通过
  Resource Class: planar                 diag  sysplanar0 -v -e 查看相关日志在通过
Resource Type:   sysplanar_rspc     lsmcode -A检查微码是否过旧,如微码没问
Location:                                      题,那么应该是硬件故障   

Description
UNDETERMINED ERROR

Failure Causes
UNDETERMINED

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.

Detail Data
PROBLEM DATA
0644 00E0 0000 01B4 8E00 8E00 0000 0000 0000 0000 4942 4D00 5048 0030 0100 EA10

...省略了一些
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

Diagnostic Analysis
Diagnostic Log sequence number: 104
Resource tested:        sysplanar0
Resource Description:   System Planar
Location:             
SRC:                    B17CE433  
Description:            Surveillance Error Predictive Error, general. Refer to
                        the system service documentation for more information.
Additional Words:       2-030000F0 3-53B71510 4-C13920FF 5-400000FF
                        6-00000000 7-000007F7 8-00000800 9-00000000
Possible FRUs:
    Priority: H Maintainence Procedure: FSPSP33
    Location: n/a
    Priority: M Maintainence Procedure: FSPSP04
    Location: n/a
    Priority: L FRU: 32N1272S/N: YL1126327097 CCIN: 293A
    Location: U787F.001.DPM2DCM-P1-C7

---------------------------------------------------------------------------

  打开IBM 硬件信息中心
  http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp
  搜索
  1)SRC  B17CE433
  System Reference Code (SRC)主要用于描述系统错误的代码
  Explanation
This error log entry is generated when the HMC fails to send its heartbeat message within the allotted time. The reason could be network issues, or the Ethernet cable is disconnected.
Response
If this is a tracking event, no service actions are required. Otherwise, use the FRU and procedure callouts detailed with the SRC to determine service actions.

  

  2)FSPSP33:
A problem has been detected in the connection with the HMC.
    Ensure that the cable connectors to the network from the HMC, managed system, managed system partitions, and other HMCs are securely connected. If the connections are not secure, plug the cables back into the proper spots and make sure that the connections are good.
    Check to see if the HMC is working correctly or if the HMC was disconnected incorrectly from the managed system, managed system partitions, and other HMCs. If either has happened, reboot the HMC. For more information, see Shutting down, rebooting, and logging off the HMC.
    Verify that the network connection between the HMC, managed system, managed system partitions, and other HMCs is working properly. If you have a high performance switch (HPS) network, verify that the network connection to the CSM Management Server is also working. If the connection is not working properly, contact the customer network support to correct the problems.
    If applicable, service the next FRU.
    If the problem continues to persist, contact your next level of support. This ends the procedure
  

  3)FSPSP04:
A problem has been detected in the service processor firmware.

  

  4)FRU:32N1272
  Field Replace Unit(FRU)现场可更换单元

  在电脑上的一些可更换的部件。主要是厂商为了节省成本,把设备分成多个FRU,直接更换而不修。(该FRU号没有找到结果,有时候事实就是这样!)
  

  5)CCIN:293A
  custom card identification number(CCIN)自定义识别号
  

  6)Location: U787F.001.DPM2DCM-P1-C7
  实际的物理位置,其中U787F.001.DPM2DCM为逻辑分区标识,P1-C7为物理设备标识
  通过Location结合FRU与CCIN可定位到实际设备,定位的时候注意比对Maintainence Procedure避免定位错误。

  定位结果

   DSC0000.jpg
  

  相关说明
   DSC0001.jpg
  

  参考至:http://rocolex.blog.163.com/blog/static/68446410201062102627624/
             http://www.loveunix.net/archiver/tid-129933.html
             http://www-947.ibm.com/systems/support/i/probsolv/src/index.html
             http://baike.baidu.com/view/1511517.htm
             http://jingh3209.blog.163.com/blog/static/15696672009421113615882/
  

  本文原创,转载请注明出处、作者
  如有错误,欢迎指正
  邮箱:czmcj@163.com

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-381285-1-1.html 上篇帖子: 让 IBM 的笔记本也支持 Win 键 下篇帖子: 【转】IBM Websphere Portal 主题与皮肤开发(3)
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表