fengda 发表于 2018-10-1 14:29:56

mysql-mmm故障解决一例

  mysql-mmm故障解决一例
  关键字:FATAL Couldn't configure IP 'x.x.x.x' on interface 'eth1': undef
  故障现象:
  在mmm_monitor上ping agent的虚拟机ip,其中一个无法ping通
  # mmm_control show
  # Warning: agent on host db3 is not reachable
  db1(10.1.1.15) master/ONLINE. Roles: reader(10.1.1.23), writer(10.1.1.20)
  db2(10.1.1.14) master/ONLINE. Roles: reader(10.1.1.22)
  db3(10.1.1.13) slave/ONLINE. Roles: reader(10.1.1.21)
  # Role writer is assigned to it's preferred host db1.
  # ping 10.1.1.21
  PING 10.1.1.21 (10.1.1.21) 56(84) bytes of data.
  From 10.1.1.12 icmp_seq=2 Destination Host Unreachable
  From 10.1.1.12 icmp_seq=3 Destination Host Unreachable
  From 10.1.1.12 icmp_seq=4 Destination Host Unreachable
  --- 10.1.1.21 ping statistics ---
  4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2998ms
  , pipe 3
  # ping 10.1.1.22
  PING 10.1.1.22 (10.1.1.22) 56(84) bytes of data.
  64 bytes from 10.1.1.22: icmp_seq=1 ttl=64 time=0.102 ms
  --- 10.1.1.22 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.102/0.102/0.102/0.000 ms
  在db3的实体机 10.1.1.13上:
  查看是否有此IP,结果此IP没有被设置到此机器
  # ip add
  1: lo:mtu 16436 qdisc noqueue
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
  inet6 ::1/128 scope host
  valid_lft forever preferred_lft forever
  2: eth0:mtu 1500 qdisc pfifo_fast qlen 100
  link/ether 00:80:3f:03:47:ce brd ff:ff:ff:ff:ff:ff
  inet 6.6.6.6/28 brd 122.225.32.143 scope global eth0
  inet6 fe80::280:3fff:fe03:47ce/64 scope link
  valid_lft forever preferred_lft forever
  3: eth1:mtu 1500 qdisc pfifo_fast qlen 1000
  link/ether 00:80:3f:03:47:cf brd ff:ff:ff:ff:ff:ff
  inet 10.1.1.13/24 brd 10.1.1.255 scope global eth1
  inet6 fe80::280:3fff:fe03:47cf/64 scope link
  valid_lft forever preferred_lft forever
  4: sit0:mtu 1480 qdisc noop
  link/sit 0.0.0.0 brd 0.0.0.0
  查看mysql-mmm-agent的日志
  2011/06/02 20:07:50INFO Changing active master to 'db1'
  2011/06/02 20:07:50 FATAL Failed to change master to 'db1': undef
  2011/06/02 20:07:50 FATAL Couldn't configure IP '10.1.1.21' on interface 'eth1': undef
  根据mysql-mmm-agent的日志,通过google找到了解决问题的方法
  # /usr/lib/mysql-mmm/agent/configure_ip eth1 10.1.1.21
  Can't locate Net/ARP.pm in @INC (@INC contains: /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at /usr/lib/perl5/vendor_perl/5.8.8/MMM/Agent/Helpers/Network.pm line 11.
  BEGIN failed--compilation aborted at /usr/lib/perl5/vendor_perl/5.8.8/MMM/Agent/Helpers/Network.pm line 11.
  Compilation failed in require at /usr/lib/perl5/vendor_perl/5.8.8/MMM/Agent/Helpers/Actions.pm line 5.
  BEGIN failed--compilation aborted at /usr/lib/perl5/vendor_perl/5.8.8/MMM/Agent/Helpers/Actions.pm line 5.
  Compilation failed in require at /usr/lib/mysql-mmm/agent/configure_ip line 6.
  BEGIN failed--compilation aborted at /usr/lib/mysql-mmm/agent/configure_ip line 6.
  原来是arp.pm没有安装,我们现在就来安装它
  # perl -MCPAN -e shell
  cpan> install Net::ARP
  安装完成以后通过mmm_monitor将db3置于离线,在置于在线,测试是否可以ping通。
  # mmm_control set_offline db3
  OK: State of 'db3' changed to ADMIN_OFFLINE. Now you can wait some time and check all roles!
  # mmm_control set_online db3
  OK: State of 'db3' changed to ONLINE. Now you can wait some time and check its new roles!
  # mmm_control show
  db1(10.1.1.15) master/ONLINE. Roles: reader(10.1.1.23), writer(10.1.1.20)
  db2(10.1.1.14) master/ONLINE. Roles: reader(10.1.1.22)
  db3(10.1.1.13) slave/ONLINE. Roles: reader(10.1.1.21)
  # Role writer is assigned to it's preferred host db1.
  # ping 10.1.1.21
  PING 10.1.1.21 (10.1.1.21) 56(84) bytes of data.
  64 bytes from 10.1.1.21: icmp_seq=1 ttl=64 time=0.181 ms
  64 bytes from 10.1.1.21: icmp_seq=2 ttl=64 time=0.079 ms
  问题解决了。
  最后总结一下:
  这个问题其实是安装时候不小心遗留下来的,由于db3是纯slave,所以一般是通过真实ip去访问,没有用到虚拟IP,mmm_monitor也完全没有表现出任何的故障信息。问题是在配置读写分离时候,用到了slave的虚拟IP,才发现的。
  所以在需要上线的架构,最好还是安装官方文档,一一检查清楚,避免不必要的故障。

页: [1]
查看完整版本: mysql-mmm故障解决一例