设为首页 收藏本站
查看: 1523|回复: 0

[经验分享] MySQL MHA高可用环境部署

[复制链接]

尚未签到

发表于 2018-10-4 11:11:36 | 显示全部楼层 |阅读模式
  一,安装MHA基本环境
  安装MHA节点
  (1)基本环境说明
  角色IP地址主机名
  =========================================
  主机192.168.1.121节点1
  从机192.168.1.122节点2
  从机192.168.1.123节点3
  监视主机192.168.1.125节点5
  (2)在node1,node2,node3,node5中操作:
  #vi / etc / hosts
  192.168.1.121 node1
  192.168.1.122 node2
  192.168.1.123 node3
  192.168.1.125 node5
  安装MHA节点节点软件包:
  #rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm#yum
  install perl-DBD-MySQL perl-CPAN -y
  #tar xf mha4mysql -node-0.56.tar.gz
  #cd mha4mysql-node-0.56
  #perl Makefile.PL
  #make && make install
  安装MHA
  在节点5管理节点上操作:注:MHA管理器主机也是需要安装MHA节点,MHA管理器
  #yum install perl-DBD-MySQL perl-CPAN perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-time-HiRes -y
  #tar xf mha4mysql-manager-0.56.tar.gz
  #cd mha4mysql-manager -0.56
  #perl Makefile.PL
  #make && make install
  #说明:安装的脚本程序都在/ usr / local / bin /目录下。
  3.节点间配置SSH登录无密码验证(MHA主机之间使用密钥登录)
  在node5(Monitor)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-     copy-id -i /root/.ssh/id_rsa.pub root @ node3
  在node1(Master)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node3
  #ssh-     copy-id -i /root/.ssh/id_rsa.pub root @ node5
  在node2(slave)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node3
  #ssh-     copy-id -i /root/.ssh/id_rsa.pub root @ node5
  在node3(slave)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-     copy-id -i /root/.ssh/id_rsa.pub root @ node5
  二,搭建主从复制环境
  主从复制环境配置过程
  (1)mysql安装过程略,但是三节点要创建如下链接
  node1(主),node2(主备从),node3(从)
  注意:创建如下链接:
  ln -s / usr / local / mysql / bin / * / usr / local / bin /
  node1 my.cnf
  server-id = 1
  binlog-format = ROW
  log-bin = master-bin
  log-bin-index = master-bin.index
  log-slave-updates = true
  relay_log_purge = 0
  node2 my.cnf
  server-id = 2
  binlog-format = ROW
  log-bin = master-bin
  log-bin-index = master-bin.index
  log-slave-updates = true
  relay_log_purge = 0
  node3 my.cnf
  binlog-format = ROW
  log-bin = mysql-bin
  relay-log = slave-relay-bin
  relay-log-index = slave-relay-bin.index
  log-slave-updates = true
  server-id = 11
  skip-name- resolve
  relay_log_purge = 0
  (2)在node1(Master)上备份一份完整的数据:
  #mysqldump -uroot -p123456 --master-data = 2 - 单事务-R - triggers -A> all.sql
  其中--master-data = 2代表备份时刻记录主的Binlog位置和位置。
  (3)在node1(Master)上创建复制用户:
  mysql>授予复制从机*。*到'123456'确定的'repl'@'192.168.1.%';
  刷新权限;
  (4)查看主库备份时的binlog名称和位置,MASTER_LOG_FILE和MASTER_LOG_POS:
  #head -n 30 all.sql | grep'CHANGE MASTER TO'
  - CHANGE MASTER TO MASTER_LOG_FILE ='master-bin.000004',MASTER_LOG_POS = 120;
  (5)把备份复制到192.168.1.122和192.168.1.123
  #scp all.sql
  192.168.1.122:/root/#scp all.sql 192.168.1.123:/root/
  (6)分别在两台服务器上导入备份,执行复制相关命令
  在node2,node3主机上操作:
  #mysql -uroot -p123456  start slave;
  显示从属状态\ G
  创建MHA管理用户,在主上创建。
  将*。*的所有权限授予'123456'标识为'root'@'192.168.1.%'的权限;
  刷新权限;
  三,配置Keepal VIP
  vip配置可以采用两种方式,一种通过keepalived的方式管理虚拟ip的浮动;一人是通过脚本方式,本文通过keepalived方式实现
  1.在node1(Master)与node2(备选主节点)安装keepalived。
  #wget的http://www.keepalived.org/software/keepalived-1.2.12.tar.gz
  #焦油XF的keepalived-1.2.12.tar.gz
  #CD的keepalived-1.2.12
  #的./configure前缀= / usr / local / keepalived
  #make && make install
  #cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/#cp
  / usr / local / keepalived / etc / sysconfig / keepalived / etc / sysconfig /
  #mkdir / etc / keepalived
  #cp /usr/local/keepalived/etc/keepalived/keepalived.conf / etc / keepalived /
  #cp / usr / local / keepalived / sbin / keepalived / usr / sbin /
  配置keepalived的配置文件,在node1(master)上配置操作如下:
  注意:keepalived配置成备份 - 备份,即IP地址切换后,主起来后IP地址不切换,本文监控脚本由MHA提供,keepalived不提供对mysqld的监控。
  #vi /etc/keepalived/keepalived.conf
  !保持配置文件
  global_defs {
  notification_email {
  abc@163.com
  }
  notification_email_from dba@dbserver.com
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id MySQL-HA
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 150
  advert_int 1
  nopreempt
  身份验证{
  auth_type PASS
  auth_pass 1111
  }
  virtual_ipaddress {
  192.168.1.130
  }
  }
  配置keepalived的配置文件,在node2(备用节点)上配置操作如下:
  #vi /etc/keepalived/keepalived.conf
  !保持配置文件
  global_defs {
  notification_email {
  abc@163.com
  }
  notification_email_from dba@dbserver.com
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id MySQL-HA
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 120
  advert_int 1
  nopreempt
  身份验证{
  auth_type PASS
  auth_pass 1111
  }
  virtual_ipaddress {
  192.168.1.130
  }
  }
  4. node1,node2启动keepalived服务
  #service keepalived start
  #chkconfig keepalived on
  5. node1查看VIP启动情况
  [root @
  node1]#ip a   1:lo: mtu 65536 qdisc noqueue state UNKNOWN
  link / loopback 00:00:00:00:00:00 brd 00:00:00:00:00 :00
  inet 127.0.0.1/8 scope host lo
  inet6 :: 1/128 scope host
  valid_lft forever preferred_lft forever
  2:eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
  link / ether 00:0c :29:4e:53:71 brd ff:ff:ff:ff:ff:ff
  inet 192.168.1.121/24 brd 192.168.1.255范围全局eth0
  inet 192.168.1.130/32范围全局eth0
  inet6 fe80 :: 20c:29ff: fe4e:5371/64范围链接
  valid_lft永远preferred_lft永远
  四,配置MHA
  监控创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)。
  #mkdir -p / etc / masterha
  #mkdir -p / var / log / masterha / app1
  #cp mha4mysql-manager-0.56 / samples / conf / app1.cnf / etc / masterha /
  修改app1.cnf配置文件,修改后的文件内容如下:
  #cat /etc/masterha/app1.cnf
  [server default]
  manager_workdir = / var / log / masterha / app1
  manager_log = / var / log / masterha / app1 / manager.log
  master_binlog_dir = / usr / local / mysql / data /
  master_ip_failover_script = / usr / local / bin / master_ip_failover
  master_ip_online_change_script = / usr / local / bin / master_ip_online_change
  password = 123456
  user = root
  ping_interval = 1
  remote_workdir = / tmp
  repl_password = 123456
  repl_user = repl
  report_script = / usr / local / bin / send_report
  ssh_user = root
  [server1]
  hostname = 192.168.1.121
  port = 3306
  [server2]
  hostname = 192.168.1.122
  port = 3306
  candidate_master = 1
  check_repl_delay = 0
  [server3]
  hostname = 192.168.1.123
  port = 3306
  说明:
  master_ip_failover_script = / usr / local / bin / master_ip_failover            #MHA自动切换执行的脚本,需要修改
  master_ip_online_change_script = / usr / local / bin / master_ip_online_change #手动
  切换  需要执行的脚本,需要修改     report_script = / usr / local / bin / send_report #切换                                时发送邮件进行报告,需要修改
  2.设置中继日志的清除方式(在每个从节点上):
  (1)在节点2,节点3从节点上操作:
  将relay_log_purge = 0加入my.cnf配置文件,前面已经配置。
  (2)设定定期清理继电器脚本(node2,node3上操作):
  #猫purge_relay_log.sh
  #!/斌/ bash的
  用户=根
  的passwd = 123456
  端口= 3306
  LOG_DIR = '/数据/ masterha /日志'
  WORK_DIR = '/数据'
  清除= '在/ usr / local / bin目录/ purge_relay_logs'
  如果[!-d $ log_dir]
  then
  mkdir $ log_dir -p
  fi
  $ purge --user = $ user --password = $ passwd --disable_relay_log_purge --port = $ port --workdir = $ work_dir >> $ log_dir / purge_relay_logs.log 2>&1
  配置定时计划任务
  #crontab -e   0 4 * * * / bin / bash /root/purge_relay_log.sh
  要求把keepalived服务引入MHA,我们只需要修改切换是触发的脚本文件master_ip_failover即可,在该脚本中添加在master发生宕机时对keepalived的处理。
  (1)编辑脚本/ usr / local / bin / master_ip_failover,修改后如下:
  #vi / usr / local / bin / master_ip_failover
  #!/ usr / bin / env perl
  use strict;
  使用警告FATAL =>'all';
  使用Getopt :: Long;
  我的(
  $命令,$ ssh_user,$ orig_master_host,$ orig_master_ip,
  $ orig_master_port,$ new_master_host,$ new_master_ip,$ new_master_port
  );
  我的$ vip ='192.168.1.130';
  我的$ ssh_start_vip =“/etc/init.d/keepalived start”;
  我的$ ssh_stop_vip =“/etc/init.d/keepalived stop”;
  GetOptions(
  'command = s'=> \ $ command,
  'ssh_user = s'=> \ $ ssh_user,
  'orig_master_host = s'=> \ $ orig_master_host,
  'orig_master_ip = s'=> \ $ orig_master_ip,
  'orig_master_port = '=> \ $ orig_master_port,
  'new_master_host = s'=> \ $ new_master_host,
  'new_master_ip = s'=> \ $ new_master_ip,
  'new_master_port = i'=> \ $ new_master_port,
  );
  exit&main();
  sub main {
  print“\ n \ nIN SCRIPT TEST ==== $ ssh_stop_vip == $ ssh_start_vip === \ n \ n”;
  if($ command eq“stop”|| $ command eq“stopssh”){
  my $ exit_code = 1;
  eval {
  print“禁用旧主机上的VIP:$ orig_master_host \ n”;
  &stop_vip();
  $ exit_code = 0;
  };
  if($ @){
  warn“Got Error:$ @ \ n”;
  退出$ exit_code;
  }
  exit $ exit_code;
  }
  elsif($ command eq“start”){
  我的$ exit_code = 10;
  eval {
  print“启用VIP - $ vip on the new master - $ new_master_host \ n”;
  &start_vip();
  $ exit_code = 0;
  };
  if($ @){
  warn $ @;
  退出$ exit_code;
  }
  exit $ exit_code;
  } {
  el } {
  “ } ”{     “ }     ”\“”
  退出0;
  }
  else {
  &usage();
  出口1;
  }
  }
  sub start_vip(){
  `ssh $ ssh_user \ @ $ new_master_host \“$ ssh_start_vip \”`;
  }
  #一个简单的系统调用,禁用在old_master
  子上的VIP     stop_vip(){
  `ssh $ ssh_user \ @ $ orig_master_host \“$ ssh_stop_vip \”`;
  }}
  sub usage {
  print
  “用法:master_ip_failover --command = start | stop | stopssh | status --orig_master_host = host --orig_master_ip = ip --orig_master_port = port --new_master_host = host --new_master_ip = ip --new_master_port = port \ n“;
  }
  (2)编辑脚本master_ip_online_change,修改后如下:
  #!/ usr / bin / env perl
  #版权所有(C)2011 DeNA有限公司
  ##
  这个程序是免费的软件; 您可以
  根据
  #自由软件基金会发布的GNU通用公共许可证的条款重新分配和/或修改     #    许可证的版本2或
  #(根据您的选择)任何更高版本。
  ##
  这个程序是分发的,希望它是有用的,
  但没有任何的保证; 甚至没有
  #适销性或适用于特定用途的默示保证     。有关
  详细信息,请参阅     #GNU通用公共许可证。
  ##
  您应该已经收到了GNU通用公共许可证
  #的副本     以及该程序; 如果不,
  ##注意:这是一个示例脚本,不完整。根据您的环境修改脚本。
  使用严格
  使用警告FATAL =>'all';
  使用Getopt :: Long;
  使用MHA :: DBHelper;
  使用MHA :: NodeUtil;
  使用Time :: HiRes qw(sleep gettimeofday tv_interval);
  使用Data :: Dumper;
  我的$ _tstart;
  我的$ _running_interval = 0.1;
  我的(
  $命令,$ orig_master_is_new_slave,$ orig_master_host,
  $ orig_master_ip,$ orig_master_port,$ orig_master_user,
  $ orig_master_password,$ orig_master_ssh_user,$ new_master_host,
  $ new_master_ip,$ new_master_port,$ new_master_user,
  $ new_master_password,$ new_master_ssh_user
  )
  我的$ vip ='192.168.1.130/24';
  我的$ key ='1';
  我的$ ssh_start_vip =“/ sbin / ifconfig eth0:$ key $ vip”;
  我的$ ssh_stop_vip =“/ sbin / ifconfig eth0:$ key down”;
  我的$ orig_master_ssh_port = 22;
  我的$ new_master_ssh_port = 22;
  exit&main();
  sub current_time_us {
  my($ sec,$ microsec)= gettimeofday();
  我的$ curdate = localtime($ sec);
  返回$ curdate。“”。sprintf(“%06d”,$ microsec);
  }
  sub sleep_until {
  my $ elapsed = tv_interval($ _ tstart);
  if($ _running_interval> $ elapsed){
  sleep($ _running_interval - $ elapsed);
  }
  }
  sub get_threads_util {
  my $ dbh = shift;
  我的$ my_connection_id = shift;
  我的$ running_time_threshold = shift;
  我的$ type = shift;
  $ running_time_threshold = 0,除非($ running_time_threshold);
  $ type = 0,除非($ type);
  我的@threads;
  我的$ sth = $ dbh-> prepare(“SHOW PROCESSLIST”);
  $ sth-> execute();
  while(my $ ref = $ sth-> fetchrow_hashref()){

  my $>  我的$ user = $ ref - > {User};
  我的$ host = $ ref - > {Host};
  我的$ command = $ ref - > {Command};
  我的$ state = $ ref - > {State};
  我的$ query_time = $ ref - > {Time};
  我的$ info = $ ref - > {Info};
  $ info =s / ^ \ s *(。*?)\ s * $ / $ 1 / if defined($ info);

  next if($ my_connection_id == $>  next if(defined($ query_time)&& $ query_time  = 1){
  next if(defined($ command)&& $ command eq“Sleep”);
  next if(defined($ command)&& $ command eq“Connect”);
  }
  if($ type> = 2){
  next if(defined($ info)&& $ info =m / ^ select / i);
  next if(defined($ info)&& $ info =m / ^ show / i);
  }
  推送@threads,$ ref;
  }
  return @threads;
  }
  sub main {
  if($ command eq“stop”){
  ##正当地杀死当前主机上的连接
  #1.在新主机
  #2 上设置read_only = 1。DROP     USER使得没有应用用户可以建立新的连接
  #3。在当前主机
  #4 上设置read_only = 1。杀死当前查询
  #*任何数据库访问失败都会导致脚本死机     。
  我的$ exit_code = 1;
  eval {
  ##在新主机上设置read_only = 1(以避免意外)
  我的$ new_master_handler = new MHA :: DBHelper();
  #args:hostname,port,user,password,raise_error(die_on_error)_or_not
  $ new_master_handler-> connect($ new_master_ip,$ new_master_port,
  $ new_master_user,$ new_master_password,1);
  打印current_time_us()。“设置read_only在新的主人..”;
  $ new_master_handler-> enable_read_only();
  if($ new_master_handler-> is_read_only()){
  print“ok。\ n”;
  }
  else {
  die“Failed!\ n”;
  }
  $ new_master_handler-> disconnect();
  #连接到原始主机,如果发生任何数据库错误,则会死亡
  $ orig_master_handler = new MHA :: DBHelper();
  $ orig_master_handler-> connect($ orig_master_ip,$ orig_master_port,
  $ orig_master_user,$ orig_master_password,1);
  ##删除应用程序用户,以便没有人可以连接。事先禁用每会话binlog
  $ orig_master_handler-> disable_log_bin_local();
  打印current_time_us()。“在原始主机上吸引应用用户.. \ n”;
  #FIXME_xxx_drop_app_user($ orig_master_handler);
  等待N * 100毫秒,以便当前的连接可以退出
  我的$ time_until_read_only = 15;
  $ _tstart = [gettimeofday];
  我的@threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  while($ time_until_read_only> 0 && $#threads> = 0){
  if($ time_until_read_only%5 == 0){
  printf
  “%s等待所有正在运行的%d线程断开连接..(最大%d毫秒)\ n”
  current_time_us(),$#threads + 1,$ time_until_read_only * 100;
  if($#threads  new([$ _]) - > Indent(0) - > Terse(1) - > Dump。
  “\ n” foreach(@threads);
  }
  }
  sleep_until();
  $ _tstart = [gettimeofday];
  $ time_until_read_only--;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  }}
  ##在当前主设备上设置read_only = 1,以便没有人(SUPER除外)可以写入
  print_time_us()。“在原始主机上设置read_only = 1”。
  $ orig_master_handler-> enable_read_only();
  if($ orig_master_handler-> is_read_only()){
  print“ok。\ n”;
  }
  else {
  die“Failed!\ n”;
  }}
  等待M * 100毫秒,以便当前的更新查询可以完成
  我的$ time_until_kill_threads = 5;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  while($ time_until_kill_threads> 0 && $#threads> = 0){
  if($ time_until_kill_threads%5 == 0){
  printf
  “%s等待所有运行的%d查询断开连接..(最大%d毫秒)\ n”
  current_time_us(),$#threads + 1,$ time_until_kill_threads * 100;
  if($#threads  new([$ _]) - > Indent(0) - > Terse(1) - > Dump。“\ n”
  foreach(@threads);
  }
  }
  sleep_until();
  $ _tstart = [gettimeofday];
  $ time_until_kill_threads--;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  }}
  ## Terminating all threads
  print current_time_us() . " Killing all application threads..\n";
  $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
  print current_time_us() . " done.\n";
  $orig_master_handler->enable_log_bin_local();
  $orig_master_handler->disconnect();
  ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
  eval {
  `ssh -p$orig_master_ssh_port $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
  };
  if ($@) {
  warn $@;
  }
  $exit_code = 0;
  };
  if ($@) {
  warn "Got Error: $@\n";
  exit $exit_code;
  }
  exit $exit_code;
  }
  elsif ( $command eq "start" ) {
  ## Activating master ip on the new master
  # 1. Create app user with write privileges
  # 2. Moving backup script if needed
  # 3. Register new master's ip to the catalog database
  # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
  # If exit code is 0 or 10, MHA does not abort
  my $exit_code = 10;
  eval {
  my $new_master_handler = new MHA::DBHelper();
  # args: hostname, port, user, password, raise_error_or_not
  $new_master_handler->connect( $new_master_ip, $new_master_port,
  $new_master_user, $new_master_password, 1 );
  ## Set read_only=0 on the new master
  $new_master_handler->disable_log_bin_local();
  print current_time_us() . " Set read_only=0 on the new master.\n";
  $new_master_handler->disable_read_only();
  ## Creating an app user on the new master
  print current_time_us() . " Creating app user on the new master..\n";
  #FIXME_xxx_create_app_user($new_master_handler);
  $new_master_handler->enable_log_bin_local();
  $new_master_handler->disconnect();
  ## Update master ip on the catalog database, etc
  `ssh -p$new_master_ssh_port $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
  $exit_code = 0;
  };
  if ($@) {
  warn "Got Error: $@\n";
  exit $exit_code;
  }
  exit $exit_code;
  }
  elsif ( $command eq "status" ) {
  # do nothing
  exit 0;
  }
  else {
  &usage();
  exit 1;
  }
  }
  sub usage {
  print
  "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
  }
  (3) 编辑脚本send_report,修改后如下:
  #!/usr/bin/perl
  use strict;
  use warnings FATAL => 'all';
  use Mail::Sender;
  use Getopt::Long;
  #new_master_host and new_slave_hosts are set only when recovering master succeeded
  my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
  my $smtp='smtp.163.com';
  my $mail_from='xxxx';
  my $mail_user='xxxxx';
  my $mail_pass='xxxxx';
  my $mail_to=['xxxx','xxxx'];
  GetOptions(
  'orig_master_host=s' => \$dead_master_host,
  'new_master_host=s'  => \$new_master_host,
  'new_slave_hosts=s'  => \$new_slave_hosts,
  'subject=s'          => \$subject,
  'body=s'             => \$body,
  );
  mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
  sub mailToContacts {
  my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
  open my $DEBUG, "> /tmp/monitormail.log"
  or die "Can't open the debug      file:$!\n";
  my $sender = new Mail::Sender {
  ctype       => 'text/plain; ',
  encoding    => 'utf-8',
  smtp        => $smtp,
  from        => $mail_from,
  auth        => 'LOGIN',
  TLS_allowed => '0',
  authid      => $user,
  authpwd     => $passwd,
  to          => $mail_to,
  subject     => $subject,
  debug       => $DEBUG
  };
  $sender->MailMsg(
  {   msg   => $msg,
  debug => $DEBUG
  }
  ) or print $Mail::Sender::Error;
  return 1;
  }
  # Do whatever you want here
  exit 0;
  五、MHA的日常管理
  1. 检查SSH配置(node5 Monitor 监控节点上操作),如下:
  # masterha_check_ssh --conf=/etc/masterha/app1.cnf
  Sun May  1 22:05:12 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  Sun May  1 22:05:12 2016 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
  Sun May  1 22:05:12 2016 - [info] Reading server configuration from /etc/masterha/app1.cnf..
  Sun May  1 22:05:12 2016 - [info] Starting SSH connection tests..
  Sun May  1 22:05:14 2016 - [debug]
  Sun May  1 22:05:12 2016 - [debug]  Connecting via SSH from root@192.168.1.121(192.168.1.121:22) to root@192.168.1.122(192.168.1.122:22)..
  Sun May  1 22:05:13 2016 - [debug]   ok.
  Sun May  1 22:05:13 2016 - [debug]  Connecting via SSH from root@192.168.1.121(192.168.1.121:22) to root@192.168.1.123(192.168.1.123:22)..
  Sun May  1 22:05:13 2016 - [debug]   ok.
  Sun May  1 22:05:14 2016 - [debug]
  Sun May  1 22:05:13 2016 - [debug]  Connecting via SSH from root@192.168.1.122(192.168.1.122:22) to root@192.168.1.121(192.168.1.121:22)..
  Sun May  1 22:05:13 2016 - [debug]   ok.
  Sun May  1 22:05:13 2016 - [debug]  Connecting via SSH from root@192.168.1.122(192.168.1.122:22) to root@192.168.1.123(192.168.1.123:22)..
  Sun May  1 22:05:14 2016 - [debug]   ok.
  Sun May  1 22:05:14 2016 - [debug]
  Sun May  1 22:05:13 2016 - [debug]  Connecting via SSH from root@192.168.1.123(192.168.1.123:22) to root@192.168.1.121(192.168.1.121:22)..
  Sun May  1 22:05:14 2016 - [debug]   ok.
  Sun May  1 22:05:14 2016 - [debug]  Connecting via SSH from root@192.168.1.123(192.168.1.123:22) to root@192.168.1.122(192.168.1.122:22)..
  Sun May  1 22:05:14 2016 - [debug]   ok.
  Sun May  1 22:05:14 2016 - [info] All SSH connection tests passed successfully.
  2. 检查整个复制环境状况(node5 监控节点上操作),如下:
  # masterha_check_repl --conf=/etc/masterha/app1.cnf
  Sun May  1 22:46:44 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
  Sun May  1 22:46:44 2016 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
  Sun May  1 22:46:44 2016 - [info] Reading server configuration from /etc/masterha/app1.cnf..
  Sun May  1 22:46:44 2016 - [info] MHA::MasterMonitor version 0.56.
  Sun May  1 22:46:45 2016 - [info] GTID failover mode = 0
  Sun May  1 22:46:45 2016 - [info] Dead Servers:
  Sun May  1 22:46:45 2016 - [info] Alive Servers:
  Sun May  1 22:46:45 2016 - [info]   192.168.1.121(192.168.1.121:3306)
  Sun May  1 22:46:45 2016 - [info]   192.168.1.122(192.168.1.122:3306)
  Sun May  1 22:46:45 2016 - [info]   192.168.1.123(192.168.1.123:3306)
  Sun May  1 22:46:45 2016 - [info] Alive Slaves:
  Sun May  1 22:46:45 2016 - [info]   192.168.1.122(192.168.1.122:3306)  Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
  Sun May  1 22:46:45 2016 - [info]     Replicating from 192.168.1.121(192.168.1.121:3306)
  Sun May  1 22:46:45 2016 - [info]     Primary candidate for the new Master (candidate_master is set)
  Sun May  1 22:46:45 2016 - [info]   192.168.1.123(192.168.1.123:3306)  Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
  Sun May  1 22:46:45 2016 - [info]     Replicating from 192.168.1.121(192.168.1.121:3306)
  Sun May  1 22:46:45 2016 - [info] Current Alive Master: 192.168.1.121(192.168.1.121:3306)
  Sun May  1 22:46:45 2016 - [info] Checking slave configurations..
  Sun May  1 22:46:45 2016 - [info]  read_only=1 is not set on slave 192.168.1.122(192.168.1.122:3306).

  Sun May  1 22:46:45 2016 - [warning] >  Sun May  1 22:46:45 2016 - [info]  read_only=1 is not set on slave 192.168.1.123(192.168.1.123:3306).

  Sun May  1 22:46:45 2016 - [warning] >  Sun May  1 22:46:45 2016 - [info] Checking replication filtering settings..
  Sun May  1 22:46:45 2016 - [info]  binlog_do_db= , binlog_ignore_db=
  Sun May  1 22:46:45 2016 - [info]  Replication filtering check ok.
  Sun May  1 22:46:45 2016 - [info] GTID (with auto-pos) is not supported
  Sun May  1 22:46:45 2016 - [info] Starting SSH connection tests..
  Sun May  1 22:46:46 2016 - [info] All SSH connection tests passed successfully.
  Sun May  1 22:46:46 2016 - [info] Checking MHA Node version..
  Sun May  1 22:46:47 2016 - [info]  Version check ok.
  Sun May  1 22:46:47 2016 - [info] Checking SSH publickey authentication settings on the current master..
  Sun May  1 22:46:47 2016 - [info] HealthCheck: SSH to 192.168.1.121 is reachable.
  Sun May  1 22:46:47 2016 - [info] Master MHA Node version is 0.56.
  Sun May  1 22:46:47 2016 - [info] Checking recovery script configurations on 192.168.1.121(192.168.1.121:3306)..
  Sun May  1 22:46:47 2016 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data/ --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000008
  Sun May  1 22:46:47 2016 - [info]   Connecting to root@192.168.1.121(192.168.1.121:22)..
  Creating /tmp if not exists..    ok.
  Checking output directory is accessible or not..
  ok.
  Binlog found at /usr/local/mysql/data/, up to master-bin.000008
  Sun May  1 22:46:48 2016 - [info] Binlog setting check done.
  Sun May  1 22:46:48 2016 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
  Sun May  1 22:46:48 2016 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.1.122 --slave_ip=192.168.1.122 --slave_port=3306 --workdir=/tmp --target_version=5.6.29-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info  --relay_dir=/usr/local/mysql/data/  --slave_pass=xxx
  Sun May  1 22:46:48 2016 - [info]   Connecting to root@192.168.1.122(192.168.1.122:22)..
  Checking slave recovery environment settings..
  Opening /usr/local/mysql/data/relay-log.info ... ok.

  >
  Temporary>  Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
  done.
  Testing mysqlbinlog output.. done.
  Cleaning up test file(s).. done.
  Sun May  1 22:46:48 2016 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.1.123 --slave_ip=192.168.1.123 --slave_port=3306 --workdir=/tmp --target_version=5.6.29-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info  --relay_dir=/usr/local/mysql/data/  --slave_pass=xxx
  Sun May  1 22:46:48 2016 - [info]   Connecting to root@192.168.1.123(192.168.1.123:22)..
  Checking slave recovery environment settings..
  Opening /usr/local/mysql/data/relay-log.info ... ok.

  >
  Temporary>  Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
  done.
  Testing mysqlbinlog output.. done.
  Cleaning up test file(s).. done.
  Sun May  1 22:46:48 2016 - [info] Slaves settings check done.
  Sun May  1 22:46:48 2016 - [info]
  192.168.1.121(192.168.1.121:3306) (current master)
  +--192.168.1.122(192.168.1.122:3306)
  +--192.168.1.123(192.168.1.123:3306)
  Sun May  1 22:46:48 2016 - [info] Checking replication health on 192.168.1.122..
  Sun May  1 22:46:48 2016 - [info]  ok.
  Sun May  1 22:46:48 2016 - [info] Checking replication health on 192.168.1.123..
  Sun May  1 22:46:48 2016 - [info]  ok.
  Sun May  1 22:46:48 2016 - [info] Checking master_ip_failover_script status:
  Sun May  1 22:46:48 2016 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.121 --orig_master_ip=192.168.1.121 --orig_master_port=3306
  IN SCRIPT TEST====/etc/init.d/keepalived stop==/etc/init.d/keepalived start===
  Checking the Status of the script.. OK
  Sun May  1 22:46:48 2016 - [info]  OK.
  Sun May  1 22:46:48 2016 - [warning] shutdown_script is not defined.
  Sun May  1 22:46:48 2016 - [info] Got exit code 0 (Not master dead).
  MySQL Replication Health is OK.
  [root@node5 masterha]#
  3. 开启MHA Manager监控(node5操作)如下:
  # mkdir -p  /var/log/masterha/app1/
  # nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
  参数说明:
  --remove_dead_master_conf     #该参数代表当发生主从切换后,老的主库的ip将会从配置文件中移除。
  --manger_log                  #日志存放位置
  --ignore_last_failover        #在缺省情况下,如果MHA检测到连续发生宕机,会生成app1.failover.complete文件,会造成MHA管理进程无法启动。
  4. 查看MHA Manager监控是否正常:
  # masterha_check_status --conf=/etc/masterha/app1.cnf
  app1 (pid:2480) is running(0:PING_OK), master:192.168.1.121
  5. 查看启动日志(node5操作)如下:
  # tail -n20 /var/log/masterha/app1/manager.log
  6. 关闭MHA Manage监控:
  (1) 关闭
  # masterha_stop --conf=/etc/masterha/app1.cnf
  (2) 启动
  # nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1
  六、MHA Failover切换
  1. 自动Failover切换
  (1) 模拟master mysql关闭
  (2) VIP将会切换到node2
  (3) /etc/masterha/app1.cnf中将原主服务器配置文件清掉。
  (4) masterha_manager监控进程会自动退出关闭,并在/var/log/masterha/app1下生成app1.failover.complete文件,manager.log会记录全过程,从服务器会自动从新的主服务器复制。
  (5) 原主服务器mysqld启动的,需要清掉/var/log/masterha/app1下生成app1.failover.complete文件,添加node1配置文件到/etc/masterha/app1.cnf,通过manager.log中的记录的故障点,重新同步主服务器,成为从节点。
  2. 手动Failover切换
  (1) 先停MHA Manager进程。
  masterha_stop --conf=/etc/masterha/app1.cnf
  (2) 停掉master mysqld
  (3) 手动切换,在Manager主机上操作如下:
  # masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.cnf --dead_master_host=192.168.1.122 --dead_master_port=3306 --new_master_host=192.168.1.121 --new_master_port=3306 --ignore_last_failover
  通过观察日志可以观察切换全过程。
  (4) 如上节方式恢复节点为从服务器。
  3.  正常运行情况下切换(Master正在运行)
  等补充。
  4. 小结
  通过对MMM,MHA的环境搭建测试,MHA由于采用复制架构,原理简单,在一些对数据要求比较高的环境,为了保证可靠性,最好与半同步结合使用。


运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-611798-1-1.html 上篇帖子: CentOS安装mysql遇到问题 下篇帖子: centos7.2安装Mysql5.7.13-12179708
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表