q4561231 发表于 2018-10-9 09:40:02

MySQL之高可用架构—MHA

  MySQL高可用目前有heartbeat+drbd、MHA、MySQL复制等几种较成熟的方案,heartbeat+drbd的方案可扩展性较差,而且读写都由主服务器负责,从库并不提供读功能,适合于数据增长量不大、一致性要求很高的环境,如银行、金融业等。今天重点讲下MHA的高可用架构。
  MHA是一款优秀的高可用环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到0-30秒之内自动完成数据库的故障切换,并且在切换的过程中,最大限度的保证数据的一致性,以达到真正意义上的高可用。MHA高可用建立在MySQL主从复制的基础上,先了解下MySQL复制最常见的两种方式:

[*]  异步复制:主库写入并提交事务之后,把记录写进主库二进制日志即返回客户端,主库和从库的数据存在一定的延迟,这样就存在一定的隐患,当主库提交了一个事务,并且写入了二进制日志,而从库尚未得到主库推送的二进制日志时,此时主库宕机,将造成主从服务器的数据不一致。
[*]  半同步复制:主库在每次提交事务成功时,并不及时反馈给客户端,而是等待其中一个从库也接收到二进制日志并写入中继日志之后,才返回操作成功给客户端。
  MHA组成:

[*]  MHA Manager:管理节点,可以单独的部署在一台独立的服务器上,管理多个master-slave集群,也可以部署在一台Slave上。
[*]  MHA Node:数据节点,运行在每台MySQL服务器上。
  MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新master,然后将其它所有的slave重新指向新的master。整个故障转移过程对应用程序是完全透明的。
  MHA工作原理:
  1)从宕机的master保存二进制日志事件
  2)识别含有最新更新的Slave
  3) 应用差异的中继日志到其它从服务器
  4)应用从master保存的二进制日志事件
  5)提升一个新的Slave为master
  6)使其它的Slave连接到新的master并复制
  示例:MHA高可用架构(如果在内网可以关闭防火墙,否则请开启相应的端口)
  Manager:node1:192.168.154.128
  Master:node2:192.168.154.156
  Slave:node3:192.168.154.130
  Slave:node4:192.168.154.154
  一 配置主从复制:
  1)主节点:
  # vim /etc/my.cnf
  innodb_file_per_table=1      #开启独立的表空间
  skip_name_resolve            #禁止域名解析
  log-bin=master-bin
  relay-log=relay-bin
  server-id=1
  # service mysqld restart
  查看二进制日志信息
  mysql> show master status;
  +-------------------+----------+--------------+------------------+
  | File            | Position | Binlog_Do_DB | Binlog_Ignore_DB |
  +-------------------+----------+--------------+------------------+
  | master-bin.000001 |      106 |            |                  |
  +-------------------+----------+--------------+------------------+
  1 row in set (0.00 sec)
  建立授权用户:

  mysql> grant replication slave,replication client on *.* to 'slave'@'192.168.154.%'>  Query OK, 0 rows affected (0.06 sec)
  mysql> flush privileges;
  Query OK, 0 rows affected (0.00 sec)
  2)从节点:
  # vim /etc/my.cnf
  innodb_file_per_table=1
  skip_name_resolve
  log-bin=slave-bin
  relay-log=relay-bin
  server_id=2
  read_only=1
  relay_log_purge=0
  # service mysqld restart
  # vim /etc/my.cnf
  innodb_file_per_table=1
  skip_name_resolve
  log-bin=slave-bin
  relay-log=relay-bin
  server_id=3
  read_only=1                  #开启只读模式
  relay_log_purge=0            #关闭自动清理中继日志
  # service mysqld restart
  设置同步:
  mysql> change master to master_host='192.168.154.156',master_user='slave',master_password='slave',master_log_file='master-bin.000001',master_log_pos=106;
  Query OK, 0 rows affected (0.03 sec)
  mysql> start slave;
  Query OK, 0 rows affected (0.01 sec)
  mysql> show slave status\G
  *************************** 1. row ***************************
  Slave_IO_State: Waiting for master to send event
  Master_Host: 192.168.154.156
  Master_User: slave
  Master_Port: 3306
  Connect_Retry: 60
  Master_Log_File: master-bin.000001
  Read_Master_Log_Pos: 354

  Relay_Log_File:>
  >
  >  Slave_IO_Running: Yes
  Slave_SQL_Running: Yes
  Replicate_Do_DB:
  Replicate_Ignore_DB:
  Replicate_Do_Table:
  Replicate_Ignore_Table:
  Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
  Last_Errno: 0
  Last_Error:
  Skip_Counter: 0
  Exec_Master_Log_Pos: 354

  >  Until_Condition: None
  Until_Log_File:
  Until_Log_Pos: 0
  Master_SSL_Allowed: No
  Master_SSL_CA_File:
  Master_SSL_CA_Path:
  Master_SSL_Cert:
  Master_SSL_Cipher:
  Master_SSL_Key:
  Seconds_Behind_Master: 0
  Master_SSL_Verify_Server_Cert: No
  Last_IO_Errno: 0
  Last_IO_Error:
  Last_SQL_Errno: 0
  Last_SQL_Error:
  1 row in set (0.00 sec)
  3)在master节点上创建具有管理权限的账号

  mysql> grant all on *.* to 'zwj'@'192.168.154.%'>  Query OK, 0 rows affected (0.00 sec)
  二 配置集群间的密钥登陆
  在node1上:
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.156
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.130
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.154
  # ssh 192.168.154.154 'ifconfig'            #验证
  eth0      Link encap:EthernetHWaddr 00:0C:29:67:65:ED
  inet addr:192.168.154.154Bcast:192.168.154.255Mask:255.255.255.0
  inet6 addr: fe80::20c:29ff:fe67:65ed/64 Scope:Link
  UP BROADCAST RUNNING MULTICASTMTU:1500Metric:1
  RX packets:26253 errors:0 dropped:0 overruns:0 frame:0
  TX packets:42416 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:23453164 (22.3 MiB)TX bytes:2514457 (2.3 MiB)
  Interrupt:19 Base address:0x2024
  在node2上:
  # ssh-keygen -t rsa
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.128
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.130
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.154
  在node3上:
  # ssh-keygen -t rsa
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.128
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.156
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.154
  在node4上:
  # ssh-keygen -t rsa
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.128
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.156
  # ssh-copy-id -i ./.ssh/id_rsa.pub root@192.168.154.130
  三 安装MHA Manager,在node1上:
# yum install perl-DBD-MySQL -y  # tar -zxf mha4mysql-node-0.56.tar.gz
  # cd mha4mysql-node-0.56
  # perl Makefile.PL
  # make
  # make install
  # yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y      #安装MHA Manger依赖的perl模块
  # tar -zxf mha4mysql-manager-0.56.tar.gz
  # cd mha4mysql-manager-0.56
  # perl Makefile.PL
  # make
  # make install
  四 安装MySQL node(在所有MySQL服务器上)
#yum install perl-DBD-MySQL -y  # cd mha4mysql-node-0.56/
  # perl Makefile.PL
  # make
  # make install
  五 创建工作目录,配置MHA:
  # mkdir -pv /etc/masterha
  # vim /etc/masterha/appl.cnf
  
  user=zwj
  password=zwj
  manager_workdir=/etc/masterha/appl
  manager_log=/etc/masterha/appl/manager.log
  remote_workdir=/etc/masterha/appl
  ssh_user=root
  repl_user=slave
  repl_password=slave
  ping_interval=1
  
  hostname=192.168.154.156
  
  hostname=192.168.154.130
  candidate_master=1                        #设置为备选的master
  
  hostname=192.168.154.154
  六 检查SSH连接状态:
  # masterha_check_ssh --conf=/etc/masterha/appl.cnf
  Wed May 10 00:12:58 2017 - Global configuration file /etc/masterha_default.cnf not found. Skipping.
  Wed May 10 00:12:58 2017 - Reading application default configuration from /etc/masterha/appl.cnf..
  Wed May 10 00:12:58 2017 - Reading server configuration from /etc/masterha/appl.cnf..
  Wed May 10 00:12:58 2017 - Starting SSH connection tests..
  Wed May 10 00:13:15 2017 -
  Wed May 10 00:12:59 2017 - Connecting via SSH from root@192.168.154.154(192.168.154.154:22) to root@192.168.154.156(192.168.154.156:22)..
  Wed May 10 00:13:05 2017 -    ok.
  Wed May 10 00:13:05 2017 - Connecting via SSH from root@192.168.154.154(192.168.154.154:22) to root@192.168.154.130(192.168.154.130:22)..
  Wed May 10 00:13:15 2017 -    ok.
  Wed May 10 00:13:20 2017 -
  Wed May 10 00:12:58 2017 - Connecting via SSH from root@192.168.154.130(192.168.154.130:22) to root@192.168.154.156(192.168.154.156:22)..
  Wed May 10 00:13:11 2017 -    ok.
  Wed May 10 00:13:11 2017 - Connecting via SSH from root@192.168.154.130(192.168.154.130:22) to root@192.168.154.154(192.168.154.154:22)..
  Wed May 10 00:13:20 2017 -    ok.
  Wed May 10 00:13:35 2017 -
  Wed May 10 00:12:58 2017 - Connecting via SSH from root@192.168.154.156(192.168.154.156:22) to root@192.168.154.130(192.168.154.130:22)..
  Wed May 10 00:13:15 2017 -    ok.
  Wed May 10 00:13:15 2017 - Connecting via SSH from root@192.168.154.156(192.168.154.156:22) to root@192.168.154.154(192.168.154.154:22)..
  Wed May 10 00:13:35 2017 -    ok.
  Wed May 10 00:13:35 2017 - All SSH connection tests passed successfully.
  七 检查整个复制环境:
  # masterha_check_repl --conf=/etc/masterha/appl.cnf
  ...
  192.168.154.156(192.168.154.156:3306) (current master)
  +--192.168.154.130(192.168.154.130:3306)
  +--192.168.154.154(192.168.154.154:3306)
  Wed May 10 00:33:36 2017 - Checking replication health on 192.168.154.130..
  Wed May 10 00:33:36 2017 - ok.
  Wed May 10 00:33:36 2017 - Checking replication health on 192.168.154.154..
  Wed May 10 00:33:36 2017 - ok.
  Wed May 10 00:33:36 2017 - master_ip_failover_script is not defined.
  Wed May 10 00:33:36 2017 - shutdown_script is not defined.
  Wed May 10 00:33:36 2017 - Got exit code 0 (Not master dead).
  MySQL Replication Health is OK.
  八 开启MHA Manager监控:
  # nohup masterha_manager --conf=/etc/masterha/appl.cnf > /etc/masterha/appl/manager.log 2>&1 &
   8300
  查看MHA Manager监控:
  # masterha_check_status --conf=/etc/masterha/appl.cnf
  appl (pid:8300) is running(0:PING_OK), master:192.168.154.156
  关闭MHA Manager监控:
  # masterha_stop --conf=/etc/masterha/appl.cnf
  Stopped appl successfully.
  +Exit 1                  nohup masterha_manager --conf=/etc/masterha/appl.cnf > /etc/masterha/appl/manager.log 2>&1
  九 模拟主库宕机:
  # service mysqld stop
  Stopping mysqld:                                          
  查看slave(node4),可见master已发生变化,
  ...
  mysql> show slave status\G
  *************************** 1. row ***************************
  Slave_IO_State: Waiting for master to send event
  Master_Host: 192.168.154.130
  Master_User: slave
  Master_Port: 3306
  Connect_Retry: 60
  Master_Log_File: slave-bin.000003
  Read_Master_Log_Pos: 106

  Relay_Log_File:>
  >
  >  Slave_IO_Running: Yes
  Slave_SQL_Running: Yes


页: [1]
查看完整版本: MySQL之高可用架构—MHA