wss1051 发表于 2018-10-1 10:49:26

Corosync+Pacemaker+DRBD+Mysql高可用HA配置

  操作系统: CentOS 6.6 x64,本文采用rpm方式安装corosync+pacemaker+drbd,采用二进制版本安装mysql-5.6.29。本文是在Corosync+Pacemaker+DRBD+NFS高可用实例配置基础上进行配置修改,然后进行测试的安装过程。
一、双机配置
1. app1,app2配置hosts文件,以及主机名。
  # vi /etc/hosts
  127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
  ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
  192.168.0.24         app1
  192.168.0.25         app2
  10.10.10.24          app1-priv
  10.10.10.25          app2-priv
  说明:10段是心跳IP, 192.168段是业务IP, 采用VIP地址是192.168.0.26。
2. 关闭selinux与防火墙
  sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config
  setenforce 0
  chkconfig iptables off
  service iptables stop
3. 配置各节点ssh互信,好像可配\可不配,方便管理。
  app1:
  # ssh-keygen-t rsa -f ~/.ssh/id_rsa-P ''
  # ssh-copy-id -i .ssh/id_rsa.pub root@app2
  app2:
  # ssh-keygen-t rsa -f ~/.ssh/id_rsa-P ''
  # ssh-copy-id -i .ssh/id_rsa.pub root@app1
二、DRDB安装配置
1. app1,app2配置hosts文件以及准备磁盘分区
  app1: /dev/sdb1—> app2: /dev/sdb1
2. app1,app2安装drbd并安装
(1) 下载drbd安装包, CentOS6.6采用kmod-drbd84-8.4.5-504.1安装包才可用。
  http://rpm.pbone.net/
  drbd84-utils-8.9.1-1.el6.elrepo.x86_64.rpm
  kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm
  # rpm -ivh drbd84-utils-8.9.5-1.el6.elrepo.x86_64.rpm kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm
  Preparing...                ###########################################
  1:drbd84-utils         ########################################### [ 50%]
  2:kmod-drbd84            ###########################################
  Working. This may take some time ...
  Done.
  #
(2) 加载DRBD到内核模块
  app1,app2分别操作,并加入到/etc/rc.local文件中。
  modprobe drbd
  lsmode |grep drbd
3. 创建修改配置文件。节点1,节点2一样配置。
  # vi /etc/drbd.d/global_common.conf
  global {
  usage-count no;
  }
  common {
  protocol C;
  disk {
  on-io-error detach;
  no-disk-flushes;
  no-md-flushes;
  }
  net {
  sndbuf-size 512k;
  max-buffers   8000;
  unplug-watermark   1024;
  max-epoch-size8000;
  cram-hmac-alg "sha1";
  shared-secret "hdhwXes23sYEhart8t";
  after-sb-0pri disconnect;
  after-sb-1pri disconnect;
  after-sb-2pri disconnect;
  rr-conflict disconnect;
  }
  syncer {
  rate 300M;
  al-extents 517;
  }
  }
  resource data {
  on app1 {
  device    /dev/drbd0;
  disk      /dev/sdb1;
  address   10.10.10.24:7788;
  meta-disk internal;
  }
  on app2 {
  device   /dev/drbd0;
  disk       /dev/sdb1;
  address    10.10.10.25:7788;
  meta-disk internal;
  }
  }
4. 初始化资源
  在app1和app2上分别执行:
  # drbdadm create-md data
  initializing activity log
  NOT initializing bitmap
  Writing meta data...
  New drbd meta data block successfully created.
5. 启动服务
  在app1和app2上分别执行:或采用 drbdadm up data
  # service drbd start
  Starting DRBD resources: [
  create res: data
  prepare disk: data
  adjust disk: data
  adjust net: data
  ]
  ..........
  #
6. 查看启动状态, 两节点应均处于Secondary状态。
  cat /proc/drbd       #或者直接使用命令drbd-overview
  节点1:
  # cat /proc/drbd
  version: 8.4.5 (api:1/proto:86-101)
  GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20
  0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
  ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116
  节点2:
  # cat /proc/drbd
  version: 8.4.5 (api:1/proto:86-101)
  GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20
  0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
  ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116
7. 将其中一个节点配置为主节点
  我们需要将其中一个节点设置为Primary,在要设置为Primary的节点上执行如下两条命令均可:
  drbdadm -- --overwrite-data-of-peer primary data
  drbdadm primary --force data
  主节点查看同步状态:
  # cat /proc/drbd
  version: 8.4.5 (api:1/proto:86-101)
  GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com, 2015-01-02 12:06:20
  0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
  ns:1229428 nr:0 dw:0 dr:1230100 al:0 bm:0 lo:0 pe:2 ua:0 ap:0 ep:1 wo:d oos:19735828
  [>...................] sync'ed:5.9% (19272/20472)M
  finish: 0:27:58 speed: 11,744 (11,808) K/sec
  #
8. 创建文件系统
  文件系统的挂载只能在Primary节点进行,只有在设置了主节点后才能对drbd设备进行格式化, 格式化与手动挂载测试。
  # mkfs.ext4 /dev/drbd0
  # mount /dev/drbd0 /data
三、安装配置Mysql-5.6.x
1. app1\app2下载编译版本mysql安装
  wget http://mirrors.sohu.com/mysql/MySQL-5.6/mysql-5.6.29-linux-glibc2.5-x86_64.tar.gz
  tar zxvf mysql-5.6.29-linux-glibc2.5-x86_64.tar.gz-C /usr/local
  cd /usr/local/
  ln -sv mysql-5.6.29-linux-glibc2.5-x86_64 mysql
  groupadd mysql
  useradd -g mysql -M -s /sbin/nologin mysql
  chown -R mysql:mysql /usr/local/mysql
2. app1下初始化数据库(初始化目录为drbd0同步目录中)
  /usr/local/mysql/scripts/mysql_install_db --user=mysql --basedir=/usr/local/mysql --datadir=/data/mysql3306
3, app1,app2下创建配置文件及服务
  cd /usr/local/mysql
  cp support-files/my-default.cnf /etc/my.cnf
  cp support-files/mysql.server/etc/rc.d/init.d/mysqld
  chkconfig --add mysqld
4. app1,app2配置Mysql命令链接,也可以采用加入环境变量中,该方式可以略过。
  ln -sf /usr/local/mysql/bin/mysql /usr/bin/mysql
  ln -sf /usr/local/mysql/bin/mysqldump /usr/bin/mysqldump
  ln -sf /usr/local/mysql/bin/myisamchk /usr/bin/myisamchk
  ln -sf /usr/local/mysql/bin/mysqld_safe /usr/bin/mysqld_safe
  或通过加入环境变量中解决。
  # vi /etc/profile
  export PATH=/usr/local/mysql/bin/:$PATH
  # source /etc/profile
  ln -sv /usr/local/mysql/include/usr/include/mysql
  echo '/usr/local/mysql/lib' > /etc/ld.so.conf.d/mysql.conf
  ldconfig
5. app1上Mysql配置文件(两边保持配置文件一致)
  vi /etc/my.cnf
  
  port      = 3306
  default-character-set= utf8
  socket      = /tmp/mysql.sock
  
  character-set-server   = utf8
  collation-server       = utf8_general_ci
  port                   = 3306
  socket               = /tmp/mysql.sock
  basedir                = /usr/local/mysql
  datadir                = /data/mysql3306
  skip-external-locking
  key_buffer_size      = 16M
  max_allowed_packet   = 1M
  table_open_cache       = 64
  sort_buffer_size       = 512K
  net_buffer_length      = 8K
  read_buffer_size       = 256K
  read_rnd_buffer_size    = 512K
  myisam_sort_buffer_size = 8M
  log-bin               = mysql-bin
  binlog_format         = mixed
  server-id               = 1
  
  quick
  max_allowed_packet = 16M
  
  no-auto-rehash
  
  key_buffer_size = 20M
  sort_buffer_size = 20M
  read_buffer = 2M
  write_buffer = 2M
  
  interactive-timeout
6. 启动mysql,不要配置开机自启动。
  service mysqld start
7. 修改管理员密码并测试
  # /usr/local/mysql/bin/mysqladmin -u root password 'admin' #设置管理员密码
  # /usr/local/mysql/bin/mysql -u root -p   #测试密码输入
8. 复制配置文件到app2
  # scp /etc/my.cnf app2:/etc/
9. app1关闭mysql并设置开机不启动
  # service mysqld stop
  # chkconfig mysqld off
10.将node2节点上的DRBD设置为主节点并挂载
(1) app1卸载/dev/drbd0
  # umount /data/
  # drbdadm secondary data
  # drbd-overview
  0:web/0Connected Secondary/Secondary UpToDate/UpToDate C r-----
(2) app2配置drbd为主后,测试mysql的启动。
  # drbdadm primary data
  # drbd-overview
  0:web/0Connected Primary/Secondary UpToDate/UpToDate C r-----
  # mkdir /data
  # mount /dev/drbd0 /data/
  # service mysqld start
四、corosync+pacemaker
1. app1,app2配置安装corosync pacemaker
  # yum install corosync pacemaker -y
2. app1,app2安装crmsh
  RHEL自6.4起不再提供集群的命令行配置工具crmsh,要实现对集群资源管理,还需要独立安装crmsh。
  crmsh的rpm安装可从如下地址下载:http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
  # yum install python-dateutil -y
  说明:python-pssh、pssh依懒于python-dateutil包
  # rpm -ivh pssh-2.3.1-4.2.x86_64.rpm python-pssh-2.3.1-4.2.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm

  warning: pssh-2.3.1-4.2.x86_64.rpm: Header V3 RSA/SHA1 Signature, key>  Preparing...                ###########################################
  1:python-pssh            ########################################### [ 33%]
  2:pssh                   ########################################### [ 67%]
  3:crmsh                  ###########################################
  #
  #
3. 创建corosync配置文件,app1,app2一样。
  cd /etc/corosync/
  cp corosync.conf.example corosync.conf
  vi /etc/corosync/corosync.conf
  # Please read the corosync.conf.5 manual page
  compatibility: whitetank
  totem {
  version: 2
  secauth: on
  threads: 0
  interface {
  ringnumber: 0
  bindnetaddr: 10.10.10.0
  mcastaddr: 226.94.8.8
  mcastport: 5405
  ttl: 1
  }
  }
  logging {
  fileline: off
  to_stderr: no
  to_logfile: yes
  to_syslog: no
  logfile: /var/log/cluster/corosync.log
  debug: off
  timestamp: on
  logger_subsys {
  subsys: AMF
  debug: off
  }
  }
  amf {
  mode: disabled
  }
  service {
  ver:1
  name: pacemaker
  }
  aisexec {
  user: root
  group:root
  }
4. 创建认证文件,app1,app2一样
  各节点之间通信需要安全认证,需要安全密钥,生成后会自动保存至当前目录下,命名为authkey,权限为400。
  # corosync-keygen
  Corosync Cluster Engine Authentication key generator.
  Gathering 1024 bits for key from /dev/random.
  Press keys on your keyboard to generate entropy.
  Press keys on your keyboard to generate entropy (bits = 128).
  Press keys on your keyboard to generate entropy (bits = 192).
  Press keys on your keyboard to generate entropy (bits = 256).
  Press keys on your keyboard to generate entropy (bits = 320).
  Press keys on your keyboard to generate entropy (bits = 384).
  Press keys on your keyboard to generate entropy (bits = 448).
  Press keys on your keyboard to generate entropy (bits = 512).
  Press keys on your keyboard to generate entropy (bits = 576).
  Press keys on your keyboard to generate entropy (bits = 640).
  Press keys on your keyboard to generate entropy (bits = 704).
  Press keys on your keyboard to generate entropy (bits = 768).
  Press keys on your keyboard to generate entropy (bits = 832).
  Press keys on your keyboard to generate entropy (bits = 896).
  Press keys on your keyboard to generate entropy (bits = 960).
  Writing corosync key to /etc/corosync/authkey.
  #
5. 将刚才配置的三个文件同步至app2,同步过去后要修改ha.cf文件中的心跳IP
  # scp authkeys corosync.confroot@app2:/etc/corosync/
6. 启动corosync\pacemaker服务,测试能否正常提供服务
  节点1:
  # service corosync start
  Starting Corosync Cluster Engine (corosync):               
  # service pacemaker start
  Starting Pacemaker Cluster Manager                        
  配置服务开机自启动:
  chkconfig corosync on
  chkconfig pacemaker on
  节点2:
  # service corosync start
  Starting Corosync Cluster Engine (corosync):               
  # service pacemaker start
  Starting Pacemaker Cluster Manager                        
  配置服务开机自启动:
  chkconfig corosync on
  chkconfig pacemaker on
7. 测试corosync,pacemaker,crmsh安装情况
(1) 查看节点情况
  # crm status
  Last updated: Tue Jan 26 13:13:19 2016
  Last change: Mon Jan 25 17:46:04 2016 via cibadmin on app1

  Stack:>  Current DC: app1 - partition with quorum
  Version: 1.1.10-14.el6-368c726
  2 Nodes configured, 2 expected votes
  0 Resources configured
  Online: [ app1 app2 ]
(2) 查看端口启动情况
  # netstat -tunlp
  Active Internet connections (only servers)
  Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
  udp      0      0 10.10.10.25:5404            0.0.0.0:*                               2828/corosync
  udp      0      0 10.10.10.25:5405            0.0.0.0:*                               2828/corosync
  udp      0      0 226.94.8.8:5405             0.0.0.0:*                               2828/corosync
(3) 查看日志
  # tail -f/var/log/cluster/corosync.log
  可以查看日志中关键信息:
  Jan 23 16:09:30 corosync Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
  Jan 23 16:09:30 corosync Successfully read main configuration file '/etc/corosync/corosync.conf'.
  ....
  Jan 23 16:09:30 corosync Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
  Jan 23 16:09:31 corosync The network interface is now up.
  Jan 23 16:09:31 corosync A processor joined or left the membership and a new membership was formed.
  Jan 23 16:09:48 corosync A processor joined or left the membership and a new membership was formed.
  #
五、配置pacemaker
1. 基本配置
  corosync默认启用了stonith功能,而我们要配置的集群并没有stonith设备,因此在配置集群的全局属性时要对其禁用。
  # crm
  crm(live)# configure                                    ##进入配置模式
  crm(live)configure# property stonith-enabled=false      ##禁用stonith设备
  crm(live)configure# property no-quorum-policy=ignore      ##不具备法定票数时采取的动作
  crm(live)configure# rsc_defaults resource-stickiness=100##设置默认的资源黏性,只对当前节点有效。
  crm(live)configure# verify                              ##校验
  crm(live)configure# commit                              ##校验没有错误再提交
  crm(live)configure# show                                  ##查看当前配置
  node app1
  node app2
  property cib-bootstrap-options: \
  dc-version=1.1.11-97629de \
  cluster-infrastructure="classic openais (with plugin)" \
  expected-quorum-votes=2 \
  stonith-enabled=false \
  default-resource-stickiness=100 \
  no-quorum-policy=ignore
2. 资源配置
  #命令使用经验说明:verify报错的,可以直接退出,也可以采用edit编辑,修改正确为止。
  # crm configure edit可以直接编辑配置文件
(1) 添加VIP
  不要单个资源提交,等所有资源及约束一起建立之后提交。
  crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.0.26 cidr_netmask=24 nic=eth0:1 op monitor interval=30s timeout=20s on-fail=restart
  crm(live)configure# verify
(2) 添加drdb服务
  crm(live)configure# primitive mydrbd ocf:linbit:drbd params drbd_resource=data op monitor role=Master interval=20 timeout=30 op monitor role=Slave interval=30 timeout=30 op start timeout=240 op stop timeout=100
  crm(live)configure# verify
  把drbd设为主从资源:
  crm(live)configure# ms ms_mydrbd mydrbd meta master-max=1 master-node-max=1 clone-max=2clone-node-max=1 notify=true
  crm(live)configure# verify
(3) 文件系统挂载服务:
  crm(live)configure# primitive mystore ocf:heartbeat:Filesystem params device=/dev/drbd0 directory=/data fstype=ext4 op start timeout=60s op stop timeout=60s op monitor interval=30s timeout=40s on-fail=restart
  crm(live)configure# verify
(4) 创建约束,很关键,VIP,DRBD, 目录挂载均在一台节点上,而且VIP,目录挂载均依懒于主DRBD.
  创建组资源,vip与mystore一起。
  crm(live)configure# group g_service vip mystore
  crm(live)configure# verify
  创建位置约束,组资源的启动依懒于drbd主节点
  crm(live)configure# colocation c_g_service inf: g_service ms_mydrbd:Master
  创建位置约整,mystore存储挂载依赖于drbd主节点
  crm(live)configure# colocation mystore_with_drbd_master inf: mystore ms_mydrbd:Master
  启动顺序依懒,drbd启动后,创建g_service组资源
  crm(live)configure# order o_g_service inf: ms_mydrbd:promote g_service:start
  crm(live)configure# verify
  crm(live)configure# commit
(5) 增加mysql资源
  crm(live)# configure
  crm(live)configure# primitive mysqld lsb:mysqldop monitor interval=20 timeout=20 on-fail=restart
  创建mysql服务与g_service组在一起
  crm(live)configure# colocation mysqld_with_g_service inf: mysqld g_service
  crm(live)configure# verify
  crm(live)configure# show
  创建启动顺序,mysql服务在g_service组启动之后再启动
  crm(live)configure# order mysqld_after_g_service mandatory: g_service mysqld
  crm(live)configure# verify
  crm(live)configure# show
  crm(live)configure# commit
3. 配置完成后,查看状态
  # crm status
  Last updated: Fri Apr 29 14:59:14 2016
  Last change: Fri Apr 29 14:59:05 2016 via cibadmin on app1

  Stack:>  Current DC: app1 - partition with quorum
  Version: 1.1.10-14.el6-368c726
  2 Nodes configured, 2 expected votes
  5 Resources configured
  Online: [ app1 app2 ]
  Master/Slave Set: ms_mydrbd
  Masters: [ app1 ]
  Slaves: [ app2 ]
  mysqld (lsb:mysqld):   Started app1
  Resource Group: g_service
  vip      (ocf::heartbeat:IPaddr):      Started app1
  mystore    (ocf::heartbeat:Filesystem):    Started app1
  #
4. 模拟故障切换
(1) app1上操作standby
  # crm node standby app1
(2) app1再查看切换状态:状态转移都很成功。
  # crm status
  Last updated: Fri Apr 29 15:12:01 2016
  Last change: Fri Apr 29 15:01:49 2016 via crm_attribute on app1

  Stack:>  Current DC: app1 - partition with quorum
  Version: 1.1.10-14.el6-368c726
  2 Nodes configured, 2 expected votes
  5 Resources configured
  Node app1: standby
  Online: [ app2 ]
  Master/Slave Set: ms_mydrbd
  Masters: [ app2 ]
  Stopped: [ app1 ]
  mysqld (lsb:mysqld):   Started app2
  Resource Group: g_service
  vip      (ocf::heartbeat:IPaddr):      Started app2
  mystore    (ocf::heartbeat:Filesystem):    Started app2
  #
(3) app2上就可以测试mysql登录了:
  # mysql -uroot -padmin
  Warning: Using a password on the command line interface can be insecure.
  Welcome to the MySQL monitor.Commands end with ; or \g.

  Your MySQL connection>  Server version: 5.6.29-log MySQL Community Server (GPL)
  Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
  Oracle is a registered trademark of Oracle Corporation and/or its
  affiliates. Other names may be trademarks of their respective
  owners.
  Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  mysql> \q
  Bye
(4) app2上查看drbd挂载目录情况
  # df -h

  Filesystem                  >  /dev/mapper/vg_app2-lv_root   36G5.0G   29G16% /
  tmpfs                     1004M   29M976M   3% /dev/shm
  /dev/sda1                  485M   39M421M   9% /boot
  /dev/drbd0                   5.0G249M4.5G   6% /data
  #
  #
  #说明:切换测试时有时会出现警告提示,影响真实状态查看,可以采用如下方式清除,提示哪个资源报警就清哪个,清理后,再次crm status查看状态显示正常。
  Failed actions:
  mystore_stop_0 on app1 'unknown error' (1): call=97, status=complete, last-rc-change='Tue Jan 26 14:39:21 2016', queued=6390ms, exec=0ms
  # crm resource cleanup mystore
  Cleaning up mystore on app1
  Cleaning up mystore on app2
  Waiting for 2 replies from the CRMd.. OK
  #
5. 配置小结
  在切换的过程中最大的问题就是DRBD的同步问题,必竟数据都在磁盘上,如果不同步就会造成数据不一致的问题,standby模拟切换其实不能真实模拟drbd的故障转移的。因为在故障转移之后,drbd被stop之后,从库接管主节点会从因stop之后会出现unknownn状态,这时候需要做会数据初始化同步。

页: [1]
查看完整版本: Corosync+Pacemaker+DRBD+Mysql高可用HA配置