Linux集群之corosync+pacemaker实现MySQL高可用
一、实验拓扑http://s3.51cto.com/wyfs02/M00/6D/E6/wKioL1VuxNKg7ZAgAAVSnCqdVNE455.bmp
二、拓扑说明
分别在两个节点上实现部署MySQL,数据库文件存储在后端的NFS主机上,通过挂载至相应的节点上使用,在两个节点分别安装corosync和pacemaker用于实现MySQL的高可用,通过crmsh程序对pacemaker进行配置,当其中一个节点出现问题时用于前端访问的VIP地址将被移到另一个节点上,并挂载后端的NFS数据库存储文件,然后启动MySQL数据库程序,以实现在两个节点上实现MySQL高可用。
三、架构布置
服务器:CentOS 6.6 x86_64;
数据库IP地址即VIP:172.16.9.100;
两个节点分别是:node-02、node-03;相应的IP地址分别为:172.16.9.82,172.16.9.83;
NFS服务器:IP地址172.16.9.84、主机名为:node-04;
网关服务器:提供时间服务器,网关地址为:172.16.0.1
MySQL版本:mariadb-5.5.43-linux-x86_64.tar.gz
corosync版本:corosync-1.4.7-1.el6.x86_64
pacemaker版本:pacemaker-1.1.12-4.el6.x86_64
crmsh版本:crmsh-2.1-1.6.x86_64.rpm
crmsh依赖包:pssh-2.3.1-2.el6.x86_64.rpm
四、准备工作
在构建高可用集群服务器时需要做四个准备工作,分别是:
①节点间时间必须同步:使用ntp协议实现;
②节点间需要通过主机名互相通信,必须解析主机至IP地址;
(a)建议名称解析功能使用hosts文件来实现;
(b)通信中使用的名字与节点名字必须保持一致:“uname -n”命令,或“hostname”展示出的名字保持一致;
③考虑仲裁设备是否会用到;
④建立各节点之间的root用户能够基于密钥认证;
1)配置节点时间同步
配置时间同步使用ntpdate命令,建立一个定时任务,实现周期性的时间同步
# ntpdate 172.16.0.1
3Jun 08:56:53 ntpdate: step time server 172.16.0.1 offset 22520.390088 sec
# crontab -l
*/3 * * * * /usr/sbin/ntpdate 172.16.0.1&>/dev/null
# /usr/sbin/ntpdate172.16.0.1
3Jun 08:57:50 ntpdate: step time server 172.16.0.1 offset 23311.837688 sec
# crontab -l
*/3 * * * * /usr/sbin/ntpdate 172.16.0.1&>/dev/null
2)节点间基于主机名互相通信,在/etc/hosts文件中进行配置
# cat /etc/hosts
127.0.0.1localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.0.1 server.magelinux.com server
172.16.9.82 node-02 node2
172.16.9.83 node-03 node3
# cat /etc/hosts
127.0.0.1localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.0.1 server.magelinux.com server
172.16.9.82 node-02 node2
172.16.9.83 node-03 node3
3)节点之间基于root用户的密钥认证
# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key(/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in/root/.ssh/id_rsa.
Your public key has been saved in/root/.ssh/id_rsa.pub.
The key fingerprint is:
93:c1:f9:42:63:cd:2c:98:b3:a9:ef:7e:02:24:db:f2root@node-02
The key's randomart image is:
+--[ RSA 2048]----+
| |
| + = |
| + O + |
|..* * |
|=o S . |
|oo. o |
|o.. |
| E.. . |
| o+o |
+-----------------+
# ssh-copy-id -i.ssh/id_rsa.pub node3
Warning: Permanently added 'node3' (RSA) tothe list of known hosts.
root@node3's password:
Now try logging into the machine, with"ssh 'node3'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keysthat you weren't expecting.
# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key(/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in/root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
b7:b1:3b:ca:78:9c:bc:72:0e:a0:18:a6:8d:ac:d0:99root@node-03
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| |
| |
|.. . S o |
|+* + .. + |
|=.E.o .o |
|o .+* .. |
|. .==o.. |
+-----------------+
#
#ssh-copy-id -i .ssh/id_rsa.pub node2
root@node2's password:
Now try logging into the machine, with"ssh 'node2'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keysthat you weren't expecting.
测试节点间是否名密钥登陆
# ssh node3
Last login: Wed Jun3 02:28:06 2015 from 172.16.9.9
# exit
logout
Connection to node3 closed.
#
# ssh node2
Last login: Wed Jun3 02:41:17 2015 from 172.16.9.9
# exit
logout
Connection to node2 closed.
#
五、配置共享存储NFS
5.1 配置NFS
# mkdir /mydata/data -p
# cat /etc/exports
/web/htdoc172.16.0.0/16(rw)
/mydata/data 172.16.0.0/16(rw,no_root_squash)
#提示:建议在安装完数据库后所no_root_squash选择取消了,使用选择太危险。
5.2 设置共享文件/mydata/data文件的属主和属组
# userdel -r mysql
# useradd -r -u 336 mysql
# id mysql
uid=336(mysql) gid=336(mysql)groups=336(mysql)
# chown -R mysql.mysql/mydata/data/
5.3启动NFS服务
# service rpcbind start
# service nfs start
Starting NFS services:
Starting NFS quotas:
Starting NFS mountd:
Starting NFS daemon:
Starting RPC idmapd:
5.4 查看nfs共享存储文件
# showmount -e 172.16.9.84
Export list for 172.16.9.84:
/mydata/data 172.16.0.0/16
六、安装布置MySQL数据库
MySQL使用的是MairaDB的数据库,只有在两个节点中的其中一个节点初始化数据库就行,因为两个节点都是共享提供的一个数据库文件,如安装初始化数据库在node2上进行操作,在node3节点上不用初始化数据库其它操都是一样的,这里就不给出操作过程。
6.1 创建MariaDB运行的用户
# useradd -r -u 336 mysql
# id mysql
uid=336(mysql) gid=336(mysql)groups=336(mysql)
6.2 挂载NFS共享数据库目录
# mkdir /data
# mount -t nfs172.16.9.84:/mydata/data /data
# mount |tail -1
172.16.9.84:/mydata/data on /data type nfs(rw,vers=4,addr=172.16.9.84,clientaddr=172.16.9.82)
6.3 解压MariaDB程序包到/usr/local目录下
# tar xfmariadb-5.5.43-linux-x86_64.tar.gz -C /usr/local/ 6.4 创建软链接
# cd /usr/local/
# ln -smariadb-5.5.43-linux-x86_64/ mysql
6.5 初始化数据库
# cd mysql
# chown -Rroot.mysql ./*
# scripts/mysql_install_db--datadir=/data --user=mysql
6.6 提供MySQL的主配置文件
# mkdir /etc/mysql
# cpsupport-files/my-large.cnf /etc/mysql/my.cnf
6.7 编辑/etc/mysql/my.cnf配置文件
在/etc/mysql/my.cnf配置文件中在标签中添加数据库存放目录。
datadir = /data
innodb_file_per_table= on
skip_name_resolve = on
6.8 为MySQL提供服务脚本
# cpsupport-files/mysql.server /etc/rc.d/init.d/mysqld
# chmod +x/etc/rc.d/init.d/mysqld
# chkconfig --add mysqld
# chkconfig mysqld off
6.9 启动MariaDB服务进行测试
# service mysqld start
Starting MySQL....
# bin/mysql
Welcome to the MariaDB monitor.Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 5.5.43-MariaDB-log MariaDBServer
Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
MariaDB [(none)]> create database node2;
Query OK, 1 row affected (0.03 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| node2 |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.01 sec)
MariaDB [(none)]> exit
Bye
6.10 停止MySQL服务,卸载NFS共享存储/data
# service mysqld stop
Shutting down MySQL.
# umount /data
七、安装配置HA程序
7.1 安装corosync和pacemaker程序
在配置好yum源之后直接安装corosync和pacemaker程序,分别两个节点上执行yum 命令安装
yum install corosync pacemaker -y 7.2 corosync默认配置文件解释
corosync程序安装后配置文件所在目录为/etc/corosync目录下,启动程序/etc/init.d/corosync.
# cd /etc/corosync/
# ls
corosync.conf.examplecorosync.conf.example.udpuservice.d uidgid.d
# cpcorosync.conf.example corosync.conf
# egrep-v "#|^$" corosync.conf
compatibility: whitetank #是否兼容whitetank
totem { #用于定义底层信息层是如何通信的相关属性
version:2 #定义版本号
secauth:off #是否启用安全认证功能,启用后要使用corosync-keygen命令生成密钥
threads:0 #工作时所使用的线程数,“0”表示不基于线程模型,而是进程模型
interface{ #定义多个接口之间,基于哪个地址,哪个多播地址,监听什么端口完成多播通信;
ringnumber:0 #环数,有点类型于TTL值对方是否回传
bindnetaddr:192.168.1.0 #多播地址监听的IP网络地址
mcastaddr:239.255.1.1 #多播地址
mcastport:5405 #多播地址监听的端口
ttl:1 #指明TTL值
}
}
logging { #定义日志相关属性
fileline:off #
to_stderr:no #是否把日志输出为标准输出即屏幕
to_logfile:yes #开启记录在日志文件中
logfile:/var/log/cluster/corosync.log
to_syslog:yes#是否发往系统的日志文件中
debug:off
timestamp:on #是否在日志文件中开启时间戳功能,建议不开启
logger_subsys{ #日志文件是否记录子系统
subsys:AMF
debug:off
}
}
7.3 配置pacemaker
pacemaker与corosync结合运行pacemaker的运行方式有两种,一种是作为corosync的插件运行,另一种是以独立的守护进程运行,以CentOS 6中建议以插件的方式运行,不过这样日志中可能会用警告,可以忽略的。在corosync.conf文件后面添加如下内容:
service {
ver: 0
name: pacemaker
use_mgmtd:yes
}
aisexec {
user: root
grout:root
}
7.4为corosync提供密钥文件,它需要在/dev/random中读取1024个随机数
# corosync-keygen
Corosync Cluster Engine Authentication keygenerator.
Gathering 1024 bits for key from/dev/random.
Press keys on your keyboard to generateentropy.
Press keys on your keyboard to generateentropy (bits = 176).
#此时已经卡住了,说没有这么多个随机数,可以在打开一个终端,不断的敲击键盘,不过这么有一点的久,你可以在ftp下载一个大的文件,这样会产生大量的IO。
7.5 corosync+pacemaker最终配置文件
# egrep-v "#|^$" corosync.conf
compatibility: whitetank
totem {
version:2
secauth:on
threads:0
interface{
ringnumber:0
bindnetaddr:172.16.0.0
mcastaddr:239.255.9.9
mcastport:5405
ttl:1
}
}
logging {
fileline:off
to_stderr:no
to_logfile:yes
logfile:/var/log/cluster/corosync.log
to_syslog:no
debug:off
timestamp:on
logger_subsys{
subsys:AMF
debug:off
}
}
service {
ver: 0
name: pacemaker
use_mgmtd:yes
}
aisexec {
user: root
grout:root
}
7.6 将配置文件和密钥文件同步至node3节点
# scp authkeycorosync.conf node3:/etc/corosync/
authkey 100%128 0.1KB/s 00:00
corosync.conf 100% 2794 2.7KB/s 00:00
7.7 启动corosync服务
# service corosyncstart;ssh node3 'service corosync start'
Starting Corosync Cluster Engine(corosync): [ OK]
Starting Corosync Cluster Engine(corosync):
7.8安装crmsh
把准备好的程序直接使用yum进行安装,这样可以解决依赖关系,在生产环境中只需要选择一台节点上进行安装,在这里我们在两个节点上都进行安装,以方便测试。
# yum installcrmsh-2.1-1.6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm-y 八、配置高可用MySQL服务
8.1 初始化配置
# crm #切换至crm命令提示符
crm(live)# configure #切换至配置模式
crm(live)configure# propertystonith-enabled=false#禁用stonith设备,因为我们这里没有stonith设备所有要禁用
crm(live)configure# propertyno-quorum-policy=ignore #忽略集群中当节点数小于等于quorum,节点数将无法运行,默认是stop
crm(live)configure# verify#检验语法
crm(live)configure# commit#提交并保存服务立即生效
8.2 配置VIP资源
crm(live)configure# primitive mysqlipocf:heartbeat:IPaddr \
params ip=172.16.9.100 nic=eth0cidr_netmask=16 \
op monitor interval=10s timeout=20s
crm(live)configure# verify
#primitive:配置主资源即基本资源
#mysqlip :资源名,为VIP的地址
# ocf:heartbeat:IPaddr:表示为ocf风格的heartbeat中的IPaddr,用于设置IP地址
#parmas :参数,即ocf:heartbeat:IPaddr选项中的要进行配置的值
#ip=172.16.9.100 :设置IP地址为172.6.9.100
#nic :把VIP设置在哪块网卡上,可省
#cidr_netmask=16 :使用cidr风格的子网掩码格式
#op :表示此资源带的选项
#monitor :为监控操作
# interval:每隔多少时间监控一次
# timeout :每次监控超时时间
8.3 配置nfs挂载资源
crm(live)configure# primitive mysqlnfsocf:heartbeat:Filesystem \
paramsdevice="172.16.9.84:/mydata/data" directory="/data"fstype=nfs \
op monitor interval=20s timeout=40s opstart timeout=60s op stop timeout=60s
crm(live)configure# verify
# ocf:heartbeat:Filesystem: 示为ocf风格的heartbeat中的文件系统
# device="172.16.9.84:/mydata/data" :设备路径
# directory="/data" :挂载点
# fstype=nfs :文件系统类型
8.4 配置mysql服务资源
crm(live)configure# primitive mysqlserverlsb:mysqld op monitor interval=20s timeout=40s
crm(live)configure# verify
8.5 定义资源之间的启动顺序
只有在先启动VIP地址才能挂载NFS文件系统,挂载成功后才能启动MySQL服务,这里通过group的方式来进行定义资源之间的启动顺序。
crm(live)configure# group mysqlservicemysqlip mysqlnfs mysqlserver
crm(live)configure# verify
crm(live)configure# commit
九、测试资源
9.1查看集群资源的运行状态
# crm status
Last updated: Wed Jun3 11:18:21 2015
Last change: Wed Jun3 11:15:35 2015
Stack: classic openais (with plugin)
Current DC: node-02 - partition with quorum#当前DC,拥不拥法定票数
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes#有几个节点,有几票
3 Resources configured#当前配置的资源数
Online: [ node-02 node-03 ]#在线的节点
Resource Group: mysqlservice
mysqlip (ocf::heartbeat:IPaddr): Started node-02
mysqlnfs (ocf::heartbeat:Filesystem): Started node-02
mysqlserver (lsb:mysqld): Started node-02
9.2 测试MySQL服务
# /usr/local/mysql/bin/mysql
Welcome to the MariaDB monitor.Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 5.5.43-MariaDB-log MariaDBServer
Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
MariaDB [(none)]> create databasetestnode2; #创建测试数据库testnode2
Query OK, 1 row affected (0.02 sec)
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| node2 |
| performance_schema |
| test |
| testnode2 |
+--------------------+
6 rows in set (0.13 sec)
9.2 将节点node2设置为备用节点
把节点node2设置为备用节点后资源将会转移至node3节点,然后将node2节点上线,此时资源并不会转移至node2节点上,因为没有配置资源对节点的倾向性或资源之间在一起的参数。
# crm node standby
# crm status
Last updated: Wed Jun3 11:25:28 2015
Last change: Wed Jun3 11:25:21 2015
Stack: classic openais (with plugin)
Current DC: node-02 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Node node-02: standby #节点node2已经是standby的状态
Online: [ node-03 ]
Resource Group: mysqlservice #资源都已经转移至node3节点
mysqlip (ocf::heartbeat:IPaddr): Started node-03
mysqlnfs (ocf::heartbeat:Filesystem): Started node-03
mysqlserver (lsb:mysqld): Started node-03
将node2上线,通过状态发现资源并没有转移回至node3节点
# crm node online
# crm status
Last updated: Wed Jun3 11:27:13 2015
Last change: Wed Jun3 11:27:10 2015
Stack: classic openais (with plugin)
Current DC: node-02 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node-02 node-03 ]
Resource Group: mysqlservice
mysqlip (ocf::heartbeat:IPaddr): Started node-03
mysqlnfs (ocf::heartbeat:Filesystem): Started node-03
mysqlserver (lsb:mysqld): Started node-03
9.3 在节点node3上进行测试MySQL服务
#/usr/local/mysql/bin/mysql
Welcome to the MariaDB monitor.Commands end with ; or \g.
Your MariaDB connection id is 2
Server version: 5.5.43-MariaDB-log MariaDBServer
Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| node2 |
| performance_schema |
| test |
| testnode2 |
+--------------------+
6 rows in set (0.04 sec)
MariaDB [(none)]> create databasetestnode3;
Query OK, 1 row affected (0.07 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.01 sec)
MariaDB [(none)]>
通过上面的测试MySQL已经在节点node2、node3上能实现高可用,在节点node2上创建的数据库文件在挂载至节点node3上并没有丢失。
9.4 模拟数据库不小心关闭服务
这里通过直接把mysqld进程直接killall,检查HA会不会再次启动MySQL服务
# ss -tanp|grep":3306" #查看MySQL进程
LISTEN 0 50 *:3306 *:* users:(("mysqld",5781,15))
# killall mysqld #killall所有的MySQL进程
# killall mysqld
mysqld: no process killed
# ss -tanp|grep":3306"
# ss -tanp|grep":3306"
# ss -tanp|grep ":3306"
# crm status #查看集群的状态,MySQL资源已经停止
Last updated: Wed Jun3 11:41:45 2015
Last change: Wed Jun3 11:41:24 2015
Stack: classic openais (with plugin)
Current DC: node-02 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node-02 node-03 ]
Resource Group: mysqlservice
mysqlip (ocf::heartbeat:IPaddr): Started node-03
mysqlnfs (ocf::heartbeat:Filesystem): Started node-03
mysqlserver (lsb:mysqld): Stopped
# ss -tanp|grep":3306"
# ss -tanp|grep":3306"
# ss -tanp|grep":3306"
# ss -tanp|grep":3306"
# ss -tanp|grep":3306" #先行几秒后MySQL服务再次自动的被集群启动
LISTEN 0 50 *:3306 *:* users:(("mysqld",11361,15))
# crm status
Last updated: Wed Jun3 11:37:36 2015
Last change: Wed Jun3 11:27:10 2015
Stack: classic openais (with plugin)
Current DC: node-02 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node-02 node-03 ]
Resource Group: mysqlservice
mysqlip (ocf::heartbeat:IPaddr): Started node-03
mysqlnfs (ocf::heartbeat:Filesystem): Started node-03
mysqlserver (lsb:mysqld): Started node-03
# mysql
Welcome to the MySQL monitor.Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.43-MariaDB-log MariaDBServer
Copyright (c) 2000, 2013, Oracle and/or itsaffiliates. All rights reserved.
Oracle is a registered trademark of OracleCorporation and/or its
affiliates. Other names may be trademarksof their respective
owners.
Type 'help;' or '\h' for help. Type '\c' toclear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| node2 |
| performance_schema |
| test |
| testnode2 |
| testnode3 |
+--------------------+
7 rows in set (0.02 sec)
小结:
一个简单的MySQL高可用就配置完毕,NFS服务器将是集群节点中的一个单点故障所在,NFS所能实现的并发访问量也是有现的,整个架构并不能实现完整的高可用,还有很多地方需要改进。
欢迎各位观客为小乌提出宝贵的意见,小乌等待你。。。。。
奋斗的年纪,绝不能怠慢自己!
页:
[1]