高可用Keepalived+MySQL Replication
mysql架构为主从形式,当master故障时,会自动切换到slave上,当然也可设置为双master,但是有个弊端,比如,当某用户发表文章时,由于此时主机的压力很大,假设落后2000s,那么这台主机就挂了,另一台主机接管,vip漂移到从机上时,因为同步延长大,用户刚才发表的文章还没有复制过来,于是用户又发表了一篇文章,当原来的master修好后,由于SQL和IO线程还处于开启状态,因此还会继续同步刚才没有复制完的数据,这时有可能把用户新发表的文章更改掉,造成用户数据丢失。主从架构,故障切换后,采取人工方式重新与新的master进行同步复制。1.1. 安装
# tar -zxvfkeepalived-1.2.7.tar.gz
# cd keepalived-1.2.7
#./configure --prefix=/usr/local/keepalived
........
Keepalived configuration
------------------------
Keepalived version : 1.2.7
Compiler : gcc
Compiler flags : -g -O2 -DETHERTYPE_IPV6=0x86dd
Extra Lib : -lpopt -lssl -lcrypto
Use IPVS Framework : Yes
IPVS sync daemon support : Yes
IPVS use libnl : No
Use VRRP Framework : Yes
Use VRRP VMAC : No
SNMP support : No
Use Debug flags : No
#
# mkdir/etc/keepalived
# cp/usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/
# cp/usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
# cp/usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
# cp/usr/local/keepalived/sbin/keepalived /usr/sbin/
# chkconfig--add keepalived
# chkconfig--level 345 keepalived on
#
1.1.1. keepalived安装成功检查
# ll
total 16
drwxr-xr-x 2 root root 4096 Jul 24 13:26bin
drwxr-xr-x 5 root root 4096 Jul 24 13:26etc
drwxr-xr-x 2 root root 4096 Jul 24 13:26sbin
drwxr-xr-x 3 root root 4096 Jul 24 13:26share
# cd etc/
# ll
total 12
drwxr-xr-x 3 root root 4096 Jul 24 13:26keepalived
drwxr-xr-x 3 root root 4096 Jul 24 13:26rc.d
drwxr-xr-x 2 root root 4096 Jul 24 13:26sysconfig
# cd keepalived/
1.2. 主服务器配置1.2.1. keepalived配置
# cd/etc/keepalived/
# vikeepalived.conf
global_defs {
router_id KeepAlive_Mysql
}
vrrp_script check_run {
script"/home/sh/mysql_check.sh"
interval 300
}
vrrp_sync_group VG1 {
group {
VI_1
}
}
vrrp_instance VI_1 {
state MASTER
#备服务器改为BACKUP
# state BACKUP
interface eth0
virtual_router_id 51
priority 100
#备服务器改为90
# priority 90
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
check_run
}
notify_master /home/sh/master.sh
notify_backup /home/sh/backup.sh
notify_stop /home/sh/stop.sh
virtual_ipaddress {
192.168.6.200
}
}
notify_master:状态改变为master后执行的脚本;
notify_backup:状态改变为backup后执行的脚本;
notify_stop:VRRP停止后执行的脚本
使用到四个脚本:mysql_check.sh、master.sh、backup.sh、stop.sh
1.2.2. 健康检查脚本
# vi mysql_check.sh
#!/bin/bash
./root/.bash_profile
count=1
whiletrue
do
mysql-e "show status;" > /dev/null 2>&1
i=$?
ps aux| grep mysqld | grep -v grep > /dev/null 2>&1
j=$?
if [ $i= 0 ] && [ $j = 0 ]
then
exit 0
else
if [ $i = 1 ] && [ $j = 0 ]
then
exit 0
else
if [ $count -gt 5 ]
then
break
fi
let count++
continue
fi
fi
done
/etc/init.d/keepalivedstop
首先判断同步复制是否执行完毕,如果没执行完毕,等1分钟,不论是否执行,都跳过,并停止同步复制进程,其次,更改前端程序连接的业务账号admin的权限和密码,并记录当前切换以后的日志和pos点。
1.2.3. 主服务器脚本
# vi master.sh
#!/bin/bash
./root/.bash_profile
Master_Log_File=$(mysql-e "show slave statusG" | grep -w Master_Log_File | awk -F":" '{print $2}')
Relay_Master_Log_File=$(mysql-e "show slave statusG" | grep -w Relay_Master_Log_File | awk-F": " '{print $2}')
Read_Master_Log_Pos=$(mysql-e "show slave statusG" | grep -w Read_Master_Log_Pos | awk-F": " '{print $2}')
Exec_Master_Log_Pos=$(mysql-e "show slave statusG" | grep -w Exec_Master_Log_Pos | awk-F": " '{print $2}')
i=1
whiletrue
do
if [$Master_Log_File = $Relay_Master_Log_File ] && [ $Read_Master_Log_Pos-eq $Exec_Master_Log_Pos ]
then
echo "ok"
break
else
sleep 1
if [ $i -gt 60 ]
then
break
fi
continue
let i++
fi
done
mysql-e "stop slave;"
mysql-e "set global innodb_support_xa=0;"
mysql-e "set global sync_binlog=0;"
mysql-e "set global innodb_flush_log_at_trx_commit=0;"
mysql-e "flush logs;GRANT ALL PRIVILEGES ON *.* TO 'admin'@'%' IDENTIFIED BY'123456';flush privileges;"
mysql-e "show master status;" > /tmp/master_status_$(date"+%y%m%d-%H%M").txt
1.2.4. 备服务器脚本
# vi backup.sh
#!/bin/bash
. /root/.bash_profile
mysql -e "GRANT ALL PRIVILEGES ON*.* TO 'admin'@'%' IDENTIFIED BY '1q2w3e4r';flush privileges;"
mysql -e "set globalevent_scheduler=0;"
mysql -e "set global innodb_support_xa=0;"
mysql -e "set globalsync_binlog=0;"
mysql -e "set globalinnodb_flush_log_at_trx_commit=0;"
1.2.5. keepalived停止后执行的脚本
stop.sh表示keepalived停止后执行的脚本,首先更改admin密码,其次设置参数,保证不丢失数据,最后查看是否有写操作,不论是否执行完毕,1分钟后退出:
# vi stop.sh
#!/bin/bash
. /root/.bash_profile
mysql -e "GRANT ALL PRIVILEGES ON*.* TO 'admin'@'%' IDENTIFIED BY '1q2w3e4r';flush privileges;"
mysql -e "set globalinnodb_support_xa=1;"
mysql -e "set globalsync_binlog=1;"
mysql -e "set globalinnodb_flush_log_at_trx_commit=1;"
M_File1=$(mysql -e "show masterstatusG" | awk -F': ' '/File/{print $2}')
M_Position1=$(mysql -e "show masterstatusG" | awk -F': ' '/Position/{print $2}')
sleep 1
M_File2=$(mysql -e "show masterstatusG" | awk -F': ' '/File/{print $2}')
M_Position2=$(mysql -e "show masterstatusG" | awk -F': ' '/Position/{print $2}')
i=1
while true
do
if [ $M_File1 = $M_File1 ] && [$M_Position1 -eq $M_Position2 ]
then
echo "ok"
break
else
sleep 1
if [ $i -gt 60 ]
then
break
fi
continue
let i++
fi
done
至此,配置完毕,只要把master关机或者把mysq停掉,vip就会漂移到slave上,在把原来的master修复好后,vip并不会自己漂移过去,仍旧会停留在slave上,这样做的好处是防止数据频繁切换导致数据不一致。
1.3. keepalived管理
查看进程:
# ps -aux|grepkeepalived
Warning: bad syntax, perhaps a bogus '-'?See /usr/share/doc/procps-3.2.7/FAQ
root 41470.00.04016 676 pts/1 R+13:42 0:00 grep keepalived
# servicekeepalived start
Starting keepalived:
# ps -aux|grepkeepalived
Warning: bad syntax, perhaps a bogus '-'?See /usr/share/doc/procps-3.2.7/FAQ
root 41580.00.05040 576 ? Ss13:42 0:00 keepalived -D
root 41590.00.150881428 ? S 13:42 0:00 keepalived -D
root 41600.10.05088 936 ? S 13:42 0:00 keepalived -D
root 41630.00.04016 672 pts/1 R+13:43 0:00 grep keepalived
# lsmod|grep ip_vs
ip_vs_rr 60813
ip_vs 780815 ip_vs_rr
# tail -f/var/log/messages
Jul 24 13:44:44 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:44:50 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server :443.
Jul 24 13:44:50 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:44:56 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:443.
Jul 24 13:44:56 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:44:56 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:44:56 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:44:56 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:45:02 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:443.
Jul 24 13:45:02 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:45:08 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:443.
Jul 24 13:45:08 gflinuxKeepalived_healthcheckers: Timeout connect, timeout server:1358.
Jul 24 13:45:08
页:
[1]