|
keepalived+redis 实现高可用的自动故障转移failover
在A服务器(10.0.11.2),B服务器(10.0.12.2)上均安装redis,keepalived(安装方法略)
A作为默认的master,B作为slave(在redis的配置文件中加上 SLAVEOF 10.0.11.2 6379)即可
A,B上的Redis均开启本地化策略。appendonly yes
A服务器的配置
keepalived配置文件内容
-------begin------
! Configuration File for keepalived
global_defs {
lvs_id LVS_redis
}
vrrp_script chk_redis {
script "/opt/redis/sh/redis_check.sh"
weight -20
interval 2
}
vrrp_instance VI_1 {
state backup
#state MASTER
interface bond0
virtual_router_id 51
nopreempt
priority 200
advert_int 5
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_redis
}
virtual_ipaddress {
10.0.11.0
}
notify_master /opt/redis/sh/redis_master.sh
notify_backup /opt/redis/sh/redis_backup.sh
notify_fault /opt/redis/sh/redis_fault.sh
notify_stop /opt/redis/sh/redis_stop.sh
}
-----end-----
说明:
global_defs 部分的邮件可以随便写,要实现邮件通知则要按真实填写
script "/opt/redis/sh/redis_check.sh" #监控脚本的路径
weight -20 #redis连接失败优先级-20,优先级调整则会触发keepalived的状态转移,vip同时会漂移
interval 5 #监控的频率
state MASTER #默认的状态
state backup #备份状态 这里两个都设置为备份状态,
nopreempt 设置为不抢占,靠优先级来确定谁是master
interface bond0 #网卡名
virtual_router_id 51 #A,B 服务器设置一样即可
priority 200 #优先级 比B设大即可
advert_int 2 #貌似广播的频率,不确定
authentication { #A,B 服务器设置一样即可
auth_type PASS
auth_pass 1111
}
track_script { #监控的名称,上面设置的
chk_redis
}
virtual_ipaddress { #虚拟IP, 客户端就用这个IP来访问redis
10.0.11.0
}
#以下是各个状态下执行的脚本路径
notify_master /opt/redis/sh/redis_master.sh #成为master
notify_backup /opt/redis/sh/redis_backup.sh #成为backup
notify_fault /opt/redis/sh/redis_fault.sh #监控脚本 exit 1 时
notify_stop /opt/redis/sh/redis_stop.sh #keepalived 服务停止时
/opt/redis/sh/redis_check.sh
--begin--
#!/bin/bash
ALIVE=`/usr/local/bin/redis-cli PING`
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
if [ "$ALIVE" == "PONG" ]; then
echo $ALIVE
#echo "check master pong" >> $LOGFILE
exit 0
else
echo $ALIVE
exit 1
fi
--end--
/opt/redis/sh/redis_master.sh
--begin--
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF 10.0.12.2 6379 >> $LOGFILE 2>&1
sleep 15
echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
--end--
/opt/redis/sh/redis_backup.sh
--begin--
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1
sleep 15
echo "Run SLAVEOF cmd..." >> $LOGFILE
$REDISCLI SLAVEOF 10.0.12.2 6379 >> $LOGFILE 2>&1
--end--
/opt/redis/sh/redis_fault.sh
--begin--
#!/bin/bash
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[fault]" >> $LOGFILE
date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
--end--
/opt/redis/sh/redis_stop.sh
--begin--
#!/bin/bash
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[stop]" >> $LOGFILE
date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
--end--
B服务器的配置
keepalived配置文件内容
-------begin------
! Configuration File for keepalived
global_defs {
lvs_id LVS_redis
}
vrrp_script chk_redis {
script "/opt/redis/sh/redis_check.sh"
weight -20
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface bond0
virtual_router_id 51
priority 190
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_redis
}
virtual_ipaddress {
10.0.11.0
}
notify_master /opt/redis/sh/redis_master.sh
notify_backup /opt/redis/sh/redis_backup.sh
notify_fault /opt/redis/sh/redis_fault.sh
notify_stop /opt/redis/sh/redis_stop.sh
}
-----end-----
/opt/redis/sh/redis_check.sh
--begin--
#!/bin/bash
ALIVE=`/usr/local/bin/redis-cli -h 10.0.11.2 -p 6379 PING`
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
if [ "$ALIVE" != "PONG" ]; then
echo $ALIVE
exit 0
else
echo $ALIVE
exit 1
fi
--end--
/opt/redis/sh/redis_master.sh
--begin--
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE 2>&1
##echo "master Run SLAVEOF 10.0.11.2 cmd ..." >> $LOGFILE
##REDISCLI SLAVEOF 10.0.11.2 6379 >> $LOGFILE 2>&1
#sleep 10
echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
--end--
/opt/redis/sh/redis_backup.sh
--begin--
#!/bin/bash
REDISCLI="/usr/local/bin/redis-cli"
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Being slave...." >> $LOGFILE 2>&1
#sleep 10
echo "backup Run SLAVEOF 10.0.11.2 cmd..." >> $LOGFILE
$REDISCLI SLAVEOF 10.0.11.2 6379 >> $LOGFILE 2>&1
--end--
/opt/redis/sh/redis_fault.sh
--begin--
#!/bin/bash
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[fault]" >> $LOGFILE
date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
--end--
/opt/redis/sh/redis_stop.sh
--begin--
#!/bin/bash
LOGFILE="/opt/redis/logs/keepalived-redis-state.log"
echo "[stop]" >> $LOGFILE
date >> $LOGFILE
sh /opt/redis/sh/redis_backup.sh
--end--
脚本说明:
脚本的逻辑就是当A,B上的redis服务正常是A为master,B为slave
如果检测到A服务不正常则B成为master, “/usr/local/bin/redis-cli SLAVEOF NO ONE” 这个命令就是关闭数据同步,变成Redis 的master.
如果A服务起来后,A切回master,在变成master前从B上同步最新的数据。同时在B上要 执行 “/usr/local/bin/redis-cli SLAVEOF 10.0.11.2 6379”
让B再做为A的slave,不然B还是master.
在redis的主从架构中,可以用“/usr/local/bin/redis-cli -h 10.0.11.2 INFO” 来查看各个当前的状态。 看当前服务器是master还是slave;
在命令行下
“tail -30 /opt/redis/logs/keepalived-redis-state.log” 查看keepalived的状态转换
“tail -30 /var/log/messages” 查看 keepalived虚拟IP的变化。
以上是在实际生产环境中测试过的,虽然有的的地方可能不大合理,但故障转移可以实现,数据也不会丢。之前按网上的教程做的vip可以切换,但数据这块有问题,所以改成这样。
如有更好的方法请告知,多谢!
keepalived运行原理
keepalived默认只能做到对网络故障和keepalived本身的监控,即当出现网络故障或者keepalived本身出现问题时,进行切换。但我们更关注的是机器上运行的业务,如果业务出问题了VIP没有变化,整体来说还是失败的。这时候就需要根据业务进程的运行状态决定是否需要进行主备切换。还好keepalived提供了这样一个自定义脚本监控功能,用这个来实现业务的控制
方案的整体思路:
通过keepalived的自定义脚本功能监控本机的redis服务状态,当监控脚本检测到redis服务出现异常时,则改变本机keepalived的优先级,同时这会导致master/backup角色的变化,而keepalived在角色变化时也会触发一些机制执行相关脚本,这就为我们改变redis的master/slave状态提供了机会,这样做的目的是为了是redis的master/slave直接的数据保持一致。
在keepalived+redis的使用过程中有四种情况:
1 一种是keepalived挂了,同时redis也挂了,这样的话直接VIP飘走之后,是不需要进行redis数据同步的,因为redis挂了,你也无法去master上同步,不过会损失已经写在master上却还没同步到slave上面的这部分数据。
2 另一种是keepalived挂了,redis没挂,这时候VIP飘走后,redis的master/slave还是老的对应关系,如果不变化的话会把数据写入redis slave中,从而不会同步到master上去,这就要借助监控脚本反转redis的master/slave关系。这时候就要预留一点时间进行数据同步,然后反转master/slave。
3 还有一种是keepalived没挂,redis挂了,这时候根据监控脚本会检测到redis挂了,并且降低keepalived master的优先级,同样会导致VIP飘走,情况和第二种一样,也是需要进行数据同步,然后反转当前redis的master/slave关系的。
4 随后一种是keepalived没挂,redis也没挂,大吉大利啊,什么都不用操作。
版权声明:本文为博主原创文章,未经博主允许不得转载。 |
|
|