34fw 发表于 2015-1-14 08:50:24

keepalived+redis主从自动切换

目录
keepalived+redis主从自动切换... 1
设计思路:... 1
安装:... 2
主备机都安装keepalived. 2
主备机都安装redis. 3
主机配置:... 3
状态为master的脚本:... 5
状态为slave 的脚本... 6
状态为fault 的脚本... 6
状态为stop 的脚本... 7
备机配置:... 7
状态为master的脚本:... 9
状态为slave 的脚本... 10
状态为fault 的脚本... 10
状态为stop 的脚本... 11
流程测试:... 11
模拟故障产生:... 12











环境: centos 5.8 64bit
keepalived : keepalived-1.1.15
redis: redis-2.8.19

版本历史
时间版本说明编写者
2015-1-91.1keepalived+redis主从自动切换csc







设计思路: 在keepalived+redis的使用过程中有四种情况:1 一种是keepalived挂了,同时redis也挂了,这样的话直接VIP飘走之后,是不需要进行redis数据同步的,因为redis挂了,你也无法去master上同步,不过会损失已经写在master上却还没同步到slave上面的这部分数据。2 另一种是keepalived挂了,redis没挂,这时候VIP飘走后,redis的master/slave还是老的对应关系,如果不变化的话会把数据写入redis slave中,从而不会同步到master上去,这就要借助监控脚本反转redis的master/slave关系。这时候就要预留一点时间进行数据同步,然后反转master/slave。3 还有一种是keepalived没挂,redis挂了,这时候根据监控脚本会检测到redis挂了,并且降低keepalived master的优先级,同样会导致VIP飘走,情况和第二种一样,也是需要进行数据同步,然后反转当前redis的master/slave关系的。4 随后一种是keepalived没挂,redis也没挂,大吉大利啊,什么都不用操作。本文的实验环境四种情况都适合,第一种是不需要同步数据的,脚本会默认去同步数据,但是其实是不会成功的。脚本主要是用来处理第二和第三种情况的。
安装:主备机都安装keepalivedyum -y install ipvsadm (好像可以不用安装)
lsmod ip_vs
modprobe ip_vs

tar -xvzf keepalived-1.1.15.tar.gz
cdkeepalived-1.1.15
./configure makemake install

mkdir/etc/keepalivedcp/usr/local/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/cp/usr/local/etc/sysconfig/keepalived /etc/sysconfig/cp/usr/local/etc/keepalived/keepalived.conf /etc/keepalived/cp/usr/local/sbin/keepalived/usr/sbin//etc/init.d/keepalivedstart
ps -ef|grepkeepalived


主备机都安装redis并配置主从服务。参考redis主从配置。

主机配置:#cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {    script"/etc/keepalived/scripts/redis_check.sh"    interval 1   #检查时间间隔   weight -20#注意这里必须写,根据script 返回的结果1来更改优先级,进行主备切换;返回0则不更改。很多网站都漏写这行,导致不能切换#脚本结果导致的优先级变更:10表示优先级+20;-20则表示优先级-20} 定义好vrrp_script代码块之后,就可以在instance中使用了


vrrp_instance VI_1 {    state MASTER    interfaceeth0   virtual_router_id 51    nopreempt    priority 100    advert_int 1##心跳广播时间间隔 秒默认检测三次,一共是1秒x3=3秒,延长3秒开始切换脚本,是为了让redis_backup/master.sh脚本有充足的时间执行完毕   authentication {       auth_type PASS       auth_pass 1111    }
    track_script{       chk_redis    }
   virtual_ipaddress {       10.8.10.130    }

notify_master /etc/keepalived/scripts/redis_master.sh notify_backup /etc/keepalived/scripts/redis_slave.sh notify_fault /etc/keepalived/scripts/redis_fault.sh notify_stop   /etc/keepalived/scripts/redis_stop.sh}
notify_stop       keepalived停止运行前运行notify_stop指定的脚本。notify_master   keepalived切换到master时执行的脚本 notify_backup   keepalived切换到backup时执行的脚本 notify_fault   keepalived出现故障时执行的脚本
       在script下有五个脚本,一个是检测redis状态的redis_check.sh脚本,其余四个是keepalived状态变化时执行的脚本。keepalived有master/backup/stop/fault四种状态,因为我们主要是关注系统上的业务,所以在在keepalived进入fault/stop状态后,也认为是进入了backup状态,需要对redis的master/slave关系进行反转,否则即使VIP漂移过去,但是redis的主从关系还没有改变,会导致数据不一致,所以最终四个脚本只有两种内容。   还有个问题需要注意:当master down了,backup接管了,master再次起来,不能再成为master。否则master恢复了再接管的话,会造成业务来回切换,这时候就需要nopreempt参数了。nopreempt:设置不抢占,这里只能设置在state为backup的节点上,而且这个节点的优先级必须别另外的高。
状态为master的脚本:# cat/etc/keepalived/scripts/redis_master.sh #!/bin/bash REDISCLI="/usr/local/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log"
echo "" >> $LOGFILE date >> $LOGFILE echo "Being master...." >> $LOGFILE2>&1
echo "Run SLAVEOF cmd ..." >> $LOGFILE$REDISCLI SLAVEOF 10.8.10.128 6379 >>$LOGFILE2>&1 sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态echo "Run SLAVEOF NO ONE cmd ..." >>$LOGFILE $REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
##执行/usr/local/bin/redis-cliSLAVEOF NO ONE,让redis保持master状态
状态为slave 的脚本# cat/etc/keepalived/scripts/redis_slave.sh #!/bin/bash
REDISCLI="/usr/local/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log"
echo "" >> $LOGFILE date >> $LOGFILE echo "Being slave...." >> $LOGFILE 2>&1
sleep 15#延迟15秒待数据被对方同步完成之后再切换主从角色echo "Run SLAVEOF cmd ..." >> $LOGFILE$REDISCLI SLAVEOF 10.8.10.128 6379 >>$LOGFILE2>&1
状态为fault 的脚本# cat/etc/keepalived/scripts/redis_fault.sh #!/bin/bash LOGFILE=/var/log/keepalived-redis-state.log echo "" >> $LOGFILE date >> $LOGFILE
备机相同状态为stop 的脚本# cat/etc/keepalived/scripts/redis_stop.sh#!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "" >> $LOGFILE date >> $LOGFILE

备机相同


备机配置:
# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
vrrp_script chk_redis {    script"/etc/keepalived/scripts/redis_check.sh"    interval 1   #检查时间间隔   weight-20#注意这里必须写,根据script 返回的结果1来更改优先级,进行主备切换;返回0则不更改。很多网站都漏写该行,导致不能切换。#脚本结果导致的优先级变更:10表示优先级+20;-20则表示优先级-20}
定义好vrrp_script代码块之后,就可以在instance中使用了
vrrp_instance VI_1 {    state BACKUP    interfaceeth0   virtual_router_id 51    priority 90    advert_int 1##心跳广播时间间隔 秒 默认检测三次,一共是1秒x3=3秒,延长3秒开始切换脚本,是为了让redis_backup/master.sh脚本有充足的时间执行完毕   authentication {       auth_type PASS       auth_pass 1111    }
    track_script{       chk_redis    }
   virtual_ipaddress {       10.8.10.130    }
    notify_master /etc/keepalived/scripts/redis_master.sh     notify_backup /etc/keepalived/scripts/redis_slave.sh     notify_fault/etc/keepalived/scripts/redis_fault.sh     notify_stop/etc/keepalived/scripts/redis_stop.sh


}

状态为master的脚本:
# cat/etc/keepalived/scripts/redis_master.sh #!/bin/bash
REDISCLI="/usr/local/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log"
echo "" >> $LOGFILE date >> $LOGFILE echo "Being master...." >> $LOGFILE2>&1

echo "Run SLAVEOF cmd ..." >> $LOGFILE$REDISCLI SLAVEOF 10.8.10.115 6379 >>$LOGFILE2>&1 sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态echo "Run SLAVEOF NO ONE cmd ..." >>$LOGFILE $REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
##执行/usr/local/bin/redis-cliSLAVEOF NO ONE,让redis保持master状态

状态为slave 的脚本
# cat/etc/keepalived/scripts/redis_slave.sh #!/bin/bash
REDISCLI="/usr/local/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log"
echo "" >> $LOGFILE date >> $LOGFILE echo "Being slave...." >> $LOGFILE2>&1 sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色echo "Run SLAVEOF cmd ..." >> $LOGFILE$REDISCLI SLAVEOF 10.8.10.115 6379 >>$LOGFILE2>&1

状态为fault 的脚本
# cat /etc/keepalived/scripts/redis_fault.sh#!/bin/bash LOGFILE=/var/log/keepalived-redis-state.log echo "" >> $LOGFILE date >> $LOGFILE
主机相同
状态为stop 的脚本
# cat/etc/keepalived/scripts/redis_stop.sh #!/bin/bash
LOGFILE=/var/log/keepalived-redis-state.log
echo "" >> $LOGFILE date >> $LOGFILE
主机相同



流程测试: 1.启动Master上的Redis # pwd /usr/local/bin# ./redis-serverredis.conf   
2.启动Slave上的Redis # pwd /usr/local/bin# ./redis-server redis.conf   
3.启动Master上的Keepalived /etc/init.d/keepalived start   
4.启动Slave上的Keepalived /etc/init.d/keepalived start
5.尝试通过VIP连接Redis: #pwd /usr/local/bin# ./redis-cli -h 192.168.1.237 inforole:master slave0:192.168.1.236,6379,online 连接成功,Slave也连接上来了   6.尝试插入一些数据: # ./redis-cli -h 192.168.1.237 SET Hello Redis 从VIP读取数据# ./redis-cli -h192.168.1.237 GET Hello "Redis"    从Master读取数据 # ./redis-cli -h 192.168.1.235GET Hello "Redis"    从Slave读取数据# ./redis-cli -h192.168.1.235GET Hello"Redis"   
模拟故障产生:   将Master上的Redis进程杀死: #./redis-cli shutdown    查看Master上的Keepalived日志 # tail /var/log/keepalived-redis-state.logThu Sep 27 08:29:01 CST 2012    同时Slave上的日志显示:# tail/var/log/keepalived-redis-state.log Thu Nov 15 12:06:04 CST 2012 Being master....Run SLAVEOF cmd ... OKRun SLAVEOF NO ONE cmd ... OK   
然后我们可以发现,Slave已经接管服务,并且担任Master的角色了。 ./redis-cli -h192.168.1.237 info ./redis-cli -h 192.168.1.236 inforole:master 然后我们恢复Master的Redis进程主变成slave    然后把236redis停掉235恢复主的角色,在把236redis开启   恢复235是主,236是备   自动切换成功


页: [1]
查看完整版本: keepalived+redis主从自动切换