结合keepalived实现redis群集高可用故障自动切换

阿牛发表于 2018-11-7 08:06:30

　　系统架构图:

　　我们所要实现的目的很简单,
　　104,107 为keepalive和redis主从架构，其余服务器比如105、106均为redis从库并且挂在vip 192.168.56.180下面。
　　主keepalive负责主要日常工作，从keepalive担任备机角色，一旦主keepalive挂掉，从keepalive服务器立即使从redis转变角色切换成master状态开始接管任务提供服务，实现业务的无缝切换,当挂掉的服务器修好上线后继续担任主的角色，从库会自动切换到slave状态并且不影响挂载在vip下的从redis的数据同步，一满足高并发架构的需求。
　　keepalive在ubuntu的安装很简单
apt-get install libssl-dev　　
apt-get install openssl
　　
apt-get install libpopt-dev
　　
apt-get install keepalived
　　redis的安装也很简单:
　　网上很多方法，也可参考我的另一篇文章: redis的shell安装脚本,实现在linux下本机主从架构
　　网上有一篇郭冬的一篇文章给了我很大启发故拿来参考:通过Keepalived实现Redis Failover自动故障切换，
　　下面我们看192.168.56.104主keepalived的配置
　　/etc/keepalived/keepalived.conf
global_defs {　　notification_email {
　　409011500@qq.com
　　}
　　notification_email_from409011500@qq.com
　　smtp_server 127.0.0.1（如果本机配置的话）
　　smtp_connect_timeout 30
　　router_id redis-ha
　　
}
　　
vrrp_script chk_redis {
　　script "/home/lhb/sh/redis_check.sh" ###监控脚本
　　interval 2                                     ###监控时间
　　
}
　　
vrrp_instance VI_1 {
　　state MASTER                         ###设置为MASTER
　　interface eth0                      ###监控网卡
　　virtual_router_id 52
　　priority 101                         ###权重值
　　authentication {
　　auth_type PASS          ###加密
　　auth_pass redis          ###密码
　　}
　　track_script {
　　chk_redis                   ###执行上面定义的chk_redis
　　}
　　virtual_ipaddress {
　　192.168.56.180                      ###VIP
　　}
　　notify_master /home/lhb/sh/redis_master.sh
　　notify_backup /home/lhb/sh/redis_backup.sh
　　
}
　　/home/lhb/sh/redis_master.sh
#!/bin/bash　　
REDISCLI="/usr/local/redis/bin/redis-cli"
　　
LOGFILE="/usr/local/redis/log/keepalived-redis-state.log"
　　
echo "" >> $LOGFILE
　　
date >> $LOGFILE
　　
echo "Being master...." >> $LOGFILE 2>&1
　　
echo "Run SLAVEOF cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF 192.168.56.107 6379 >> $LOGFILE2>&1
　　
sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态
　　
echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
　　/home/lhb/sh/redis_backup.sh
#!/bin/bash　　
REDISCLI="/usr/local/redis/bin/redis-cli"
　　
LOGFILE="/usr/local/redis/log/keepalived-redis-state.log"
　　
echo "" >> $LOGFILE
　　
date >> $LOGFILE
　　
echo "Being slave...." >> $LOGFILE 2>&1
　　
sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色
　　
echo "Run SLAVEOF cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF 192.168.56.107 6379 >> $LOGFILE2>&1
　　/usr/local/redis/etc/redis.conf
daemonize yes　　
pidfile /var/run/redis.pid
　　
port 6379
　　
tcp-backlog 511
　　
timeout 0
　　
tcp-keepalive 0
　　
loglevel notice
　　
logfile "/usr/local/redis/log/redis.log"
　　
databases 16
　　
save 900 1
　　
save 300 10
　　
save 60 10000
　　
stop-writes-on-bgsave-error yes
　　
rdbcompression yes
　　
rdbchecksum yes
　　
dbfilename dump.rdb
　　
dir /usr/local/redis/data
　　
slave-serve-stale-data yes
　　
slave-read-only no
　　
repl-disable-tcp-nodelay no
　　
slave-priority 100
　　
appendonly yes
　　
appendfilename "appendonly.aof"
　　
appendfsync everysec
　　
no-appendfsync-on-rewrite no
　　
auto-aof-rewrite-percentage 100
　　
auto-aof-rewrite-min-size 64mb
　　
lua-time-limit 5000
　　
slowlog-log-slower-than 10000
　　
slowlog-max-len 128
　　
notify-keyspace-events ""
　　
hash-max-ziplist-entries 512
　　
hash-max-ziplist-value 64
　　
list-max-ziplist-entries 512
　　
list-max-ziplist-value 64
　　
set-max-intset-entries 512
　　
zset-max-ziplist-entries 128
　　
zset-max-ziplist-value 64
　　
activerehashing yes
　　
client-output-buffer-limit normal 0 0 0
　　
client-output-buffer-limit slave 256mb 64mb 60
　　
client-output-buffer-limit pubsub 32mb 8mb 60
　　
hz 10
　　
aof-rewrite-incremental-fsync yes
　　192.168.56.107从keepalived的配置
　　/etc/keepalived/keepalived.conf
global_defs {　　notification_email {
　　409011500@qq.com
　　}
　　notification_email_from 409011500@qq.com
　　smtp_server 127.0.0.1
　　smtp_connect_timeout 30
　　router_id redis-ha
　　
}
　　
vrrp_script chk_redis {
　　script "/home/lhb/sh/redis_check.sh" ###监控脚本
　　interval 2                                     ###监控时间
　　
}
　　
vrrp_instance VI_1 {
　　state BACKUP                            ###设置为BACKUP
　　interface eth0                            ###监控网卡
　　virtual_router_id 52
　　priority 100                            ###比MASTRE权重值低
　　authentication {
　　auth_type PASS
　　auth_pass redis             ###密码与MASTRE相同
　　}
　　track_script {
　　chk_redis                   ###执行上面定义的chk_redis
　　}
　　virtual_ipaddress {
　　192.168.56.180                      ###VIP
　　}
　　notify_master /home/lhb/sh/redis_master.sh
　　notify_backup /home/lhb/sh/redis_backup.sh
　　
}
　　/home/lhb/sh/redis_master.sh
#!/bin/bash　　
REDISCLI="/usr/local/redis/bin/redis-cli"
　　
LOGFILE="/usr/local/redis/log/keepalived-redis-state.log"
　　
echo "" >> $LOGFILE
　　
date >> $LOGFILE
　　
echo "Being master...." >> $LOGFILE 2>&1
　　
echo "Run SLAVEOF cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF 192.168.56.104 6379 >> $LOGFILE2>&1
　　
sleep 10 #延迟10秒以后待数据同步完成后再取消同步状态
　　
echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF NO ONE >> $LOGFILE 2>&1
　　/home/lhb/sh/redis_backup.sh
#!/bin/bash　　
REDISCLI="/usr/local/redis/bin/redis-cli"
　　
LOGFILE="/usr/local/redis/log/keepalived-redis-state.log"
　　
echo "" >> $LOGFILE
　　
date >> $LOGFILE
　　
echo "Being slave...." >> $LOGFILE 2>&1
　　
sleep 15 #延迟15秒待数据被对方同步完成之后再切换主从角色
　　
echo "Run SLAVEOF cmd ..." >> $LOGFILE
　　
$REDISCLI SLAVEOF 192.168.56.104 6379 >> $LOGFILE2>&1
　　/home/lhb/sh/redis_check.sh
#!/bin/bash　　
ALIVE=`/usr/local/redis/bin/redis-cli PING`
　　
if [ "$ALIVE" == "PONG" ]; then
　　echo $ALIVE
　　exit 0
　　
else
　　echo $ALIVE
　　exit 1
　　
fi
　　/usr/local/redis/etc/redis_slave.conf
daemonize yes　　
pidfile /var/run/redis_salve.pid
　　
port 6379
　　
tcp-backlog 511
　　
timeout 0
　　
tcp-keepalive 0
　　
loglevel notice
　　
logfile "/usr/local/redis/log/redis_slave.log"
　　
databases 16
　　
save 900 1
　　
save 300 10
　　
save 60 10000
　　
stop-writes-on-bgsave-error yes
　　
rdbcompression yes
　　
rdbchecksum yes
　　
dbfilename dump_salve.rdb
　　
dir /usr/local/redis/data
　　
slave-serve-stale-data yes
　　
slave-read-only no
　　
repl-disable-tcp-nodelay no
　　
slave-priority 100
　　
appendonly yes
　　
appendfilename "appendonly.aof"
　　
appendfsync everysec
　　
no-appendfsync-on-rewrite no
　　
auto-aof-rewrite-percentage 100
　　
auto-aof-rewrite-min-size 64mb
　　
lua-time-limit 5000
　　
slowlog-log-slower-than 10000
　　
slowlog-max-len 128
　　
notify-keyspace-events ""
　　
hash-max-ziplist-entries 512
　　
hash-max-ziplist-value 64
　　
list-max-ziplist-entries 512
　　
list-max-ziplist-value 64
　　
set-max-intset-entries 512
　　
zset-max-ziplist-entries 128
　　
zset-max-ziplist-value 64
　　
activerehashing yes
　　
client-output-buffer-limit normal 0 0 0
　　
client-output-buffer-limit slave 256mb 64mb 60
　　
client-output-buffer-limit pubsub 32mb 8mb 60
　　
hz 10
　　
aof-rewrite-incremental-fsync yes
　　
SLAVEOF 192.168.56.104 6379
　　192.168.56.105、192.168.56.106 redis配置文件相同:
　　/usr/local/redis/etc/redis_salve.conf
daemonize yes　　
pidfile /var/run/redis_salve.pid
　　
port 6379
　　
tcp-backlog 511
　　
timeout 0
　　
tcp-keepalive 0
　　
loglevel notice
　　
logfile "/usr/local/redis/log/redis_slave.log"
　　
databases 16
　　
save 900 1
　　
save 300 10
　　
save 60 10000
　　
stop-writes-on-bgsave-error yes
　　
rdbcompression yes
　　
rdbchecksum yes
　　
dbfilename dump_salve.rdb
　　
dir /usr/local/redis/data
　　
slave-serve-stale-data yes
　　
slave-read-only no
　　
repl-disable-tcp-nodelay no
　　
slave-priority 100
　　
appendonly no
　　
appendfilename "appendonly.aof"
　　
appendfsync everysec
　　
no-appendfsync-on-rewrite no
　　
auto-aof-rewrite-percentage 100
　　
auto-aof-rewrite-min-size 64mb
　　
lua-time-limit 5000
　　
slowlog-log-slower-than 10000
　　
slowlog-max-len 128
　　
notify-keyspace-events ""
　　
hash-max-ziplist-entries 512
　　
hash-max-ziplist-value 64
　　
list-max-ziplist-entries 512
　　
list-max-ziplist-value 64
　　
set-max-intset-entries 512
　　
zset-max-ziplist-entries 128
　　
zset-max-ziplist-value 64
　　
activerehashing yes
　　
client-output-buffer-limit normal 0 0 0
　　
client-output-buffer-limit slave 256mb 64mb 60
　　
client-output-buffer-limit pubsub 32mb 8mb 60
　　
hz 10
　　
aof-rewrite-incremental-fsync yes
　　
SLAVEOF 192.168.56.180 6379
　　好了，下面我们就来展示一下是否是我们期待的那样？
　　启动192.168.56.104，192.168.56.107上的redis,keepalvied，以及192.168.56.105,192.168.56.106上的redis
　　在192.168.56.104 我们看到以下结果，已经获取vip:192.168.56.180

　　在192.168.56.107 我们看到以下结果,并没有获得vip192.168.56.180

　　在192.168.56.105,192.168.56.106我们看到相同结果redis挂载在vip:192.168.56.180下

　　这时我们把192.168.56.104下的redis给关闭掉，看一下结果:vip已经释放

　　然后到192.168.56.107下看一下信息: 已经获取vip，并且redis已经从salve切换到master，并且从库105、106状态均为online

　　192.168.56.105、192.168.56.106 redis结果:一直挂载在vip下面并且link_status处于up状态

　　由此可见，当主redis挂掉后，备机上的redis立即切换为master,并且不影响业务正常运行。
　　我们然后把主上redis从新启动,看到下面结果: 主服务器获取vip

　　redis信息:我们可以看到192.168.56.107备机已经出现在master的slave列表中

　　然后切换到:192.168.56.107看一下信息: 已经释放vip，并且redis已经从master切换到slave状态，并且指向的master是192.168.56.104

　　在看192.168.56.105、192.168.56.106 redis信息,亦然指向192.168.56.180

　　看到这里，已经满足我们系统架构最初的所有设想功能。有感兴趣的同学可以一起交流。谢谢

页: [1]

运维网's Archiver

结合keepalived实现redis群集高可用故障自动切换