设为首页 收藏本站
查看: 2056|回复: 0

[经验分享] Redis的高可用(使用篇)

[复制链接]

尚未签到

发表于 2018-11-2 09:41:39 | 显示全部楼层 |阅读模式
  Redis的复制解决了单点问题,但主节点若出现故障,则要人工干预进行故障转移。先看看1主2从(master,slave-1和slave-2)的Redis主从模式下,如何进行故障转移的。
  1. 主节点发生故障后,客户端连接主节点失败,两个从节点与主节点连接失败造成复制中断。
  2. 需要选出一个从节点(slave-1),对其执行slaveof no one命令使其成为新的主节点(new-master)。
  3. 从节点(slave-1)成为新的主节点后,更新应用方的主节点信息,重新启动应用方。
  4. 客户端命令另一个从节点(slave-2)去复制新的主节点(new-master)。
  5. 待原来的主节点恢复后,让它去复制新的主节点。
  如上人工干预的过程,很难保证准确性,实效性,这正是Redis Sentinel要解决的问题。
  Redis Sentinel是一个分布式架构,其中包含若干个Sentinel节点和Redis数据节点。每个Sentinel节点会对数据节点和其余Sentinel节点进行监控,当它发现节点不可达时,会对节点做下线标识。如果被标识的是主数据节点,它还会和其它Sentinel节点进行协商,当大多数Sentinel节点都认为主节点不可达时,它们会选举出一个Sentinel节点来完成自动故障转移的工作,同时会将这个变化实时通知给Redis应用方。整个过程不需人工介入,有效的解决了Redis的高可用问题。
  部署Redis Sentinel的高可用架构
  1. 搭建3个Redis数据节点,初始状态:master节点,6479端口;slave-1节点,6480端口和slave-2节点,6481端口。
  127.0.0.1:6479> info replication
  # Replication
  role:master
  connected_slaves:2
  slave0:ip=127.0.0.1,port=6480,state=online,offset=845,lag=0
  slave1:ip=127.0.0.1,port=6481,state=online,offset=845,lag=0
  2. 搭建3个Sentinel节点,初始配置文件如下(3个节点分别对应26479,26480和26481端口):
  port 26479
  daemonize yes
  loglevel notice
  dir "/home/redis/stayfoolish/26479/data"
  logfile "/home/redis/stayfoolish/26479/log/sentinel.log"
  pidfile "/home/redis/stayfoolish/26479/log/sentinel.pid"
  unixsocket "/home/redis/stayfoolish/26479/log/sentinel.sock"
  # sfmaster
  sentinel monitor sfmaster 127.0.0.1 6479 2
  sentinel auth-pass sfmaster abcdefg
  sentinel down-after-milliseconds sfmaster 30000
  sentinel parallel-syncs sfmaster 1
  sentinel failover-timeout sfmaster 180000
  启动Sentinel节点,查看信息,可见其找到了主节点,发现了2个从节点,也发现了一共3个Sentinel节点。
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:1
  sentinel_tilt:0
  sentinel_running_scripts:0
  sentinel_scripts_queue_length:0
  sentinel_simulate_failure_flags:0
  master0:name=sfmaster,status=ok,address=127.0.0.1:6479,slaves=2,sentinels=3
  至此Redis Sentinel已经搭建起来了,有了Redis复制的基础,该过程还比较容易。
  下面kill -9杀掉6479主节点,模拟故障,通过日志查看下故障转移的过程。
  1. 杀掉6479主节点
  $ ps -ef | egrep 'redis-server.*6479' | egrep -v 'egrep' | awk '{print $2}' | xargs kill -9
  127.0.0.1:6479> info replication
  Could not connect to Redis at 127.0.0.1:6479: Connection refused
  not connected>
  2. 看下Redis节点6480端口的日志,显示了无法连接6479端口,被Sentinel节点提升为新主节点,和响应6481端口复制请求的过程。
  ~/stayfoolish/6480/log $ tail -f redis.log
  20047:S 22 Jul 03:03:22.946 # Error condition on socket for SYNC: Connection refused
  20047:S 22 Jul 03:03:23.954 * Connecting to MASTER 127.0.0.1:6479
  20047:S 22 Jul 03:03:23.955 * MASTER  SLAVE sync started
  20047:S 22 Jul 03:03:23.955 # Error condition on socket for SYNC: Connection refused
  ...
  20047:S 22 Jul 03:03:38.061 * MASTER  SLAVE sync started
  20047:S 22 Jul 03:03:38.061 # Error condition on socket for SYNC: Connection refused
  20047:M 22 Jul 03:03:38.963 * Discarding previously cached master state.

  20047:M 22 Jul 03:03:38.963 * MASTER MODE enabled (user request from 'id=27 addr=127.0.0.1:37972 fd=10 name=sentinel-68102904-cmd age=882>  20047:M 22 Jul 03:03:38.963 # CONFIG REWRITE executed with success.
  20047:M 22 Jul 03:03:40.075 * Slave 127.0.0.1:6481 asks for synchronization
  20047:M 22 Jul 03:03:40.076 * Full resync requested by slave 127.0.0.1:6481
  20047:M 22 Jul 03:03:40.077 * Starting BGSAVE for SYNC with target: disk
  20047:M 22 Jul 03:03:40.077 * Background saving started by pid 20452
  20452:C 22 Jul 03:03:40.086 * DB saved on disk
  20452:C 22 Jul 03:03:40.086 * RDB: 0 MB of memory used by copy-on-write
  20047:M 22 Jul 03:03:40.175 * Background saving terminated with success
  20047:M 22 Jul 03:03:40.176 * Synchronization with slave 127.0.0.1:6481 succeeded
  看下6481端口的日志,显示了无法连接6479端口,接到Sentinel节点的命令,复制新的主节点的过程。
  ~/stayfoolish/6481/log $ tail -f redis.log
  20051:S 22 Jul 03:03:08.590 # Connection with master lost.
  20051:S 22 Jul 03:03:08.590 * Caching the disconnected master state.
  20051:S 22 Jul 03:03:08.844 * Connecting to MASTER 127.0.0.1:6479
  20051:S 22 Jul 03:03:08.844 * MASTER  SLAVE sync started
  20051:S 22 Jul 03:03:08.844 # Error condition on socket for SYNC: Connection refused
  ...
  20051:S 22 Jul 03:03:39.067 # Error condition on socket for SYNC: Connection refused
  20051:S 22 Jul 03:03:39.342 * Discarding previously cached master state.

  20051:S 22 Jul 03:03:39.342 * SLAVE OF 127.0.0.1:6480 enabled (user request from 'id=27 addr=127.0.0.1:38660 fd=10 name=sentinel-68102904-cmd age=883>  20051:S 22 Jul 03:03:39.343 # CONFIG REWRITE executed with success.
  20051:S 22 Jul 03:03:40.074 * Connecting to MASTER 127.0.0.1:6480
  20051:S 22 Jul 03:03:40.074 * MASTER  SLAVE sync started
  20051:S 22 Jul 03:03:40.074 * Non blocking connect for SYNC fired the event.
  20051:S 22 Jul 03:03:40.074 * Master replied to PING, replication can continue...
  20051:S 22 Jul 03:03:40.075 * Partial resynchronization not possible (no cached master)
  20051:S 22 Jul 03:03:40.084 * Full resync from master: 84b623afc0824be14bb9187245ff00cab43427c1:1
  20051:S 22 Jul 03:03:40.176 * MASTER  SLAVE sync: receiving 77 bytes from master
  20051:S 22 Jul 03:03:40.176 * MASTER  SLAVE sync: Flushing old data
  20051:S 22 Jul 03:03:40.176 * MASTER  SLAVE sync: Loading DB in memory
  20051:S 22 Jul 03:03:40.176 * MASTER  SLAVE sync: Finished with success
  3. 看下Sentinel节点26479,26480和26481端口的日志,显示了Sentinel节点如何配合完成故障转移的(背后的原理下篇再说)。
  ~/stayfoolish/26479/log $ tail -f sentinel.log
  20169:X 22 Jul 03:03:38.720 # +sdown master sfmaster 127.0.0.1 6479
  20169:X 22 Jul 03:03:38.742 # +new-epoch 1
  20169:X 22 Jul 03:03:38.743 # +vote-for-leader 68102904daa4df70bf945677f62498bbdffee1d4 1
  20169:X 22 Jul 03:03:38.778 # +odown master sfmaster 127.0.0.1 6479 #quorum 3/2
  20169:X 22 Jul 03:03:38.779 # Next failover delay: I will not start a failover before Sun Jul 22 03:09:39 2018
  20169:X 22 Jul 03:03:39.346 # +config-update-from sentinel 68102904daa4df70bf945677f62498bbdffee1d4 127.0.0.1 26481 @ sfmaster 127.0.0.1 6479
  20169:X 22 Jul 03:03:39.346 # +switch-master sfmaster 127.0.0.1 6479 127.0.0.1 6480
  20169:X 22 Jul 03:03:39.346 * +slave slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6480
  20169:X 22 Jul 03:03:39.346 * +slave slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  20169:X 22 Jul 03:04:09.393 # +sdown slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  ~/stayfoolish/26480/log $ tail -f sentinel.log
  20171:X 22 Jul 03:03:38.665 # +sdown master sfmaster 127.0.0.1 6479
  20171:X 22 Jul 03:03:38.741 # +new-epoch 1
  20171:X 22 Jul 03:03:38.742 # +vote-for-leader 68102904daa4df70bf945677f62498bbdffee1d4 1
  20171:X 22 Jul 03:03:39.343 # +config-update-from sentinel 68102904daa4df70bf945677f62498bbdffee1d4 127.0.0.1 26481 @ sfmaster 127.0.0.1 6479
  20171:X 22 Jul 03:03:39.344 # +switch-master sfmaster 127.0.0.1 6479 127.0.0.1 6480
  20171:X 22 Jul 03:03:39.344 * +slave slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6480
  20171:X 22 Jul 03:03:39.344 * +slave slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  20171:X 22 Jul 03:04:09.379 # +sdown slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  ~/stayfoolish/26481/log $ tail -f sentinel.log
  20177:X 22 Jul 03:03:38.671 # +sdown master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.730 # +odown master sfmaster 127.0.0.1 6479 #quorum 2/2
  20177:X 22 Jul 03:03:38.730 # +new-epoch 1
  20177:X 22 Jul 03:03:38.730 # +try-failover master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.731 # +vote-for-leader 68102904daa4df70bf945677f62498bbdffee1d4 1
  20177:X 22 Jul 03:03:38.742 # 88fc1c8a5cdb41f3f92ed8e83e92e11b244b6e1a voted for 68102904daa4df70bf945677f62498bbdffee1d4 1
  20177:X 22 Jul 03:03:38.744 # fc2182cf6c2cc8ae88dbe4bec35f1cdd9e9b8d65 voted for 68102904daa4df70bf945677f62498bbdffee1d4 1
  20177:X 22 Jul 03:03:38.815 # +elected-leader master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.815 # +failover-state-select-slave master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.871 # +selected-slave slave 127.0.0.1:6480 127.0.0.1 6480 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.871 * +failover-state-send-slaveof-noone slave 127.0.0.1:6480 127.0.0.1 6480 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:38.962 * +failover-state-wait-promotion slave 127.0.0.1:6480 127.0.0.1 6480 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:39.269 # +promoted-slave slave 127.0.0.1:6480 127.0.0.1 6480 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:39.269 # +failover-state-reconf-slaves master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:39.342 * +slave-reconf-sent slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:39.859 # -odown master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:40.335 * +slave-reconf-inprog slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:40.335 * +slave-reconf-done slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:40.410 # +failover-end master sfmaster 127.0.0.1 6479
  20177:X 22 Jul 03:03:40.410 # +switch-master sfmaster 127.0.0.1 6479 127.0.0.1 6480
  20177:X 22 Jul 03:03:40.411 * +slave slave 127.0.0.1:6481 127.0.0.1 6481 @ sfmaster 127.0.0.1 6480
  20177:X 22 Jul 03:03:40.411 * +slave slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  20177:X 22 Jul 03:04:10.501 # +sdown slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  4. 启动6479端口,查看Sentinel节点26481端口的日志,显示了复制关系指向6480端口的结果。
  ~/stayfoolish/26481/log $ tail -f sentinel.log
  20177:X 22 Jul 03:33:36.960 # -sdown slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  20177:X 22 Jul 03:33:46.959 * +convert-to-slave slave 127.0.0.1:6479 127.0.0.1 6479 @ sfmaster 127.0.0.1 6480
  5. 查看下新的复制关系。
  127.0.0.1:6480> info replication
  # Replication
  role:master
  connected_slaves:2
  slave0:ip=127.0.0.1,port=6481,state=online,offset=405522,lag=0
  slave1:ip=127.0.0.1,port=6479,state=online,offset=405389,lag=0
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:1
  ...
  master0:name=sfmaster,status=ok,address=127.0.0.1:6480,slaves=2,sentinels=3
  熟悉Sentinel API
  Sentinel节点是一个特殊的Redis节点,可以执行少数的命令,有自己专属的API,下面重点看几个。
  1. sentinel get-master-addr-by-name 返回指定主节点的IP和端口。
  127.0.0.1:26479> sentinel get-master-addr-by-name sfmaster
  1) "127.0.0.1"
  2) "6480"
  2. sentinel failover 对指定的主节点进行强制故障转移(没有和其它Sentinel节点协商),当故障转移完成后,其它Sentinel节点按照故障转移的结果更新自身配置。
  127.0.0.1:26479> sentinel failover sfmaster
  OK
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:1
  ...
  master0:name=sfmaster,status=ok,address=127.0.0.1:6481,slaves=2,sentinels=3
  3. sentinel remove 取消当前Sentinel节点对于指定主节点的监控,但该命令仅对当前Sentinel节点有效。
  127.0.0.1:26479> sentinel remove sfmaster
  OK
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:0
  sentinel_tilt:0
  sentinel_running_scripts:0
  sentinel_scripts_queue_length:0
  sentinel_simulate_failure_flags:0
  4. sentinel monitor    添加对主节点的监控。执行下面的命令完成对主节点6481端口的添加,通过info sentinel查看信息。
  127.0.0.1:26479> sentinel monitor sfmaster 127.0.0.1 6481 2
  OK
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:1
  ...
  sentinel_simulate_failure_flags:0
  master0:name=sfmaster,status=ok,address=127.0.0.1:6481,slaves=0,sentinels=3
  发现slaves=0,应该slaves=2才对,为什么呢...
  原来,sentinel remove移除主节点时,会将Redis节点的配置在该Sentinel节点上删除,其中包括了一条认证配置sentinel auth-pass sfmaster abcdefg,但sentinel monitor在添加刚移除的主节点时,并不会添加该条认证配置(小瑕疵)。手动添加,重启26479端口的Sentinel节点,可看到正常了。
  127.0.0.1:26479> info sentinel
  # Sentinel
  sentinel_masters:1
  ...
  master0:name=sfmaster,status=ok,address=127.0.0.1:6481,slaves=2,sentinels=3
  若感兴趣可关注订阅号”数据库最佳实践”(DBBestPractice).
DSC0000.jpg



运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-629671-1-1.html 上篇帖子: Redis主从复制结构模式,哨兵模式 下篇帖子: redis性能调优
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表