redis演练(7) redis Sentinel实现故障转移

jiaxp 发表于 2018-11-4 14:23:07

书接上文.
　　中仅仅配置了redis主从环境。分别配置了2个主从结构。
　　分别是1.有向无环，2星型模型。配置起来非常简单。但是，遗留了一个尾巴，没有阐述。如果master宕掉了怎么办？redis如何实现fail-over故障转移？本文，就重点说一下这块。主要内容

[*]　　手动实现fail-over效果
[*]　　sentinel实现自动fail-over效果
　　手动实现fail-over效果
#有向无环模型（参照redis演练(6) redis主从模式搭建内容）　　
# ps -ef |grep redis
　　
root    2495 12 20:06 ?    00:00:01 bin/redis-server *:6379
　　
root    2503 11 20:06 ?    00:00:00 bin/redis-server *:6381
　　
root    2508 11 20:06 ?    00:00:00 bin/redis-server *:6380
　　

　　
#Master(有一个从6380)
　　
127.0.0.1:6379> info Replication
　　
# Replication
　　
role:master
　　
connected_slaves:1
　　
slave0:ip=127.0.0.1,port=6380,state=online,offset=99,lag=1
　　
master_repl_offset:99
　　
repl_backlog_active:1
　　
repl_backlog_size:1048576
　　
repl_backlog_first_byte_offset:2
　　
repl_backlog_histlen:98
　　

　　

　　
#Slave1 连接主6379
　　
127.0.0.1:6380> info Replication
　　
# Replication
　　
role:slave
　　
master_host:127.0.0.1
　　
master_port:6379
　　
master_link_status:up
　　
master_last_io_seconds_ago:5
　　
master_sync_in_progress:0
　　
slave_repl_offset:197
　　
slave_priority:100
　　
slave_read_only:1
　　
connected_slaves:1
　　
slave0:ip=127.0.0.1,port=6381,state=online,offset=197,lag=0
　　
master_repl_offset:197
　　
repl_backlog_active:1
　　
repl_backlog_size:1048576
　　
repl_backlog_first_byte_offset:2
　　
repl_backlog_histlen:196
　　

　　
#6380的从
　　
127.0.0.1:6381> info Replication
　　
# Replication
　　
role:slave
　　
master_host:127.0.0.1
　　
master_port:6380
　　
master_link_status:up
　　
master_last_io_seconds_ago:6
　　
master_sync_in_progress:0
　　
slave_repl_offset:573
　　
slave_priority:100
　　
slave_read_only:1
　　
connected_slaves:0
　　
master_repl_offset:0
　　
repl_backlog_active:0
　　
repl_backlog_size:1048576
　　
repl_backlog_first_byte_offset:0
　　
repl_backlog_histlen:0
　　

　　
####################################
　　
模拟6379 宕机
　　
#####################################
　　
# bin/redis-cli shutdown
　　
# bin/redis-cli -p 6379 shutdown
　　
Could not connect to Redis at 127.0.0.1:6379: Connection refused
　　
#观察，发现master_link_status:down，表示主一定宕掉了
　　
127.0.0.1:6380> infoReplication
　　
# Replication
　　
role:slave
　　
master_host:127.0.0.1
　　
master_port:6379
　　
master_link_status:down
　　
master_last_io_seconds_ago:-1
　　
master_sync_in_progress:0
　　
slave_repl_offset:1049
　　
master_link_down_since_seconds:42
　　
slave_priority:100
　　
slave_read_only:1
　　
connected_slaves:1
　　
slave0:ip=127.0.0.1,port=6381,state=online,offset=1105,lag=0
　　
master_repl_offset:1105
　　
repl_backlog_active:1
　　
repl_backlog_size:1048576
　　
repl_backlog_first_byte_offset:2
　　
repl_backlog_histlen:1104
　　
#开始从主切换(6380 -->6379)
　　
# 只需要简单执行下面两句命令，就将6380主切换为主
　　
127.0.0.1:6380> slaveof no one
　　
OK
　　
127.0.0.1:6380> config set slave-read-only no
　　
OK
　　
127.0.0.1:6380> set title "sentinel"
　　
OK
　　
#连到从服务上，没有问题
　　
127.0.0.1:6381> get title
　　
"sentinel"
　　日志（6379）
　　2495:M 05 Sep 20:06:23.615 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
　　2495:M 05 Sep 20:06:23.615 # Server started, Redis version 3.2.3
　　2495:M 05 Sep 20:06:23.617 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
　　2495:M 05 Sep 20:06:24.815 * DB loaded from append only file: 1.199 seconds
　　2495:M 05 Sep 20:06:24.816 * The server is now ready to accept connections on port 6379
　　2495:M 05 Sep 20:06:24.816 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:24.816 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:29.841 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:29.841 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:34.867 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:34.875 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:39.919 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:39.921 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:44.971 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:44.971 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:50.022 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:50.022 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:55.134 - DB 0: 20019 keys (0 volatile) in 32768 slots HT.
　　2495:M 05 Sep 20:06:55.134 - 0 clients connected (0 slaves), 3764336 bytes in use
　　2495:M 05 Sep 20:06:58.775 - Accepted 127.0.0.1:44408
　　2495:M 05 Sep 20:06:58.775 * Slave 127.0.0.1:6380 asks for synchronization
　　2495:M 05 Sep 20:06:58.775 * Full resync requested by slave 127.0.0.1:6380
　　2495:M 05 Sep 20:06:58.775 * Starting BGSAVE for SYNC with target: disk
　　2495:M 05 Sep 20:06:58.776 * Background saving started by pid 2511
　　2511:C 05 Sep 20:06:58.868 * DB saved on disk
　　2511:C 05 Sep 20:06:58.870 * RDB: 0 MB of memory used by copy-on-write
　　2495:M 05 Sep 20:06:58.916 * Background saving terminated with success
　　2495:M 05 Sep 20:06:58.920 * Synchronization with slave 127.0.0.1:6380 succeeded
　　....
　　2495:M 05 Sep 20:19:19.471 # User requested shutdown...
　　2495:M 05 Sep 20:19:19.471 * Calling fsync() on the AOF file.
　　2495:M 05 Sep 20:19:19.471 * Removing the pid file.
　　2495:M 05 Sep 20:19:19.472 # Redis is now ready to exit, bye bye...
　　日志（6380）
　　2508:S 05 Sep 20:06:58.714 # Server started, Redis version 3.2.3
　　2508:S 05 Sep 20:06:58.714 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
　　2508:S 05 Sep 20:06:58.775 * DB loaded from disk: 0.060 seconds
　　2508:S 05 Sep 20:06:58.775 * The server is now ready to accept connections on port 6380
　　2508:S 05 Sep 20:06:58.775 * Connecting to MASTER 127.0.0.1:6379
　　2508:S 05 Sep 20:06:58.775 * MASTERSLAVE sync started
　　2508:S 05 Sep 20:06:58.775 * Non blocking connect for SYNC fired the event.
　　2508:S 05 Sep 20:06:58.775 * Master replied to PING, replication can continue...
　　2508:S 05 Sep 20:06:58.775 * Partial resynchronization not possible (no cached master)
　　2508:S 05 Sep 20:06:58.802 * Full resync from master: 8d0d86237c36a8d6ace4eed9b5f6e5871b40da29:1
　　2508:S 05 Sep 20:06:58.917 * MASTERSLAVE sync: receiving 489615 bytes from master
　　2508:S 05 Sep 20:06:58.922 * MASTERSLAVE sync: Flushing old data
　　2508:S 05 Sep 20:06:58.938 * MASTERSLAVE sync: Loading DB in memory
　　2508:S 05 Sep 20:06:58.969 * MASTERSLAVE sync: Finished with success
　　2508:S 05 Sep 20:06:59.788 * Slave 127.0.0.1:6381 asks for synchronization
　　2508:S 05 Sep 20:06:59.788 * Full resync requested by slave 127.0.0.1:6381
　　2508:S 05 Sep 20:06:59.788 * Starting BGSAVE for SYNC with target: disk
　　2508:S 05 Sep 20:06:59.788 * Background saving started by pid 2512
　　2512:C 05 Sep 20:06:59.832 * DB saved on disk
　　2512:C 05 Sep 20:06:59.832 * RDB: 0 MB of memory used by copy-on-write
　　2508:S 05 Sep 20:06:59.896 * Background saving terminated with success
　　2508:S 05 Sep 20:06:59.899 * Synchronization with slave 127.0.0.1:6381 succeeded
　　2508:S 05 Sep 20:10:46.786 * 10000 changes in 60 seconds. Saving...
　　2508:S 05 Sep 20:10:46.786 * Background saving started by pid 2595
　　2595:C 05 Sep 20:10:46.800 * DB saved on disk
　　2595:C 05 Sep 20:10:46.801 * RDB: 0 MB of memory used by copy-on-write
　　2508:S 05 Sep 20:10:46.887 * Background saving terminated with success
　　2508:S 05 Sep 20:19:19.472 # Connection with master lost.
　　2508:S 05 Sep 20:19:19.472 * Caching the disconnected master state.
　　2508:S 05 Sep 20:19:19.594 * Connecting to MASTER 127.0.0.1:6379
　　2508:S 05 Sep 20:19:19.595 * MASTERSLAVE sync started
　　2508:S 05 Sep 20:19:19.595 # Error condition on socket for SYNC: Connection refused
　　2508:S 05 Sep 20:19:20.619 * Connecting to MASTER 127.0.0.1:6379
　　2508:S 05 Sep 20:19:20.619 * MASTERSLAVE sync started
　　...
　　2508:S 05 Sep 20:20:49.783 # Error condition on socket for SYNC: Connection refused
　　2508:M 05 Sep 20:20:50.283 * Discarding previously cached master state.

　　2508:M 05 Sep 20:20:50.283 * MASTER MODE enabled (user request from 'id=6 addr=127.0.0.1:54717 fd=8 name= age=696>　　2508:M 05 Sep 20:25:47.073 * 1 changes in 900 seconds. Saving...
　　2508:M 05 Sep 20:25:47.074 * Background saving started by pid 2722
　　2722:C 05 Sep 20:25:47.087 * DB saved on disk
　　2722:C 05 Sep 20:25:47.088 * RDB: 0 MB of memory used by copy-on-write
　　2508:M 05 Sep 20:25:47.176 * Background saving terminated with success
　　2508:M 05 Sep 20:40:48.064 * 1 changes in 900 seconds. Saving...
　　2508:M 05 Sep 20:40:48.064 * Background saving started by pid 2813
　　2813:C 05 Sep 20:40:48.075 * DB saved on disk
　　2813:C 05 Sep 20:40:48.075 * RDB: 0 MB of memory used by copy-on-write
　　2508:M 05 Sep 20:40:48.165 * Background saving terminated with success
　　6381日志
　　2503:S 05 Sep 20:06:54.667 * DB loaded from disk: 0.087 seconds
　　2503:S 05 Sep 20:06:54.667 * The server is now ready to accept connections on port 6381
　　2503:S 05 Sep 20:06:54.667 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:54.667 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:54.667 # Error condition on socket for SYNC: Connection refused
　　2503:S 05 Sep 20:06:55.691 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:55.692 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:55.692 # Error condition on socket for SYNC: Connection refused
　　2503:S 05 Sep 20:06:56.716 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:56.717 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:56.717 # Error condition on socket for SYNC: Connection refused
　　2503:S 05 Sep 20:06:57.741 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:57.742 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:57.742 # Error condition on socket for SYNC: Connection refused
　　2503:S 05 Sep 20:06:58.764 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:58.764 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:58.764 * Non blocking connect for SYNC fired the event.
　　2503:S 05 Sep 20:06:58.775 * Master replied to PING, replication can continue...
　　2503:S 05 Sep 20:06:58.775 * Partial resynchronization not possible (no cached master)
　　2503:S 05 Sep 20:06:58.776 * Master does not support PSYNC or is in error state (reply: -ERR Can't SYNC while not connected with my master)
　　2503:S 05 Sep 20:06:58.776 * Retrying with SYNC...
　　2503:S 05 Sep 20:06:58.803 # MASTER aborted replication with an error: ERR Can't SYNC while not connected with my master
　　2503:S 05 Sep 20:06:59.786 * Connecting to MASTER 127.0.0.1:6380
　　2503:S 05 Sep 20:06:59.787 * MASTERSLAVE sync started
　　2503:S 05 Sep 20:06:59.787 * Non blocking connect for SYNC fired the event.
　　2503:S 05 Sep 20:06:59.787 * Master replied to PING, replication can continue...
　　2503:S 05 Sep 20:06:59.787 * Partial resynchronization not possible (no cached master)
　　2503:S 05 Sep 20:06:59.788 * Full resync from master: e1bfca531c87795977333fca30c7a75eea64a1de:1
　　2503:S 05 Sep 20:06:59.897 * MASTERSLAVE sync: receiving 489615 bytes from master
　　2503:S 05 Sep 20:06:59.900 * MASTERSLAVE sync: Flushing old data
　　2503:S 05 Sep 20:06:59.917 * MASTERSLAVE sync: Loading DB in memory
　　2503:S 05 Sep 20:06:59.969 * MASTERSLAVE sync: Finished with success
　　2.sentinel实现fail-over自动切换
从源文件中复制sentinel.conf　　
cp /usr/local/src/redis-3.2.3/sentinel.conf/usr/local/redis/
　　
#修改确认如下参数
　　
sentinel monitor mymaster 127.0.0.1 6379 1
　　
sentinel down-after-milliseconds mymaster 5000
　　
sentinel failover-timeout mymaster 180000
　　
sentinel parallel-syncs mymaster 1
　　参照
　　http://www.redis.cn/topics/sentinel.html
bin/redis-server redis.conf　　
bin/redis-server redis6380.conf
　　
bin/redis-server redis6381.conf
　　
bin/redis-server sentinel.conf--sentinel
端口标志6379Master6380Slave6381Slave　　使用sentinel 监控(正常初始化状态，使用sentinel监控如下）
　　2.1Master状态
　　127.0.0.1:26379> sentinel masters
　　1)1) "name"
　　2) "mymaster"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6379"
　　7) "runid"
　　8) "4d2b8e087e297f5d6347e1599a37c4998ad056d6"
　　9) "flags"
　　10) "master"
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "410"
　　19) "last-ping-reply"
　　20) "410"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "7817"
　　25) "role-reported"
　　26) "master"
　　27) "role-reported-time"
　　28) "58045"
　　29) "config-epoch"
　　30) "0"
　　31) "num-slaves"
　　32) "2"
　　33) "num-other-sentinels"
　　34) "0"
　　35) "quorum"
　　36) "1"
　　37) "failover-timeout"
　　38) "180000"
　　39) "parallel-syncs"
　　40) "1"
　　可以知道Master的端口，备节点等信息。
　　2.2 查看初始的Slave信息
　　127.0.0.1:26379> sentinel slaves mymaster
　　1)1) "name"
　　2) "127.0.0.1:6380"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6380"
　　7) "runid"
　　8) "c344769d6d1cfd814437034b39f04b17851dca66"
　　9) "flags"
　　10) "slave"
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "693"
　　19) "last-ping-reply"
　　20) "693"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "6445"
　　25) "role-reported"
　　26) "slave"
　　27) "role-reported-time"
　　28) "96788"
　　29) "master-link-down-time"
　　30) "0"
　　31) "master-link-status"
　　32) "ok"
　　33) "master-host"
　　34) "127.0.0.1"
　　35) "master-port"
　　36) "6379"
　　37) "slave-priority"
　　38) "100"
　　39) "slave-repl-offset"
　　40) "6058"
　　2)1) "name"
　　2) "127.0.0.1:6381"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6381"
　　7) "runid"
　　8) "9f8666ce6e7b30d01449f6fb10d8556030a96186"
　　9) "flags"
　　10) "slave"
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "693"
　　19) "last-ping-reply"
　　20) "693"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "6444"
　　25) "role-reported"
　　26) "slave"
　　27) "role-reported-time"
　　28) "96788"
　　29) "master-link-down-time"
　　30) "0"
　　31) "master-link-status"
　　32) "ok"
　　33) "master-host"
　　34) "127.0.0.1"
　　35) "master-port"
　　36) "6379"
　　37) "slave-priority"
　　38) "100"
　　39) "slave-repl-offset"
　　40) "6058"
　　此时，sentinel日志风平浪静

　　2847:X 07 Sep 21:01:27.567 # Sentinel>　　2847:X 07 Sep 21:01:27.567 # +monitor master mymaster 127.0.0.1 6379 quorum 1
　　2.3 模拟Master6379宕机）
　　127.0.0.1:6379> debug sleep 100
　　OK
　　2.4 sentinel自动进行failover切换
　　观看sentinel日志(sentinel具体工作详情)
　　2847:X 07 Sep 21:01:27.567 # +monitor master mymaster 127.0.0.1 6379 quorum 1
　　2847:X 07 Sep 21:03:49.117 # +sdown master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.117 # +odown master mymaster 127.0.0.1 6379 #quorum 1/1
　　2847:X 07 Sep 21:03:49.117 # +new-epoch 4
　　2847:X 07 Sep 21:03:49.117 # +try-failover master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.128 # +vote-for-leader 1b9d1d720b11ecf5568c3dc0194305e86c47ed9a 4
　　2847:X 07 Sep 21:03:49.129 # +elected-leader master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.129 # +failover-state-select-slave master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.185 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.185 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:49.252 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:50.262 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:50.262 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:50.315 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:51.308 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:51.308 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:51.365 # +failover-end master mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:03:51.365 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
　　2847:X 07 Sep 21:03:51.365 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
　　2847:X 07 Sep 21:03:51.365 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
　　2847:X 07 Sep 21:03:56.399 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
　　2847:X 07 Sep 21:05:23.708 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
　　2847:X 07 Sep 21:05:33.730 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
　　2.6failover切换后监控信息
　　127.0.0.1:26379> sentinel masters
　　1)1) "name"
　　2) "mymaster"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6381" （这儿变成了6381）
　　7) "runid"
　　8) "9f8666ce6e7b30d01449f6fb10d8556030a96186"
　　9) "flags"
　　10) "master"
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "190"
　　19) "last-ping-reply"
　　20) "190"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "1905"
　　25) "role-reported"
　　26) "master"
　　27) "role-reported-time"
　　28) "42128"
　　29) "config-epoch"
　　30) "4"
　　31) "num-slaves"
　　32) "2"
　　33) "num-other-sentinels"
　　34) "0"
　　35) "quorum"
　　36) "1"
　　37) "failover-timeout"
　　38) "180000"
　　39) "parallel-syncs"
　　40) "1"
　　备库信息
　　127.0.0.1:26379> sentinel slaves mymaster
　　1)1) "name"
　　2) "127.0.0.1:6379"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6379"
　　7) "runid"
　　8) ""
　　9) "flags"
　　10) "s_down,slave"（此时6379 还处在sleep状态，过了休眠时间会更新该状态）
　　11) "link-pending-commands"
　　12) "44"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "48488"
　　17) "last-ok-ping-reply"
　　18) "48488"
　　19) "last-ping-reply"
　　20) "48488"
　　21) "s-down-time"
　　22) "43454"
　　23) "down-after-milliseconds"
　　24) "5000"
　　25) "info-refresh"
　　26) "1473253479853"
　　27) "role-reported"
　　28) "slave"
　　29) "role-reported-time"
　　30) "48488"
　　31) "master-link-down-time"
　　32) "0"
　　33) "master-link-status"
　　34) "err"
　　35) "master-host"
　　36) "?"
　　37) "master-port"
　　38) "0"
　　39) "slave-priority"
　　40) "100"
　　41) "slave-repl-offset"
　　42) "0"
　　2)1) "name"
　　2) "127.0.0.1:6380"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6380"
　　7) "runid"
　　8) "c344769d6d1cfd814437034b39f04b17851dca66"
　　9) "flags"
　　10) "slave"
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "308"
　　19) "last-ping-reply"
　　20) "308"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "8265"
　　25) "role-reported"
　　26) "slave"
　　27) "role-reported-time"
　　28) "48488"
　　29) "master-link-down-time"
　　30) "0"
　　31) "master-link-status"
　　32) "ok"
　　33) "master-host"
　　34) "127.0.0.1"
　　35) "master-port"
　　36) "6381"
　　37) "slave-priority"
　　38) "100"
　　39) "slave-repl-offset"
　　40) "11780"
　　127.0.0.1:26379> sentinel slaves mymaster
　　1)1) "name"
　　2) "127.0.0.1:6379"
　　3) "ip"
　　4) "127.0.0.1"
　　5) "port"
　　6) "6379"
　　7) "runid"
　　8) "4d2b8e087e297f5d6347e1599a37c4998ad056d6"
　　9) "flags"
　　10) "slave" s_down没有了
　　11) "link-pending-commands"
　　12) "0"
　　13) "link-refcount"
　　14) "1"
　　15) "last-ping-sent"
　　16) "0"
　　17) "last-ok-ping-reply"
　　18) "869"
　　19) "last-ping-reply"
　　20) "869"
　　21) "down-after-milliseconds"
　　22) "5000"
　　23) "info-refresh"
　　24) "3426"
　　25) "role-reported"
　　26) "slave"
　　27) "role-reported-time"
　　28) "3426"
　　29) "master-link-down-time"
　　30) "0"
　　31) "master-link-status"
　　32) "ok"
　　33) "master-host"
　　34) "127.0.0.1"
　　35) "master-port"
　　36) "6381"
　　37) "slave-priority"
　　38) "100"
　　39) "slave-repl-offset"
　　40) "16556"
　　...
　　手动切换failover
127.0.0.1:26379> SENTINEL failover mymaster　　
OK
　　
#切换为6379
　　
127.0.0.1:26379> SENTINEL get-master-addr-by-name mymaster
　　
1) "127.0.0.1"
　　
2) "6379"
　　2847:X 07 Sep 21:46:46.793 # +switch-master mymaster 127.0.0.1 6381 127.0.0.1 6379
　　2847:X 07 Sep 21:46:46.794 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:46:46.794 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　2847:X 07 Sep 21:46:56.910 * +convert-to-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
　　sentinel 发生failover，会更新对应主备库的redis.conf文件。
　　6379对应的配置文件，添加了slaveof参数
　　# cat redis.conf | grep slaveof
　　# Master-Slave replication. Use slaveof to make a Redis instance a copy of
　　# slaveof
　　slaveof 127.0.0.1 6381
　　6380对应的配置文件，修改了slaveof参数
　　# cat redis6380.conf | grep slaveof
　　# Master-Slave replication. Use slaveof to make a Redis instance a copy of
　　slaveof 127.0.0.1 6381
　　至此演练结束
　　3其他
　　3.1 重要的quorum参数。
　　演练设置quorum=1，纯粹为了简单，线上环境不能重要。
　　http://redis.io/topics/sentinel 中提供了讨论了4个场景，在以后慢慢演练讨论下。
　　复制sentinel.conf时，需要处理sentinel生成的信息如
　　sentinel myid 575cb680ff3d3cbad55cdb978c1d6b5962abe7ac
　　否则，sentinel之间通信存在问题

页: [1]

运维网's Archiver

redis演练(7) redis Sentinel实现故障转移