Redis 哨兵模式与Redis集群

trzxycx · 发表于 2018-11-2 12:27:19

第1章 Redis哨兵模式:

1.1 sentinel的功能:
　　1.    监控,sentinel会不断的检查你的主服务器和从服务器是否运行正常
　　2.    提醒.当被监控的某个redis服务器出现问题时,sentinel可以通过API向管理员或者其他应用程序发送通知
　　3.    自动故障迁移
1.2 服务器连接:
1.2.1 sentinel通过用户配置的配置文件来发现主服务器

　　sentinel会与被监视的主服务器创建两个网络连接:
　　1. 命令连接用于向主服务器发送命令
　　2. 订阅连接用于订阅指定的频道,从而发现
1.1.1 发现并连接从服务器:

1.1.1 发现其他sentinel:
　　sentinel会通过命令连接想被监视的主从服务器发送hello信息,该消息包含sentinel的IP,端口号,ID等内容,以此来向其他sentinel 宣告自己的存在,于此同时sentinel会通过订阅连接接受其他sentinel的hello信息,以此来发现监视同一个主服务器的其他sentinel

1.1.1 多个sentinel之间的连接:
　　sentinel之间只会互相创建命令连接,用于进行通信,因为已经有主从服务器作为发送和接受hello信息的中介,所以sentinel之间不会创建订阅连接

1.1 failover:
　　一次故障转移的步骤:
　　1. 发现主服务器已经进入客观下线状态。
　　2. 基于Raft leader election 协议，进行投票选举
　　3. 如果当选失败，那么在设定的故障迁移超时时间的两倍之后，重新尝试当选。如果当选成功，那么执行以下步骤。
　　4. 选出一个从服务器，并将它升级为主服务器。
　　5. 向被选中的从服务器发送 SLAVEOF NO ONE 命令，让它转变为主服务器。
　　6. 通过发布与订阅功能，将更新后的配置传播给所有其他 Sentinel ，其他 Sentinel 对它们自己的配置进行更新。
　　7. 向已下线主服务器的从服务器发送 SLAVEOF 命令，让它们去复制新的主服务器。
　　8. 当所有从服务器都已经开始复制新的主服务器时， leader Sentinel 终止这次故障迁移操作。
1.2 部署sentinel:
　　创建目录
　　[root@gitlab data]# mkdir 26380
　　[root@gitlab data]# cd 26380/
　　编写配置文件
　　[root@gitlab 26380]# vim sentienl.conf
　　port 26380
　　dir "/data/26380"
　　sentinel monitor mymaster 127.0.0.1 6380 1
　　sentinel down-after-milliseconds mymaster 60000
　　启动sentinel服务
　　redis-sentinel /data/26380/sentienl.conf &
　　配置文件说明:
　　# 指定监控master
　　sentinel monitor mymaster 127.0.0.1 6370 2
　　# {2表示多少个sentinel同意}
　　# 安全信息
　　sentinel auth-pass mymaster root
　　# 超过15000毫秒后认为主机宕机
　　sentinel down-after-milliseconds mymaster 15000
　　# 当主从切换多久后认为主从切换失败
　　sentinel failover-timeout mymaster 900000
　　# 这两个配置后面的数量主从机需要一样，epoch为master的版本
　　sentinel leader-epoch mymaster 1
　　sentinel config-epoch mymaster 1
1.2.1 确认一主两从环境良好,然后宕掉6380节点:
　　127.0.0.1:6380> shutdown
1.2.2 等待进行验证:
　　127.0.0.1:6381> info replication
　　# Replication
　　role:master
　　connected_slaves:1
　　slave0:ip=127.0.0.1,port=6382,state=online,offset=1928,lag=1
　　master_repl_offset:1928
　　repl_backlog_active:1
　　repl_backlog_size:1048576
　　repl_backlog_first_byte_offset:2
　　repl_backlog_histlen:1927
　　repl_backlog_histlen:0
　　127.0.0.1:6382> info replication
　　# Replication
　　role:slave
　　master_host:127.0.0.1
　　master_port:6381
　　master_link_status:up
　　master_last_io_seconds_ago:2
　　master_sync_in_progress:0
　　slave_repl_offset:2474
　　slave_priority:100
　　slave_read_only:1
　　connected_slaves:0
　　master_repl_offset:0
　　repl_backlog_active:0
　　repl_backlog_size:1048576
　　repl_backlog_first_byte_offset:0
　　repl_backlog_histlen:0
第2章 Redis cluster
2.1 redis集群
　　Ø  redis集群是一个可以在多个redis节点之间进行数据共享的设施
　　Ø  redis集群不支持哪些需要同时处理多个键的redis命令,因为执行这些命令需要在多个redis节点之间移动数据,并且在高负载的情况下,这些命令将降低redis集群的性能,并导致不可预测的行为
　　Ø  redis集群通过分区来提供一定程度的可用性,即使集群中有一部分节点失效或者无法进行通讯,集群也可以继续处理命令请求,将数据自动切分到多个节点的能力
　　Ø  当集群中的一部分节点失效或者无法进行通讯时,仍然可以继续处理命令请求的能力
2.2 redis集群数据共享
　　redis集群使用数据分片,而非一致性hash来实现,一个redis集群包含16384个哈希槽,数据库中的每个键都属于这16384哈希槽其中一个,集群使用公式来计算键值属于哪个槽
　　节点 A 负责处理 0 号至 5500 号哈希槽。
　　节点 B 负责处理 5501 号至 11000 号哈希槽。
　　节点 C 负责处理 11001 号至 16384 号哈希槽。

1.1 集群运行机制
　　Ø  所有的redis节点彼此互联ping-pong机制,内部使用二进制协议传输速度和带宽
　　Ø  节点的fail是通过集群中超过半数的master节点检测失效时才失效
　　Ø  客户端与redis节点直连,不需要中间proxy层,客户顿不需要连接集群中所有节点,连接集群中任何一个可用节点即可
　　把所有的物理节点映射到哈希槽上,cluster负责维护

　　为了使得集群在一部分加点下线或者无法与集群中大多数节点进行通讯的情况下,仍然可以正常运作,redis集群对节点使用了主从复制功能:集群中每个节点都有1个至N个复制品,其中一个复制品为主节点,而其余的N-1个复制品为从节点
　　在之前列举的节点 A 、B 、C 的例子中，如果节点 B 下线了，那么集群将无法正常运行，因为集群找不到节点来处理 5501 号至 11000  号的哈希
　　槽。
　　假如在创建集群的时候（或者至少在节点 B 下线之前），我们为主节点  B添加了从节点 B1 ，那么当主节点 B 下线的时候，集群就会将 B1  设置为新的主节点，并让它代替下线的主节点 B ，继续处理 5501 号至  11000 号的哈希槽，这样集群就不会因为主节点 B  的下线而无法正常运作了。
　　不过如果节点 B 和 B1 都下线的话， Redis 集群还是会停止运作。
1.1 集群的故障转移
　　1.    在集群中,节点会对其它节点进行下线检测
　　2.    当一个主节点下线时,集群中其它节点负责对下年主节点进行故障转移
　　3.    换句话说,集群的节点集成了下线检测和故障转移等类似sentinel的功能
　　4.    因为sentinel是一个独立运行的监控程序,而集群的下线检测和故障转移等功能是集成在节点中的,他们的运行模式非常的不同,所以尽管这两者的功能很相似,但集群的实现没有重用sentinel的代码
1.2 在集群中执行命令的两种情况:
　　示例1-1 命令发送到正确的节点 : 就像单机redis服务器一样

　　Ø 槽位说明：
　　7000: 槽 0~5000
　　7001：槽 5001~10000
　　7002：槽 10001~16383
　　示例1-1 命令发送到了错误的节点:
　　接受到命令的节点并非处理键所在槽节点,那么节点将向客户端返回一个转向错误,告知客户端应该到哪个节点上去执行命令,客户端根据错误提示信息重新执行命令

　　Ø 键date位于2022槽,该槽由7000负责,但错误发送到了7001上,7001向客户端返回转向错误

　　Ø  客户端根据错误提示,转向到7000,并重新发送命令
1.1 redis cluster部署:
　　Ø  安装ruby支持
　　yum install ruby rubygems -y
　　gem sources -a http://mirrors.aliyun.com/rubygems/
　　gem sources  --remove http://rubygems.org/
　　gem sources -l
　　gem install redis -v 3.3.3
　　Ø  创建程序目录:
　　[root@gitlab data]# mkdir {7000..7005}
　　Ø  编写配置文件:
　　port 7000
　　daemonize yes
　　pidfile /data/7000/redis.pid
　　loglevel notice
　　logfile "/data/7000/redis.log"
　　dbfilename dump.rdb
　　dir /data/7000
　　protected-mode no
　　cluster-enabled yes
　　cluster-config-file nodes.conf
　　cluster-node-timeout 5000
　　appendonly yes
　　Ø  启动实例:
　　for i in {0..5};do redis-server /data/700${i}/redis.conf ; done
　　[root@gitlab data]# ps -ef |grep redis
　　root    19490    1  0 03:40 ?       00:00:00 redis-server *:7000 [cluster]
　　root    19492    1  0 03:40 ?       00:00:00 redis-server *:7001 [cluster]
　　root    19494    1  0 03:40 ?       00:00:00 redis-server *:7002 [cluster]
　　root    19496    1  0 03:40 ?       00:00:00 redis-server *:7003 [cluster]
　　root    19498    1  0 03:40 ?       00:00:00 redis-server *:7004 [cluster]
　　root    19500    1  0 03:40 ?       00:00:00 redis-server *:7005 [cluster]
　　Ø  加载节点并启动集群:
　　[root@gitlab data]# redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 \
　　> 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
　　>>> Creating cluster
　　>>> Performing hash slots allocation on 6 nodes...
　　Using 3 masters:
　　127.0.0.1:7000
　　127.0.0.1:7001
　　127.0.0.1:7002
　　Adding replica 127.0.0.1:7003 to 127.0.0.1:7000
　　Adding replica 127.0.0.1:7004 to 127.0.0.1:7001
　　Adding replica 127.0.0.1:7005 to 127.0.0.1:7002
　　M: 41679c9a4392f205496746f51fe2d167ce307c86 127.0.0.1:7000
　　slots:0-5460 (5461 slots) master
　　M: b22bd736f693bf1573c0e3aff0403516871865ce 127.0.0.1:7001
　　slots:5461-10922 (5462 slots) master
　　M: e61c8c741d9a1e69ca2d9a6f36e46177915393c0 127.0.0.1:7002
　　slots:10923-16383 (5461 slots) master
　　S: 1b87225b8c1c8a9ebfbb9ac37a8c5a963d569513 127.0.0.1:7003
　　replicates 41679c9a4392f205496746f51fe2d167ce307c86
　　S: 3b26d115ce27c4e72b60a5fd4985658dacfe44fb 127.0.0.1:7004
　　replicates b22bd736f693bf1573c0e3aff0403516871865ce
　　S: 4a3c744e24e980f84836d6cd708dcadf5e505158 127.0.0.1:7005
　　replicates e61c8c741d9a1e69ca2d9a6f36e46177915393c0
　　Can I set the above configuration? (type 'yes' to accept): yes
　　>>> Nodes configuration updated
　　>>> Assign a different config epoch to each node
　　>>> Sending CLUSTER MEET messages to join the cluster
　　Waiting for the cluster to join...
　　>>> Performing Cluster Check (using node 127.0.0.1:7000)
　　M: 41679c9a4392f205496746f51fe2d167ce307c86 127.0.0.1:7000
　　slots:0-5460 (5461 slots) master
　　1 additional replica(s)
　　S: 1b87225b8c1c8a9ebfbb9ac37a8c5a963d569513 127.0.0.1:7003
　　slots: (0 slots) slave
　　replicates 41679c9a4392f205496746f51fe2d167ce307c86
　　S: 4a3c744e24e980f84836d6cd708dcadf5e505158 127.0.0.1:7005
　　slots: (0 slots) slave
　　replicates e61c8c741d9a1e69ca2d9a6f36e46177915393c0
　　M: e61c8c741d9a1e69ca2d9a6f36e46177915393c0 127.0.0.1:7002
　　slots:10923-16383 (5461 slots) master
　　1 additional replica(s)
　　M: b22bd736f693bf1573c0e3aff0403516871865ce 127.0.0.1:7001
　　slots:5461-10922 (5462 slots) master
　　1 additional replica(s)
　　S: 3b26d115ce27c4e72b60a5fd4985658dacfe44fb 127.0.0.1:7004
　　slots: (0 slots) slave
　　replicates b22bd736f693bf1573c0e3aff0403516871865ce
　　[OK] All nodes agree about slots configuration.
　　>>> Check for open slots...
　　>>> Check slots coverage...
　　[OK] All 16384 slots covered.
　　[root@gitlab data]#
1.2 集群管理:
　　Ø  写入数据:
　　[root@gitlab data]# redis-cli -c -p 7000
　　127.0.0.1:7000> set too bar
　　OK
　　127.0.0.1:7000> get too
　　"bar"
　　Ø  查看集群状态:
　　[root@gitlab data]# redis-cli -p 7000 cluster nodes | grep master
　　41679c9a4392f205496746f51fe2d167ce307c86 127.0.0.1:7000 myself,master - 0 0 1 connected 0-5460
　　e61c8c741d9a1e69ca2d9a6f36e46177915393c0 127.0.0.1:7002 master - 0 1523908330620 3 connected 10923-16383
　　b22bd736f693bf1573c0e3aff0403516871865ce 127.0.0.1:7001 master - 0 1523908331131 2 connected 5461-10922
　　Ø  重新分片实战:
　　[root@gitlab data]# redis-trib.rb reshard 127.0.0.1:7000
　　>>> Performing Cluster Check (using node 127.0.0.1:7000)
　　M: 41679c9a4392f205496746f51fe2d167ce307c86 127.0.0.1:7000
　　slots:0-5460 (5461 slots) master
　　1 additional replica(s)
　　S: 1b87225b8c1c8a9ebfbb9ac37a8c5a963d569513 127.0.0.1:7003
　　slots: (0 slots) slave
　　replicates 41679c9a4392f205496746f51fe2d167ce307c86
　　S: 4a3c744e24e980f84836d6cd708dcadf5e505158 127.0.0.1:7005
　　slots: (0 slots) slave
　　replicates e61c8c741d9a1e69ca2d9a6f36e46177915393c0
　　M: e61c8c741d9a1e69ca2d9a6f36e46177915393c0 127.0.0.1:7002
　　slots:10923-16383 (5461 slots) master
　　1 additional replica(s)
　　M: b22bd736f693bf1573c0e3aff0403516871865ce 127.0.0.1:7001
　　slots:5461-10922 (5462 slots) master
　　1 additional replica(s)
　　S: 3b26d115ce27c4e72b60a5fd4985658dacfe44fb 127.0.0.1:7004
　　slots: (0 slots) slave
　　replicates b22bd736f693bf1573c0e3aff0403516871865ce
　　[OK] All nodes agree about slots configuration.
　　>>> Check for open slots...
　　>>> Check slots coverage...
　　[OK] All 16384 slots covered.
　　How many slots do you want to move (from 1 to 16384)? 3             分三个槽位出去

　　What is the receiving node>
　　Please enter all the source node>　　Type 'all' to use all the nodes as source nodes for the hash slots.

　　Type 'done' once you entered all the source nodes>　　Source node #1:41679c9a4392f205496746f51fe2d167ce307c86                给出节点的ID
　　Source node #2:done                                                    没有了就添写done
　　Ready to move 3 slots.
　　Source nodes:
　　M: 41679c9a4392f205496746f51fe2d167ce307c86 127.0.0.1:7000
　　slots:0-5460 (5461 slots) master
　　1 additional replica(s)
　　Destination node:
　　M: e61c8c741d9a1e69ca2d9a6f36e46177915393c0 127.0.0.1:7002
　　slots:10923-16383 (5461 slots) master
　　1 additional replica(s)
　　Resharding plan:
　　Moving slot 0 from 41679c9a4392f205496746f51fe2d167ce307c86
　　Moving slot 1 from 41679c9a4392f205496746f51fe2d167ce307c86
　　Moving slot 2 from 41679c9a4392f205496746f51fe2d167ce307c86
　　Do you want to proceed with the proposed reshard plan (yes/no)? yes
　　Moving slot 0 from 127.0.0.1:7000 to 127.0.0.1:7002:
　　Moving slot 1 from 127.0.0.1:7000 to 127.0.0.1:7002:
　　Moving slot 2 from 127.0.0.1:7000 to 127.0.0.1:7002:
　　Ø  删除一个节点:
　　如果节点上还有slot的话,是无法进行删除的
　　[root@gitlab data]# redis-trib.rb del-node 127.0.0.1:7000 '41679c9a4392f205496746f51fe2d167ce307c86'
　　>>> Removing node 41679c9a4392f205496746f51fe2d167ce307c86 from cluster 127.0.0.1:7000
　　>>> Sending CLUSTER FORGET messages to the cluster...
　　>>> SHUTDOWN the node.
　　节点在删除后,服务自动关闭了,要添加回来的话需要重新启动
　　[root@gitlab data]# ps -ef |grep redis
　　root    19492    1  0 03:40 ?       00:00:12 redis-server *:7001 [cluster]
　　root    19494    1  0 03:40 ?       00:00:20 redis-server *:7002 [cluster]
　　root    19496    1  0 03:40 ?       00:00:04 redis-server *:7003 [cluster]
　　root    19498    1  0 03:40 ?       00:00:04 redis-server *:7004 [cluster]
　　root    19500    1  0 03:40 ?       00:00:04 redis-server *:7005 [cluster]
　　root    19608  19171  0 04:33 pts/3 00:00:00 grep --color=auto redis
　　[root@gitlab data]# redis-server /data/7000/redis.conf
　　Ø  添加一个节点:
　　redis-trib.rb add-node 127.0.0.1:7000 127.0.0.1:7002
　　注意:添加节点时,要保证节点是全新的
　　Ø  添加一个从节点:
　　redis-trib.rb add-node --slave --master-id $[nodeid] 127.0.0.1:7008 127.0.0.1:7000
第2章 Redis API:
2.1 PHP连接redis
　　Ø  连接测试代码
　　[root@clsn ~]# cat /application/nginx/html/check.php
　　
　　Ø  字符串操作
　　
2.2 Python连接redis
　　unzip redis-py-master.zip
　　python setup.py install
　　>>> r = redis.StrictRedis(host='localhost', port=6379, db=0, password='')
　　>>> r.set('foo', 'bar')
　　True
　　>>> r.get('foo')
　　'bar'
2.3 redis cluster的连接与操作
　　python连接的话要2.7以上版本才支持:
　　>>> from rediscluster import StrictRedisCluster
　　>>> startup_nodes = [{"host": "127.0.0.1", "port": "7000"}]
　　>>> rc = StrictRedisCluster(startup_nodes=startup_nodes, decode_responses=True)
　　>>> rc.set("foo", "bar")
　　True
　　>>> print(rc.get("foo"))
　　bar
2.4 sentinel集群连接并操作:
　　>>> from redis.sentinel import Sentinel
　　>>> sentinel = Sentinel([('localhost', 26380)], socket_timeout=0.1)
　　>>> sentinel.discover_master('mymaster')
　　>>> sentinel.discover_slaves('mymaster')
　　>>> master = sentinel.master_for('mymaster', socket_timeout=0.1)
　　>>> slave = sentinel.slave_for('mymaster', socket_timeout=0.1)
　　>>> master.set('oldboy', '123')
　　>>> slave.get('oldboy')
　　'bar'

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] Redis 哨兵模式与Redis集群

浏览过的版块

扫码加入运维网微信交流群