Redis Cluster集群节点管理

ndlli · 发表于 2018-11-3 08:04:13

如何管理
　　Redis集群的管理涉及的主要就是针对集群中的主次节点进行新增、删除以及对节点重新分片操作，而这些操作我们就可以使用redis-trib.rb工具来实现，具体如下：

一、新增Master节点
　　Redis集群中新增节点需要新创建一个空节点，然后将该空节点加入到集群中，最后为这个新的空节点分配slot哈希槽值即可。具体如下：

　　A、新建空节点

　　

//使用脚本创建redis 7006节点：　　
#cd /usr/local/redis-3.2.8
　　
#./utils/install_server.sh
　　
Welcome to the redis service installer
　　
This script will help you easily set up a running redis server
　　

　　
Please select the redis port for this instance: [6379] 7006    # 端口为7006
　　
Please select the redis config file name [/etc/redis/7006.conf]
　　
Selected default - /etc/redis/7006.conf
　　
Please select the redis log file name [/var/log/redis_7006.log]
　　
Selected default - /var/log/redis_7006.log
　　
Please select the data directory for this instance [/var/lib/redis/7006]
　　
Selected default - /var/lib/redis/7006
　　
Please select the redis executable path [/usr/local/bin/redis-server]
　　
Selected config:
　　
Port          : 7006
　　
Config file : /etc/redis/7006.conf
　　
Log file    : /var/log/redis_7006.log
　　
Data dir    : /var/lib/redis/7006
　　
Executable    : /usr/local/bin/redis-server
　　
Cli Executable : /usr/local/bin/redis-cli
　　
Is this ok? Then press ENTER to go on or Ctrl-C to abort.
　　
Copied /tmp/7006.conf => /etc/init.d/redis_7006
　　
Installing service...
　　
Successfully added to chkconfig!
　　
Successfully added to runlevels 345!
　　
Starting Redis server...
　　
Installation successful!
　　

//手动修改redis配置文件端口（脚本无法成功配置）：　　
# sed -i 's/6379/7006/g' /etc/redis/7006.conf
　　
//修改redis启动脚本（如果修改过监听端口地址）
　　
# sed -i 's/$CLIEXEC -p/$CLIEXEC -h 192.168.0.2 -p/g' /etc/rc.d/init.d/redis_7006
　　
//关闭默认端口
　　
# /usr/local/redis-3.2.8/src/redis-cli -h 192.168.0.2 -p 6379 shutdown
　　
//启动这个新节点：
　　
# /etc/init.d/redis_7006 start
　　

　　B、加入空节点到集群

　　

#cd /usr/local/redis-3.2.8/src　　
#./redis-trib.rb add-node 192.168.0.2:7006 192.168.0.2:7000
　　
NOTE：
　　
192.168.0.2:7006为集群新增的空节点；
　　
192.168.0.2:7000为集群中的任意节点；
　　

报错： [ERR] Node 192.168.30.198:16381 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0　　

　　
解决方法：
　　
1)、将需要新增的节点下aof、rdb等本地备份文件删除；
　　
2)、同时将新Node的集群配置文件 nodes_7006.conf 删除,即：删除你redis.conf里面cluster-config-file所在的文件；
　　
3)、再次添加新节点如果还是报错，则登录新Node,./redis-cli–h x –p对数据库进行清除：
　　
192.168.0.2:7006> flushdb #清空当前数据库
　　

　　执行完成之后，结果显示如下：

　　可以使用redis-trib.rb验证新增节点的类型是否为主节点：
　　

# ./redis-trib.rb check 192.168.0.2:7006　　

　　
//结果显示如下：
　　
>>> Performing Cluster Check (using node 192.168.0.2:7006)
　　
M: 7765fdc83ea8859a0d2398bfff8c633415d12777 192.168.0.2:7006
　　slots: (0 slots) master
　　0 additional replica(s)
　　
S: f40ea2a234f7251fa5ee32f31cb987334b104263 192.168.0.2:7005
　　slots: (0 slots) slave
　　replicates 920d755a8dbd2b62cbdf6053d62102c7379140d8
　　
S: 45be7a1f9a2a8f891fa5807ecd7bb6ae37d95eab 192.168.0.2:7003
　　slots: (0 slots) slave
　　replicates 01c944eb66564d41e355388dc468ee79e71fe789
　　
M: 01c944eb66564d41e355388dc468ee79e71fe789 192.168.0.2:7000
　　slots:0-5460 (5461 slots) master
　　1 additional replica(s)
　　
M: 357a9140f1b836afda4a623a757cfa54c3ab932b 192.168.0.2:7001
　　slots:5461-10922 (5462 slots) master
　　1 additional replica(s)
　　
M: 920d755a8dbd2b62cbdf6053d62102c7379140d8 192.168.0.2:7002
　　slots:10923-16383 (5461 slots) master
　　1 additional replica(s)
　　
S: c9335f2ea4c8d2bfa1766d59aa4631d9a8df36b3 192.168.0.2:7004
　　slots: (0 slots) slave
　　replicates 357a9140f1b836afda4a623a757cfa54c3ab932b
　　
[OK] All nodes agree about slots configuration.
　　
>>> Check for open slots...
　　
>>> Check slots coverage...
　　
[OK] All 16384 slots covered.
　　

　　C、为空节点分配slot

　　

#./redis-trib.rb reshard 192.168.0.2:7006　　

　　
//执行结果显示，并按提示输入：
　　
How many slots do you want to move (from 1 to 16384)? 500 //被删除master的slot数量

　　
What is the receiving node>
　　
Please enter all the source node>　　Type 'all' to use all the nodes as source nodes for the hash slots.

　　Type 'done' once you entered all the source nodes>　　
Source node #1:01c944eb66564d41e355388dc468ee79e71fe789 //被删除slot的7000主节点
　　
Source node #2:done
　　
Do you want to proceed with the proposed reshard plan (yes/no)? yes //确认重新分
　　

　　注意：source node 要用 ‘done’,如果用’all’则会将其他每个master节点的slot 分给接收slot的节点。最好别用‘all’，会造成slot 分割碎片。
　　执行结果显示：

　　从上图可以知道，新的主节点已经添加并分配好slots，slots值的范围为500.

　　D、验证使用该节点

　　从上图知道，新节点已经添加并配置成功，并且使用正常无问题。

　　如果在迁移过程遇到下面这样的错误：
　　

>>> Check for open slots...　　
[WARNING] Node 192.168.0.2:7000 has slots in importing state (0).
　　
[WARNING] Node 192.168.0.2:7006 has slots in migrating state (0).
　　
[WARNING] The following slots are open: 0
　　

　　

　　可以考虑使用命令“redis-trib.rb fix 192.168.0.2:7000”尝试修复。需要显示有节点处于migrating或importing状态，可以登录到相应的节点，使用命令“cluster setslot 0 stable”修改，参数0为问题显示的slot的ID。

二、为主节点新增Slave节点
　　新增Slave节点与新增Master的前三步骤相同，这里不再介绍，不同的是需要登录到从节点的redis-cli，使用replicate为该节点指定主节点，具体如下：
　　方式一：
　　

// 登录到从节点的redis-cli，使用replicate为该节点指定主节点　　
192.168.0.2:7006> cluster replicate 01c944eb66564d41e355388dc468ee79e71fe789
　　

　　方式二：
　　

#./redis-trib.rb add-node --slave --master-id '01c944eb66564d41e355388dc468ee79e71fe789' 192.168.0.2:7006 192.168.0.2:7001　　

　　
注释：
　　
--slave，表示添加的是从节点

　　
--master-id 01c944eb66564d41e355388dc468ee79e71fe789 ,主节点的node>　　
192.168.0.2:7006,新节点
　　
192.168.0.2:7001,集群任一个旧节点
　　

　　结果显示：
　　

(error) ERR To set a master the node must be empty and without assigned slots.　　
NOTE：
　　
上面的错误意思是不能为一个非空并且分配了slot的主节点继续添加从节点。
　　

　　由于上面的问题，我们现在将7006主节点置空并且清除为其指定的slots，具体如下：

　　A、先删除7006主节点的slots

　　#./redis-trib.rb reshard 192.168.0.2:7006

　　

How many slots do you want to move (from 1 to 16384)? 500          // 删除指定的500的slots
　　
What is the receiving node>　　
Source node #1:7765fdc83ea8859a0d2398bfff8c633415d12777    //指定slots的来源为当前的node-id(7006)
　　
Source node #2:done
　　
Do you want to proceed with the proposed reshard plan (yes/no)? yes          # 开始转移
　　

　　B、指定主节点

　　

#./redis-cli -c -h 192.168.0.2 -p 7006　　
193.192.168.0.2:7006> cluster replicate 01c944eb66564d41e355388dc468ee79e71fe789
　　
OK
　　
193.192.168.0.2:7006> exit
　　

　　C、查看下主从节点关系
　　./redis-trib.rb check 192.168.0.2:7006

　　执行结果：

三、删除节点

　　1,删除从节点
　　

#cd /usr/local/redis-3.2.8/src　　
#./redis-trib.rb del-node 192.168.0.2:7006 '7765fdc83ea8859a0d2398bfff8c633415d12777'
　　

　　执行结果显示：

　　2,删除主节点
　　A、如果主节点有从节点，将从节点转移到其他主节点在删除
　　B、转移主节点slots
　　这里的操作与上面的操作相同，这里不赘述。
　　C、删除主节点
　　$redis-trib.rb del-node 192.168.0.2:7000
　　8d8589fd7e9d140442e06b06e9e810d5d0f5e257
　　注：
　　192.168.0.2:7000 代表cluster的一个node
　　920d755a8dbd2b62cbdf6053d62102c7379140d8 为要删除的7002这个节点的id.

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

Red Hat RHCE 8 (EX294) Cert Guide

c++ size_t 和 int 的区别

HERE 使用 AWS EF 和 JFrog Artifactory 打

C++ 指针大全：从基础到进阶，一篇快速上手

wirelessnetview好用的无线分析工具

亿图图示专家(EDraw Max) V7.9 中文破解版

[经验分享] Redis Cluster集群节点管理

浏览过的版块

扫码加入运维网微信交流群