使用虚拟机CentOS7部署CEPH集群
第1章 CEPH部署
1.1简单介绍
Ceph的部署模式下主要包含以下几个类型的节点
Ø CephOSDs: A Ceph OSD 进程主要用来存储数据,处理数据的replication,恢复,填充,调整资源组合以及通过检查其他OSD进程的心跳信息提供一些监控信息给CephMonitors . 当Ceph Storage Cluster 要准备2份数据备份时,要求至少有2个CephOSD进程的状态是active+clean状态 (Ceph 默认会提供两份数据备份).
Ø Monitors:Ceph Monitor 维护了集群map的状态,主要包括monitor map, OSD map, PlacementGroup (PG) map, 以及CRUSHmap. Ceph 维护了 Ceph Monitors, Ceph OSD Daemons, 以及PGs状态变化的历史记录 (called an “epoch”).
Ø MDSs:Ceph Metadata Server (MDS)存储的元数据代表Ceph的文件系统 (i.e., Ceph Block Devices 以及Ceph ObjectStorage不适用 MDS). Ceph Metadata Servers 让系统用户可以执行一些POSIX文件系统的基本命令,例如ls,find 等.
1.2集群规划
在创建集群前,首先做好集群规划,如下:
1.2.1网络拓扑
基于VMware虚拟机部署ceph集群:
1.2.2节点IP分配
节点IP
Hostname
说明
192.168.92.100
node0
Admin, osd
192.168.92.101
node1
Osd,mon
192.168.92.102
node2
Osd,mon
192.168.92.103
node3
Osd,mon
192.168.92.109
client-node
用户端节点;客服端,主要利用它挂载ceph集群提供的存储进行测试
1.2.3Osd规划
Node2/var/local/osd0
Node3/var/local/osd0
1.3主机准备
1.3.1管理节点修改hosts
修改/etc/hosts
$ sudocat /etc/hosts
password for ceph:
127.0.0.1 localhost localhost.localdomain localhost4localhost4.localdomain4
::1 localhost localhost.localdomainlocalhost6 localhost6.localdomain6
192.168.92.100node0
192.168.92.101node1
192.168.92.102node2
192.168.92.103node3
$
1.3.2root权限准备
分别为上述5台主机存储创建用户ceph:(使用root权限,或者具有root权限)
创建用户
sudo adduser -d /home/ceph -m ceph
设置密码
sudo passwd ceph
设置账户权限
echo “ceph ALL = (root)NOPASSWD:ALL” | sudo tee /etc/sudoers.d/ceph
sudo chomod 0440 /etc/sudoers.d/ceph
1.3.3sudo权限准备
执行命令visudo修改suoders文件:
把Defaults requiretty 这一行修改为修改 Defaults:ceph !requiretty
如果不进行修改ceph-depoy利用ssh执行命令将会出错
如果在ceph-deploy new <node-hostname>阶段依旧出错:
$ceph-deploy new node3
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.28): /usr/bin/ceph-deploy newnode3
ceph-deploy options:
username :None
func : <function new at 0xee0b18>
verbose :False
overwrite_conf :False
quiet :False
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0xef9a28>
cluster :ceph
ssh_copykey :True
mon : ['node3']
public_network :None
ceph_conf :None
cluster_network :None
default_release :False
fsid :None
Creatingnew cluster named ceph
making sure passwordless SSH succeeds
connected to host:admin-node
Running command: ssh -CT -o BatchMode=yesnode3
connection detectedneed for sudo
connected to host:node3
RuntimeError: remote connection got closed,ensure ``requiretty`` is disabled for node3
$
则需要按照如下方式设置sudo无密码操作:
使用命令sudo visudo修改:
$ sudo grep"ceph" /etc/sudoers
Defaults:ceph !requiretty
ceph ALL=(ALL) NOPASSWD: ALL
$
另外:
1.注释Defaults requiretty
Defaultsrequiretty修改为 #Defaults requiretty, 表示不需要控制终端。
否则会出现sudo: sorry, you must have a tty to run sudo
2.增加行 Defaults visiblepw
否则会出现 sudo: no tty present and no askpass program specified
1.3.4管理节点的无密码远程访问权限
配置admin-node与其他节点ssh无密码root权限访问其它节点。
第一步:在admin-node主机上执行命令:
ssh-keygen
说明:(为了简单点命令执行时直接确定即可)
第二步:将第一步的key复制至其他节点
ssh-copy-id ceph@node0
ssh-copy-id ceph@node1
ssh-copy-id ceph@node2
ssh-copy-id ceph@node3
同时修改~/.ssh/config文件增加一下内容:
ceph@admin-node my-cluster]$ cat~/.ssh/config
Host node1
Hostname node1
User ceph
Host node2
Hostname node2
User ceph
Host node3
Hostname node3
User ceph
Host client-node
Hostname client-node
User ceph
$
1.3.4.1Badowner or permissions on .ssh/config的解决
错误信息:
Bad owner or permissions on/home/ceph/.ssh/config fatal: The remote end hung up unexpectedly
解决方案:
$sudo chmod 600 config
1.4管理节点安装ceph-deploy工具
第一步:增加 yum配置文件
sudo vim /etc/yum.repos.d/ceph.repo
添加以下内容:
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://mirrors.163.com/ceph/keys/release.asc
第二步:更新软件源并按照ceph-deploy,时间同步软件
sudo yum update && sudo yuminstall ceph-deploy
sudo yum install ntp ntpupdatentp-doc
第三步:关闭所有节点的防火墙以及安全选项(在所有节点上执行)以及其他一些步骤
sudo systemctl stopfirewall.service
sudosetenforce 0
sudo yum installyum-plugin-priorities
总结:经过以上步骤前提条件都准备好了接下来真正部署ceph了。
1.5创建Ceph集群
以前面创建的ceph用户在admin-node节点上创建目录
mkdirmy-cluster
cd my-cluster
1.5.1如何清空ceph数据
先清空之前所有的ceph数据,如果是新装不用执行此步骤,如果是重新部署的话也执行下面的命令:
ceph-deploy purgedata {ceph-node}[{ceph-node}]
ceph-deploy forgetkeys
如:
$ ceph-deploy purgedata admin-node node1 node2 node3
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.28): /usr/bin/ceph-deploypurgedata admin-node node1 node2 node3
…
Running command: sudo rm -rf--one-file-system -- /var/lib/ceph
Running command: sudo rm -rf --one-file-system-- /etc/ceph/
$ ceph-deploy forgetkeys
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.28): /usr/bin/ceph-deployforgetkeys
…
default_release : False
$
命令:
ceph-deploy purge {ceph-node}[{ceph-node}]
如:
$ ceph-deploypurge admin-node node1 node2 node3
1.5.2创建集群设置Monitor节点
在admin节点上用ceph-deploy创建集群,new后面跟的是monitor节点的hostname,如果有多个monitor,则它们的hostname以为间隔,多个mon节点可以实现互备。
$ sudo vim/etc/ssh/sshd_config
$ sudo visudo
$ ceph-deploynew node1 node2 node3
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.34): /usr/bin/ceph-deploy newnode1 node2 node3
ceph-deploy options:
username :None
func : <function new at0x29f2b18>
verbose :False
overwrite_conf :False
quiet :False
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x2a15a70>
cluster :ceph
ssh_copykey :True
mon : ['node1', 'node2','node3']
public_network :None
ceph_conf :None
cluster_network :None
default_release :False
fsid :None
Creatingnew cluster named ceph
making sure passwordless SSH succeeds
connected to host:node0
Running command: ssh -CT -o BatchMode=yesnode1
connection detectedneed for sudo
connected to host:node1
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo /usr/sbin/ip linkshow
Running command: sudo /usr/sbin/ip addrshow
IP addresses found:['192.168.92.101', '192.168.1.102', '192.168.122.1']
Resolvinghost node1
Monitornode1 at 192.168.92.101
making sure passwordless SSH succeeds
connected to host:node0
Running command: ssh -CT -o BatchMode=yesnode2
connection detectedneed for sudo
connected to host:node2
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo /usr/sbin/ip linkshow
Running command: sudo /usr/sbin/ip addrshow
IP addresses found:['192.168.1.103', '192.168.122.1', '192.168.92.102']
Resolvinghost node2
Monitornode2 at 192.168.92.102
making sure passwordless SSH succeeds
connected to host:node0
Running command: ssh -CT -o BatchMode=yesnode3
connection detectedneed for sudo
connected to host:node3
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo /usr/sbin/ip linkshow
Running command: sudo /usr/sbin/ip addrshow
IP addresses found:['192.168.122.1', '192.168.1.104', '192.168.92.103']
Resolvinghost node3
Monitornode3 at 192.168.92.103
Monitorinitial members are ['node1', 'node2', 'node3']
Monitoraddrs are ['192.168.92.101', '192.168.92.102', '192.168.92.103']
Creating arandom mon key...
Writingmonitor keyring to ceph.mon.keyring...
Writinginitial config to ceph.conf...
$
查看生成的文件
ceph@admin-node my-cluster]$ ls
ceph.confceph.log ceph.mon.keyring
查看ceph的配置文件,Node1节点都变为了控制节点
$ catceph.conf
fsid =3c9892d0-398b-4808-aa20-4dc622356bd0
mon_initial_members = node1, node2,node3
mon_host =192.168.92.111,192.168.92.112,192.168.92.113
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
$
1.5.2.1修改副本数目
修改默认的副本数为2,即ceph.conf,使osd_pool_default_size的值为2。如果该行,则添加。
$ grep"osd_pool_default_size" ./ceph.conf
osd_pool_default_size = 2
$
1.5.2.2网络不唯一的处理
如果IP不唯一,即除ceph集群使用的网络外,还有其他的网络IP。
比如:
eno16777736:192.168.175.100
eno50332184:192.168.92.110
virbr0:192.168.122.1
那么就需要在ceph.conf配置文档部分增加参数public network参数:
public_network ={ip-address}/{netmask}
如:
public_network = 192.168.92.0/6789
1.5.3安装ceph
admin-node节点用ceph-deploy工具向各个节点安装ceph:
ceph-deploy install{ceph-node}[{ceph-node} ...]
如:
$ ceph-deployinstall node0 node1 node2 node3
$ ceph-deployinstall node0 node1 node2 node3
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.34): /usr/bin/ceph-deployinstall node0 node1 node2 node3
ceph-deploy options:
verbose :False
testing :None
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x2ae0560>
cluster :ceph
dev_commit :None
install_mds :False
stable : None
default_release :False
username :None
adjust_repos :True
func : <function install at 0x2a53668>
install_all :False
repo :False
host :['node0', 'node1', 'node2', 'node3']
install_rgw :False
install_tests :False
repo_url :None
ceph_conf : None
install_osd :False
version_kind :stable
install_common :False
overwrite_conf : False
quiet :False
dev :master
local_mirror :None
release : None
install_mon :False
gpg_url :None
Installing stable version jewel on cluster ceph hosts node0 node1 node2 node3
Detecting platform for host node0 ...
connection detectedneed for sudo
connected to host:node0
detect platforminformation from remote host
detect machine type
Distro info: CentOS Linux 7.2.1511 Core
installing Ceph on node0
Running command: sudo yum clean all
Loaded plugins:fastestmirror, langpacks, priorities
Cleaning repos: CephCeph-noarch base ceph-source epel extras updates
Cleaning upeverything
Cleaning up list offastest mirrors
Running command: sudo yum -y installepel-release
Loaded plugins:fastestmirror, langpacks, priorities
Determining fastestmirrors
* epel: mirror01.idc.hinet.net
25 packages excludeddue to repository priority protections
Packageepel-release-7-7.noarch already installed and latest version
Nothing to do
Running command: sudo yum -y installyum-plugin-priorities
Loaded plugins:fastestmirror, langpacks, priorities
Loading mirrorspeeds from cached hostfile
* epel: mirror01.idc.hinet.net
25 packages excludeddue to repository priority protections
Packageyum-plugin-priorities-1.1.31-34.el7.noarch already installed and latest version
Nothing to do
Configure Yumpriorities to include obsoletes
check_obsoletes hasbeen enabled for Yum priorities plugin
Running command: sudo rpm --importhttps://download.ceph.com/keys/release.asc
Running command: sudo rpm -Uvh--replacepkgshttps://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-0.el7.noarch.rpm
Retrievinghttps://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-0.el7.noarch.rpm
Preparing... ########################################
Updating /installing...
ceph-release-1-1.el7 ########################################
ensuring that/etc/yum.repos.d/ceph.repo contains a high priority
altered ceph.repopriorities to contain: priority=1
Running command: sudo yum -y install cephceph-radosgw
Loaded plugins:fastestmirror, langpacks, priorities
Loading mirrorspeeds from cached hostfile
* epel: mirror01.idc.hinet.net
25 packages excludeddue to repository priority protections
Package1:ceph-10.2.2-0.el7.x86_64 already installed and latest version
Package1:ceph-radosgw-10.2.2-0.el7.x86_64 already installed and latest version
Nothing to do
Running command: sudo ceph --version
ceph version 10.2.2(45107e21c568dd033c2f0a3107dec8f0b0e58374)
….
1.5.3.1问题No section: 'ceph'
问题日志
$ ceph-deployinstall node0 node1 node2 node3
foundconfiguration file at: /home/ceph/.cephdeploy.conf
…
ensuring that/etc/yum.repos.d/ceph.repo contains a high priority
RuntimeError:NoSectionError: No section: 'ceph'
$
解决方法:
yum remove ceph-release
再次执行
$ ceph-deployinstall node0 node1 node2 node3
解决方案:
原因是在失败节点安装ceph超时,就需要单独执行,在失败的节点上执行如下语句
sudo yum -y install cephceph-radosgw
1.5.4初始化monitor节点
初始化监控节点并收集keyring:
$ ceph-deploymon create-initial
$ ceph-deploymon create-initial
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.34): /usr/bin/ceph-deploy moncreate-initial
ceph-deploy options:
username :None
verbose : False
overwrite_conf :False
subcommand :create-initial
quiet :False
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x7fbe46804cb0>
cluster :ceph
func :<function mon at 0x7fbe467f6aa0>
ceph_conf :None
default_release :False
keyrings :None
Deployingmon, cluster ceph hosts node1 node2 node3
detectingplatform for host node1 ...
connection detectedneed for sudo
connected to host:node1
detect platforminformation from remote host
detect machine type
find the location ofan executable
distro info: CentOS Linux 7.2.1511 Core
determining ifprovided host has same hostname in remote
get remote shorthostname
deploying mon to node1
get remote shorthostname
remote hostname:node1
write clusterconfiguration to /etc/ceph/{cluster}.conf
create the mon pathif it does not exist
checking for donepath: /var/lib/ceph/mon/ceph-node1/done
done path does notexist: /var/lib/ceph/mon/ceph-node1/done
creating keyring file:/var/lib/ceph/tmp/ceph-node1.mon.keyring
create the monitorkeyring file
Running command: sudo ceph-mon --clusterceph --mkfs -i node1 --keyring /var/lib/ceph/tmp/ceph-node1.mon.keyring--setuser 1001 --setgroup 1001
ceph-mon:mon.noname-a 192.168.92.101:6789/0 is local, renaming to mon.node1
ceph-mon: set fsidto 4f8f6c46-9f67-4475-9cb5-52cafecb3e4c
ceph-mon: createdmonfs at /var/lib/ceph/mon/ceph-node1 for mon.node1
unlinking keyring file/var/lib/ceph/tmp/ceph-node1.mon.keyring
create a done fileto avoid re-doing the mon deployment
create the init pathif it does not exist
Running command: sudo systemctl enableceph.target
Running command: sudo systemctl enableceph-mon@node1
Created symlink from/etc/systemd/system/ceph-mon.target.wants/ceph-mon@node1.service to/usr/lib/systemd/system/ceph-mon@.service.
Running command: sudo systemctl startceph-mon@node1
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node1.asok mon_status
********************************************************************************
status for monitor:mon.node1
{
"election_epoch": 0,
"extra_probe_peers": [
"192.168.92.102:6789/0",
"192.168.92.103:6789/0"
],
"monmap": {
"created": "2016-06-2414:43:29.944474",
"epoch": 0,
"fsid":"4f8f6c46-9f67-4475-9cb5-52cafecb3e4c",
"modified": "2016-06-2414:43:29.944474",
"mons": [
{
"addr":"192.168.92.101:6789/0",
"name": "node1",
"rank": 0
},
{
"addr":"0.0.0.0:0/1",
"name": "node2",
"rank": 1
},
{
"addr":"0.0.0.0:0/2",
"name": "node3",
"rank": 2
}
]
},
"name": "node1",
"outside_quorum": [
"node1"
],
"quorum": [],
"rank": 0,
"state": "probing",
"sync_provider": []
}
********************************************************************************
monitor: mon.node1 is running
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node1.asok mon_status
detectingplatform for host node2 ...
connection detectedneed for sudo
connected to host:node2
detect platforminformation from remote host
detect machine type
find the location ofan executable
distro info: CentOS Linux 7.2.1511 Core
determining ifprovided host has same hostname in remote
get remote shorthostname
deploying mon tonode2
get remote shorthostname
remote hostname:node2
write clusterconfiguration to /etc/ceph/{cluster}.conf
create the mon pathif it does not exist
checking for donepath: /var/lib/ceph/mon/ceph-node2/done
done path does notexist: /var/lib/ceph/mon/ceph-node2/done
creating keyring file:/var/lib/ceph/tmp/ceph-node2.mon.keyring
create the monitorkeyring file
Running command: sudo ceph-mon --clusterceph --mkfs -i node2 --keyring /var/lib/ceph/tmp/ceph-node2.mon.keyring--setuser 1001 --setgroup 1001
ceph-mon:mon.noname-b 192.168.92.102:6789/0 is local, renaming to mon.node2
ceph-mon: set fsidto 4f8f6c46-9f67-4475-9cb5-52cafecb3e4c
ceph-mon: createdmonfs at /var/lib/ceph/mon/ceph-node2 for mon.node2
unlinking keyring file/var/lib/ceph/tmp/ceph-node2.mon.keyring
create a done fileto avoid re-doing the mon deployment
create the init pathif it does not exist
Running command: sudo systemctl enableceph.target
Running command: sudo systemctl enableceph-mon@node2
Created symlink from/etc/systemd/system/ceph-mon.target.wants/ceph-mon@node2.service to/usr/lib/systemd/system/ceph-mon@.service.
Running command: sudo systemctl startceph-mon@node2
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
********************************************************************************
status for monitor:mon.node2
{
"election_epoch": 1,
"extra_probe_peers": [
"192.168.92.101:6789/0",
"192.168.92.103:6789/0"
],
"monmap": {
"created": "2016-06-2414:43:34.865908",
"epoch": 0,
"fsid":"4f8f6c46-9f67-4475-9cb5-52cafecb3e4c",
"modified": "2016-06-2414:43:34.865908",
"mons": [
{
"addr":"192.168.92.101:6789/0",
"name": "node1",
"rank": 0
},
{
"addr":"192.168.92.102:6789/0",
"name": "node2",
"rank": 1
},
{
"addr":"0.0.0.0:0/2",
"name": "node3",
"rank": 2
}
]
},
"name": "node2",
"outside_quorum": [],
"quorum": [],
"rank": 1,
"state": "electing",
"sync_provider": []
}
********************************************************************************
monitor: mon.node2 is running
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
detectingplatform for host node3 ...
connection detectedneed for sudo
connected to host:node3
detect platforminformation from remote host
detect machine type
find the location ofan executable
distro info: CentOS Linux 7.2.1511 Core
determining ifprovided host has same hostname in remote
get remote shorthostname
deploying mon tonode3
get remote shorthostname
remote hostname:node3
write clusterconfiguration to /etc/ceph/{cluster}.conf
create the mon pathif it does not exist
checking for donepath: /var/lib/ceph/mon/ceph-node3/done
done path does notexist: /var/lib/ceph/mon/ceph-node3/done
creating keyring file:/var/lib/ceph/tmp/ceph-node3.mon.keyring
create the monitorkeyring file
Running command: sudo ceph-mon --clusterceph --mkfs -i node3 --keyring /var/lib/ceph/tmp/ceph-node3.mon.keyring--setuser 1001 --setgroup 1001
ceph-mon:mon.noname-c 192.168.92.103:6789/0 is local, renaming to mon.node3
ceph-mon: set fsidto 4f8f6c46-9f67-4475-9cb5-52cafecb3e4c
ceph-mon: createdmonfs at /var/lib/ceph/mon/ceph-node3 for mon.node3
unlinking keyring file/var/lib/ceph/tmp/ceph-node3.mon.keyring
create a done fileto avoid re-doing the mon deployment
create the init pathif it does not exist
Running command: sudo systemctl enableceph.target
Running command: sudo systemctl enableceph-mon@node3
Created symlink from/etc/systemd/system/ceph-mon.target.wants/ceph-mon@node3.service to/usr/lib/systemd/system/ceph-mon@.service.
Running command: sudo systemctl startceph-mon@node3
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node3.asok mon_status
********************************************************************************
status for monitor:mon.node3
{
"election_epoch": 1,
"extra_probe_peers": [
"192.168.92.101:6789/0",
"192.168.92.102:6789/0"
],
"monmap": {
"created": "2016-06-2414:43:39.800046",
"epoch": 0,
"fsid":"4f8f6c46-9f67-4475-9cb5-52cafecb3e4c",
"modified": "2016-06-2414:43:39.800046",
"mons": [
{
"addr":"192.168.92.101:6789/0",
"name": "node1",
"rank": 0
},
{
"addr":"192.168.92.102:6789/0",
"name": "node2",
"rank": 1
},
{
"addr":"192.168.92.103:6789/0",
"name": "node3",
"rank": 2
}
]
},
"name": "node3",
"outside_quorum": [],
"quorum": [],
"rank": 2,
"state": "electing",
"sync_provider": []
}
********************************************************************************
monitor: mon.node3 is running
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node3.asok mon_status
processing monitor mon.node1
connection detectedneed for sudo
connected to host:node1
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node1.asok mon_status
mon.node1monitor is not yet in quorum, tries left: 5
waiting 5seconds before retrying
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node1.asok mon_status
mon.node1monitor is not yet in quorum, tries left: 4
waiting 10seconds before retrying
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node1.asok mon_status
mon.node1 monitor has reached quorum!
processing monitor mon.node2
connection detectedneed for sudo
connected to host:node2
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
mon.node2 monitor has reached quorum!
processing monitor mon.node3
connection detectedneed for sudo
connected to host:node3
detect platforminformation from remote host
detect machine type
find the location ofan executable
Running command: sudo ceph --cluster=ceph--admin-daemon /var/run/ceph/ceph-mon.node3.asok mon_status
mon.node3 monitor has reached quorum!
all initial monitors are running and haveformed quorum
Running gatherkeys...
Storing keys in temp directory/tmp/tmp5_jcSr
connection detectedneed for sudo
connected to host:node1
detect platforminformation from remote host
detect machine type
get remote shorthostname
fetch remote file
Running command: sudo /usr/bin/ceph--connect-timeout=25 --cluster=ceph--admin-daemon=/var/run/ceph/ceph-mon.node1.asok mon_status
Running command: sudo /usr/bin/ceph--connect-timeout=25 --cluster=ceph --name mon.--keyring=/var/lib/ceph/mon/ceph-node1/keyring auth get-or-create client.adminosdallow * mds allow * mon allow *
Running command: sudo /usr/bin/ceph--connect-timeout=25 --cluster=ceph --name mon.--keyring=/var/lib/ceph/mon/ceph-node1/keyring auth get-or-createclient.bootstrap-mdsmon allow profile bootstrap-mds
Running command: sudo /usr/bin/ceph--connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-node1/keyringauth get-or-create client.bootstrap-osdmon allow profile bootstrap-osd
Running command: sudo /usr/bin/ceph--connect-timeout=25 --cluster=ceph --name mon.--keyring=/var/lib/ceph/mon/ceph-node1/keyring auth get-or-createclient.bootstrap-rgwmon allow profile bootstrap-rgw
Storing ceph.client.admin.keyring
Storing ceph.bootstrap-mds.keyring
keyring 'ceph.mon.keyring' already exists
Storing ceph.bootstrap-osd.keyring
Storing ceph.bootstrap-rgw.keyring
Destroy temp directory /tmp/tmp5_jcSr
$
查看生成的文件
$ ls
ceph.bootstrap-mds.keyringceph.bootstrap-rgw.keyringceph.conf ceph.mon.keyring
ceph.bootstrap-osd.keyringceph.client.admin.keyring ceph-deploy-ceph.log
$
第2章 OSD操作
2.1添加OSD
2.1.1初始化OSD
命令
ceph-deploy osd prepare{ceph-node}:/path/to/directory
示例,如1.2.3所示
$ ceph-deployosd prepare node2:/var/local/osd0 node3:/var/local/osd0
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.34): /usr/bin/ceph-deploy osdprepare node2:/var/local/osd0 node3:/var/local/osd0
ceph-deploy options:
username :None
disk : [('node2', '/var/local/osd0',None), ('node3', '/var/local/osd0', None)]
dmcrypt :False
verbose :False
bluestore : None
overwrite_conf :False
subcommand :prepare
dmcrypt_key_dir :/etc/ceph/dmcrypt-keys
quiet :False
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x12dddd0>
cluster :ceph
fs_type : xfs
func :<function osd at 0x12d2398>
ceph_conf :None
default_release :False
zap_disk :False
Preparingcluster ceph disks node2:/var/local/osd0: node3:/var/local/osd0:
connection detectedneed for sudo
connected to host:node2
detect platforminformation from remote host
detect machine type
find the location ofan executable
Distro info: CentOS Linux 7.2.1511 Core
Deployingosd to node2
write clusterconfiguration to /etc/ceph/{cluster}.conf
Preparinghost node2 disk /var/local/osd0 journal None activate False
find the location ofan executable
Running command: sudo /usr/sbin/ceph-disk-v prepare --cluster ceph --fs-type xfs -- /var/local/osd0
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
command: Runningcommand: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
populate_data_path:Preparing osd data dir /var/local/osd0
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/ceph_fsid.3504.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/ceph_fsid.3504.tmp
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/fsid.3504.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/fsid.3504.tmp
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/magic.3504.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/magic.3504.tmp
checking OSD status...
find the location ofan executable
Running command: sudo /bin/ceph--cluster=ceph osd stat --format=json
Host node2is now ready for osd use.
connection detectedneed for sudo
connected to host:node3
detect platforminformation from remote host
detect machine type
find the location ofan executable
Distro info: CentOS Linux 7.2.1511 Core
Deployingosd to node3
write clusterconfiguration to /etc/ceph/{cluster}.conf
Preparinghost node3 disk /var/local/osd0 journal None activate False
find the location ofan executable
Running command: sudo /usr/sbin/ceph-disk-v prepare --cluster ceph --fs-type xfs -- /var/local/osd0
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
command: Runningcommand: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
populate_data_path:Preparing osd data dir /var/local/osd0
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/ceph_fsid.3553.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/ceph_fsid.3553.tmp
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/fsid.3553.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/fsid.3553.tmp
command: Runningcommand: /sbin/restorecon -R /var/local/osd0/magic.3553.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/local/osd0/magic.3553.tmp
checking OSD status...
find the location ofan executable
Running command: sudo /bin/ceph--cluster=ceph osd stat --format=json
Host node3is now ready for osd use.
$
2.1.2激活OSD
命令:
ceph-deploy osd activate{ceph-node}:/path/to/directory
示例:
$ ceph-deployosd activate node2:/var/local/osd0 node3:/var/local/osd0
查询状态:
$ ceph -s
cluster 4f8f6c46-9f67-4475-9cb5-52cafecb3e4c
health HEALTH_WARN
64 pgs degraded
64 pgs stuck unclean
64 pgs undersized
mon.node2 low disk space
mon.node3 low disk space
monmap e1: 3 mons at{node1=192.168.92.101:6789/0,node2=192.168.92.102:6789/0,node3=192.168.92.103:6789/0}
election epoch 18, quorum 0,1,2node1,node2,node3
osdmap e12: 3 osds: 3 up, 3 in
flags sortbitwise
pgmap v173: 64 pgs, 1 pools, 0 bytes data, 0 objects
20254 MB used, 22120 MB / 42374 MBavail
64 active+undersized+degraded
$
2.1.2.1creatingempty object store in *: (13) Permission denied
错误日志:
ceph_disk.main.Error: Error: ['ceph-osd', '--cluster', 'ceph', '--mkfs','--mkkey', '-i', '0', '--monmap', '/var/local/osd0/activate.monmap','--osd-data', '/var/local/osd0','--osd-journal', '/var/local/osd0/journal','--osd-uuid', '76f06d28-7e0d-4894-8625-4f55d43962bf', '--keyring', '/var/local/osd0/keyring','--setuser', 'ceph', '--setgroup', 'ceph'] failed : 2016-06-24 15:31:39.9318257fd1150c1800 -1 filestore(/var/local/osd0)mkfs: write_version_stamp() failed:(13) Permission denied
2016-06-2415:31:39.931861 7fd1150c1800 -1 OSD::mkfs: ObjectStore::mkfs failed with error-13
2016-06-24 15:31:39.932024 7fd1150c1800 -1 ** ERROR: error creating empty object store in /var/local/osd0: (13)Permission denied
RuntimeError: command returned non-zero exit status: 1
RuntimeError: Failed to execute command: /usr/sbin/ceph-disk -v activate--mark-init systemd --mount /var/local/osd0
解决方案:
办法很简单,将ceph集群需要使用的所有磁盘权限,所属用户、用户组改给ceph
chownceph:ceph /var/local/osd0
$ssh node2"sudo chown ceph:ceph /var/local/osd0"
$ssh node3"sudo chown ceph:ceph /var/local/osd0"
问题延伸:
此问题本次修复后,系统重启磁盘权限会被修改回,导致osd服务无法正常启动,这个权限问题很坑,写了个for 循环,加入到rc.local,每次系统启动自动修改磁盘权限;
for i in a b c d e f g h i l j k;dochown ceph.ceph /dev/sd"$i"*;done
2.1.3实例/dev/sdb
2.1.3.1 初始化
$ ceph-deployosd prepare node1:/dev/sdb
foundconfiguration file at: /home/ceph/.cephdeploy.conf
Invoked (1.5.34): /usr/bin/ceph-deploy osdprepare node1:/dev/sdb
ceph-deploy options:
username :None
disk :[('node1', '/dev/sdb', None)]
dmcrypt :False
verbose :False
bluestore :None
overwrite_conf :False
subcommand :prepare
dmcrypt_key_dir :/etc/ceph/dmcrypt-keys
quiet : False
cd_conf :<ceph_deploy.conf.cephdeploy.Conf instance at 0x1acfdd0>
cluster :ceph
fs_type : xfs
func :<function osd at 0x1ac4398>
ceph_conf :None
default_release :False
zap_disk :False
Preparingcluster ceph disks node1:/dev/sdb:
connection detectedneed for sudo
connected to host:node1
detect platforminformation from remote host
detect machine type
find the location ofan executable
Distro info: CentOS Linux 7.2.1511 Core
Deployingosd to node1
write clusterconfiguration to /etc/ceph/{cluster}.conf
Preparinghost node1 disk /dev/sdb journal None activate False
find the location ofan executable
Running command: sudo /usr/sbin/ceph-disk-v prepare --cluster ceph --fs-type xfs -- /dev/sdb
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
command: Runningcommand: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
command: Runningcommand: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
set_type: Willcolocate journal with data on /dev/sdb
command: Runningcommand: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
command: Runningcommand: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookuposd_mkfs_options_xfs
command: Runningcommand: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookuposd_fs_mkfs_options_xfs
command: Runningcommand: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookuposd_mount_options_xfs
command: Runningcommand: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookuposd_fs_mount_options_xfs
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
ptype_tobe_for_name:name = journal
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
create_partition:Creating journal partition num 2 size 5120 on /dev/sdb
command_check_call:Running command: /sbin/sgdisk --new=2:0:+5120M --change-name=2:ceph journal--partition-guid=2:ddc560cc-f7b8-40fb-8f19-006ae2ef03a2 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106--mbrtogpt-- /dev/sdb
Creating new GPTentries.
The operation hascompleted successfully.
update_partition:Calling partprobe on created device /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
command: Runningcommand: /sbin/partprobe /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb2 uuid path is /sys/dev/block/8:18/dm/uuid
prepare_device:Journal is GPT partition/dev/disk/by-partuuid/ddc560cc-f7b8-40fb-8f19-006ae2ef03a2
prepare_device:Journal is GPT partition/dev/disk/by-partuuid/ddc560cc-f7b8-40fb-8f19-006ae2ef03a2
get_dm_uuid: get_dm_uuid/dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
set_data_partition:Creating osd partition on /dev/sdb
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
ptype_tobe_for_name:name = data
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
create_partition:Creating data partition num 1 size 0 on /dev/sdb
command_check_call:Running command: /sbin/sgdisk --largest-new=1 --change-name=1:ceph data--partition-guid=1:805bfdb4-97b8-48e7-a42e-a734a47aa533--typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be--mbrtogpt -- /dev/sdb
The operation hascompleted successfully.
update_partition:Calling partprobe on created device /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
command: Runningcommand: /sbin/partprobe /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
get_dm_uuid:get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
populate_data_path_device: Creating xfs fs on /dev/sdb1
command_check_call:Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdb1
meta-data=/dev/sdb1 isize=2048 agcount=4,agsize=982975 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=3931899, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: Mounting/dev/sdb1 on /var/lib/ceph/tmp/mnt.9sdF7v with options noatime,inode64
command_check_call:Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1/var/lib/ceph/tmp/mnt.9sdF7v
command: Running command:/sbin/restorecon /var/lib/ceph/tmp/mnt.9sdF7v
populate_data_path:Preparing osd data dir /var/lib/ceph/tmp/mnt.9sdF7v
command: Runningcommand: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.9sdF7v/ceph_fsid.5102.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph/var/lib/ceph/tmp/mnt.9sdF7v/ceph_fsid.5102.tmp
command: Runningcommand: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.9sdF7v/fsid.5102.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.9sdF7v/fsid.5102.tmp
command: Runningcommand: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.9sdF7v/magic.5102.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.9sdF7v/magic.5102.tmp
command: Runningcommand: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.9sdF7v/journal_uuid.5102.tmp
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.9sdF7v/journal_uuid.5102.tmp
adjust_symlink:Creating symlink /var/lib/ceph/tmp/mnt.9sdF7v/journal ->/dev/disk/by-partuuid/ddc560cc-f7b8-40fb-8f19-006ae2ef03a2
command: Runningcommand: /sbin/restorecon -R /var/lib/ceph/tmp/mnt.9sdF7v
command: Runningcommand: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.9sdF7v
unmount: Unmounting/var/lib/ceph/tmp/mnt.9sdF7v
command_check_call:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.9sdF7v
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
command_check_call:Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d-- /dev/sdb
Warning: The kernelis still using the old partition table.
The new table willbe used at the next reboot.
The operation hascompleted successfully.
update_partition:Calling partprobe on prepared device /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
command: Runningcommand: /sbin/partprobe /dev/sdb
command_check_call:Running command: /usr/bin/udevadm settle --timeout=600
command_check_call:Running command: /usr/bin/udevadm trigger --action=add --sysname-match sdb1
checking OSD status...
find the location ofan executable
Running command: sudo /bin/ceph--cluster=ceph osd stat --format=json
there is 1 OSD down
there is 1 OSD out
Host node1is now ready for osd use.
$
2.1.3.2激活
2.1.4 Cannot discover filesystemtype
will useinit type: systemd
find the location ofan executable
Running command: sudo /usr/sbin/ceph-disk-v activate --mark-init systemd --mount /dev/sdb
main_activate: path= /dev/sdb
get_dm_uuid:get_dm_uuid /dev/sdb uuid path is /sys/dev/block/8:16/dm/uuid
command: Runningcommand: /sbin/blkid -p -s TYPE -o value -- /dev/sdb
Traceback (mostrecent call last):
File "/usr/sbin/ceph-disk", line9, in <module>
load_entry_point('ceph-disk==1.0.0','console_scripts', 'ceph-disk')()
File"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4994, inrun
main(sys.argv)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py",line 4945, in main
args.func(args)
File"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3299, inmain_activate
reactivate=args.reactivate,
File"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3009, inmount_activate
e,
ceph_disk.main.FilesystemTypeError: Cannot discover filesystem type: device/dev/sdb: Line is truncated:
RuntimeError:command returned non-zero exit status: 1
RuntimeError:Failed to execute command: /usr/sbin/ceph-disk -v activate --mark-init systemd--mount /dev/sdb
2.2删除OSD
# cephauth del osd.5
updated
# cephosd rm 5
removed osd.5
第3章 OSD操作
页:
[1]