我爱小虾 发表于 2019-2-2 10:08:35

Ceph 时钟偏移故障处理

  时钟偏移故障现象:
  # ceph -w
  cluster b516386f-cb9d-49d5-bf48-07f0dac29e97
  health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
  monmap e1: 3 mons at {node1=10.240.217.101:6789/0,node4=10.240.217.104:6789/0,node5=10.240.217.105:6789/0}, election epoch 18, quorum 0,1,2 node1,node4,node5
  osdmap e63: 3 osds: 2 up, 2 in
  pgmap v249: 192 pgs, 3 pools, 0 bytes data, 0 objects
  10314 MB used, 2063 GB / 2073 GB avail
  192 active+degraded
  

  2014-06-19 10:46:24.736860 mon.0 mon.1 10.240.217.104:6789/0 clock skew 0.060021s > max 0.05s
  

  解决上面问题的方法:
  ceph默认的时钟偏移的时间是0.05s,由于这个时间太小,导致集群间的时间偏移值都大于0.05s,解决这个问题
  需要到各个monitor节点修改ceph.conf的配置,在配置文件中加入下面的配置
  # vi /etc/ceph/ceph.conf
  
  mon clock drift allowed = .50
  

  修改后重启ceph进程
  # service ceph restart
  === mon.node1 ===
  === mon.node1 ===
  Stopping Ceph mon.node1 on node1...kill 4723...done
  === mon.node1 ===
  Starting Ceph mon.node1 on node1...
  Starting ceph-create-keys on node1...
  

  更详细的处理方法可以看官方文档
  http://ceph.com/docs/master/rados/configuration/mon-config-ref/#monitor-store-synchronization



页: [1]
查看完整版本: Ceph 时钟偏移故障处理