haixin3036 发表于 2018-10-25 12:53:32

【Mongodb】 Replica set的自动故障切换

  Replica set 为我们提供了自动故障切换功能,这个机制是由mongodb自己来操作的,它根据从库的优先级或者数据新鲜度(也就是最新的从主库同步数据的那个节点)来选择primary,而当以前的primary起来之后,会成为secondary ,接受新的primary 的日志。
http://blog.itpub.net/attachments/2011/10/22664653_201110312221291.jpg
  完整的replica sets
http://blog.itpub.net/attachments/2011/10/22664653_201110312221371.jpg
  primary 当机
http://blog.itpub.net/attachments/2011/10/22664653_201110312221431.jpg
  mongodb 会根据数据的新鲜度来选择下一个主库
  
  接上一篇文章,搭建好了replica set,查看端口为 27018 27020两个服务的状态:
  $ ./mongo 127.0.0.1:27018
  MongoDB shell version: 2.0.1
  connecting to: 127.0.0.1:27018/test
  PRIMARY> db.isMaster();
  {
  "setName" : "myset",
  "ismaster" : true,--为主库
  "secondary" : false,
  "hosts" : [
  "10.250.7.220:27018",
  "10.250.7.220:27020",
  "10.250.7.220:27019"
  ],
  "primary" : "10.250.7.220:27018",
  "me" : "10.250.7.220:27018",
  "maxBsonObjectSize" : 16777216,
  "ok" : 1
  }
  PRIMARY> exit
  bye
  $ ./mongo 127.0.0.1:27020
  MongoDB shell version: 2.0.1
  connecting to: 127.0.0.1:27020/test
  SECONDARY>
  SECONDARY> db.isMaster();
  {
  "setName" : "myset",
  "ismaster" : false,
  "secondary" : true, --为从库
  "hosts" : [
  "10.250.7.220:27020",
  "10.250.7.220:27019",
  "10.250.7.220:27018"
  ],
  "primary" : "10.250.7.220:27018",
  "me" : "10.250.7.220:27020",
  "maxBsonObjectSize" : 16777216,
  "ok" : 1
  }
  PRIMARY> 手工杀掉primary
  # ps -ef | grep 27018
  mongodb14826 147941 20:24 pts/4    00:00:05 ./mongod --dbpath /opt/mongodata/r1 --port 27018 --replSet myset --rest
  mongodb14999 144300 20:28 pts/2    00:00:00 ./mongo 127.0.0.1:27018
  # kill -9 14826 14794
  # ps -ef | grep mongodb |grep -v root
  mongodb14883 148531 20:26 pts/7    00:00:05 ./mongod --dbpath /opt/mongodata/r2 --port 27019 --replSet myset --rest
  mongodb14901 145481 20:27 pts/6    00:00:07 ./mongod --dbpath /opt/mongodata/r3 --port 27020 --replSet myset --rest
  mongodb14999 144300 20:28 pts/2    00:00:00 ./mongo 127.0.0.1:27018
  mongodb15102 150720 20:30 pts/5    00:00:00 ./mongo 127.0.0.1:27019
  mongodb15136 151060 20:30 pts/8    00:00:00 ./mongo 127.0.0.1:27020
  #
  27019 端口的mongodb 输出日志显示的选择10.250.7.220 作为主库的日志记录
  Mon Oct 31 20:27:59 allocating new datafile /opt/mongodata/r2/local.2, filling with zeroes...
  Mon Oct 31 20:27:59 replSet info member 10.250.7.220:27018 is up
  Mon Oct 31 20:27:59 replSet member 10.250.7.220:27018 is now in state SECONDARY
  Mon Oct 31 20:27:59 replSet info 10.250.7.220:27020 is down (or slow to respond): still initializing
  Mon Oct 31 20:27:59 replSet member 10.250.7.220:27020 is now in state DOWN
  Mon Oct 31 20:28:01 connection accepted from 10.250.7.220:10857 #3
  Mon Oct 31 20:28:05 replSet RECOVERING
  Mon Oct 31 20:28:05 replSet info voting yea for 10.250.7.220:27018 (0)
  Mon Oct 31 20:28:07 replSet member 10.250.7.220:27018 is now in state PRIMARY

  Mon Oct 31 20:28:09 done allocating datafile /opt/mongodata/r2/local.2,>  Mon Oct 31 20:28:10 ******
  Mon Oct 31 20:28:10 replSet initial sync pending
  Mon Oct 31 20:28:10 replSet syncing to: 10.250.7.220:27018
  Mon Oct 31 20:28:10 build index local.me { _id: 1 }
  Mon Oct 31 20:28:10 build index done 0 records 0.001 secs
  Mon Oct 31 20:28:10 replSet initial sync drop all databases
  Mon Oct 31 20:28:10 dropAllDatabasesExceptLocal 1
  Mon Oct 31 20:28:10 replSet initial sync clone all databases
  Mon Oct 31 20:28:10 replSet initial sync query minValid
  Mon Oct 31 20:28:10 replSet initial oplog application from 10.250.7.220:27018 starting at Oct 31 20:27:53:1 to Oct 31 20:27:53:1
  Mon Oct 31 20:28:13 replSet info member 10.250.7.220:27020 is up
  Mon Oct 31 20:28:13 replSet member 10.250.7.220:27020 is now in state STARTUP2
  Mon Oct 31 20:28:14 replSet initial sync finishing up
  Mon Oct 31 20:28:14 replSet set minValid=4eae9449:1
  Mon Oct 31 20:28:14 build index local.replset.minvalid { _id: 1 }
  Mon Oct 31 20:28:14 build index done 0 records 0.005 secs
  Mon Oct 31 20:28:14 replSet initial sync done
  Mon Oct 31 20:28:15 replSet syncing to: 10.250.7.220:27018
  Mon Oct 31 20:28:15 replSet SECONDARY
  Mon Oct 31 20:28:15 replSet member 10.250.7.220:27020 is now in state RECOVERING
  Mon Oct 31 20:28:26 mem (MB) res:16 virt:2677 mapped:1232
  Mon Oct 31 20:28:52 connection accepted from 10.250.7.220:10872 #4
  Mon Oct 31 20:28:52 connection accepted from 10.250.7.220:10873 #5
  Mon Oct 31 20:28:52 handshake between 2 and 10.250.7.220:27018
  Mon Oct 31 20:28:53 build index local.slaves { _id: 1 }
  Mon Oct 31 20:28:53 build index done 0 records 0.003 secs
  Mon Oct 31 20:28:55 end connection 10.250.7.220:10873
  Mon Oct 31 20:28:55 end connection 10.250.7.220:10872
  Mon Oct 31 20:28:57 replSet member 10.250.7.220:27020 is now in state SECONDARY
  Mon Oct 31 20:29:27 mem (MB) res:19 virt:2693 mapped:1232
  Mon Oct 31 20:30:21 connection accepted from 127.0.0.1:44672 #6
  Mon Oct 31 20:33:35 end connection 10.250.7.220:42493
  Mon Oct 31 20:33:35 replSet syncThread: 10278 dbclient error communicating with server: 10.250.7.220:27018
  Mon Oct 31 20:33:35 DBClientCursor::init call() failed
  Mon Oct 31 20:33:35 replSet info 10.250.7.220:27018 is down (or slow to respond): DBClientBase::findN: transport error: 10.250.7.220:27018 query: { replSetHeartbeat: "myset", v: 1, pv: 1, checkEmpty: false, from: "10.250.7.220:27019" }
  Mon Oct 31 20:33:35 replSet member 10.250.7.220:27018 is now in state DOWN
  Mon Oct 31 20:33:35 not electing self, 10.250.7.220:27020 would veto
  Mon Oct 31 20:33:36 replSet info voting yea for 10.250.7.220:27020 (2)
  Mon Oct 31 20:33:37 replSet member 10.250.7.220:27020 is now in state PRIMARY
  Mon Oct 31 20:33:46 replSet syncing to: 10.250.7.220:27020
  Mon Oct 31 20:34:27 mem (MB) res:19 virt:2693 mapped:1232
  27020 端口的mongodb 输出日志显示的选择10.250.7.220 作为主库的日志记录
  Mon Oct 31 20:33:35 replSet syncThread: 10278 dbclient error communicating with server: 10.250.7.220:27018
  Mon Oct 31 20:33:36 DBClientCursor::init call() failed
  Mon Oct 31 20:33:36 replSet info 10.250.7.220:27018 is down (or slow to respond): DBClientBase::findN: transport error: 10.250.7.220:27018 query: { replSetHeartbeat: "myset", v: 1, pv: 1, checkEmpty: false, from: "10.250.7.220:27020" }
  Mon Oct 31 20:33:36 replSet member 10.250.7.220:27018 is now in state DOWN
  Mon Oct 31 20:33:36 replSet info electSelf 2
  Mon Oct 31 20:33:36 replSet PRIMARY
  Mon Oct 31 20:33:46 connection accepted from 10.250.7.220:37261 #5
  Mon Oct 31 20:33:47 build index local.slaves { _id: 1 }
  Mon Oct 31 20:33:47 build index done 0 records 0.001 secs
  Mon Oct 31 20:33:48 mem (MB) res:19 virt:2692 mapped:1232
  Mon Oct 31 20:34:35 end connection 127.0.0.1:17500
  Mon Oct 31 20:34:37 connection accepted from 127.0.0.1:36525 #6
  进入数据库查看:
  $ ./mongo 127.0.0.1:27020
  MongoDB shell version: 2.0.1
  connecting to: 127.0.0.1:27020/test
  PRIMARY>
  PRIMARY>
  PRIMARY> db.isMaster();
  {
  "setName" : "myset",
  "ismaster" : true,--成为主库master
  "secondary" : false,
  "hosts" : [
  "10.250.7.220:27020",
  "10.250.7.220:27019",
  "10.250.7.220:27018"
  ],
  "primary" : "10.250.7.220:27020",
  "me" : "10.250.7.220:27020",
  "maxBsonObjectSize" : 16777216,
  "ok" : 1
  }
  PRIMARY>
  重新启动端口为27018的mongodb的数据库服务:从日志中可以看出其进行恢复的操作记录
  $ ./mongod --dbpath /opt/mongodata/r1 --port 27018--rest --replSet myset &
   16290
  $ Mon Oct 31 20:48:32 MongoDB starting : pid=16290 port=27018 dbpath=/opt/mongodata/r1 64-bit host=rac4
  Mon Oct 31 20:48:32 db version v2.0.1, pdfile version 4.5
  Mon Oct 31 20:48:32 git version: 3a5cf0e2134a830d38d2d1aae7e88cac31bdd684
  Mon Oct 31 20:48:32 build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
  Mon Oct 31 20:48:32 options: { dbpath: "/opt/mongodata/r1", port: 27018, replSet: "myset", rest: true }
  Mon Oct 31 20:48:32 journal dir=/opt/mongodata/r1/journal
  Mon Oct 31 20:48:32 recover begin
  Mon Oct 31 20:48:32 recover lsn: 231055
  Mon Oct 31 20:48:32 recover /opt/mongodata/r1/journal/j._0
  Mon Oct 31 20:48:32 recover skipping application of section seq:198962 < lsn:231055
  Mon Oct 31 20:48:32 recover cleaning up
  Mon Oct 31 20:48:32 removeJournalFiles
  Mon Oct 31 20:48:32 recover done
  Mon Oct 31 20:48:32 waiting for connections on port 27018
  Mon Oct 31 20:48:32 admin web console waiting for connections on port 28018
  Mon Oct 31 20:48:32 connection accepted from 127.0.0.1:11930 #1
  Mon Oct 31 20:48:32 replSet STARTUP2
  Mon Oct 31 20:48:32 replSet info member 10.250.7.220:27019 is up
  Mon Oct 31 20:48:32 replSet member 10.250.7.220:27019 is now in state SECONDARY
  Mon Oct 31 20:48:32 replSet info member 10.250.7.220:27020 is up
  Mon Oct 31 20:48:32 replSet member 10.250.7.220:27020 is now in state PRIMARY
  Mon Oct 31 20:48:32 replSet SECONDARY
  Mon Oct 31 20:48:33 connection accepted from 10.250.7.220:35971 #2
  Mon Oct 31 20:48:34 connection accepted from 10.250.7.220:35972 #3
  Mon Oct 31 20:48:36 replSet syncing to: 10.250.7.220:27020
  Mon Oct 31 20:48:36 build index local.me { _id: 1 }
  Mon Oct 31 20:48:36 build index done 0 records 0 secs
  $ ./mongo 127.0.0.1:27018
  MongoDB shell version: 2.0.1
  connecting to: 127.0.0.1:27018/test
  SECONDARY>
  SECONDARY> db.isMaster();
  {
  "setName" : "myset",
  "ismaster" : false,   --端口为 27018的数据库服务变为从库
  "secondary" : true,
  "hosts" : [
  "10.250.7.220:27018",
  "10.250.7.220:27020",
  "10.250.7.220:27019"
  ],
  "primary" : "10.250.7.220:27020",
  "me" : "10.250.7.220:27018",
  "maxBsonObjectSize" : 16777216,
  "ok" : 1
  }
  SECONDARY>
http://blog.itpub.net/image/default/fj.png2.JPG
http://blog.itpub.net/image/default/fj.png3.JPG
http://blog.itpub.net/image/default/fj.png4.JPG

页: [1]
查看完整版本: 【Mongodb】 Replica set的自动故障切换