V 8 nfs+drbd+heartbeat

htbzwd · 发表于 2019-1-7 07:15:45

　　nfs+drbd+heartbeat，nfs或分布式存储mfs只要有单点都可用此方案解决
　　在企业实际生产场景中，nfs是中小企业最常用的存储架构解决方案之一，该架构方案部署简单、维护方便，只需通过配inotify+rsync简单而高效的数据同步方式就可实现对nfs存储系统的数据进行异机主从同步及类似MySQL的rw splitting，且多个从（读r）还可通过LVS或haproxy实现LB，既分担大并发读数据的压力又排除了从的单点故障；

　　web server上的存储：
　　方案一（读写在任意一台web server上都行，通过inotify+rsync将每个web server上的数据同步至其它web server，例如web1-->web2-->web3-->web2-->web1）；
　　方案二（在LB器上配置，在写（上传文件）时只能到web3上，r在web｛1，2｝上，使用inotify+rsync在同步时web3-->web2、web3-->web1）；
　　方案三（使用共享存储nfs，若只一个rw都在一个上就成了单点，再加一台，一主一备，彼此间使用inotify+rsync同步数据，这两个可rw都放到一个上，另一个仅用来备份数据，也可以读在备上写在主上；一般读多写少，可再加一台作为备，一主多从，减轻r的压力，同步数据时主-->备1、主-->备2；若某一个存储故障，web{1,2,3}要重新挂载，备挂掉一个不影响，主若挂掉则不可写；于是给主做高可用master-active和master-inactive，这两台主同一时间仅一台对外提供服务，master-inactive为不活动状态仅在切换后才对外提供服务）；
　　方案四（弃用nfs共享存储，使用master-active和master-inactive，只将数据写到共享存储上，再返回来将共享存储的数据同步到web{1,2,3}的本地，读时直接从本地拿；
　　一主多从模型中，若要实现当主挂掉时仍可写，且可继续同步到从，用nfs+drbd+heartbeat实现主的高可用，解决主单点问题，当master-activenfs故障切至master-inactive nfs上，这两主的数据是一致的，master-inactivenfs会自动与其它所有从nfs进行同步，从而实现nfs存储系统热备
　　master-active故障切至备master-inactive上时，备node要仍能向nfs slave同步数据，此时同步就不能全部同步而要仅同步切换后变更的数据，此处可用sersync代替inotify，通过sersync的-r选项（或者也可以先不让inotify启动，待备node的heartbeat启动并挂载好之后，再开启inotify服务）
　　注：此方案与MFS、FastDFS、GFS等FS相比，部署简单、维护控制方面较容易，符合简单易高效的原则；但也有缺陷，每个node都是全部的数据（和MySQL同步一样），大数据量文件同步很可能有数据延迟发生，可根据不同的数据目录进行拆分同步（类似MySQL的库表拆分方案），对于延迟也可开启多个同步实例并通过程序实现控制读写逻辑，还要能监测到同步状态
　　nfs高可用方案，解决在两主node在切换时，nfsslave读不到数据卡死状态，可从以下几方面入手：
　　rpcbind服务要一直确保开启（主node、备node、nfs客户端都要开启）；
　　nfs client（nfs slave）监控本地已挂载的nfs共享目录，如果发现读不了，执行重新挂载；
　　nfs client监控master-inactivenode是否有VIP出现或者drbd的状态变为Primary，如果有执行重新挂载（nfs服务切换时通过SSH等机制nfs client实现remount；利用nagios监控，如果master inactive node出现VIP，执行一个指脚本进行多台nfsclient的remount）；

　　如图：椭圆标注是此节操作的内容

　　注：单台server，无需文件存储，数据放本地，只有做集群的情况下才需要做专门的存储

　　注：问题（单点，rw都在一个上性能不好，企业中做运维要考虑的（数据保护；7*24小时持续服务）

　　注：
　　web1和web2一般用LNMP；
　　IMG1和IMG2一般nginx或lighttpd；
　　该方案既解决nfs master单点，又解决了并发读性能问题，但如果数据写并发持续加大，会导致如下问题：
　　适用于200-300张/s上传的图片，并发同步效率方面还可以，若高于300张/s可能导致master和slave同步延迟，解决办法：开启多线程同步，优化监控事件、磁盘IO、网络IO；
　　若IMG server很多的情况下，只有一台master，master既负责写，又负责给多台同步数据，压力会很大；
　　图片问题非常大时，每个node都是全部完整的数据，若总容量3T以上，可能导致单台server存储空间不够，解决办法：（一、可利用MySQL拆库的思路解决容量、写性能、同步延迟的问题，例如初期规则img1--img5，5个目录对应5个域名，挂载这5个目录，每个imgNUM变为一组新的nfs主从高可用及rw spltting的集群，rw splitting可用POST或webDAV的方式）；（二、通过DNS扩展多主的架构，增加新的服务意味着单点）；（三、利用MySQL、Oracle、Mongodb、cassandra等数据库的内部功能实现文件数据的同步，爱奇艺用mongodb的GridFS做图片存储）；
　　注：mongodb的GridFS做图片存储（支持分布式，设计思路：图片存储唯一；只存原始图；首次请求生成缩略图并生成静态文件；url固定，根据不同url产生缩略图；参考Abusing Amazon images
　　注：facebook图片管理架构

　　注：
　　给nfs做HA解决了单点，浪费了一台server；
　　nfs两主之间是通过heartbeat+drbd，采用drbd的C协议实时同步；
　　nfs(M)和nfs(S)之间通过inotify+rsync异步同步，nfs(S)通过VIP与nfs(M)进行同步，nfsslaveNUM用来读，nfs master用来写，这解决了并发读性能问题；也可将nfs master只写，再由nfs master推至appserver（弃用nfs方案）；
　　物理磁盘做RAID10或RAID0根据性能和冗余需求来选择；server之间、server和switch是用双千兆网卡bounding绑定；应用server（包括不限于web）通过VIP访问nfs(M)，通过不同的VIP访问LB的nfs(S)存储池；nfs(M)的数据在drbd的分区中；
　　在数据量不大的情况下，可将直接将数据从nfs(M)上直接同步至appserver本地，读全都从appserver本地读取，写要到nfs(M)上；
　　用inotify+rsync做从master--slave同步时，在并发写大的情况下会导致数据延迟或不同步；
　　注：
　　在企业实际工作场景中，只有万不得已才会去搞DB和文件存储的问题，平时应多在网站架构上做调整，以让用户请求最小化的访问DB及存储系统，例如做文件缓存和数据缓存（高并发的核心原则：把所有的用户访问请求都尽量往前推），而不是上来就搞分布式存储系统，对于中小企业用分布式存储就是大炮打蚊子，2012年facebook已经很大的时候还是用nfs存储系统（分布式不是万能的，会消耗大量的人力、物力，控制不好会带来灾难的后果）

　　注：
　　为缓解网站访问的压力，尽量将user访问的内容往前推，有放到user本地的就不要放到CDN，能放到CDN的就不要放到本地server，充分利用每一层的缓存，直到万不得已才让用户访问后端的DB，在此基础上若撑不住，解决办法：使用ssd+sata，还不行使用分布式存储
　　1、安装配置heartbeat
　　准备环境：
　　VIP：10.96.20.8
　　master：eth0（10.96.20.113）、eth1（172.16.1.113，不配网关及dns）、主机名（test-master）
　　backup：eth0（10.96.20.114）、eth1（172.16.1.114，不配网关及dns）、主机名（test-backup）
　　双网卡、双硬盘、
　　注：eth0为管理IP；eth1心跳连接及drbd传输通道，若是生产环境中心跳传输和数据传输用一个网卡要做限制，给心跳留有带宽
　　注：规范vmware中标签，Xshell中标签，公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录，方便分发及管理维护
　　test-master（分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致，/etc/hosts文件，ssh双机互信，时间同步，iptables，selinux）：
　　[root@test-master ~]# cat /etc/redhat-release
　　Red Hat Enterprise Linux Server release 6.5(Santiago)
　　[root@test-master ~]# uname -rm
　　2.6.32-431.el6.x86_64 x86_64
　　[root@test-master ~]# uname -n
　　test-master
　　[root@test-master ~]# ifconfig | grep eth0 -A 1
　　eth0    Link encap:Ethernet  HWaddr00:0C:29:1F:B6:AC
　　inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0
　　[root@test-master ~]# ifconfig | grep eth1 -A 1
　　eth1    Link encap:Ethernet  HWaddr00:0C:29:1F:B6:B6
　　inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0
　　[root@test-master ~]# routeadd -host 172.16.1.114 dev eth1  #（添加主机路由，心跳传送通过指定网卡出去，此句可追加到/etc/rc.local中，也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113）
　　[root@test-master ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P ''
　　Generating public/private rsa key pair.
　　Your identification has been saved in./.ssh/id_rsa.
　　Your public key has been saved in./.ssh/id_rsa.pub.
　　The key fingerprint is:
　　29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12root@test-master
　　The key's randomart image is:
　　+--[ RSA 2048]----+
　　| E o..       |
　　| .+ +          |
　　|.+.* .       |
　　|oo* o.  .    |
　　|+o..  =S       |
　　|+. o . +       |
　　|o o .          |
　　| .             |
　　|             |
　　+-----------------+
　　[root@test-master ~]# ssh-copy-id-i ./.ssh/id_rsa root@test-backup
　　The authenticity of host 'test-backup(10.96.20.114)' can't be established.
　　RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.
　　Are you sure you want to continue connecting(yes/no)? yes
　　Warning: Permanently added 'test-backup' (RSA) tothe list of known hosts.
　　root@test-backup's password:
　　Now try logging into the machine, with "ssh'root@test-backup'", and check in:
　　.ssh/authorized_keys
　　to make sure we haven't added extra keys that youweren't expecting.
　　[root@test-master ~]# crontab -l
　　*/5 * * * * /usr/sbin/ntpdate time.windows.com&> /dev/null
　　[root@test-master ~]# service crond restart
　　Stopping crond:                                           [  OK  ]
　　Starting crond:                                        [  OK  ]
　　[root@test-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
　　[root@test-master ~]# rpm -ivh epel-release-6-8.noarch.rpm
　　warning: epel-release-6-8.noarch.rpm: Header V3RSA/SHA256 Signature, key ID 0608b895: NOKEY
　　Preparing...             ########################################### [100%]
　　1:epel-release       ########################################### [100%]
　　[root@test-master ~]# yum search heartbeat
　　……
　　heartbeat-devel.i686 : Heartbeat developmentpackage
　　heartbeat-devel.x86_64 : Heartbeat developmentpackage
　　heartbeat-libs.i686 : Heartbeat libraries
　　heartbeat-libs.x86_64 : Heartbeat libraries
　　heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux
　　[root@test-master ~]# yum-y install heartbeat
　　[root@test-master ~]# chkconfig heartbeat off
　　[root@test-master ~]# chkconfig --list heartbeat
　　heartbeat       0:off 1:off 2:off 3:off 4:off 5:off 6:off
　　test-backup：
　　[root@test-backup ~]# uname -n
　　test-backup
　　[root@test-backup ~]# ifconfig | grep eth0 -A 1
　　eth0    Link encap:Ethernet  HWaddr00:0C:29:15:E6:BB
　　inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0
　　[root@test-backup ~]# ifconfig | grep eth1 -A 1
　　eth1    Link encap:Ethernet  HWaddr00:0C:29:15:E6:C5
　　inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0
　　[root@test-backup ~]# routeadd -host 172.16.1.113 dev eth1
　　[root@test-backup ~]# ssh-keygen-t rsa -f ./.ssh/id_rsa -P ''
　　Generating public/private rsa key pair.
　　Your identification has been saved in./.ssh/id_rsa.
　　Your public key has been saved in ./.ssh/id_rsa.pub.
　　The key fingerprint is:
　　08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8root@test-backup
　　The key's randomart image is:
　　+--[ RSA 2048]----+
　　|       .    |
　　|       =. |
　　| . = * |
　　| . . . .. + + |
　　|. + . ..SE . |
　　| o = .  .    |
　　|. . = .    |
　　| o . . .    |
　　|o .o...    |
　　+-----------------+
　　[root@test-backup ~]#ssh-copy-id -i ./.ssh/id_rsa root@test-master
　　The authenticity of host 'test-master(10.96.20.113)' can't be established.
　　RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.
　　Are you sure you want to continue connecting(yes/no)? yes
　　Warning: Permanently added 'test-master' (RSA) tothe list of known hosts.
　　root@test-master's password:
　　Now try logging into the machine, with "ssh'root@test-master'", and check in:
　　.ssh/authorized_keys
　　to make sure we haven't added extra keys that youweren't expecting.
　　[root@test-backup ~]# crontab -l
　　*/5 * * * * /usr/sbin/ntpdate time.windows.com&> /dev/null
　　[root@test-backup ~]# service crond restart
　　Stopping crond:                                        [  OK  ]
　　Starting crond:                                        [  OK  ]
　　[root@test-backup ~]# wgethttp://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
　　[root@test-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm
　　[root@test-backup ~]# yum -y install heartbeat
　　[root@test-backup ~]# chkconfig heartbeat off
　　[root@test-backup ~]# chkconfig --list heartbeat
　　heartbeat       0:off 1:off 2:off 3:off 4:off 5:off 6:off
　　test-master：
　　[root@test-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/
　　[root@test-master ~]# cd /etc/ha.d
　　[root@test-master ha.d]# ls
　　authkeys ha.cf  harc  haresources rc.d  README.config  resource.d shellfuncs
　　[root@test-master ha.d]# vim authkeys #（使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数，sha1后跟随机数）
　　auth 1
　　1 sha1912d6402295ac8d47109e56b177073b9
　　[root@test-master ha.d]# chmod 600 authkeys   #（此文件权限600，否则启动服务时会报错）
　　[root@test-master ha.d]# ll !$
　　ll authkeys
　　-rw-------. 1 root root 692 Aug  7 21:51 authkeys
　　[root@test-master ha.d]# vim ha.cf
　　debugfile /var/log/ha-debug #（调试日志）
　　logfile /var/log/ha-log
　　logfacility local1 #（在rsyslog服务中配置通过local1接收日志）
　　keepalive 2  #（指定心跳间隔时间，即2s发一次广播）
　　deadtime 30  #（指定备node在30s内没收到主node的心跳信息则立即接管对方的服务资源）
　　warntime 10  #（指定心跳延迟的时间为10s，当10s内备node没收到主node的心跳信息，就会往日志中写警告，此时不会切换服务）
　　initdead 120  #（指定在heartbeat首次运行后，需等待120s才启动主node的各资源，此项用于解决等待对方heartbeat服务启动了自己才启，此项值至少要是deadtime的两倍）
　　udpport 694
　　#bcast  eth0 #（指定心跳使用以太网广播方式在eth0上广播，若要使用两个实际网络传送心跳则要为bcast eth0 eth1）
　　mcast eth0 225.0.0.11 694 1 0 #（设置多播通信的参数，多播地址在LAN内必须是唯一的，因为有可能有多个heartbeat服务，多播地址使用D类IP（224.0.0.0--239.255.255.255），格式为mcastdev mcast_group port ttl loop）
　　auto_failback on #（用于主node恢复后failback）
　　node test-master   #（主node主机名，uname -n结果）
　　node test-backup #（备node主机名）
　　crm no #（是否开启CRM功能）
　　[root@test-master ha.d]# vim haresources
　　test-master    IPaddr::10.96.20.8/24/eth0 #（此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|start，IPaddr即是/etc/ha.d/resource.d/下的脚本）
　　[root@test-master ha.d]#scp authkeys ha.cf haresources root@test-backup:/etc/ha.d/
　　authkeys                                                                                        100%  692    0.7KB/s  00:00
　　ha.cf                                                                                           100% 10KB 10.3KB/s 00:00
　　haresources                                                                                     100% 5944    5.8KB/s 00:00
　　[root@test-master ha.d]# service heartbeat start
　　Starting High-Availability services: INFO:  Resource is stopped
　　Done.
　　[root@test-master ha.d]# ssh test-backup 'service heartbeat start'
　　Starting High-Availability services:2016/08/07_22:39:00 INFO:  Resource isstopped
　　Done.
　　[root@test-master ha.d]# ps aux | grep heartbeat
　　root    63089  0.0  3.1 50124  7164 ?       SLs 22:38 0:00 heartbeat: mastercontrol process
　　root    63093  0.0  3.1 50076  7116 ?       SL  22:38 0:00 heartbeat: FIFOreader
　　root    63094  0.0  3.1 50072  7112 ?       SL  22:38 0:00 heartbeat: write:mcast eth0
　　root    63095  0.0  3.1 50072  7112 ?       SL  22:38 0:00 heartbeat: read:mcast eth0
　　root    63136  0.0  0.3 103264  836 pts/0 S+ 22:39  0:00 grep heartbeat
　　[root@test-master ha.d]# ssh test-backup 'ps aux |grep heartbeat'
　　root    3050  0.0  3.1 50124  7164 ?       SLs 22:39 0:00 heartbeat: mastercontrol process
　　root    3054  0.0  3.1 50076  7116 ?       SL  22:39 0:00 heartbeat: FIFOreader
　　root    3055  0.0  3.1 50072  7112 ?       SL  22:39 0:00 heartbeat: write:mcast eth0
　　root    3056  0.0  3.1 50072  7112 ?       SL  22:39 0:00 heartbeat: read:mcast eth0
　　root    3094  0.0  0.5 106104 1368 ?       Ss 22:39  0:00 bash -c ps aux | grep heartbeat
　　root    3108  0.0  0.3 103264  832 ?       S 22:39  0:00 grep heartbeat
　　[root@test-master ha.d]# netstat -tnulp | grep heartbeat
　　udp    0    0 225.0.0.11:694             0.0.0.0:*                            63094/heartbeat:wr
　　udp    0    0 0.0.0.0:50268             0.0.0.0:*                            63094/heartbeat:wr
　　[root@test-master ha.d]# ssh test-backup 'netstat-tnulp | grep heartbeat'
　　udp    0    0 0.0.0.0:58019             0.0.0.0:*                            3055/heartbeat:wri
　　udp    0    0 225.0.0.11:694             0.0.0.0:*                            3055/heartbeat: wri
　　[root@test-master ha.d]# ip addr | grep 10.96.20
　　inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
　　inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0
　　[root@test-master ha.d]# ssh test-backup 'ip addr |grep 10.96.20'
　　inet10.96.20.114/24 brd 10.96.20.255 scope global eth0
　　[root@test-master ha.d]# service heartbeat stop
　　Stopping High-Availability services: Done.
　　[root@test-master ha.d]# ip addr | grep 10.96.20
　　inet10.96.20.113/24 brd 10.96.20.255 scope global eth0
　　[root@test-master ha.d]# ssh test-backup 'ip addr |grep 10.96.20'
　　inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0
　　inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0
　　[root@test-master ha.d]# service heartbeat start
　　Starting High-Availability services: INFO:  Resource is stopped
　　Done.
　　[root@test-master ha.d]# ip addr | grep 10.96.20
　　inet10.96.20.113/24 brd 10.96.20.255 scope global eth0
　　inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
　　[root@test-master ha.d]# ssh test-backup 'ip addr |grep 10.96.20'
　　inet10.96.20.114/24 brd 10.96.20.255 scope global eth0
　　[root@test-master ~]# service heartbeat stop
　　Stopping High-Availability services: Done.
　　[root@test-master ~]# ssh test-backup 'serviceheartbeat stop'
　　Stopping High-Availability services: Done.
　　2、安装配置drbd
　　test-master：
　　[root@test-master ~]# fdisk -l
　　……
　　Disk /dev/sdb: 2147 MB, 2147483648 bytes
　　255 heads, 63 sectors/track, 261 cylinders
　　Units = cylinders of 16065 * 512 = 8225280 bytes
　　Sector size (logical/physical): 512 bytes / 512bytes
　　I/O size (minimum/optimal): 512 bytes / 512 bytes
　　Disk identifier: 0x00000000
　　[root@test-master ~]# parted /dev/sdb  #（parted命令可支持大于2T的硬盘，将新硬盘分两个区，一个区用于放数据，另一个区用于drbd的meta data）
　　GNU Parted 2.1
　　Using /dev/sdb
　　Welcome to GNU Parted! Type 'help' to view a listof commands.
　　(parted) h
　　align-checkTYPE N                      checkpartition N for TYPE(min|opt) alignment
　　checkNUMBER                            do asimple check on the file system
　　cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER  copy file system to another partition
　　help[COMMAND]                         printgeneral help, or help on COMMAND
　　mklabel,mktable LABEL-TYPE             create a new disklabel (partitiontable)
　　mkfs NUMBERFS-TYPE                   make aFS-TYPE file system on partition NUMBER
　　mkpart PART-TYPE [FS-TYPE] START END    make a partition
　　mkpartfsPART-TYPE FS-TYPE START END    make apartition with a file system
　　move NUMBERSTART END                   movepartition NUMBER
　　name NUMBERNAME                      namepartition NUMBER as NAME
　　print [devices|free|list,all|NUMBER] display the partition table, availabledevices, free space, all found partitions, or a
　　particular partition
　　quit                                  exitprogram
　　rescueSTART END                      rescuea lost partition near START and END
　　resizeNUMBER START END                resizepartition NUMBER and its file system
　　rmNUMBER                            delete partition NUMBER
　　selectDEVICE                         choosethe device to edit
　　set NUMBERFLAG STATE                   change theFLAG on partition NUMBER
　　toggle[NUMBER [FLAG]]                togglethe state of FLAG on partition NUMBER
　　unitUNIT                               setthe default unit to UNIT
　　version                               display the version number and copyrightinformation of GNU Parted
　　(parted) mklabel gpt
　　(parted) mkpart primary 0 1024
　　Warning: The resulting partition is not properlyaligned for best performance.
　　Ignore/Cancel? Ignore
　　(parted) mkpart primary 1025 2147
　　Warning: The resulting partition is not properlyaligned for best performance.
　　Ignore/Cancel? Ignore
　　(parted) p
　　Model: VMware, VMware Virtual S (scsi)
　　Disk /dev/sdb: 2147MB
　　Sector size (logical/physical): 512B/512B
　　Partition Table: gpt
　　Number Start End    Size File system  Name    Flags
　　1    17.4kB 1024MB  1024MB             primary
　　2    1025MB 2147MB  1122MB             primary
　　[root@test-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
　　[root@test-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm
　　warning: elrepo-release-6-6.el6.elrepo.noarch.rpm:Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY
　　Preparing...             ########################################### [100%]
　　1:elrepo-release       ########################################### [100%]
　　[root@test-master ~]# yum -y install drbd kmod-drbd84
　　[root@test-master ~]# modprobe drbd
　　FATAL: Module drbd not found.
　　[root@test-master ~]# yum -y install kernel* #（更新内核后要重启系统）
　　[root@test-master ~]# uname -r
　　2.6.32-642.3.1.el6.x86_64
　　[root@test-master ~]# depmod
　　[root@test-master ~]# lsmod| grep drbd
　　drbd                372759  0
　　libcrc32c             1246  1 drbd
　　[root@test-master ~]# ll /usr/src/kernels/
　　total 12
　　drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64
　　drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64
　　drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64.debug
　　[root@test-master ~]# echo "modprobe drbd >/dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules
　　[root@test-master ~]# cat !$
　　cat /etc/sysconfig/modules/drbd.modules
　　modprobe drbd > /dev/null 2>&1
　　test-backup：
　　[root@test-backup ~]# parted /dev/sdb
　　(parted) mklabel gpt
　　(parted) mkpart primary 0 4096
　　Warning: The resulting partition is not properlyaligned for best performance.
　　Ignore/Cancel? Ignore
　　(parted) mkpart primary 4097 5368
　　(parted) p
　　Model: VMware, VMware Virtual S (scsi)
　　Disk /dev/sdb: 5369MB
　　Sector size (logical/physical): 512B/512B
　　Partition Table: gpt
　　Number Start End    Size File system  Name    Flags
　　1    17.4kB 4096MB  4096MB             primary
　　2    4097MB 5368MB  1271MB             primary
　　[root@test-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
　　[root@test-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm
　　[root@test-backup ~]# ll /etc/yum.repos.d/
　　total 20
　　-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo
　　-rw-r--r--. 1 root root 2150 Feb  9  2014elrepo.repo
　　-rw-r--r--. 1 root root  957 Nov 4  2012 epel.repo
　　-rw-r--r--. 1 root root 1056 Nov  4  2012epel-testing.repo
　　-rw-r--r--. 1 root root  529 Mar 30 23:00 rhel-source.repo.bak
　　[root@test-backup ~]# yum -y install drbd kmod-drbd84
　　[root@test-backup ~]# yum -y install kernel*
　　[root@test-backup ~]# depmod
　　[root@test-backup ~]# lsmod | grep drbd
　　drbd                372759  0
　　libcrc32c             1246  1 drbd
　　[root@test-backup ~]# chkconfig drbd off
　　[root@test-backup ~]# chkconfig --list drbd
　　drbd          0:off 1:off 2:off 3:off 4:off 5:off 6:off
　　[root@test-backup ~]# echo "modprobe drbd >/dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules
　　[root@test-backup ~]# cat !$
　　cat /etc/sysconfig/modules/drbd.modules
　　modprobe drbd > /dev/null 2>&1
　　test-master：
　　[root@test-master ~]# vim /etc/drbd.d/global_common.conf
　　[root@test-master ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf
　　global {
　　usage-countno;
　　}
　　common {
　　handlers{
　　}
　　startup{
　　}
　　options{
　　}
　　disk{
　　on-io-error detach;
　　}
　　net {
　　}
　　syncer{
　　rate50M;
　　verify-algcrc32c;
　　}
　　}
　　[root@test-master ~]# vim /etc/drbd.d/data.res
　　resource data {
　　protocol C;
　　ontest-master {
　　device  /dev/drbd0;
　　disk /dev/sdb1;
　　address 172.16.1.113:7788;
　　meta-disk    /dev/sdb2[0];
　　}
　　ontest-backup {
　　device  /dev/drbd0;
　　disk /dev/sdb1;
　　address 172.16.1.114:7788;
　　meta-disk    /dev/sdb2[0];
　　}
　　}
　　[root@test-master ~]# cd /etc/drbd.d
　　[root@test-master drbd.d]# scp global_common.conf data.res root@test-backup:/etc/drbd.d/
　　global_common.conf                                                                                  100% 2144    2.1KB/s 00:00
　　data.res                                                                                              100%  251    0.3KB/s  00:00
　　[root@test-master drbd.d]# drbdadm --help
　　USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}
　　GENERAL OPTIONS:
　　--stacked,-S
　　--dry-run,-d
　　--verbose,-v
　　--config-file=..., -c ...
　　--config-to-test=..., -t ...
　　--drbdsetup=...,-s ...
　　--drbdmeta=..., -m ...
　　--drbd-proxy-ctl=..., -p ...
　　--sh-varname=..., -n ...
　　--peer=...,-P ...
　　--version,-V
　　--setup-option=..., -W ...
　　--help, -h
　　COMMANDS:
　　attach                            disk-options
　　detach                            connect
　　net-options                      disconnect
　　 up                               resource-options
　　 down                            primary
　　secondary                         invalidate
　　invalidate-remote                outdate
　　resize                            verify
　　pause-sync                      resume-sync
　　adjust                         adjust-with-progress
　　wait-connect                      wait-con-int
　　role                            cstate
　　dstate                            dump
　　dump-xml                         create-md
　　show-gi                         get-gi
　　dump-md                         wipe-md
　　apply-al                         hidden-commands
　　[root@test-master drbd.d]# drbdadm create-md data
　　initializing activity log
　　NOT initializing bitmap
　　Writing meta data...
　　New drbd meta data block successfully created.
　　[root@test-master drbd.d]# ssh test-backup 'drbdadm create-md data'
　　NOT initializing bitmap
　　initializing activity log
　　Writing meta data...
　　New drbd meta data block successfully created.
　　[root@test-master drbd.d]#drbdadm up data
　　[root@test-master drbd.d]# ssh test-backup 'drbdadm up data'
　　[root@test-master drbd.d]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
　　ns:0 nr:0dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984
　　[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
　　ns:0 nr:0dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984
　　[root@test-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data  #（仅在主上执行）
　　[root@test-master drbd.d]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----
　　ns:339968nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:660016
　　[=====>..............]sync'ed: 34.3% (660016/999984)K
　　finish:0:00:15 speed: 42,496 (42,496) K/sec
　　[root@test-master drbd.d]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
　　ns:630784nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:369200
　　[===========>........]sync'ed: 63.3% (369200/999984)K
　　finish:0:00:09 speed: 39,424 (39,424) K/sec
　　[root@test-master drbd.d]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
　　ns:942080nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:57904
　　[=================>..]sync'ed: 94.3% (57904/999984)K
　　finish:0:00:01 speed: 39,196 (39,252) K/sec
　　[root@test-master drbd.d]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
　　ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
　　[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----
　　ns:0nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
　　[root@test-master drbd.d]# mkdir/drbd
　　[root@test-master drbd.d]# ssh test-backup 'mkdir /drbd'
　　[root@test-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0 #（仅在主上执行，meta分区不要格式化）
　　Writing superblocks and filesystem accountinginformation: done
　　[root@test-master drbd.d]# tune2fs -c -1 /dev/drbd0
　　tune2fs 1.41.12 (17-May-2010)
　　Setting maximal mount count to -1
　　[root@test-master drbd.d]# mount /dev/drbd0 /drbd
　　[root@test-master drbd.d]# cd /drbd
　　[root@test-master drbd]# for i in `seq 1 10`; do touch test$i; done
　　[root@test-master drbd]# ls
　　lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9
　　[root@test-master drbd]# cd
　　[root@test-master ~]# umount /dev/drbd0
　　[root@test-master ~]# drbdadm secondary data
　　[root@test-master ~]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----
　　ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0
　　test-backup：
　　[root@test-backup ~]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----
　　ns:0nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
　　[root@test-backup ~]# drbdadm primary data
　　[root@test-backup ~]# cat /proc/drbd
　　version: 8.4.7-1 (api:1/proto:86-101)
　　GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
　　0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----
　　ns:0nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
　　[root@test-backup ~]# mount /dev/drbd0 /drbd
　　[root@test-backup ~]# ls /drbd
　　lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9
　　3、调试heartbeat+drbd
　　[root@test-master ~]# ssh test-backup 'umount/drbd'
　　[root@test-master ~]# ssh test-backup 'drbdadmsecondary data'
　　[root@test-master ~]# service drbd stop
　　Stopping all DRBD resources: .
　　[root@test-master ~]# ssh test-backup 'service drbdstop'
　　Stopping all DRBD resources: .
　　[root@test-master ~]# service heartbeat status
　　heartbeat is stopped. No process
　　[root@test-master ~]# ssh test-backup 'serviceheartbeat status'
　　heartbeat is stopped. No process
　　[root@test-master ~]# ll/etc/ha.d/resource.d/{Filesystem,drbddisk}
　　-rwxr-xr-x. 1 root root 3162 Jan 12  2016 /etc/ha.d/resource.d/drbddisk
　　-rwxr-xr-x. 1 root root 1903 Dec  2  2013/etc/ha.d/resource.d/Filesystem
　　[root@test-master ~]# vim /etc/ha.d/haresources #（此行内容相当于脚本加参数的执行方式，例如#/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start|stop，#/etc/ha.d/resource.d/drbddiskdata start|stop，#/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start|stop；heartbeat就是这样按配置的先后顺序控制资源的，如果heartbeat出问题了，可通过查看日志并单独运行这些命令排错）
　　test-master    IPaddr::10.96.20.8/24/eth0    drbddisk::data  Filesystem::/dev/drbd/0::/drbd::ext4
　　[root@test-master ~]# scp /etc/ha.d/haresourcesroot@test-backup:/etc/ha.d/
　　haresources                                                                                              100% 5996    5.9KB/s 00:00
　　[root@test-master~]# service drbd start #（在主node执行）
　　Starting DRBD resources: [
　　createres: data
　　preparedisk: data
　　adjustdisk: data
　　adjustnet: data
　　]
　　..........
　　***************************************************************
　　DRBD's startupscript waits for the peer node(s) to appear.
　　- If thisnode was already a degraded cluster before the
　　reboot,the timeout is 0 seconds. [degr-wfc-timeout]
　　- If thepeer was available before the reboot, the timeout
　　is 0seconds. [wfc-timeout]
　　(Thesevalues are for resource 'data'; 0 sec -> wait forever)
　　To abortwaiting enter 'yes' [  23]:
　　[root@test-backup~]# service drbd start #（在备node执行）
　　Starting DRBD resources: [
　　createres: data
　　preparedisk: data
　　adjustdisk: data
　　adjustnet: data
　　]
　　.
　　[root@test-master ~]# drbdadm role data
　　Secondary/Secondary
　　[root@test-master ~]# ssh test-backup 'drbdadm roledata'
　　Secondary/Secondary
　　[root@test-master ~]# drbdadm -- --overwrite-data-of-peer primary data
　　[root@test-master ~]# drbdadm role data
　　Primary/Secondary
　　[root@test-master ~]# service heartbeat start
　　Starting High-Availability services: INFO:  Resource is stopped
　　Done.
　　[root@test-master ~]# ssh test-backup 'serviceheartbeat start'
　　Starting High-Availability services: 2016/08/09_03:08:11INFO:  Resource is stopped
　　Done.
　　[root@test-master ~]# ip addr | grep 10.96.20
　　inet10.96.20.113/24 brd 10.96.20.255 scope global eth0
　　inet 10.96.20.8/24 brd 10.96.20.255 scope global secondaryeth0
　　[root@test-master ~]# drbdadm role data
　　Primary/Secondary
　　[root@test-master ~]# df -h
　　Filesystem    Size  Used Avail Use% Mounted on
　　/dev/sda2    18G  6.3G 11G 38% /
　　tmpfs       112M    0  112M  0% /dev/shm
　　/dev/sda1    283M 83M  185M 31% /boot
　　/dev/sr0    3.6G  3.6G    0 100% /mnt/cdrom
　　/dev/drbd0    946M 1.3M  896M 1% /drbd
　　[root@test-master ~]# ls /drbd
　　lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9
　　[root@test-master ~]# service heartbeat stop
　　Stopping High-Availability services: Done.
　　[root@test-master ~]# ssh test-backup 'ip addr |grep 10.96.20'
　　inet10.96.20.114/24 brd 10.96.20.255 scope global eth0
　　inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
　　[root@test-master ~]# ssh test-backup 'df -h'
　　Filesystem    Size  Used Avail Use% Mounted on
　　/dev/sda2    18G  3.9G 13G 24% /
　　tmpfs       112M    0  112M  0% /dev/shm
　　/dev/sda1    283M 83M  185M 31% /boot
　　/dev/sr0    3.6G  3.6G    0 100% /mnt/cdrom
　　/dev/drbd0    946M  1.3M  896M  1% /drbd
　　[root@test-master ~]# ssh test-backup 'ls /drbd'
　　lost+found
　　test1
　　test10
　　test2
　　test3
　　test4
　　test5
　　test6
　　test7
　　test8
　　test9
　　[root@test-master ~]# drbdadm role data
　　Secondary/Primary
　　[root@test-master ~]# service heartbeat start #（主node恢复后，先确保把drbd理顺，弄正常，再开启heartbeat服务）
　　Starting High-Availability services: INFO:  Resource is stopped
　　Done.
　　[root@test-master ~]# drbdadm role data
　　Primary/Secondary
　　[root@test-master ~]# ip addr | grep 10.96.20
　　inet10.96.20.113/24 brd 10.96.20.255 scope global eth0
　　inet10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
　　[root@test-master ~]# df -h
　　Filesystem    Size  Used Avail Use% Mounted on
　　/dev/sda2    18G  6.3G 11G 38% /
　　tmpfs       112M    0  112M  0% /dev/shm
　　/dev/sda1    283M  83M  185M  31% /boot
　　/dev/sr0    3.6G  3.6G    0 100% /mnt/cdrom
　　/dev/drbd0    946M  1.3M  896M  1% /drbd
　　[root@test-master ~]# ls /drbd
　　lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9
　　注：若两端出现Primary/Unknown或Secondary/Unknown，调整方法：
　　#service heartbeat stop #（把两端heartbeat服务停掉）
　　#drbdadm secondary data #（将备node的drbd置从）
　　#drbdadm disconnect data
　　#drbdadm -- --discard-my-data connect data
　　#drbdadm role data
　　#drbdadm connect data #（主node操作）
　　4、安装配置nfs
　　在两个主node和nfs slave1上均如下操作：
　　[root@test-master ~]# yum -y groupinstall 'NFS fileserver'
　　[root@test-master ~]# rpm -qa nfs-utils rpcbind
　　nfs-utils-1.2.3-70.el6_8.1.x86_64
　　rpcbind-0.2.0-12.el6.x86_64
　　[root@test-master ~]# service rpcbind start
　　[root@test-master ~]# service nfs start
　　Starting NFS services:                                  [  OK  ]
　　Starting NFS quotas:                                     [  OK  ]
　　Starting NFS mountd:                                     [  OK  ]
　　Starting NFS daemon:                                     [  OK  ]
　　Starting RPC idmapd:                                     [  OK  ]
　　[root@test-master ~]# chkconfig rpcbind on
　　[root@test-master ~]# chkconfig nfs on
　　[root@test-master ~]# chkconfig --list rpcbind
　　rpcbind          0:off 1:off 2:on 3:on 4:on 5:on 6:off
　　[root@test-master ~]# chkconfig --list nfs
　　nfs             0:off 1:off 2:on 3:on 4:on 5:on 6:off
　　在两个主node上操作：
　　[root@test-master ~]# vim /etc/exports
　　/drbd  10.96.20.*(rw,sync,all_squash,anonuid=65534,anongid=65534,mp,fsid=2)
　　[root@test-master ~]# chmod 777 -R /drbd
　　[root@test-master ~]# service nfs reload #（相当于#exportfs-r）
　　5、测试：
　　两端主均开启heartbeat
　　在nfs-slave上测试，正常

　　[root@test-master ~]# service heartbeat stop
　　Stopping High-Availability services:
　　/sbin/service: line 66: 17235 Killed                env -i PATH="$PATH"TERM="$TERM" "${SERVICEDIR}/${SERVICE}" ${OPTIONS}
　　[root@test-master ~]# tail -f /var/log/ha-log #（测试在对heartbeat停服时，切换过程中一直卸载不掉挂载的分区，最终会强制重启server）
　　Filesystem(Filesystem_/dev/drbd0)[19791]:  2016/08/09_04:36:21 INFO: No processes on/drbd were signalled. force_unmount is
　　Filesystem(Filesystem_/dev/drbd0)[19791]:  2016/08/09_04:36:22 ERROR: Couldn't unmount /drbd; trying cleanup with KILL
　　Filesystem(Filesystem_/dev/drbd0)[19791]:  2016/08/09_04:36:22 INFO: No processes on/drbd were signalled. force_unmount is
　　Filesystem(Filesystem_/dev/drbd0)[19791]:  2016/08/09_04:36:23 ERROR: Couldn't unmount/drbd, giving up!
　　/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[19783]: 2016/08/09_04:36:23 ERROR:  Generic error
　　ResourceManager(default)[17256]:       2016/08/09_04:36:23 ERROR: Return code 1from /etc/ha.d/resource.d/Filesystem
　　/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[20014]: 2016/08/09_04:36:23 INFO:  Running OK
　　ResourceManager(default)[17256]:       2016/08/09_04:36:23 CRIT: Resource STOP failure. Reboot required!
　　ResourceManager(default)[17256]:       2016/08/09_04:36:23 CRIT: Killingheartbeat ungracefully!
　　[root@test-backup ~]# drbdadm role data #（主node那边server重启后，备node查看已接管）
　　Primary/Unknown
　　[root@test-backup ~]# ip addr
　　……
　　2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
　　link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff
　　inet10.96.20.114/24 brd 10.96.20.255 scope global eth0
　　inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0
　　inet6fe80::20c:29ff:fe15:e6bb/64 scope link
　　valid_lft forever preferred_lft forever
　　[root@test-backup ~]# df -h
　　Filesystem    Size  Used Avail Use% Mounted on
　　/dev/sda2    18G  3.9G  13G  24% /
　　tmpfs       112M    0  112M  0% /dev/shm
　　/dev/sda1    283M 83M  185M 31% /boot
　　/dev/sr0    3.6G  3.6G    0 100% /mnt/cdrom
　　/dev/drbd0    946M  1.3M  896M  1% /drbd
　　[root@test-backup ~]# ls /drbd
　　lost+found test111  test2  test222.txt test3  test4  test5 test6  test7  test8 test9
　　两主node的热备是实现了，但nfs slave挂载时一直挂载不上，卡住了，服务端（nfs master active）保存有nfs客户端挂载状态，这时需重启nfs服务端，于是在heartbeat的haresources配置文件中加入脚本，让其切换时重启nfs
　　关闭两主node的drbd和heartbeat服务
　　[root@test-master ~]# vim /etc/ha.d/haresources
　　test-master IPaddr::10.96.20.8/24/eth0    drbddisk::data Filesystem::/dev/drbd0::/drbd::ext4 killnfs
　　[root@test-master ~]# cd /etc/ha.d/resource.d/
　　[root@test-master resource.d]# vim killnfs
　　---------------script start-------------
　　#!/bin/bash
　　#
　　for i in {1..10};do
　　killall nfsd
　　done
　　service nfs start
　　exit 0
　　----------------script end--------------
　　[root@test-master resource.d]# chmod 755 killnfs
　　[root@test-master resource.d]# ll killnfs
　　-rwxr-xr-x. 1 root root 79 Aug  9 21:02 killnfs
　　[root@test-master resource.d]# scp killnfs root@test-backup:/etc/ha.d/resource.d/
　　killnfs                                                                                                 100% 79    0.1KB/s  00:00
　　[root@test-master resource.d]# cd ..
　　[root@test-master ha.d]# scp haresources root@test-backup:/etc/ha.d/
　　haresources                                                                                              100% 6003    5.9KB/s 00:00
　　调整好drbd再开启heartbeat，重新测试，nfs slave在主切换时正常，没有挂载不上或卡住的问题

　　注：调试的一个大前提是，确保drbd是正常的，再开启heartbeat这样就不会有问题
　　注：ganji图片架构演变

　　注：用户上传图片到web server上后，web server把图片POST到对应设置ID的图片server上，图片server上的php接收到POST来的图片把图片写入到本地磁盘并返回对应的成功状态码，前端web server根据返回成功的状态码把图片server对应的ID和对应的图片path写入到DB server；用户访问页面时，根据请求从DB读取图片server ID和图片的URL到对应图片server上访问
　　

账号		自动登录	找回密码
密码			立即注册

VMware vcenter+vSphere 6.5 U2共享

【跟谁学】韩宇极简英语课-技术人员不得不

用Zabbix通过JMX方式监控weblogic

winhex数据恢复教程（非常巨大，内容丰富）

Symantec Backup Exec 2015 2016/2012 BE20

NetScaler VPX部署之：NetScaler Gateway调

zabbix3.4.1安装部署+微信推送信息+大屏显

[经验分享] V 8 nfs+drbd+heartbeat

扫码加入运维网微信交流群