Heartbeat+Drbd实现
继续之前的操作,在drbd部署完成之后,将drbd和heartbeat结合起来,实现drbd服务的高可用,并在主节点完成自动挂载,且能够做到故障自动切换。按照之前的部署,只需要修改heartbeat中的资源,也即修改/etc/init.d/haresources文件的内容。
1、准备工作
注意:在配置drbd高可用之前,需要保证drbd服务是启动的,而且要实现两端都是secondary的状态,如下:
# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
所以,需要在两个drbd节点上都把drbd设置为开机自动启动。
1
2
/etc/init.d/drbd start
chkconfig drbd on
在上述工作完成之后,修改haresources文件,内容如下所示:
# tail -1 /etc/ha.d/haresources
heartbeat01.contoso.comIPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4
#这里以heartbeat01为例,heartbeat02的配置和heartbeat01保持一致
2、启动heartbeat
然后,两个节点同时启动heartbeat服务,
/etc/init.d/heartbeat start
3、观察两个节点的服务
1)下面是节点1(heartbeat01)上的状态:
# ip a |grep 49.100
inet 172.16.49.100/24 brd 172.16.49.255 scope global secondary eth1
可以看到,节点1(heartbeat01)已经获取了VIP。
# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:0 dw:4 dr:709 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
而且,heartbeat01是drbd中的Primary节点。
# mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/drbd0 on /data type ext4 (rw)
heartbeat01已经自动挂载/dev/drbd0到/data下。
# ls /data
10.txt1.txt 29.txt38.txt47.txt56.txt65.txt74.txt83.txt92.txt
11.txt20.txt2.txt 39.txt48.txt57.txt66.txt75.txt84.txt93.txt
12.txt21.txt30.txt3.txt 49.txt58.txt67.txt76.txt85.txt94.txt
13.txt22.txt31.txt40.txt4.txt 59.txt68.txt77.txt86.txt95.txt
14.txt23.txt32.txt41.txt50.txt5.txt 69.txt78.txt87.txt96.txt
15.txt24.txt33.txt42.txt51.txt60.txt6.txt 79.txt88.txt97.txt
16.txt25.txt34.txt43.txt52.txt61.txt70.txt7.txt 89.txt98.txt
17.txt26.txt35.txt44.txt53.txt62.txt71.txt80.txt8.txt 99.txt
18.txt27.txt36.txt45.txt54.txt63.txt72.txt81.txt90.txt9.txt
19.txt28.txt37.txt46.txt55.txt64.txt73.txt82.txt91.txtlost+found
同时,之前drbd同步的文件也都在。
2)下面是节点1(heartbeat01)上的状态:
# ip a |grep 49.100
节点2上没有VIP。
# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:4 dw:4 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
节点2(heartbeat02)在drbd中是secondary状态。
# mount -n
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
同时,heartbeat02也没有挂载/dev/drbd0。
# ll /data
total 0
当然,/data下面什么都没有。
4、模拟故障切换场景
下面将heartbeat01的heartbeat服务停掉,查看drbd能否自动挂载到heartbeat02上。
# /etc/init.d/heartbeat stop
Stopping High-Availability services: Done.
1)下面是节点1(heartbeat01)上的状态:
# ip a|grep 49.100
# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:16 nr:4 dw:20 dr:1418 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
# ll /data
total 0
2)下面是节点2(heartbeat02)上的状态:
# ip a |grep 49.100
inet 172.16.49.100/24 brd 172.16.49.255 scope global secondary eth1
# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:16 dw:20 dr:705 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
# ls /data
10.txt1.txt 29.txt38.txt47.txt56.txt65.txt74.txt83.txt92.txt
11.txt20.txt2.txt 39.txt48.txt57.txt66.txt75.txt84.txt93.txt
12.txt21.txt30.txt3.txt 49.txt58.txt67.txt76.txt85.txt94.txt
13.txt22.txt31.txt40.txt4.txt 59.txt68.txt77.txt86.txt95.txt
14.txt23.txt32.txt41.txt50.txt5.txt 69.txt78.txt87.txt96.txt
15.txt24.txt33.txt42.txt51.txt60.txt6.txt 79.txt88.txt97.txt
16.txt25.txt34.txt43.txt52.txt61.txt70.txt7.txt 89.txt98.txt
17.txt26.txt35.txt44.txt53.txt62.txt71.txt80.txt8.txt 99.txt
18.txt27.txt36.txt45.txt54.txt63.txt72.txt81.txt90.txt9.txt
19.txt28.txt37.txt46.txt55.txt64.txt73.txt82.txt91.txtlost+found
3)检查一下heartbeat02上的日志
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: Received shutdown notice from 'heartbeat01.contoso.com'.
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: Resources being acquired from heartbeat01.contoso.com.
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: acquire local HA resources (standby).
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: local HA resource acquisition completed (standby).
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: Standby resource acquisition done .
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: No local resources to acquire.
harc(default): 2016/09/26_00:32:04 info: Running /etc/ha.d//rc.d/status status
mach_down(default): 2016/09/26_00:32:04 info: Taking over resource group IPaddr::172.16.49.100/24/eth1
ResourceManager(default): 2016/09/26_00:32:04 info: Acquiring resource group: heartbeat01.contoso.com IPaddr::172.16.49.100/24/eth1 drbddisk::test Filesystem::/dev/drbd0::/data::ext4
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100): 2016/09/26_00:32:04 INFO:Resource is stopped
ResourceManager(default): 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/IPaddr 172.16.49.100/24/eth1 start
IPaddr(IPaddr_172.16.49.100): 2016/09/26_00:32:04 INFO: Adding inet address 172.16.49.100/24 with broadcast address 172.16.49.255 to device eth1
IPaddr(IPaddr_172.16.49.100): 2016/09/26_00:32:04 INFO: Bringing device eth1 up
IPaddr(IPaddr_172.16.49.100): 2016/09/26_00:32:04 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-172.16.49.100 eth1 172.16.49.100 auto not_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_172.16.49.100): 2016/09/26_00:32:04 INFO:Success
ResourceManager(default): 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/drbddisk test start
/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0): 2016/09/26_00:32:04 INFO:Resource is stopped
ResourceManager(default): 2016/09/26_00:32:04 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext4 start
Filesystem(Filesystem_/dev/drbd0): 2016/09/26_00:32:04 INFO: Running start for /dev/drbd0 on /data
/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0): 2016/09/26_00:32:04 INFO:Success
mach_down(default): 2016/09/26_00:32:04 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down(default): 2016/09/26_00:32:04 info: mach_down takeover complete for node heartbeat01.contoso.com.
Sep 26 00:32:04 heartbeat02.contoso.com heartbeat: : info: mach_down takeover complete.
Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: : WARN: node heartbeat01.contoso.com: is dead
Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: : info: Dead node heartbeat01.contoso.com gave up resources.
Sep 26 00:32:36 heartbeat02.contoso.com heartbeat: : info: Link heartbeat01.contoso.com:eth1 dead.
Sep 26 00:32:36 heartbeat02.contoso.com ipfail: : info: Status update: Node heartbeat01.contoso.com now has status dead
Sep 26 00:32:38 heartbeat02.contoso.com ipfail: : info: NS: We are dead. :<
Sep 26 00:32:38 heartbeat02.contoso.com ipfail: : info: Link Status update: Link heartbeat01.contoso.com/eth1 now has status dead
Sep 26 00:32:39 heartbeat02.contoso.com ipfail: : info: We are dead. :<
Sep 26 00:32:39 heartbeat02.contoso.com ipfail: : info: Asking other side for ping node count.
页:
[1]