华风 发表于 2018-12-29 08:09:30

keepalived集群高可用

    keepalived是基于vrrp的实现,最初为ipvs提供高可用功能,同时能够检测后端realserver的健康状态,此后又为其他的服务提供高可用功能,它可以调用外部脚本,监控资源状态,从而进行故障转移,比较适用于节点比较少,不使用共享存储等情况下;相对于heartbeat以及corosync和RHCS来说,算是一种轻量级的高可用解决方案。


    keepalived包含2个主要组件,一个组件负责VRRP协议的实现和管理,一个负责对资源进行监控;运行VRRP的主机在主机之间发送组播信息,用以通告自身的优先级、以及其他属性,主机的优先级范围从0-255, 0表示不参与选举,255表示最高优先级,优先级高的会成为MASTER,在MASTER上启动VIP和VMAC,所有发送到VIP的数据包,由MASTER进行处理,当MASTER出现故障或资源出现问题,会停止发送心跳信息或降低自身优先级,当BACKUP主机发现自己的优先级高于对方或检测不到对方时,会将自己提升为MASTER,当原先的MASTER恢复时,默认可以把MASTER抢回去,可以配置不抢占。


在CentOS 6.4上有keepalived的rpm包,直接安装就可以


# yum -yinstall keepalived
Installed:
keepalived.x86_64 0:1.2.7-3.el6


Dependency Installed:
lm_sensors-libs.x86_64 0:3.1.1-17.el6   net-snmp-libs.x86_64 1:5.5-44.el6


配置文件在/etc/keepalived/keepalived.conf,服务启动脚本是/etc/init.d/keepalived


#cat keepalived.conf
! Configuration File forkeepalived


global_defs {
notification_email {
   acassen@firewall.loc用以定义状态切换时邮件通知对象,可写多行(个)
   }
notification_email_fromAlexandre.Cassen@firewall.loc;发件人
smtp_server 192.168.200.1 SMTP服务器地址
smtp_connect_timeout 30连接超时时长
router_id LVS_DEVEL   VRRP主机标识


vrrp_scriptchk_state_down {
script "[[ -f /etc/keepalived/down ]]"自定义一个检测条件,如果存在此文件
interval 1检测间隔为1s
weight -10         weight 减去10;
}




}


vrrp_instanceVI_1 {定义一个VRRP实例
state MASTER状态为MASTER
interface eth0在eth0上监听VIP
virtual_router_id 10 定义所属的组,同一个组中的主机必须相同;
priority 100优先级
advert_int 1通告间隔(advertising interval)
authentication {认证
    auth_type PASS类型
    auth_pass 1111密码
}
virtual_ipaddress {
172.16.1.250       定义VIP
}


track_script {
chk_state_down
}


}


virtual_server172.16.1.10080 {
delay_loop 6
lb_algo rr
lb_kind NAT
nat_mask 255.255.255.0
persistence_timeout 50
protocol TCP


real_server 172.16.1.2 80 {
    weight 1
HTTP_GET {检测后端realserver的方式为GET
      url {
       path /
state_code 200    检测报文头部的状态码为200
      }
      connect_timeout 3检测超时时间为3s
      nb_get_retry 3重试的次数为3次
      delay_before_retry 3重试的间隔
    }
}
}




定义主机发生状态变化时,发送邮件通知
首先定义一个脚本:
#!/bin/bash
#
vip=172.16.1.100
contact='root@localhost'
notify (){
mailsubject="`hostname` became to $1 , $vip floated."
mailbody="`date +"%F %T"`: vrrp status changed.`hostname` became $1"
echo $mailbody | mail -s "$mailsubject" $contact
}


case $1in
master)
notify master ;;
backup)
notify backup;;
fault)
notify fault ;;
*)
echo "Usage: `basename $0` {master|backup|fault}"
exit 1;;
esac


编辑配置文件:
vrrp_instance Instance1 {
state MASTER
interface eth0
virtual_router_id 10
priority 100
advert_int 1
authentication {
    auth_type PASS
    auth_pass 1119
}
virtual_ipaddress {
    172.16.1.100
}
track_script {
chk_state_down
}
notify_master"/etc/keepalived/nofity.sh master"
notify_backup"/etc/keepalived/nofity.sh backup"
notify_fault"/etc/keepalived/nofity.sh fault"
}








#cat keepalived.conf
! Configuration File forkeepalived


global_defs {
notification_email {
    root@localhost
}
notification_email_from admin@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Node4.magedu.com
}


vrrp_scriptchk_state_down {
    script "[[ -f /etc/keepalived/down]] && exit 1 || exit 0"
    interval 1
    weight -10
}


vrrp_instance Instance1 {
state MASTER
interface eth0
virtual_router_id 10
priority 100
advert_int 1
authentication {
    auth_type PASS
    auth_pass 1119
}
virtual_ipaddress {
    172.16.1.100
}
    track_script {
    chk_state_down
    }
    notify_master"/etc/keepalived/notify.sh master"
    notify_backup"/etc/keepalived/notify.sh backup"
    notify_fault"/etc/keepalived/notify.sh fault"
}


virtual_server172.16.1.100 80 {
delay_loop 6
lb_algo rr
lb_kind NAT
nat_mask 255.255.255.0
persistence_timeout 50
protocol TCP


real_server 172.16.1.3 80 {
    weight 1
    HTTP_GET {
      url {
       path /
            state_code 200
      }
      connect_timeout 3
      nb_get_retry 3
      delay_before_retry 3
    }
}
real_server 172.16.1.2 80 {
    weight 1
    HTTP_GET {
      url {
       path /
            status_code 200
      }
      connect_timeout 3
      nb_get_retry 3
      delay_before_retry 3
    }
   }
}
}


如果检测成功就会自动生成ipvs规则
#ipvsadm -L -n
IP Virtual Server version1.2.1 (size=4096)
Prot LocalAddress:PortScheduler Flags
-> RemoteAddress:Port      Forward Weight ActiveConn InActConn
TCP 172.16.1.100:80 rr persistent 50
-> 172.16.1.2:80      Masq1   0   0
-> 172.16.1.3:80      Masq1   0   0
当有一个real server失效以后,会自动修改规则:
#ipvsadm -L -n
IP Virtual Server version1.2.1 (size=4096)
Prot LocalAddress:PortScheduler Flags
-> RemoteAddress:Port      Forward Weight ActiveConn InActConn
TCP 172.16.1.100:80 rr persistent 50
-> 172.16.1.3:80      Masq1   0   0
#tail /var/log/messages
Sep 25 04:22:20 node5Keepalived_vrrp: VRRP_Instance(Instance1) Sending gratuitous ARPs oneth0 for 172.16.1.100
Sep 25 04:22:25 node5Keepalived_vrrp: VRRP_Instance(Instance1) Sending gratuitous ARPs oneth0 for 172.16.1.100
Sep 25 04:22:28 node5Keepalived_vrrp: VRRP_Instance(Instance1) Received higher prio advert
Sep 25 04:22:28 node5Keepalived_vrrp: VRRP_Instance(Instance1) Entering BACKUP STATE
Sep 25 04:22:28 node5Keepalived_vrrp: VRRP_Instance(Instance1) removing protocol VIPs.
Sep 25 04:22:28 node5Keepalived_healthcheckers: Netlink reflector reports IP 172.16.1.100removed
Sep 25 04:36:11 node5Keepalived_healthcheckers: Error connecting server :80.
Sep 25 04:36:11 node5Keepalived_healthcheckers: Removing service :80 from VS:80
Sep 25 04:36:11 node5Keepalived_healthcheckers: Remote SMTP server :25 connected.
Sep 25 04:36:11 node5Keepalived_healthcheckers: SMTP alert successfully sent.


# 如何对特定的服务实现高可用:
思路:
让vrrp_script 监控程序的运行状态,当服务停止时,降低优先级,优先级降低引起VRRP的状态切换,切换的同时,系统会发出邮件,管理员收到邮件后及时处理故障,启动服务后,优先级恢复到原来的值,根据实际情况配置preempt;如果不对故障进行处理那么当第二台设备也服务也停止了,整个系统将无法工作;
# cat keepalived.conf
!Configuration File for keepalived


global_defs {
notification_email {
    root@localhost
}
notification_email_from admin@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id Node4.magedu.com
}


vrrp_script chk_state_down {
script"[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"
    interval 1
    weight-2
}
vrrp_script chk_httpd {
    script"killall -0 httpd"
    interval 1
    fall 2
    rise 1
    weight-20
}


vrrp_instanceInstance1 {
state MASTER
interface eth0
virtual_router_id 10
priority 100
advert_int 1
authentication {
    auth_type PASS
    auth_pass 1119
}
virtual_ipaddress {
    172.16.1.100
}
    track_script {
    chk_state_down
    chk_httpd
    }
    notify_master "/etc/keepalived/notify.sh master"
    notify_backup "/etc/keepalived/notify.sh backup"
    notify_fault "/etc/keepalived/notify.sh fault"


}
# cat notify.sh
#!/bin/bash
vip=172.16.1.100
contact='root@localhost'
notify () {
    mailsubject="`hostname` became to$1 , $vip floated."
    mailbody="`date +"%F%T"`: vrrp status changed. `hostname` became $1"
    echo $mailbody | mail -s"$mailsubject" $contact
}


case $1 in
    master)
      notify master
      /etc/init.d/httpd start;;
    backup)
      notify backup
      /etc/init.d/httpd stop;;
    fault)
      notify fault
      /etc/init.d/httpd stop;;
    *)
      echo "Usage: `basename $0`{master|backup|fault}"
      exit 1;;


esac


keepalived双主模型:
提供2个vrrp实例,2个vip;


vrrp_instance Instance1 {
state MASTER
interface eth0
virtual_router_id 10
priority 100
advert_int 1
authentication {
    auth_type PASS
    auth_pass 1119
}
virtual_ipaddress {
    172.16.1.100
}
track_script {
chk_state_down
chk_httpd
}
notify_master"/etc/keepalived/notify.sh master"
notify_backup"/etc/keepalived/notify.sh backup"
notify_fault"/etc/keepalived/notify.sh fault"


}


vrrp_instance Instance2 {
state BACKUP
interface eth0
virtual_router_id 20
priority 95
advert_int 1
authentication {
    auth_type PASS
    auth_pass 2119
}
virtual_ipaddress {
    172.16.1.101
}
track_script {
chk_state_down
chk_httpd
}
notify_master"/etc/keepalived/notify.sh master"
notify_backup"/etc/keepalived/notify.sh backup"
notify_fault"/etc/keepalived/notify.sh fault"


}
重启keepalived服务
# servicekeepalived restart
#ip addr show
1: lo: mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether 00:0c:29:85:22:ac brdff:ff:ff:ff:ff:ff
inet 172.16.1.4/16 brd 172.16.255.255 scopeglobal eth0
inet 172.16.1.100/32 scope global eth0
inet6 fe80::20c:29ff:fe85:22ac/64 scopelink
    valid_lft forever preferred_lft forever




当另一台主机发生故障时:
#ip addr show
1: lo: mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether 00:0c:29:85:22:ac brdff:ff:ff:ff:ff:ff
inet 172.16.1.4/16 brd 172.16.255.255 scopeglobal eth0
inet 172.16.1.100/32 scope global eth0
inet 172.16.1.101/32 scope global eth0
inet6 fe80::20c:29ff:fe85:22ac/64 scopelink
    valid_lft forever preferred_lft forever







页: [1]
查看完整版本: keepalived集群高可用