lenf 发表于 2018-12-29 10:44:08

PGSQL主从+keepalived高可用配置

  环境说明:
  主机与IP:

  192.168.11.177 主库
  192.168.11.180 备库
  192.168.11.210VIP
  系统:

  centos7.2
  PGSQL9.6主从已安装配置完成(参考我之前的博客)
  安装配置:
  1、安装配置keepalived
  主备安装:
  # yum install -y keepalived

  主配置:
  # vi /etc/keepalived/keepalived.conf
  bal_defs {
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id pg
  }
  vrrp_script chk_pgsql {
  script "/etc/keepalived/scripts/pgsql_check.sh"
  interval 2
  weight -5
  fall 2
  rise 1
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 61
  priority 100
  nopreempt
  advert_int 1
  authentication {
  auth_type PASS
  auth_pass 1111
  }
  track_script {
  chk_pgsql
  }
  virtual_ipaddress {
  192.168.11.210
  }
  }
  备配置:
  # vi /etc/keepalived/keepalived.conf
  bal_defs {
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id pg
  }
  vrrp_script chk_pgsql {
  script "/etc/keepalived/scripts/pgsql_check.sh"
  interval 2
  weight -5
  fall 2
  rise 1
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 61
  priority 80
  nopreempt
  advert_int 1
  authentication {
  auth_type PASS
  auth_pass 1111
  }
  track_script {
  chk_pgsql
  }
  virtual_ipaddress {
  192.168.11.210
  }
  }
  注意:

  这里virtual_router_id按照默认的值51会出错bogus VRRP packet received on eth0 !!!,所以改值为61。
  主备创建目录和脚本
  # mkdir /etc/keepalived/scripts

  # vi /etc/keepalived/scripts/pgsql_check.sh
  #!/bin/bash
  #判断pg是否活着
  A=`ps -C postgres --no-header | wc -l`
  #判断vip浮到哪里
  B=`ip a | grep 192.168.11.210 | wc -l`
  #判断是否是从库处于等待的状态
  C=`ps -ef | grep postgres | grep 'startup process' | wc -l`
  #判断从库链接主库是否正常
  D=`ps -ef | grep postgres | grep 'receiver' | wc -l`
  #判断主库连接从库是否正常
  E=`ps -ef | grep postgres | grep 'sender' | wc -l`
  #如果pg死了,将消息写入日记并且关闭keepalived
  if [ $A -eq 0 ];then
  echo "`date "+%Y-%m-%d--%H:%M:%S"` postgresql stop so vip stop " >> /etc/keepalived/log/check_pg.log
  systemctl stop keepalived
  else
  #判断出主挂了,vip浮到了从,提升从的地位让他可读写
  if [ $B -eq 1 -a $C -eq 1 -a $D -eq 0 ];then
  su - postgres -c "pg_ctl promote -D /data/pg_data"
  echo "`date "+%Y-%m-%d--%H:%M:%S"` standby promote " >> /etc/keepalived/log/check_pg.log
  fi
  #判断出自己是主并且和从失去联系
  if [ $B -eq 1 -a $C -eq 0 -a $D -eq 0 -a $E -eq 0 ];then
  sleep 10
  echo "`date "+%Y-%m-%d--%H:%M:%S"` can't find standby " >> /etc/keepalived/log/check_pg.log
  fi
  fi
  主备配置日志:
  修改 /etc/sysconfig/keepalived
  把KEEPALIVED_OPTIONS="-D" 修改为KEEPALIVED_OPTIONS="-D -d -S 0"
  # vi /etc/rsyslog.conf
  加入如下配置:
  #keepalived -S 0
  local0.*/var/log/keepalived.log
  主备启动服务
  # systemctl start keepalived.service
  # systemctl enable keepalived.service
  2、停止主库服务,并切换主库为备库

  停止主库服务,之后发现主库上的VIP消失,备库上的VIP生成,备库变为主库,可以进行建库建表等操作。
  原来的主库切换为备库:
  $ cd /data/pg_data
  $ rm -rf *
  $ pg_basebackup -h 192.168.11.180 -U repuser -D /data/pg_data -X stream -P
  $ mv recovery.done recovery.conf
  $ vi recovery.conf
  primary_conninfo = 'host=192.168.11.177 port=5432 user=repuser password=password123! keepalives_idle=60'
  >>
  primary_conninfo = 'host=192.168.11.180 port=5432 user=repuser password=password123! keepalives_idle=60'
  启动主机keepalived
  # systemctl start keepalived
  3、检查验证

  查看原来备库服务和库状态
  $ ps -ef | grep postgres
  postgres   1081   10 Aug07 ?      00:00:06 /usr/local/postgresql/bin/postgres -D /data/pg_data
  postgres   108310810 Aug07 ?      00:00:01 postgres: checkpointer process
  postgres   108410810 Aug07 ?      00:00:02 postgres: writer process
  postgres   108510810 Aug07 ?      00:00:00 postgres: stats collector process
  postgres1396110810 11:09 ?      00:00:00 postgres: wal writer process
  postgres1396210810 11:09 ?      00:00:00 postgres: autovacuum launcher process
  postgres1396310810 11:09 ?      00:00:00 postgres: archiver process   last was 000000020000000000000010
  postgres2706510810 11:25 ?      00:00:00 postgres: wal sender process repuser 192.168.11.177(47074) streaming 0/11000060
  root   2792235900 11:26 pts/1    00:00:00 grep --color=auto postgres
  postgres=# SELECT pg_is_in_recovery from pg_is_in_recovery();
  pg_is_in_recovery
  -------------------
  f
  (1 row)
  查看原来主库服务和库状态
  $ ps -ef | grep postgres
  postgres   2602   10 11:25 pts/1    00:00:00 /usr/local/postgresql/bin/postgres -D /data/pg_data
  postgres   260326020 11:25 ?      00:00:00 postgres: startup process   recovering 000000020000000000000011
  postgres   260426020 11:25 ?      00:00:00 postgres: checkpointer process
  postgres   260526020 11:25 ?      00:00:00 postgres: writer process
  postgres   260626020 11:25 ?      00:00:00 postgres: stats collector process
  postgres   260726020 11:25 ?      00:00:00 postgres: wal receiver process   streaming 0/11000060
  postgres   261323250 11:25 pts/1    00:00:00 grep --color=auto postgres
  postgres=# SELECT pg_is_in_recovery from pg_is_in_recovery();
  pg_is_in_recovery
  -------------------
  t
  (1 row)
  4、测试结果

  PGSQL高可用测试结果如下:
  高可用方案:
  PGSQL 主从 + keepalived
  资源:
  192.168.11.177 主库(称为服务器A)
  192.168.11.180 备库(称为服务器B)
  192.168.11.210VIP
  1)、模拟A的PGSQL服务停止
  B接管VIP(自动)
  B由从库变为主库(自动)
  启动A的PGSQL服务,并把A配置为备库(手动)
  2)、模拟B的PGSQL服务停止
  A接管VIP(自动)
  A由从库变为主库(自动)
  启动B的PGSQL服务,并把B配置为备库(手动)
  参考:
  https://blog.csdn.net/vanilla_he/article/details/79001890



chenping1995 发表于 2019-8-30 08:02:53

你好,pgsql的主从复制可以给个链接吗?我从库配置后无法启动了
页: [1]
查看完整版本: PGSQL主从+keepalived高可用配置