HeartBeat+NFS配置
接上一篇的DRBD的系统环境及安装配置继续....
一、Hearbeat配置
1、安装heartbeat
# yum install epel-release -y # yum --enablerepo=epel install heartbeat -y
2、设置heartbeat配置文件
(node1)
编辑ha.cf,添加下面配置:
# vi /etc/ha.d/ha.cf logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 5 ucast eth0 192.168.0.192 # 指定对方网卡及IP auto_failback off node drbd1.corp.com drbd2.corp.com
(node2)
编辑ha.cf,添加下面配置:
# vi /etc/ha.d/ha.cf logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 5 ucast eth0 192.168.0.191 auto_failback off node drbd1.corp.com drbd2.corp.com
3、编辑双机互联验证文件authkeys,添加以下内容:(node1,node2)
# vi /etc/ha.d/authkeys auth 1 1 crc
给验证文件600权限
# chmod 600 /etc/ha.d/authkeys
4、编辑集群资源文件:(node1,node2)
# vi /etc/ha.d/haresources drbd1.corp.com IPaddr::192.168.0.190/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/store::ext4 killnfsd
注:该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下,也可在该目录下存放服务启动脚本(例如:mysql,www),将相同脚本名称添加到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
IPaddr::192.168.0.190/24/eth0:用IPaddr脚本配置对外服务的浮动虚拟IP
drbddisk::r0:用drbddisk脚本实现DRBD主从节点资源组的挂载和卸载
Filesystem::/dev/drbd0::/store::ext4:用Filesystem脚本实现磁盘挂载和卸载
5、编辑脚本文件killnfsd,用来重启NFS服务:(node1,node2)
# vi /etc/ha.d/resource.d/killnfsd killall -9 nfsd; /etc/init.d/nfs restart;exit 0
赋予755执行权限:
# chmod 755 /etc/ha.d/resource.d/killnfsd
二、创建DRBD脚本文件drbddisk:(node1,node2)
编辑drbddisk,添加下面的脚本内容
# vi /etc/ha.d/resource.d/drbddisk
#!/bin/bash # # This script is inteded to be used as resource script by heartbeat # # Copright 2003-2008 LINBIT Information Technologies # Philipp Reisner, Lars Ellenberg # ### DEFAULTFILE="/etc/default/drbd" DRBDADM="/sbin/drbdadm" if [ -f $DEFAULTFILE ]; then . $DEFAULTFILE fi if [ "$#" -eq 2 ]; then RES="$1" CMD="$2" else RES="all" CMD="$1" fi ## EXIT CODES # since this is a "legacy heartbeat R1 resource agent" script, # exit codes actually do not matter that much as long as we conform to # http://wiki.linux-ha.org/HeartbeatResourceAgent # but it does not hurt to conform to lsb init-script exit codes, # where we can. # http://refspecs.linux-foundation.org/LSB_3.1.0/ #LSB-Core-generic/LSB-Core-generic/iniscrptact.html #### drbd_set_role_from_proc_drbd() { local out if ! test -e /proc/drbd; then ROLE="Unconfigured" return fi dev=$( $DRBDADM sh-dev $RES ) minor=${dev#/dev/drbd} if [[ $minor = *[!0-9]* ]] ; then # sh-minor is only supported since drbd 8.3.1 minor=$( $DRBDADM sh-minor $RES ) fi if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then ROLE=Unknown return fi if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then set -- $out ROLE=${5%/**} : ${ROLE:=Unconfigured} # if it does not show up else ROLE=Unknown fi } case "$CMD" in start) # try several times, in case heartbeat deadtime # was smaller than drbd ping time try=6 while true; do $DRBDADM primary $RES && break let "--try" || exit 1 # LSB generic error sleep 1 done ;; stop) # heartbeat (haresources mode) will retry failed stop # for a number of times in addition to this internal retry. try=3 while true; do $DRBDADM secondary $RES && break # We used to lie here, and pretend success for anything != 11, # to avoid the reboot on failed stop recovery for "simple # config errors" and such. But that is incorrect. # Don't lie to your cluster manager. # And don't do config errors... let --try || exit 1 # LSB generic error sleep 1 done ;; status) if [ "$RES" = "all" ]; then echo "A resource name is required for status inquiries." exit 10 fi ST=$( $DRBDADM role $RES ) ROLE=${ST%/**} case $ROLE in Primary|Secondary|Unconfigured) # expected ;; *) # unexpected. whatever... # If we are unsure about the state of a resource, we need to # report it as possibly running, so heartbeat can, after failed # stop, do a recovery by reboot. # drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is # suddenly readonly. So we retry by parsing /proc/drbd. drbd_set_role_from_proc_drbd esac case $ROLE in Primary) echo "running (Primary)" exit 0 # LSB status "service is OK" ;; Secondary|Unconfigured) echo "stopped ($ROLE)" exit 3 # LSB status "service is not running" ;; *) # NOTE the "running" in below message. # this is a "heartbeat" resource script, # the exit code is _ignored_. echo "cannot determine status, may be running ($ROLE)" exit 4 # LSB status "service status is unknown" ;; esac ;; *) echo "Usage: drbddisk [resource] {start|stop|status}" exit 1 ;; esac exit 0
赋予755执行权限:
# chmod 755 /etc/ha.d/resource.d/drbddisk
三、启动HeartBeat服务
在两个节点上启动HeartBeat服务,先启动node1:(node1,node2)
# service heartbeat start # chkconfig heartbeat on
现在从其他机器能够ping通虚IP 192.168.0.190,表示配置成功
四、配置NFS:(node1,node2)
编辑exports配置文件,添加以下配置:
# vi /etc/exports /store *(rw,no_root_squash)
重启NFS服务:
# service rpcbind restart # service nfs restart # chkconfig rpcbind on # chkconfig nfs off
注:这里设置NFS开机不要自动运行,因为/etc/ha.d/resource.d/killnfsd 该脚本会控制NFS的启动。
五、测试高可用
1、正常热备切换
在客户端挂载NFS共享目录
# mount -t nfs 192.168.0.190:/store /tmp
模拟将主节点node1 的heartbeat服务停止,则备节点node2会立即无缝接管;测试客户端挂载的NFS共享读写正常。
此时备机node2上的DRBD状态:
# service drbd status drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Secondary UpToDate/UpToDate C /store ext4
2、异常宕机切换
强制关机,直接关闭node1电源
node2节点也会立即无缝接管,测试客户端挂载的NFS共享读写正常。
此时node2上的DRBD状态:
# service drbd status drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Unknown UpToDate/DUnknown C /store ext4
原文链接:CentOS6.5下DRBD+HeartBeat+NFS配置(二),转载请注明来源!
Thanks for finally talking about >CentOS6.5下DRBD+HeartBeat+NFS配置(二) -
何敏杰博客 <Loved it!
模拟宕机后,请在服务器说看看DRBD的状态;
主服务器重新上线,会自动变为从服务器,可以手动指定为主服务器(命令在第一篇中有),或者停止从服务器的服务,也会自动夺回主服务器权限了。
我的网络段全是用的桥接模式,而且是dhcp的动态地址,至于这里设置虚拟ip我很疑惑,每次关掉主的之后主的再上线,不能互相识别了,会不会有特殊的启动要求,就是挂了的主服务器如何让它启动并正常夺回他的主服务器控制权。
太给力了,直接一步到位,试验成功。十分感谢,相当完美。
已拜访~~~
最励志网:http://www.zuilizhi.net/? 前来拜访,欢迎互访!
讲的很详细