数据库环境:**
test=# select version();
version
------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
操作系统:
[kingbase@node1 bin]$ cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)
集群架构:
案例说明:
1)本案例在通用机环境下执行。sys_backup.sh是调用sys_rman做物理备份,对于集群环境需要用到ssh端口做远程连接,当修改ssh端口,会影响sys_backup.sh正常执行。
2)修改ssh端口对于集群的运行,只需要修改repmgr.conf文件中变量即可。
3)对于修改ssh端口后,用sys_backup.sh作物理备份,需要在sys_backup.sh脚本中修改所有ssh语句的连接端口,修改的位置较多。
4)建议如果对ssh修改端口后,需要用sys_backup.sh作备份的应用较多的情况下,在sys_backup.sh脚本中用变量来指定ssh端口号。
一、查看当前集群状态
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------
1 | node248 | standby | running | node249 | default | 100 | 6 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node249 | primary | * running | | default | 100 | 6 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count
二、修改操作系统和集群配置文件ssh端口号(所有节点)
1)查看系统原ssh端口号(默认22)
[kingbase@node2 bin]$ netstat -antlp |grep 22
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 192.168.7.249:22 192.168.7.116:55883 ESTABLISHED -
tcp6 0 0 :::22 :::* LISTEN -
2)查看集群repmgr.conf应用ssh端口号
[kingbase@node2 bin]$ cat ../etc/repmgr.conf|grep ssh
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
=== 默认用-p 22 指定集群ssh通讯端口===
3)修改操作系统端口
[root@node1 ~]# cat /etc/ssh/sshd_config|grep -i Port
# If you want to change the port on a SELinux system, you have to tell
# semanage port -a -t ssh_port_t -p tcp #PORTNUMBER
Port 2222
4)修改集群ssh通讯端口(改为2222)
[kingbase@node1 bin]$ cat ../etc/repmgr.conf |grep sshssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 2222'
5)重启sshd服务
[root@node1 ~]# systemctl restart sshd[root@node1 ~]# netstat -an |grep 22tcp 0 0 0.0.0.0:2222 0.0.0.0:* LISTEN
6)通过非默认端口ssh连接测试
[root@node1 ~]# ssh -p 2222 node2Last failed login: Mon Mar 1 17:06:07 CST 2021 from 192.168.7.116 on ssh:nottyThere were 2 failed login attempts since the last successful login.Last login: Mon Mar 1 16:43:29 2021 from 192.168.7.249
=== 从以上可知,修改端口后ssh信任关系正常===
7)sys_monitor.sh重启集群测试
[kingbase@node1 bin]$ ./sys_monitor.sh restart
2021-03-01 17:29:55 Ready to stop all DB ...
Service process "node_export" was killed at process 11833
Service process "postgres_ex" was killed at process 11834
Service process "node_export" was killed at process 9343
Service process "postgres_ex" was killed at process 9344
2021-03-01 17:30:00 begin to stop repmgrd on "[192.168.7.248]".
2021-03-01 17:30:01 repmgrd on "[192.168.7.248]" stop success.
2021-03-01 17:30:01 begin to stop repmgrd on "[192.168.7.249]".
2021-03-01 17:30:02 repmgrd on "[192.168.7.249]" stop success.
2021-03-01 17:30:02 begin to stop DB on "[192.168.7.249]".waiting for server to shut down..... done
server stopped
2021-03-01 17:30:04 DB on "[192.168.7.249]" stop success.
2021-03-01 17:30:04 begin to stop DB on "[192.168.7.248]".waiting for server to shut down......... done
server stopped
2021-03-01 17:30:11 DB on "[192.168.7.248]" stop success.
2021-03-01 17:30:11 Done.2021-03-01 17:30:11 Ready to start all DB ...
2021-03-01 17:30:11 begin to start DB on "[192.168.7.248]".waiting for server to start.... done
server started
2021-03-01 17:30:12 execute to start DB on "[192.168.7.248]" success, connect to check it.
2021-03-01 17:30:13 DB on "[192.168.7.248]" start success.
2021-03-01 17:30:13 Try to ping trusted_servers on host 192.168.7.248 ...
2021-03-01 17:30:16 Try to ping trusted_servers on host 192.168.7.249 ...
2021-03-01 17:30:18 begin to start DB on "[192.168.7.249]".waiting for server to start.... done
server started
2021-03-01 17:30:20 execute to start DB on "[192.168.7.249]" success, connect to check it.
2021-03-01 17:30:21 DB on "[192.168.7.249]" start success.
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+---------+---------+-----------+-----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | ! node249 | default | 100 | 6 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node249 | primary | * running | | default | 100 | 6 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
WARNING: following issues were detected - node "node248" (ID: 1) is not attached to its upstream node "node249" (ID: 2)
2021-03-01 17:30:21 The primary DB is started.
2021-03-01 17:30:25 Success to load virtual ip [192.168.7.240/24] on primary host [192.168.7.249].
2021-03-01 17:30:25 Try to ping vip on host 192.168.7.248 ...
2021-03-01 17:30:28 Try to ping vip on host 192.168.7.249 ...
2021-03-01 17:30:30 begin to start repmgrd on "[192.168.7.248]".
[2021-03-01 17:30:31] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/../etc/repmgr.conf"
[2021-03-01 17:30:31] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log"2021-03-01 17:30:31 repmgrd on "[192.168.7.248]" start success.
2021-03-01 17:30:31 begin to start repmgrd on "[192.168.7.249]".
[2021-03-01 17:29:25] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/../etc/repmgr.conf"[2021-03-01 17:29:25]
[NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log"2021-03-01 17:30:32 repmgrd on "[192.168.7.249]"
start success.
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node248 | standby | running | node249 | running | 16767 | no | 0 second(s) ago
2 | node249 | primary | * running | | running | 17865 | no | n/a 2021-03-01 17:30:38 Done.
8)查看集群节点状态
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+---------+---------+-----------+----------+----------+----------+----------+----------------
1 | node248 | standby | running | node249 | default | 100 | 6 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node249 | primary | * running | | default | 100 | 6 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count
=== 从以上可知,修改ssh端口后,集群通讯正常===
二、修改ssh端口后执行sys_backup.sh备份(所有节点)
1)在修改ssh端口前备份基础上停止备份测试
[kingbase@node1 bin]$ ./sys_backup.sh stop
Disable all sys_rman in crontab-daemon
ssh: connect to host 192.168.7.248 port 22: Connection refused
ssh: connect to host 192.168.7.248 port 22: Connection refused
ssh: connect to host 192.168.7.248 port 22: Connection refused
=== 如上所示,在通过sys_backup.sh基于集群环境做备份时,会通过ssh做远程节点的连接,修改端口后,无法通过ssh连接===
2)修改sys_backup.sh脚本中ssh端口
=== 修改”ssh_cmd“变量===
# local function_ssh_cmd_="ssh -p 2222 -n -o ConnectTimeout=30 -o StrictHostKeyChecking=no -o PreferredAuthentications=publickey -- "
function _log () {
echo "$*" >> /tmp/sys_backup.sh.log
} # end of _log
=修改”_gene_ssh_pwd_less“中ssh通讯端口=
function _gene_ssh_pwd_less() {
_ip="${1}"
_user="${2}"
# 1. check whether pwd-less work
ssh -p 2222 -t -o ConnectTimeout=30 -o PreferredAuthentications=publickey ${_user}@${_ip} date 1>/dev/null 2>/dev/null
_local2remote_rt=$?
ssh -p 2222 -t -o ConnectTimeout=30 -o PreferredAuthentications=publickey ${_user}@${_ip} "ssh -p 2222 ${_user}@${_repo_ip} date>/dev/null 2>/dev/null" 2>/dev/null
=== 配置ssh免密中ssh端口===
# set local.pub to remote, get remote.pub to local _remote_pub_buf=` ssh -p 2222 -q -o StrictHostKeyChecking=no -o ConnectTimeout=30 -o PreferredAuthentications=password -- ${_user}@${_ip} \ "if [ ! -f \\${HOME}/.ssh/id_rsa.pub ] ; then echo -e '\ny' | ssh-keygen -t rsa -N '' >/dev/null 2>/dev/null ; fi;echo ${_t_buf_pub} >> \\${HOME}/.ssh/authorized_keys;chmod 600 \\${HOME}/.ssh/authorized_keys;cat \\${HOME}/.ssh/id_rsa.pub;" `
三、执行sys_backup.sh备份
1)执行init备份初始化
[kingbase@node2 bin]$ ./sys_backup.sh init
# generate local sys_rman.conf...DONE
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)ERROR: check stanza failed, check log file /tmp/sys_rman_check.log
=== 脚本执行报错,在check stanza失败===
查看日志:
[kingbase@node2 bin]$ cat /tmp/sys_rman_check.log
2021-03-01 12:38:56.011 P00 INFO: check command begin 2.27: --config=/home/kingbase/kbbr_repo/sys_rman.conf --log-level-console=info --log-level-file=info --log-path=/tmp --log-subprocess --kb2-host=192.168.7.248 --kb2-host-user=kingbase --kb1-path=/home/kingbase/cluster/R6HA/KHA/kingbase/data --kb2-path=/home/kingbase/cluster/R6HA/KHA/kingbase/data --kb1-port=54321 --kb2-port=54321 --kb1-user=esrep --kb2-user=esrep --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbaseWARN: unable to check kb-2: [UnknownError] remote-0 process on '192.168.7.248' terminated unexpectedly [255]: ssh: connect to host 192.168.7.248 port 22: Connection refusedERROR: [125]: remote-0 process on '192.168.7.248' terminated unexpectedly [255]: ssh: connect to host 192.168.7.248 port 22: Connection refused2021-03-01 12:38:56.529 P00 INFO: check command end: aborted with exception [125]
=== 从日志可可知,在执行check stanza时,需要通过ssh连接备库;但是使用ssh连接时,仍然使用修改前的22端口,无法使用修改后的2222端口,导致连接备库失败,check stanza失败===
3)在sys_backup.sh脚本注释stanza检测(跳过check stanza)
371 #${_rman_bin} --config=${_rman_conf_file} --stanza=${_stanza_name} --log-level-console=info check >>/tmp/sys_rm an_check.log 2>&1372 #if [ "X0" != "X$?" ] ; then
373 # echo "ERROR: check stanza failed, check log file /tmp/sys_rman_check.log"374 # exit 3
375 #fi
376 echo "# create stanza and check...DONE"
4)再次执行sys_backup.sh备份
# init 初始化
[kingbase@node2 bin]$ ./sys_backup.sh init
# generate local sys_rman.conf...DONE
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)# create stanza and check...DONE# initial first full backup...(maybe several minutes)
# initial first full backup...DONE# Initial sys_rman OK.'sys_backup.sh start' should be executed when need back-rest feature.
# start 开始备份
[kingbase@node2 bin]$ ./sys_backup.sh start
Enable some sys_rman in crontab-daemonSet full-backup in 7 daysSet incr-backup in 1 days
0 2 */7 * * kingbase /home/kingbase/cluster/R6HA/KHA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=full backup >>/tmp/sys_rman_backup_full.log 2>&1
0 4 */1 * * kingbase /home/kingbase/cluster/R6HA/KHA/kingbase/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --archive-copy --type=incr backup >>/tmp/sys_rman_backup_incr.log 2>&1
# pause 备份暂停
[kingbase@node2 bin]$ ./sys_backup.sh pause
Puase the sys_rman...DONE
# unpause 停止暂停
[kingbase@node2 bin]$ ./sys_backup.sh unpause
Un-Puase the sys_rman...DONE
# stop 停止备份
[kingbase@node2 bin]$ ./sys_backup.sh stop
Disable all sys_rman in crontab-daemon
[kingbase@node2 bin]$ cat /etc/cron.d/KINGBASECRON
*/1 * * * * kingbase . /etc/profile;/home/kingbase/cluster/R6HA/KHA/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/R6HA/KHA/kingbase/bin/../etc/repmgr.conf >> /home/kingbase/cluster/R6HA/KHA/kingbase/bin/../kbha.log 2>&1
#*/1 * * * * kingbase /home/kingbase/cluster/kha/db/bin/network_rewind.sh#*/1 * * * * root /home/kingbase/cluster/kha/kingbasecluster/bin/restartcluster.sh
=== 从以上信息获知,修改系统ssh端口后,通过sys_backup.sh备份成功===
手机扫一扫
移动阅读更方便
你可能感兴趣的文章