KingbaseES V8R6C5B041手工创建集群测试案例
阅读原文时间:2023年07月09日阅读:2

案例说明:

KingbaseES V8R6C5B041版本和以前的KingbaseES R6有一定的区别,增加了“securecmdd”的工具,并且在install.conf配置文件中增加了“两地三中心”部署的配置项。本案例在最新的版本手工部署集群进行测试,用于研究和以前R6版本手工部署的异同点,本案例在通用机环境下测试。

关于securecmdd工具说明:

7.2.1.1. 免密配置约束
1、 目前在通用机上支持两种部署方式:① 通过sshd服务自动分发文件并部署;② 手动分发文件,启动sys_securecmdd工具并进行部署。
2、 第①种部署方式,需要配置节点间ssh免密,第②种部署方式需要节点间sys_securecmdd工具秘钥协商。两种方式都有如下免密登陆要求:所有即将部署集群的设备之间,root用户和root用户、root用户和普通用户、普通用户和普通用户免密登陆,同一主机的root用户和root用户、root用户和普通用户免密登陆。
3、 sys_securecmdd工具秘钥协商在初始化、启动过程中自动完成;配置ssh免密请用 root 用户执行脚本trust_cluster.sh,脚本会读取install.conf配置文件,所以请先按要求配置install.conf文件。

集群节点信息:

[root@node101 r6_install]# cat /etc/hosts
192.168.1.101   node101      ### Primary
192.168.1.102   node102      ### Standby

数据库版本:

[kingbase@node101 bin]$ ./ksql -V
ksql (Kingbase) V008R006C005B0041

系统环境参数配置:(参考官方文档)

https://help.kingbase.com.cn/stage-api/profile/document/kes/v8r6/html/highly/highly-tools/cluster-use/cluster-use-7.html

一、部署前的准备

1、首先在主节点安装KingbaseES软件

=以下为KingbaseES软件安装后,手工部署所需要的文件=

[kingbase@node101 zip]$ pwd
/opt/Kingbase/ES/V8R6_041/ClientTools/guitools/DeployTools/zip

[kingbase@node101 zip]$ ls -lh
total 264M
-rw-rw-r--. 1 kingbase kingbase 261M Feb 16 20:41 db.zip
-rw-rw-r--. 1 kingbase kingbase 8.0K Feb 16 20:41 install.conf
-rw-rw-r--. 1 kingbase kingbase 2.1M Feb 16 20:41 securecmdd.zip
-rwxrwxr-x. 1 kingbase kingbase 3.3K Feb 16 20:41 trust_cluster.sh
-rwxrwxr-x. 1 kingbase kingbase  87K Feb 16 20:41 V8R6_cluster_install.sh

2、指定部署文件存储路径

=创建一目录,将部署所需要的文件拷贝到同一目录下,并包含license.dat文件。=

[kingbase@node101 ~]$ mkdir r6_install
[kingbase@node101 ~]$ cp -r /opt/Kingbase/ES/V8R6_041/ClientTools/guitools/DeployTools/zip/* r6_install/

[kingbase@node101 r6_install]$ ls -lh
total 264M
-rw-rw-r-- 1 kingbase kingbase 261M Apr  7 10:33 db.zip
-rw-rw-r-- 1 kingbase kingbase 8.0K Apr  7 10:33 install.conf
-rwxr-x--- 1 kingbase kingbase 3.2K Apr  7 10:42 license.dat
-rw-rw-r-- 1 kingbase kingbase 2.1M Apr  7 10:33 securecmdd.zip
-rwxrwxr-x 1 kingbase kingbase 3.3K Apr  7 10:33 trust_cluster.sh
-rwxrwxr-x 1 kingbase kingbase  87K Apr  7 10:33 V8R6_cluster_install.sh

部署文件信息:

安装目录/ClientTools/guitools/DeployTools/zip/中的db.zip
安装目录/Server/bin/下的部署脚本 V8R6_cluster_install.sh
安装目录/Server/bin/下的配置文件install.conf
安装目录/Server/bin/下的配置SSH免密脚本 trust_cluster.sh
授权文件license.dat

3、创建集群安装目录

[kingbase@node101 ~]$ mkdir -p /home/kingbase/cluster/HA_R6/kha/

4、配置部署conf文件

[kingbase@node101 r6_install]$ cat install.conf |grep -v ^# |grep -v ^$
on_bmj=0
all_ip=(192.168.1.101 192.168.1.102)
witness_ip=""
production_ip=()
local_disaster_recovery_ip=()
remote_disaster_recovery_ip=()
install_dir="/home/kingbase/cluster/HA_R6/kha"
zip_package="/home/kingbase/r6_install/db.zip"
license_file=(license.dat)
db_user="system"                 # the user name of database
db_port="54321"                  # the port of database, defaults is 54321
db_mode="oracle"                 # database mode: pg, oracle
db_auth="scram-sha-256"          # database authority: scram-sha-256, md5, default is scram-sha-256
db_case_sensitive="yes"          # database case sensitive settings: yes, no. default is yes - case sensitive; no - case insensitive (NOTE. cannot set to 'no' when db_mode="pg").
trusted_servers="192.168.1.1"
data_directory="/home/kingbase/cluster/HA_R6/kha/kingbase/data"
virtual_ip=""
net_device=(enp0s3 enp0s3)
net_device_ip=(192.168.1.101 192.168.1.102)
ipaddr_path="/usr/sbin"
arping_path="/opt/Kingbase/ES/V8R6_041/Server/bin/"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
deploy_by_sshd=1                 # choose whether to use sshd when deploy, 0 means not to use (deploy by sys_securecmdd), 1 means to use (deploy by sshd), default value is 1; when on_bmj=1, it will auto set to no(deploy_by_sshd=0)
use_scmd=1                       # Is the cluster running on sys_securecmdd or sshd? 1 means yes (on sys_securecmdd), 0 means no (on sshd), default value is 1; when on_bmj=1, it will auto set to yes(use_scmd=1)
reconnect_attempts="10"          # the number of retries in the event of an error
reconnect_interval="6"           # retry interval
recovery="standby"               # the way of cluster recovery: standby/automatic/manual
ssh_port="22"                    # the port of ssh, default is 22
scmd_port="8890"                 # the port of sys_securecmdd, default is 8890
auto_cluster_recovery_level='1'
use_check_disk='off'

新增参数文件:

二、执行脚本部署集群

1、建立主机节点的ssh信任关系

=执行此脚本可以用于建立主机节点见的ssh信任关系。=

[root@node101 r6_install]# sh trust_cluster.sh
authorized_keys                                                                                       100% 3173     3.1KB/s   00:00
id_rsa                                                                                                100% 1679     1.6KB/s   00:00
id_rsa.pub                                                                                            100%  394     0.4KB/s   00:00
known_hosts                                                                                           100%  537     0.5KB/s   00:00

测试节点间的ssh信任关系:
[root@node101 r6_install]# ssh node102
[root@node101 r6_install]# ssh kingbase@node101

[root@node101 r6_install]# su - kingbase
Last login: Thu Apr  7 11:00:31 CST 2022 from 192.168.1.101 on pts/2
[kingbase@node101 ~]$ ssh node102
[kingbase@node101 ~]$ ssh root@node102

2、部署集群

=部署过程出现错误,具体看部署日志信息=

[kingbase@node101 r6_install]$ sh V8R6_cluster_install.sh

[CONFIG_CHECK] will deploy the cluster of DG
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "192.168.1.101" ..... OK
[RUNNING] success connect to the target "192.168.1.102" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on "192.168.1.101:54321" ..... OK
[RUNNING] the db is not running on "192.168.1.102:54321" ..... OK
[RUNNING] check the sys_securecmdd is running or not...
[RUNNING] the sys_securecmdd is not running on "192.168.1.101:8890" ..... OK
[RUNNING] the sys_securecmdd is not running on "192.168.1.102:8890" ..... OK
[RUNNING] check if the install dir is already exist ...
[RUNNING] the install dir is not exist on "192.168.1.101" ..... OK
[RUNNING] the install dir is not exist on "192.168.1.102" ..... OK
.......
2022-04-07 11:18:08 repmgrd on "[192.168.1.102]" start success.
 ID | Name  | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node1 | primary | * running |          | running | 17018 | no      | n/a
 2  | node2 | standby |   running | node1    | running | 13693 | no      | 1 second(s) ago
[2022-04-07 11:18:09] [NOTICE] redirecting logging output to "/home/kingbase/cluster/HA_R6/kha/kingbase/log/kbha.log"

[2022-04-07 11:18:12] [NOTICE] redirecting logging output to "/home/kingbase/cluster/HA_R6/kha/kingbase/log/kbha.log"

2022-04-07 11:18:13 Done.
[INSTALL] start up the whole cluster ... OK

=== 从以上信息获知,集群部署成功!===

三、部署后验证

1、查看集群节点状态

[kingbase@node101 bin]$ ./repmgr cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=192.168.1.101 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=192.168.1.102 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

2、查看流复制状态信息

prod=# select * from sys_stat_replication;
  pid  | usesysid | usename | application_name |  client_addr  | client_hostname | client_port |         backend_start         | backend
_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |
        reply_time
-------+----------+---------+------------------+---------------+-----------------+-------------+-------------------------------+--------
------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+--
-----------------------------
 16260 |    16385 | esrep   | node2            | 192.168.1.102 |                 |       39854 | 2022-04-07 11:17:53.190928+08 |
      | streaming | 0/30BEFD8 | 0/30BEFD8 | 0/30BEFD8 | 0/30BEFD8  |           |           |            |             1 | quorum     | 2
022-04-07 11:24:15.542427+08
(1 row)

prod=# select * from sys_replication_slots;
   slot_name   | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirme
d_flush_lsn
---------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------
------------
 repmgr_slot_2 |        | physical  |        |          | f         | t      |      16260 |  912 |              | 0/30BF008   |
(1 row)

3、数据库数据同步测试

# 主库DML操作
test=# \l
                               List of databases
   Name    | Owner  | Encoding |   Collate   |    Ctype    | Access privileges
-----------+--------+----------+-------------+-------------+-------------------
 esrep     | system | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 security  | system | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | system | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/system        +
           |        |          |             |             | system=CTc/system
 template1 | system | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/system        +
           |        |          |             |             | system=CTc/system
 test      | system | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
(5 rows)

test=# create database prod;
CREATE DATABASE
test=# \c prod
You are now connected to database "prod" as user "system".
prod=# create table t1 (id int,name varchar(10));
CREATE TABLE
prod=# insert into t1 values(generate_series(1,10000),'usr'||generate_series(1,10000));
INSERT 0 10000
prod=# select count(*) from t1;
 count
-------
 10000
(1 row)

#备库查看数据
test=# \c prod
You are now connected to database "prod" as user "system".
prod=# select count(*) from t1;
 count
-------
 10000
(1 row)

四、重启集群验证

[kingbase@node102 bin]$ ./sys_monitor.sh restart

2022-04-07 11:25:35 Ready to stop all DB ...
Service process "node_export" was killed at process 17698
Service process "postgres_ex" was killed at process 17699
Service process "node_export" was killed at process 14347
Service process "postgres_ex" was killed at process 14348
2022-04-07 11:25:41 begin to stop repmgrd on "[192.168.1.101]".
2022-04-07 11:25:42 repmgrd on "[192.168.1.101]" stop success.
2022-04-07 11:25:42 begin to stop repmgrd on "[192.168.1.102]".
2022-04-07 11:25:43 repmgrd on "[192.168.1.102]" stop success.
2022-04-07 11:25:43 begin to stop DB on "[192.168.1.101]".
waiting for server to shut down.... done
server stopped
......
2022-04-07 11:26:06 repmgrd on "[192.168.1.102]" start success.
 ID | Name  | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node1 | primary | * running |          | running | 21305 | no      | n/a
 2  | node2 | standby |   running | node1    | running | 17837 | no      | 1 second(s) ago
[2022-04-07 11:26:09] [NOTICE] redirecting logging output to "/home/kingbase/cluster/HA_R6/kha/kingbase/log/kbha.log"

[2022-04-07 11:26:13] [NOTICE] redirecting logging output to "/home/kingbase/cluster/HA_R6/kha/kingbase/log/kbha.log"

2022-04-07 11:26:15 Done.

五、总结

KingbaseES V8R6C5B041一键部署集群,基本和以前R6版本操作方法一致,将安装目录zip下的文件及license文件放在同一部署目录下,执行部署即可。

并且在启动集群前,需要先启动securecmdd服务,集群部署后,会自动加入到系统自启:

如果不做ssh免密执行部署:(密码输入会让人崩溃!!!)

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器

你可能感兴趣的文章