KingbaseES R6 集群一键修改集群和数据库参数测试案例
阅读原文时间:2023年07月09日阅读:2

案例说明:

集群环境修改集群或数据库参数,需要在每个node上都要修改,在每个节点而执行修改操作,容易出现漏改或节点上参数不一致等错误;在KingbaseES V8R6的集群中增加了,一键修改参数的新功能,可以在一个节点配置后,同时修改集群中所有节点的参数,这样用户在修改集群中的参数时,避免以上错误的发生。

官方文档说明:

https://help.kingbase.com.cn/stage-api/profile/document/kes/v8r6/html/highly/highly-tools/cluster-use/cluster-use-6.html#id2

6. 一键修改集群参数

6.1. 命令行命令

6.1.1. sys_monitor.sh set/set --restart

sys_monitor.sh set/set --restart —一键修改集群配置和数据库配置

6.1.1.1. 描述

sys_monitor.sh set 脚本读取配置文件set_repmgr.conf和set_db.conf中的配置,分别修改集群的配置文件repmgr.conf和es_rep.conf

6.1.1.2. 使用

在$db_etc目录下配置set_repmgr.conf,该配置文件中设置参数,修改集群repmgr.conf文件

例:

图 6.1.20 修改集群repmgr.conf文件

在$db_etc目录下配置set_db.conf,该配置文件中设置参数,修改集群es_rep.conf文件

例:

图 6.1.21 修改集群es_rep.conf文件

6.1.1.3. 选项

以下是脚本执行参数:

set
脚本调用修改集群和数据库配置函数
set --restart
脚本调用修改集群和数据库配置函数,并重启集群。如果修改的参数必须重启数据库才能生效时,使用该项

例:

图 6.1.22 一键修改集群配置和数据库配置

例:

图 6.1.23 一键修改集群配置和数据库配置

6.1.1.4. 注意事项

1.必须在集群所有节点状态正常的情况下才能执行修改参数操作
2.set_db.conf中
不可以配置port参数
不可以配置kingbase.conf中没有的参数
不可以给参数配置错误的值
3.set_repmgr.conf中
不建议修改路径相关和数据库连接串等参数,以下是适合一键修改的参数 例如:
reconnect_attempts
reconnect_interval
recovery
trust_servers

测试案例:

测试数据库版本:

test=# select version();
                                                       version
----------------------------------------------------------------------------------------------------------------------
 KingbaseES V008R006C005B0041 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

集群node信息:

[kingbase@node102 bin]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.1.101   node101     ###  Standby
192.168.1.102   node102     ###  Primary

 ID | Name    | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node101 | standby |   running | node102  | running | 15547 | no      | 1 second(s) ago
 2  | node102 | primary | * running |          | running | 16388 | no      | n/a

一、查看集群节点状态

=注意:在修改参数前,必须保证集群中所有节点是正常状态。=

[kingbase@node101 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node101 | standby |   running | node102  | default  | 100      | 2        | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node102 | primary | * running |          | default  | 100      | 2        | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

二、修改参数测试

=在主节点的etc目录下,分别配置set_repmgr.conf和set_db.conf,将需要修改的repmgr和数据库参数及value写入到此配置文件中。=

1、查看需要修改的repmgr参数

[kingbase@node101 data]$ cat ../etc/repmgr.conf |grep -i recovery
#recovery='standby'
recovery='automatic'

[kingbase@node102 etc]$ cat repmgr.conf |grep recovery
#recovery='standby'
recovery='automatic'

2、配置repmgr参数文件(将需要修改的参数写入此配置文件,注意文件名)

[kingbase@node102 etc]$ cat set_repmgr.conf
recovery=standby

3、查看需要修改的数据库参数

[kingbase@node101 data]$ cat kingbase.conf |grep work_mem
#work_mem = 4MB                         # min 64kB
#maintenance_work_mem = 64MB            # min 1MB
#autovacuum_work_mem = -1               # min 1MB, or -1 to use maintenance_work_mem

[kingbase@node102 bin]$ cat ../data/kingbase.conf |grep -i work_mem
#work_mem = 4MB                         # min 64kB
#maintenance_work_mem = 64MB            # min 1MB

4、配置数据库参数文件(将需要修改的参数写入此配置文件,注意文件名)

[kingbase@node102 etc]$ cat set_db.conf
work_mem = 16MB

5、一键修改参数

=注意:如果修改的参数不需要重启就生效,可以用set参数;如果参数修改需要重启集群或数据库才能生效,需要用set --restart参数。=

[kingbase@node102 bin]$ ./sys_monitor.sh set
2022-04-06 11:58:29 Begin to set the es_rep.conf for database ...
2022-04-06 11:58:30 End to set the es_rep.conf for database ... OK
2022-04-06 11:58:30 Begin to set the repmgr.conf for repmgrd ...
2022-04-06 11:58:31 End to set the repmgr.conf for repmgrd ... OK
2022-04-06 11:58:31 Sending signal to reload the database on "192.168.1.102"
server signaled
2022-04-06 11:58:31 Sending signal to reload the database on "192.168.1.101"
server signaled
2022-04-06 11:58:32 Sending signal(SIGKILL) to stop the kbha on "192.168.1.102"
2022-04-06 11:58:34 Sending signal(SIGKILL) to stop the repmgrd on "192.168.1.102"
2022-04-06 11:58:34 begin to stop repmgrd on "[192.168.1.102]".
2022-04-06 11:58:35 repmgrd on "[192.168.1.102]" stop success.
2022-04-06 11:58:36 Start the repmgrd on "192.168.1.102"
2022-04-06 11:58:36 begin to start repmgrd on "[192.168.1.102]".
[2022-04-06 11:58:36] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 11:58:36] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"

2022-04-06 11:58:38 repmgrd on "[192.168.1.102]" start success.
2022-04-06 11:58:38 Start the kbha on "192.168.1.102"
[2022-04-06 11:58:41] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"

2022-04-06 11:58:42 Sending signal(SIGKILL) to stop the kbha on "192.168.1.101"
2022-04-06 11:58:45 Sending signal(SIGKILL) to stop the repmgrd on "192.168.1.101"
2022-04-06 11:58:45 begin to stop repmgrd on "[192.168.1.101]".
2022-04-06 11:58:46 repmgrd on "[192.168.1.101]" stop success.
2022-04-06 11:58:46 Start the repmgrd on "192.168.1.101"
2022-04-06 11:58:46 begin to start repmgrd on "[192.168.1.101]".
[2022-04-06 11:58:47] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 11:58:47] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"

2022-04-06 11:58:48 repmgrd on "[192.168.1.101]" start success.
2022-04-06 11:58:48 Start the kbha on "192.168.1.101"
[2022-04-06 11:58:51] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"

Some changes may not take effect by RELOAD, you can execute "./sys_monitor.sh restart" to restart the cluster

三、验证参数的修改

1、查看修改后的repmgr参数

如下所示:被修改的repmgr参数将追加到repmgr.conf文件的之后

[kingbase@node102 bin]$ cat ../etc/repmgr.conf |grep -i recovery
#recovery='standby'
recovery='automatic'
auto_cluster_recovery_level=1
recovery=standby          ### 修改参数

2、查看数据库参数的修改

如下所示:被修改的数据库参数将追加到es_rep.conf文件的之后

[kingbase@node102 bin]$ cat ../data/es_rep.conf |grep -i work_mem
work_mem = 16MB
You have new mail in /var/spool/mail/kingbase

#数据库环境查看:
[kingbase@node102 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help.

test=# show work_mem;
 work_mem
----------
 16MB
(1 row)

四、验证重启集群参数的修改

=注意:在集群或数据库中有些参数的修改,需要重启集群或数据库服务才能生效,建议此类参数的修改,在业务低峰期间执行,防止生产环境中,业务访问受到影响。=

[kingbase@node102 bin]$ ./sys_monitor.sh set --restart
2022-04-06 12:00:23 Begin to set the es_rep.conf for database ...
2022-04-06 12:00:24 End to set the es_rep.conf for database ... OK
2022-04-06 12:00:24 Begin to set the repmgr.conf for repmgrd ...
2022-04-06 12:00:25 End to set the repmgr.conf for repmgrd ... OK
2022-04-06 12:00:25 Ready to stop all DB ...
Service process "node_export" was killed at process 8009
Service process "postgres_ex" was killed at process 8010
Service process "node_export" was killed at process 8135
Service process "postgres_ex" was killed at process 8136
2022-04-06 12:00:32 begin to stop repmgrd on "[192.168.1.102]".
2022-04-06 12:00:33 repmgrd on "[192.168.1.102]" stop success.
2022-04-06 12:00:33 begin to stop repmgrd on "[192.168.1.101]".
2022-04-06 12:00:34 repmgrd on "[192.168.1.101]" stop success.
2022-04-06 12:00:34 begin to stop DB on "[192.168.1.101]".
waiting for server to shut down.... done
server stopped
2022-04-06 12:00:34 DB on "[192.168.1.101]" stop success.
2022-04-06 12:00:34 begin to stop DB on "[192.168.1.102]".
waiting for server to shut down.... done
server stopped
2022-04-06 12:00:35 DB on "[192.168.1.102]" stop success.
2022-04-06 12:00:35 Done.
2022-04-06 12:00:35 Ready to start all DB ...
2022-04-06 12:00:35 begin to start DB on "[192.168.1.102]".
waiting for server to start.... done
server started
2022-04-06 12:00:36 execute to start DB on "[192.168.1.102]" success, connect to check it.
2022-04-06 12:00:37 DB on "[192.168.1.102]" start success.
2022-04-06 12:00:37 Try to ping trusted_servers on host 192.168.1.102 ...
2022-04-06 12:00:40 Try to ping trusted_servers on host 192.168.1.101 ...
2022-04-06 12:00:42 begin to start DB on "[192.168.1.101]".
waiting for server to start.... done
server started
2022-04-06 12:00:43 execute to start DB on "[192.168.1.101]" success, connect to check it.
2022-04-06 12:00:44 DB on "[192.168.1.101]" start success.
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node101 | standby |   running | node102  | default  | 100      | 2        | host=192.168.1.101 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node102 | primary | * running |          | default  | 100      | 2        | host=192.168.1.102 user=system dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2022-04-06 12:00:44 The primary DB is started.
2022-04-06 12:00:44 begin to start repmgrd on "[192.168.1.102]".
[2022-04-06 12:00:45] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 12:00:45] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"

2022-04-06 12:00:47 repmgrd on "[192.168.1.102]" start success.
2022-04-06 12:00:47 begin to start repmgrd on "[192.168.1.101]".
[2022-04-06 12:00:49] [NOTICE] using provided configuration file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc/repmgr.conf"
[2022-04-06 12:00:49] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/hamgr.log"

2022-04-06 12:00:50 repmgrd on "[192.168.1.101]" start success.
 ID | Name    | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+---------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node101 | standby |   running | node102  | running | 15547 | no      | 1 second(s) ago
 2  | node102 | primary | * running |          | running | 16388 | no      | n/a
[2022-04-06 12:00:53] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"

[2022-04-06 12:00:57] [NOTICE] redirecting logging output to "/home/kingbase/cluster/R6HA/kha/kingbase/log/kbha.log"

2022-04-06 12:00:58 Done.

五、参数修改故障案例

=在修改参数时,如果只是配置了set_repmgr.conf文件,缺少set_db.conf文件,则执行一键修改时出现以下错误,需要同时创建以上两个文件。=

[kingbase@node102 bin]$ ./sys_monitor.sh set
2022-04-06 11:51:18 WRNING: there is no file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_db.conf", don't change the es_rep.conf for database
2022-04-06 11:51:18 WRNING: there is no file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_repmgr.conf", don't change the repmgr.conf for repmgrd
2022-04-06 11:51:18 ERROR: the file "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_db.conf" and "/home/kingbase/cluster/R6HA/kha/kingbase/bin/../etc//set_repmgr.conf" do not exist, exit

六、总结

对于KingbaseES V8R6集群一键修改参数的功能,方便了集群的维护和管理,但注意,并不是所有参数都能修改,具体可以参考官方文档说明。