问题描述:收到监控软件告警,提示数据库发生重启,进去查看,截止到6/27 10:00 之前,作为主节点的orch1先重启,然后故障转移到orch2和orch3节点上。在持续到6/27 9:00 左右,orch2又接着重启,到现在只剩下一个orch3节点单独支撑业务。
环境:
redhat7.6+orchestrator+MySQL8.0.26+一主两从的物理机集群
orch1 主库
orch2 从库
orch3 从库
orch1宕机,vip会漂移到orch2,选举为新的主库,orch2宕机,故障转移到orch3,此时会变成单节点
1.数据库提示重启,首先需要查看节点orch1现在的状态,在1d 22h之前发生了重启,此时状态正常
/soft/mysql/bin/mysql Ver 8.0.26 for Linux on x86_64 (MySQL Community Server - GPL)
Connection id: 1929
Current database:
Current user: root@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 8.0.26 MySQL Community Server - GPL
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: utf8mb4
Db characterset: utf8mb4
Client characterset: utf8mb4
Conn. characterset: utf8mb4
UNIX socket: /home/mysql/db_orch1/mysql.sock
Binary data as: Hexadecimal
Uptime: 1 day 22 hours 39 min 34 sec
查看错误日志,有一个重要的ERROR信息,提示too many open files0
2022-06-25T10:29:05.298867+08:00 0 [ERROR] [MY-012592] [InnoDB] Operating system error number 24 in a file operation.
2022-06-25T10:29:05.298916+08:00 0 [ERROR] [MY-012596] [InnoDB] Error number 24 means 'Too many open files'
2022-06-25T10:29:05.298937+08:00 0 [ERROR] [MY-012646] [InnoDB] File ./bznaqgk/sc_task_position.ibd: 'open' returned OS error 124. Cannot continue operation
2022-06-25T10:29:05.298950+08:00 0 [ERROR] [MY-012981] [InnoDB] Cannot continue operation.
2022-06-25T02:29:06.331542Z mysqld_safe Number of processes running now: 0
2.查看orch2节点的状态以及错误日志,二节点在55min之前发生了重启
/soft/mysql/bin/mysql Ver 8.0.26 for Linux on x86_64 (MySQL Community Server - GPL)
Connection id: 53
Current database:
Current user: root@10.157.200.8
SSL: Cipher in use is TLS_AES_256_GCM_SHA384
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 8.0.26 MySQL Community Server - GPL
Protocol version: 10
Connection: 10.157.200.8 via TCP/IP
Server characterset: utf8mb4
Db characterset: utf8mb4
Client characterset: utf8mb4
Conn. characterset: utf8mb4
TCP port: 13306
Binary data as: Hexadecimal
Uptime: 55 min 27 sec
Threads: 11 Questions: 20655 Slow queries: 0 Opens: 326 Flush tables: 3 Open tables: 232 Queries per second avg: 6.208
查看二节点的错误日志,这次就比较清晰,得到的错误 [InnoDB] Open files 406 exceeds the limit 400,innodb打开文件数受到了限制,所以下面就要查看一下数据库有关innodb有关文件数的限制
2022-06-26T12:56:32.140572+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-26T13:18:19.851180+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-26T19:09:27.950568+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-26T19:17:09.227432+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 407 exceeds the limit 400
2022-06-26T19:17:09.227462+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 407 exceeds the limit 400
2022-06-26T23:46:27.098079+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-27T06:50:13.584841+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-27T07:50:18.753974+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 406 exceeds the limit 400
2022-06-27T08:03:56.200996+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 407 exceeds the limit 400
2022-06-27T08:03:56.201006+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 407 exceeds the limit 400
2022-06-27T08:03:56.201011+08:00 0 [Warning] [MY-012152] [InnoDB] Open files 407 exceeds the limit 400
2022-06-27T08:13:20.471561+08:00 0 [ERROR] [MY-012592] [InnoDB] Operating system error number 24 in a file operation.
2022-06-27T08:13:20.471608+08:00 0 [ERROR] [MY-012596] [InnoDB] Error number 24 means 'Too many open files'
2022-06-27T08:13:20.471631+08:00 0 [ERROR] [MY-012646] [InnoDB] File ./bznaqgk/drp_synlog_detail.ibd: 'open' returned OS error 124. Cannot continue operation
2022-06-27T08:13:20.471644+08:00 0 [ERROR] [MY-012981] [InnoDB] Cannot continue operation.
2022-06-27T00:13:21.227135Z mysqld_safe Number of processes running now: 0
2022-06-27T00:13:21.233845Z mysqld_safe mysqld restarted
3.根本原因,在orch1和orch2上分别查看到有关文件数的限制分别是400和1024,那么在mysql8上直接修改这个参数是不是就行了呢
mysql> show variables like '%open_file%';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| innodb_open_files | 400 |
| open_files_limit | 1024 |
+-------------------+-------+
2 rows in set (0.00 sec)
4.调整表打开数量,修改失败,只读参数,需要写进配置文件进行重启
mysql> set persist open_files_limit=60000;
ERROR 1238 (HY000): Variable 'open_files_limit' is a read only variable
mysql> set persist innodb_open_files=60000;
ERROR 1238 (HY000): Variable 'innodb_open_files' is a read only variable
mysql>
5.然后进行数据库重启,但是参数没有生效,这又是怎么回事呢。有意思的是之前这三台数据库的服务器就因为/etc/security/limits.conf 系统最大文件数不够导致连接失败,当时也并没有数据发生重启或者宕机,只是VIP连接不上,然后就进行了limits.conf的修改,此时的conf大小
这时的操作系统的limits.conf跟数据库的最大打开文件数有什么关系么?在配置文件中修改了参数,但是重启完并没有生效。图为数据库配置文件
查询资料,centos7使用了systemd管理进程,安装mysql时mysqld_safe不再被安装。若要配置open_file_limit可通过systemctl edit mysqld,编辑service模块。只需要使用root用户启动mysqld_safe进程可使参数生效,mysql在启动时,这个参数参考的是操作系统的limits.conf的配置
6.所以在使用root用户进行重启之后,参数正常。用root启动成功的标志是grep mysql进程属主是属于mysql的。这里如果没有单独设置set persist innodb_open_files的值,此时的innodb_open_files的默认值就会是63000
7.在另外连个节点上进行参数的修改以及使用root用户启动mysql进程,就可以解决问题
手机扫一扫
移动阅读更方便
你可能感兴趣的文章