【 使用环境 】 测试环境
【 OB or 其他组件 】OB
【 使用版本 】V3
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【问题现象及影响】
虚拟机
使用OCP部署的OB集群(1-1-1)前一天由于网络或主机问题,导致3台OBserver离线。大概过了一天左右,observer主机上线并能远程ssh登录连接,然后ssh过去查看每台主机的时钟同步没问题后,手工启动observer进程。每台主机成功启动observer进程后,尝试登录OB,竟然不需要密码!!!感觉集群未初始化一样,但所有数据文件又都存在,
日志目录:
[root@OLMSH log]# ls |wc -l
601
[root@OLMSH log]# du -h
76G .
[root@OLMSH log]# ls -lt|more
total 79208684
-rw-r–r-- 1 admin admin 85174502 Jul 6 11:38 obesi-daemon.log
-rw-r–r-- 1 admin admin 106645551 Jul 6 11:38 observer.log
-rw-r–r-- 1 admin admin 1719734 Jul 6 11:38 observer.log.wf
-rw-r–r-- 1 admin admin 168834953 Jul 6 11:38 election.log
-rw-r–r-- 1 admin admin 81137146 Jul 6 11:38 rootservice.log
-rw-r–r-- 1 admin admin 268435791 Jul 6 11:36 observer.log.20230706113652099
-rw-r–r-- 1 admin admin 16346 Jul 6 11:36 observer.log.wf.20230706113652099
-rw-r–r-- 1 admin admin 268439580 Jul 6 11:35 observer.log.20230706113505875
-rw-r–r-- 1 admin admin 294 Jul 6 11:33 observer.log.wf.20230706113505875
-rw-r–r-- 1 admin admin 268443760 Jul 6 11:33 observer.log.20230706113336168
-rw-r–r-- 1 admin admin 294 Jul 6 11:32 observer.log.wf.20230706113336168
-rw-r–r-- 1 admin admin 268435481 Jul 6 11:32 observer.log.20230706113208721
-rw-r–r-- 1 admin admin 410317 Jul 6 11:31 observer.log.wf.20230706113208721
-rw-r–r-- 1 admin admin 268435683 Jul 6 11:30 observer.log.20230706113053011
-rw-r–r-- 1 admin admin 268436618 Jul 6 11:29 observer.log.20230706112928330
-rw-r–r-- 1 admin admin 294 Jul 6 11:29 observer.log.wf.20230706113053011
-rw-r–r-- 1 admin admin 2532525 Jul 6 11:28 observer.log.wf.20230706112928330
没法查看日志内容,一查看某个日志文件就直接刷屏卡死,tail 也依然不停的刷!!
登录OB,show database失败如下:感觉没初始化一样,因为之前设置的登录密码也变为空了
最后只能kill掉observer进程,然后再查看日志如下(部分),又是4338内部错误,分配内存失败
[2023-07-06 11:38:38.808142] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=42216] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:38.808202] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=23] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:38.849800] ERROR issue_dba_error (ob_log.cpp:2322) [29884][0][Y0-0000000000000000-0-0] [lt=21] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_log_direct_reader.cpp”, line_no=69, info=“ob_malloc fail”)
[2023-07-06 11:38:38.857308] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=10] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:38.857341] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=18] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:38.874231] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=7] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:38.874274] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=18] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:38.892255] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=7] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:39.388893] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=58] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:39.772786] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=146548] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:40.893676] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=171693] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:41.256685] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=53915] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:42.318609] ERROR issue_dba_error (ob_log.cpp:2322) [29884][0][Y0-0000000000000000-0-0] [lt=318110] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:42.506838] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=31175] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
[2023-07-06 11:38:43.629141] ERROR issue_dba_error (ob_log.cpp:2322) [29864][0][Y0-0000000000000000-0-0] [lt=417806] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“achunk_mgr.cpp”, line_no=127, info=“low alloc fail”)
[2023-07-06 11:38:44.228812] ERROR issue_dba_error (ob_log.cpp:2322) [29884][0][Y0-0000000000000000-0-0] [lt=50688] [dc=0][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4013, file=“ob_malloc.h”, line_no=54, info=“allocate memory fail”)
大部分都被写日志的操作把内存给耗没了,关键是为什么会不停的写日志??同一个报错,永不休止的刷屏!