OceanBase_CE 4.2.1.5 登录OCP网页无反应

日志分析用2881端口才能连上ob-server主机,2883连不上:
obdiag config -h 10.5.2.83 -u root -p’&EcT[pS3zE5(Cn!B=m’ -P 2881
obdiag config -h 10.5.2.84 -u root -p’&EcT[pS3zE5(Cn!B=m’ -P 2881
obdiag config -h 10.5.2.85 -u root -p’&EcT[pS3zE5(Cn!B=m’ -P 2881
obdiag config -h 10.5.2.86 -u root -p’G&X;]]UmA}QLJ:WG’ -P 2881

配置好之后,做日志分析:
obdiag analyze log --from “2025-08-25 11:00:00” --to “2025-08-25 11:30:00”

这东西我不懂。云上的桶。我之后找人问问,先放一下吧

obd cluster display bewg_ocp_test01

4、5
之前重启集群,没能影响到ob-server(10.5.2.83、10.5.2.84、10.5.2.85),只把本地(10.5.2.86)ob-proxy的库给宕了,但没能启动。

1 个赞

这几天太忙了,今天下午麻烦您帮忙再看看

1 个赞

1.看起来83,84,85是一套集群,86是另一套集群,86这套集群是ocp metadb,
发下yaml文件看下 ,默认在 /home/admin/.obd/cluster/部署名称/config.yaml

2.oss归档错误可能会导致租户异常

3.86的OB挂掉了

4,5 . 86上OB没启动成功,obproxy启动成功了

目前看环境有些混乱,问题比较多,方便的话提个官方悬赏吧,可以直接钉钉沟通及远程看问题

1 个赞

config.zip (1.1 KB)

1 个赞

学习

1 个赞

:joy:没有积分

1 个赞

确实86是OCP的metadb数据库,单节点,同时部署了OCP所用的obproxy,以及部署了ocp server,
回归最开始的问题,登录OCP无反应 是因为86上的OBServer出问题了,

目前看86这个OB你重启没起来,发下observer.log吧

83,84,85是另一套OB集群,应该是被OCP管理的集群

1 个赞

8月25,重启集群之后数据库没起来

observer.zip (27.4 MB)

1 个赞

/data/1/slog/server 这个目录好像不存在了,你检查看看

[2025-08-25 10:44:43.511747] WDIAG [SHARE] scan_dir (ob_local_device.cpp:476) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=13][errcode=-9100] dir does not exist(ret=-9100, dir_name="/data/1/slog/server")
[2025-08-25 10:44:43.511765] WDIAG [COMMON] get_total_used_size (ob_log_file_group.cpp:167) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=18][errcode=-9100] fail to scan dir(ret=-9100, log_dir="/data/1/slog/server")
[2025-08-25 10:44:43.511773] WDIAG [STORAGE.REDO] get_using_disk_space (ob_storage_log_writer.cpp:204) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=7][errcode=-9100] Fail to get the used size(ret=-9100, using_space=0)
[2025-08-25 10:44:43.511782] WDIAG [STORAGE.REDO] get_using_disk_space (ob_storage_logger_manager.cpp:310) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=8][errcode=-9100] fail to get using disk space(ret=-9100, using_space=0)
[2025-08-25 10:44:43.511789] WDIAG [STORAGE.REDO] get_reserved_size (ob_storage_logger_manager.cpp:349) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-9100] fail to get using size for slog(ret=-9100)
[2025-08-25 10:44:43.511799] WDIAG [SERVER] get_server_resource_info (ob_service.cpp:1622) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=9][errcode=-9100] Failed to get reserved size (ret=-9100, ret="OB_NO_SUCH_FILE_OR_DIRECTORY")
[2025-08-25 10:44:43.511808] WDIAG [SERVER] init_lease_request (ob_heartbeat.cpp:130) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=8][errcode=-9100] fail to get server resource info(ret=-9100, ret="OB_NO_SUCH_FILE_OR_DIRECTORY")
[2025-08-25 10:44:43.511821] WDIAG [SERVER] do_renew_lease (ob_lease_state_mgr.cpp:366) [31849][LeaseHB][T0][Y0-0000000000000000-0-0] [lt=13][errcode=-9100] init lease request failed(ret=-9100)
1 个赞

没有这个目录,我搜了一下,可能是被挪到 /home 下了

1 个赞

data目录移走了,这个OB确实启动不了了

[2025-08-25 10:45:46.775446] ERROR issue_dba_error (ob_log.cpp:1891) [12185][observer][T0][Y0-0000000000000000-0-0] [lt=4][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-9100, file="ob_server.cpp", line_no=513, info="[OBSERVER_NOTICE] fail to init observer")
[2025-08-25 10:45:46.775454] EDIAG [SERVER] init (ob_server.cpp:513) [12185][observer][T0][Y0-0000000000000000-0-0] [lt=8][errcode=-9100] [OBSERVER_NOTICE] fail to init observer(ret=-9100, ret="OB_NO_SUCH_FILE_OR_DIRECTORY") BACKTRACE:0x16bb358d 0x8063add 0x8063684 0x8063312 0x804105b 0xb4f914a 0xb4eb06a 0x80331d1 0x2b1f217ff555 0x5f851a2
[2025-08-25 10:45:46.775495] ERROR init (ob_server.cpp:514) [12185][observer][T0][Y0-0000000000000000-0-0] [lt=38][errcode=-4393] observer start process failure(msg="observer init() has failure", ret=-9100, ret="OB_NO_SUCH_FILE_OR_DIRECTORY")
[2025-08-25 10:45:46.775505] ERROR issue_dba_error (ob_log.cpp:1891) [12185][observer][T0][Y0-0000000000000000-0-0] [lt=9][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-9100, file="main.cpp", line_no=586, info="observer init fail")
[2025-08-25 10:45:46.775511] EDIAG [SERVER] main (main.cpp:586) [12185][observer][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-9100] observer init fail(ret=-9100) BACKTRACE:0x16bb358d 0x5df9ef0 0x803c6dc 0x803c4e4 0x5e2e385 0x80344f9 0x8033589 0x2b1f217ff555 0x5f851a2
1 个赞

正常情况下这些目录不允许移动,
建议重新部署吧?重新部署OCP 4.3.6 然后将另外一个集群接管下就可以了

1 个赞

日志提供一份看看

1 个赞

重建了。问题已解决。根本原因可能是配置文件中的data_dir和redo_dir参数与实际不符。

10.5.2.86内存没释放,重启了一下主机

su - admin
停止当前集群(如果显示为 running,但其实 Observer 没起来)​
obd cluster stop bewg_ocp_test01

备份参数文件,修改参数文件中的data_dir、redo_dir与实际相符
cp /home/admin/.obd/cluster/bewg_ocp_test01/config.yaml /tmp/config.yaml
vi /tmp/config.yaml
初始化所有状态
obd cluster destroy bewg_ocp_test01
重建
obd cluster deploy bewg_ocp_test01 --config=/tmp/config.yaml
obd cluster start bewg_ocp_test01

是的,是OB的data_dir、redo_dir发生过变更导致的这个问题,后续使用新建的OCP接管另外一个集群就可以了。

1 个赞

学习一下!

1 个赞

学习一下!

1 个赞

内容很好

1 个赞

干货满满,受益匪浅

1 个赞

写得很详细

1 个赞