OCP创建的集群,集群有主机连接不上,导致集群不可用怎么解决

【 使用环境 】测试环境
【 OB or 其他组件 】
【 使用版本 】OCP:4.3.6-20250709105610 、observe:4.2.1.8
【问题描述】OCP创建的集群,集群有主机连接不上,导致集群不可用怎么解决

11 个赞

看一下租户副本分布。正常情况1-1-1集群是可以提供服务的。

4 个赞

点赞~~

2 个赞

检查一下集群的报警日志,根据报警日志排查问题。

2 个赞

从截图可以看出有一台主机不可用,导致集群不可用,看看rootservice 在哪一台主机上,估计没有自动迁移到其他主机而导致的!
在截图概览中查看rootservice 是否在宕机的178上面呢!

2 个赞

向大佬学习经验

1 个赞

难道是一台主机挂掉了,剩下的不满足Paxos的多数派么,这个有可能么

1 个赞

66666

1 个赞

这个rootservice是在OCP的库里面查看还是该集群里面呢,如果是该集群里面,都连接不进去了

1 个赞

178主机里面,没有observer进程

1 个赞

[2026-05-21 09:15:32.537244] INFO [SERVER] destroy (ob_server.cpp:530) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=4] [OBSERVER_NOTICE] destroy observer begin
[2026-05-21 09:15:32.537250] INFO [SERVER] destroy (ob_server.cpp:532) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=6] begin to destroy config manager
[2026-05-21 09:15:32.537253] INFO [SERVER] destroy (ob_server.cpp:534) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3] destroy config manager success
[2026-05-21 09:15:32.537257] WDIAG [CLOG] destroy (ob_server_log_block_mgr.cpp:125) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=2][errcode=0] ObServerLogBlockMgr destroy(this={dir::"", dir_fd:-1, meta_fd:-1, log_pool_meta:{curr_total_size:0, next_total_size:0, status:0}, min_block_id:0, max_block_id:0, min_log_disk_size_for_all_tenants_:0, is_inited:false})
[2026-05-21 09:15:32.560944] INFO [STORAGE.TRANS] destroy (ob_weak_read_service.cpp:54) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=8] [WRS] weak read service begin destroy
[2026-05-21 09:15:32.560971] INFO [STORAGE.TRANS] destroy (ob_weak_read_service.cpp:60) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=27] [WRS] weak read service destroy succ
[2026-05-21 09:15:32.789738] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=10][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x8970e64 0x89c861e 0x8aff243 0xa6b4219 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.789788] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=36][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x8ace8ca 0x89c86d7 0x8aff243 0xa6b4219 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.789801] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=4][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x8acea16 0x89c8797 0x8aff243 0xa6b4219 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.789809] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x8970e64 0x89c886e 0x8aff243 0xa6b4219 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.789820] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=2][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x8aceb62 0x89c8927 0x8aff243 0xa6b4219 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.789995] INFO [SHARE.LOCATION] destroy (ob_tablet_location_refresh_service.cpp:333) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=2] [REFRESH_TABLET_LOCATION] destroy service begin
[2026-05-21 09:15:32.790002] INFO [SHARE.LOCATION] stop (ob_tablet_location_refresh_service.cpp:317) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=6] [REFRESH_TABLET_LOCATION] stop service begin
[2026-05-21 09:15:32.790007] WDIAG [SHARE] logical_stop (ob_reentrant_thread.cpp:103) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4006] not init(ret=-4006)
[2026-05-21 09:15:32.790013] INFO [SHARE.LOCATION] stop (ob_tablet_location_refresh_service.cpp:320) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=5] [REFRESH_TABLET_LOCATION] stop service end
[2026-05-21 09:15:32.790017] INFO [SHARE.LOCATION] wait (ob_tablet_location_refresh_service.cpp:325) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=4] [REFRESH_TABLET_LOCATION] wait service begin
[2026-05-21 09:15:32.790020] WDIAG [SHARE] logical_wait (ob_reentrant_thread.cpp:118) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4006] not init(ret=-4006)
[2026-05-21 09:15:32.790024] INFO [SHARE.LOCATION] wait (ob_tablet_location_refresh_service.cpp:328) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=4] [REFRESH_TABLET_LOCATION] wait service end
[2026-05-21 09:15:32.790037] INFO [SHARE.LOCATION] destroy (ob_tablet_location_refresh_service.cpp:352) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=12] [REFRESH_TABLET_LOCATION] destroy service end
[2026-05-21 09:15:32.879417] WDIAG ~ObConfigManager (ob_config_manager.cpp:34) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4016] null tg(lib::TGDefIDs::CONFIG_MGR=119)
[2026-05-21 09:15:32.879456] WDIAG ~ObConfigManager (ob_config_manager.cpp:35) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=38][errcode=-4016] null tg(lib::TGDefIDs::CONFIG_MGR=119)
[2026-05-21 09:15:32.950540] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x116204cc 0x11418847 0xfae7886 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:32.950606] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=41][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x116204cc 0x11418847 0x11418489 0xfae7955 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:33.268070] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=5][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0x104babd0 0x102a2fdc 0x102a2de9 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:33.287937] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=36][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0xfc2609a 0x102a30e3 0x102a2de9 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e
[2026-05-21 09:15:33.287983] WDIAG begin (ob_hashtable.h:914) [1427][observer][T0][Y0-0000000000000000-0-0] [lt=37][errcode=-4006] hashtable not init, backtrace=0x12435d5c 0xfc2609a 0x102a31e3 0x102a2de9 0x700a7ba47a76 0x700a7ba47bbe 0x700a7ba2a1d1 0x700a7ba2a28b 0x52dde1e

1 个赞

手动拉起改节点呢?

1 个赞

没有日志怎么排查

1 个赞

学习一下

经验分享很有价值

1 个赞

手动能单独拉起来吗

期待更多分享