使用obdiag做下部署前检查
部署环境检查
obdiag check run --cases=build_before
https://www.oceanbase.com/docs/common-obdiag-cn-1000000003242092
使用obdiag做下部署前检查
obdiag check run --cases=build_before
https://www.oceanbase.com/docs/common-obdiag-cn-1000000003242092
你是做了回滚,目录里面的相关文件被清理掉了吧
2025-07-25 14:49:43.998 INFO 9623 --- [manual-subtask-executor15,840818ceae3ab1e1,a6ecc6ffc7ac1820] c.o.o.s.task.util.AgentAsyncTaskHelper : try to request task result(EXECUTE), result:true,null,,<null>
Set state for subtask: 303, operation:ROLLBACK, state: PENDING
你再部署一次,报错后不要回滚,再去这个目录看下有没有东西,取下observer.log
防火墙或者selinux限制吗
全关!
了解了解
root登录有问题吗?有的地方不让root直接登录
这3台机器内存各是多少,这里看到memory_limit只有4G,内存参数更新失败了
[2025-07-26 00:25:44.425669] ERROR issue_dba_error (ob_log.cpp:1875) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=20][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4147, file="ob_server_config.cpp", line_no=365, info="update observer memory config failed")
[2025-07-26 00:25:44.425690] EDIAG [SHARE.CONFIG] reload_config (ob_server_config.cpp:365) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=20][errcode=-4147] update observer memory config failed(memory_limit=4751353446, system_memory=1073741824, hidden_sys_memory=1073741824, min_server_avail_memory=5368709120) BACKTRACE:0x12435d5c 0x5069065 0x4f72fb9 0x4f72aef 0x4f72a38 0x4f72861 0xfda7f37 0xfda7968 0xa6c1666 0xa6b75e5 0x72b7334 0x7fdeb1846445 0x52dde1e
[2025-07-26 00:25:44.425858] WDIAG [SHARE.CONFIG] reload_config (ob_server_config.cpp:377) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=151][errcode=-4147] the hold memory of tenant_500 is over the reserved memory(tenant_500_hold=6291456, tenant_500_reserved=0)
[2025-07-26 00:25:44.425874] ERROR issue_dba_error (ob_log.cpp:1875) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=10][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4147, file="ob_server.cpp", line_no=1889, info="reload memory config failed")
[2025-07-26 00:25:44.425886] EDIAG [SERVER] init_config (ob_server.cpp:1889) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=12][errcode=-4147] reload memory config failed(ret=-4147, ret="OB_INVALID_CONFIG") BACKTRACE:0x12435d5c 0x5069065 0x516175d 0x516121f 0x514d874 0x51610c3 0xa6eb2c6 0xa6c25a8 0xa6b75e5 0x72b7334 0x7fdeb1846445 0x52dde1e
[2025-07-26 00:25:44.425947] ERROR issue_dba_error (ob_log.cpp:1875) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=53][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4147, file="ob_server.cpp", line_no=264, info="init config failed")
[2025-07-26 00:25:44.425954] EDIAG [SERVER] init (ob_server.cpp:264) [30216][observer][T0][Y0-0000000000000000-0-0] [lt=7][errcode=-4147] init config failed(ret=-4147, ret="OB_INVALID_CONFIG") BACKTRACE:0x12435d5c 0x5069065 0x516175d 0x516121f 0x514d874 0x51610c3 0xa6c3d84 0xa6b9afd 0x72b7334 0x7fdeb1846445 0x52dde1e
因为PC资源限制,机器内存分配较小,我用obd测试创建集群时检测observer内存不足,需要设置memory_limit值才能创建成功,后续还需修改__min_full_resource_pool_memory才能创建租户(突破最小5G限制);推测是使用OCP创建集群时未设置memory_limit参数到时集群创建失败,后续再重新构建环境复现下!
obd创建的memory_limit设置了多少?
本人测试环境1(OCP 9G*1)+3(obproxy+observer 6G*3),observer实际可用约4G+,由于实际剩余可用小于5G,做如下测试
经测试,
(1)使用OBD创建集群设置memory_limit=4G可创建成功,后续创建租户需先设置__min_full_resource_pool_memory=1G来突破默认最小5G的限制;
(2)使用OCP界面创建集群的话,需要设置__min_full_resource_pool_memory来突破默认最小5G的限制即可创建成功
总结来看资源不足情况下,隐含参数__min_full_resource_pool_memory是关键!
可以调整参数试试,主要是内存参数
对的!主要是隐含参数的设置.
感谢分享
感谢故障排查思路,点赞!
学习一下!
经验分享很有价值
写得很详细
干货满满,受益匪浅
赞一个!