新装的ob有大量日志报错(errcode=-4389)

操作系统:openEuler 20.03
操作系统配置:32G内存,8cpu(同一实体机下的虚拟机)
测试环境(无任何业务)
ob版本:v3.5
使用obd部署参数全是默认

配置不算低,在没有任何业务的情况就报[errcode=-4389] write log cost too much
是要修改那些参数?还是虚拟机不适合?

[2025-08-13 14:42:28.814821] WDIAG [PALF] inner_append_log (palf_handle_impl.cpp:2236) [2956194][T1_IOWorker][T1][Y0-0000000000000000-0-0] [lt=118][errcode=-4389] write log cost too much time(ret=-4389, this={palf_id:1, self:“172.10.10.41:2882”, has_set_deleted:false}, lsn_array=[{lsn:107081719}], scn_array=[{val:1755067348732092000, v:0}], curr_size=598, accum_size=1318, time_cost=27077)
[2025-08-13 14:42:28.943720] WDIAG [PALF] inner_append_log (palf_handle_impl.cpp:2236) [2956194][T1_IOWorker][T1][Y0-0000000000000000-0-0] [lt=88][errcode=-4389] write log cost too much time(ret=-4389, this={palf_id:1, self:“172.10.10.41:2882”, has_set_deleted:false}, lsn_array=[{lsn:107082317}], scn_array=[{val:1755067348732092001, v:0}], curr_size=122, accum_size=1440, time_cost=13767)
[2025-08-13 14:42:28.969022] WDIAG [PALF] inner_append_log (palf_handle_impl.cpp:2236) [2956194][T1_IOWorker][T1][Y0-0000000000000000-0-0] [lt=83][errcode=-4389] write log cost too much time(ret=-4389, this={palf_id:1, self:“172.10.10.41:2882”, has_set_deleted:false}, lsn_array=[{lsn:107082439}], scn_array=[{val:1755067348931406000, v:0}], curr_size=598, accum_size=2038, time_cost=17346)
[2025-08-13 14:42:29.073260] WDIAG [SERVER.OMT] check_cgroup_root_dir (ob_cgroup_ctrl.cpp:212) [2956022][MultiTenant][T0][Y0-0000000000000000-0-0] [lt=35][errcode=-4027] dir not exist(OBSERVER_ROOT_CGROUP_DIR=“cgroup”, ret=-4027)
[2025-08-13 14:42:29.153552] WDIAG [PALF] inner_append_log (palf_handle_impl.cpp:2236) [2956194][T1_IOWorker][T1][Y0-0000000000000000-0-0] [lt=111][errcode=-4389] write log cost too much time(ret=-4389, this={palf_id:1, self:“172.10.10.41:2882”, has_set_deleted:false}, lsn_array=[{lsn:107083037}], scn_array=[{val:1755067348931406001, v:0}], curr_size=122, accum_size=2160, time_cost=23296)
[2025-08-13 14:42:29.209731] WDIAG [PALF] inner_append_log (palf_handle_impl.cpp:2236) [2956194][T1_IOWorker][T1][Y0-0000000000000000-0-0] [lt=66][errcode=-4389] write log cost too much time(ret=-4389, this={palf_id:1, self:“172.10.10.41:2882”, has_set_deleted:false}, lsn_array=[{lsn:107083159}], scn_array=[{val:1755067349132240000, v:0}], curr_size=596, accum_size=2756, time_cost=56072)

磁盘是ssd磁盘还是机械盘 数据盘和clog盘 是否同盘 可以先用obdiag巡检一下

部署环境检查

obdiag check run --cases=build_before
https://www.oceanbase.com/docs/common-obdiag-cn-1000000003607664

sas盘,数据和日志同盘
安装诊断工具报错
[admin@ob4 rpms]$ obd obdiag deploy
[ERROR] No such deploy: deploy.
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 7b5d1126-7821-11f0-b359-000c29fcb17a
If you want to view detailed obd logs, please run: obd display-trace 7b5d1126-7821-11f0-b359-000c29fcb17a
[admin@ob4 rpms]$ obd display-trace 7b5d1126-7821-11f0-b359-000c29fcb17a
[2025-08-13 16:42:44.744] [DEBUG] - cmd: [‘deploy’]
[2025-08-13 16:42:44.744] [DEBUG] - opts: {}
[2025-08-13 16:42:44.744] [DEBUG] - mkdir /home/admin/.obd/lock/
[2025-08-13 16:42:44.744] [DEBUG] - unknown lock mode
[2025-08-13 16:42:44.745] [DEBUG] - try to get share lock /home/admin/.obd/lock/global
[2025-08-13 16:42:44.745] [DEBUG] - share lock /home/admin/.obd/lock/global, count 1
[2025-08-13 16:42:44.745] [DEBUG] - try to get exclusive lock /home/admin/.obd/lock/global
[2025-08-13 16:42:44.745] [DEBUG] - exclusive lock /home/admin/.obd/lock/global, count 1
[2025-08-13 16:42:44.745] [DEBUG] - Get Deploy by name
[2025-08-13 16:42:44.745] [DEBUG] - mkdir /home/admin/.obd/cluster/
[2025-08-13 16:42:44.745] [DEBUG] - mkdir /home/admin/.obd/config_parser/
[2025-08-13 16:42:44.746] [DEBUG] - try to get share lock /home/admin/.obd/lock/deploy_deploy
[2025-08-13 16:42:44.746] [DEBUG] - share lock /home/admin/.obd/lock/deploy_deploy, count 1
[2025-08-13 16:42:44.746] [ERROR] [ERROR] No such deploy: deploy.
[2025-08-13 16:42:44.746] [DEBUG] - share lock /home/admin/.obd/lock/deploy_deploy release, count 0
[2025-08-13 16:42:44.746] [DEBUG] - unlock /home/admin/.obd/lock/deploy_deploy
[2025-08-13 16:42:44.746] [DEBUG] - exclusive lock /home/admin/.obd/lock/global release, count 0
[2025-08-13 16:42:44.746] [DEBUG] - try to get share lock /home/admin/.obd/lock/global
[2025-08-13 16:42:44.746] [DEBUG] - share lock /home/admin/.obd/lock/global release, count 0
[2025-08-13 16:42:44.746] [DEBUG] - unlock /home/admin/.obd/lock/global
[2025-08-13 16:42:44.746] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2025-08-13 16:42:44.746] [INFO] Trace ID: 7b5d1126-7821-11f0-b359-000c29fcb17a
[2025-08-13 16:42:44.746] [INFO] If you want to view detailed obd logs, please run: obd display-trace 7b5d1126-7821-11f0-b359-000c29fcb17a

诊断工具报错我另外开个帖子


可能是你使用机械盘有io等待 你可以用tsar或者sar命令 看一下磁盘的io
https://www.oceanbase.com/knowledge-base/oceanbase-database-1000000000532684


这个要怎么看,除了ob应该基本没有其他io了
除了修改日志等级之外能不能直接只屏蔽这个错误?

如果你是体验ob或者测试环境使用 可以忽略掉这个错误 但是不建议机械磁盘 上生产环境


1.如果有压力的时候在服务器上监控下磁盘IO情况 iostat -mxdt 3
2.在ocp 租户–> 性能监控–>存储与缓存 物理 IO 次数,物理IO吞吐量,物理IO耗时
3.在ocp 租户–> 性能监控–>性能与SQL 看下 clog同步延迟 , 租户 CPU 消耗,内存使用率