OMS异常报错

【 使用环境 】生产环境
【 OB or 其他组件 】OMS 4.2.5
【 使用版本 】OMS 4.2.5
【问题描述】
1、OMS运行过程中,突然某个任务的store报错。


2、查询错误日志。
1 [2024-10-11 09:06:37.012227] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be80d0, slice=0x7ff071be80e0)

2 [2024-10-11 09:06:37.012347] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=15][errcode=-4016] (item=0x7ff071be8220, slice=0x7ff071be8230)

3 [2024-10-11 09:06:37.012500] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be8370, slice=0x7ff071be8380)

4 [2024-10-11 09:06:37.012614] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=8][errcode=-4016] (item=0x7ff071be84c0, slice=0x7ff071be84d0)

5 [2024-10-11 09:06:37.012730] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be8610, slice=0x7ff071be8620)

6 [2024-10-11 09:06:37.012846] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=7][errcode=-4016] (item=0x7ff071be8760, slice=0x7ff071be8770)

7 [2024-10-11 09:06:37.012956] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=5][errcode=-4016] (item=0x7ff071be88b0, slice=0x7ff071be88c0)

8 [2024-10-11 09:06:37.013071] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be8a00, slice=0x7ff071be8a10)

9 [2024-10-11 09:06:37.013184] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=5][errcode=-4016] (item=0x7ff071be8b50, slice=0x7ff071be8b60)

10 [2024-10-11 09:06:37.013295] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be8ca0, slice=0x7ff071be8cb0)

11 [2024-10-11 09:06:37.013408] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=8][errcode=-4016] (item=0x7ff071be8df0, slice=0x7ff071be8e00)

12 [2024-10-11 09:06:37.013530] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be8f40, slice=0x7ff071be8f50)

13 [2024-10-11 09:06:37.013644] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=9][errcode=-4016] (item=0x7ff071be9090, slice=0x7ff071be90a0)

14 [2024-10-11 09:06:37.013753] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=5][errcode=-4016] (item=0x7ff071be91e0, slice=0x7ff071be91f0)

15 [2024-10-11 09:06:37.013864] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=5][errcode=-4016] (item=0x7ff071be9330, slice=0x7ff071be9340)

16 [2024-10-11 09:06:37.013979] WDIAG [LIB] print_leak_slice (ob_slice_alloc.cpp:40) [3924566][][T0][Y0-0000000000000000-0-0] [lt=6][errcode=-4016] (item=0x7ff071be9480, slice=0x7ff071be9490)

发现一直卡在 errcode=-4016,

其他日志并没发现明显错误。

【复现路径】OMS运行过程中,突然某个任务的store报错。
【附件及日志】
libobcdc.log
libobcdc.log.log (193.4 KB)

congo.log
congo.log.log (51.8 KB)

源端ob是4.x吗?如果是,-4016 这个错误码表示源端clob被清理了

您好,clog不应该被清理呀,什么情况下清理了?

本来一直同步着没延时的~

同步着的时候,clog不应该会被清理吧~

clog是否被清理,这要看ob服务端的磁盘情况,是否到了阀值被清理了?
可以用下面的方法查询一下
#3.x查询ob上最早日志位点
SELECT svr_min_log_timestamp FROM
oceanbase.__all_virtual_server_clog_stat WHERE zone_status=‘ACTIVE’;
#4.x,sys租户下是所有租户日志,其他租户可以不使用where TENANT_ID=xxxx条件
select scn_to_timestamp(BEGIN_SCN),BEGIN_SCN
from oceanbase.GV$OB_LOG_STAT
where TENANT_ID=1004;

1 个赞

您好,我们这边调整租户对应资源的RESOURCE UNIT,扩容了对应clog的容量,请问,这样是否足够?

如果确保租户clog阈值以保证clog不轻易被清理?

日志盘大小尽量设置为内存上限的 3 或 4 倍(生产环境至少是 3 倍),确保clog盘的容量足够