烂笔头
2025 年12 月 22 日 10:35
#1
【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】 5.7.25-OceanBase_CE-v4.3.5.4
【问题描述】OCP 报警 OBServer 合并失败 ob_log_merge_error,[errcode=-4034] failed to merge partition(ret=-4034),查询 CDB_OB_MAJOR_COMPACTION 的status 都是 IDLE。
问题1: OCP 报警 OBServer 合并失败 ob_log_merge_error,跟 租户合并的区别
问题2: 怎么排查处理告警
@论坛小助手
1 个赞
论坛小助手
2025 年12 月 22 日 10:36
#3
针对您提到的 OCP 报警 OBServer 合并失败 (ob_log_merge_error),错误代码 -4034 的问题,可以使用 obdiag 工具进行日志采集与根因分析,以便更好地理解发生问题的原因。以下是我们推荐使用的 obdiag 场景和命令:
根因分析场景: 卡合并场景
根因分析命令: obdiag rca run --scene=major_hold
日志采集场景: 合并问题
日志采集命令: obdiag gather scene run --scene=observer.compaction
请您先使用上述命令进行日志采集与根因分析,并将采集到的日志文件提供给我们,这将有助于我们进一步诊断问题所在。同时,为了能够更准确地定位问题,请您回答以下几个问题:
能否提供具体的报警时间和 OBServer 实例的详细信息?
在发生合并失败前,是否有任何特定的操作或事件(例如,大量写入操作、执行了某些DDL语句等)?
当前 OBServer 的存储空间是否足够?是否存在磁盘使用率较高的情况?
是否有其他相关日志或错误信息可供参考?
感谢您的配合,期待您的回复。
附上敏捷诊断工具 obdiag 使用帮助链接
1 个赞
烂笔头
2025 年12 月 22 日 11:10
#5
obdiag gather scene run --scene=observer.compaction 的 sql_result.txt
sql_result.txt (104.7 KB)
1 个赞
辞霜
2025 年12 月 22 日 11:20
#6
select * from GV$OB_COMPACTION_DIAGNOSE_INFO;
1 个赞
辞霜
2025 年12 月 22 日 11:28
#8
select * from cdb_ob_major_compaction ; 获取GLOBAl SCN
select svr_ip, svr_port, count(1) from __all_virtual_tablet_meta_table where tenant_id = xxxx and compaction_scn < #‘GLOBAl SCN’# group by svr_ip; 查询未合并完成的表
1 个赞
烂笔头
2025 年12 月 22 日 11:33
#10
在告警期间,我查了租户合并相关的视图,没有发现异常。但是OCP一直在告警
1 个赞
辞霜
2025 年12 月 22 日 11:36
#11
目前看合并状态都是正常的。observer日志发一份吧
1 个赞
烂笔头
2025 年12 月 22 日 11:36
#12
observer.log.zip (10.4 MB)
这是故障未回复期间的observer.log
1 个赞
烂笔头
2025 年12 月 22 日 14:49
#14
根据OCP记录 2025年12月22日 10:00:27 开始告警, 2025年12月22日 11:02:36 告警结束,
告警消除周期 5分钟。 告警期间查询 OB_COMPACTION_xxx 相关视图没有异常。
疑问1: OCP 告警 ob_log_merge_error 具体是什么
疑问2: OCP 报警 OBServer 合并失败 ob_log_merge_error,跟 租户合并 是一回事吗? 因为 在告警期间并没有发生租户合并。
辞霜
2025 年12 月 22 日 15:14
#15
1.该告警是监测合并失败的告警 通过grep observer日志中的关键信息判断
2.根据日志判断触发的minor_merge合并sstable报错。不是租户级别合并
MINOR_MERGE:Minor Compaction,多个 Mini SSTable 合成一个新的 Mini SSTable 或者多个 Mini SSTable 与一个 Minor SSTable 合成一个新的 Minor SSTable。
烂笔头
2025 年12 月 22 日 15:23
#16
ob_log_merge_error此时触发的是 分区合并吗?
分区合并异常有什么SQL或者信息定位到 哪个租户哪张表 ,对应办法吗?
辞霜
2025 年12 月 22 日 15:48
#17
obclient [oceanbase]> SELECT * FROM oceanbase.GV$OB_TABLET_COMPACTION_HISTORY WHERE TENANT_ID=1007 AND TABLET_ID=49402 ORDER BY START_TIME DESC LIMIT 10\G
查询下结果看下
烂笔头
2025 年12 月 22 日 15:53
#18
root 15:52: [oceanbase]> SELECT * FROM oceanbase.GV$OB_TABLET_COMPACTION_HISTORY WHERE TENANT_ID=1007 AND TABLET_ID=49402 ORDER BY START_TIME DESC LIMIT 10\G
*************************** 1. row ***************************
SVR_IP: 10.1.224.53
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389967741750000
START_TIME: 2025-12-22 15:52:47.835730
FINISH_TIME: 2025-12-22 15:52:47.963803
TASK_ID: YB420A01E035-000646090545727D-0-0
OCCUPY_SIZE: 1288952
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4407
INCREMENTAL_ROW_COUNT: 4407
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 12261
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389667694039000,end_scn=1766389967741750000;
MACRO_ID_LIST: 11897
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 92
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 101910
*************************** 2. row ***************************
SVR_IP: 10.1.224.55
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389878550884000
START_TIME: 2025-12-22 15:51:18.679942
FINISH_TIME: 2025-12-22 15:51:18.738273
TASK_ID: YB420A01E037-00064608F5AC9A58-0-0
OCCUPY_SIZE: 1300460
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4431
INCREMENTAL_ROW_COUNT: 4431
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 24228
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389576601411000,end_scn=1766389878550884000;
MACRO_ID_LIST: 9715
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 84
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 51722
*************************** 3. row ***************************
SVR_IP: 10.1.224.54
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINOR_MERGE
COMPACTION_SCN: 1766389820377032000
START_TIME: 2025-12-22 15:50:20.582123
FINISH_TIME: 2025-12-22 15:50:21.476281
TASK_ID: YB420A01E036-00064608F1791FF0-0-0
OCCUPY_SIZE: 72692321
MACRO_BLOCK_COUNT: 48
MULTIPLEXED_MACRO_BLOCK_COUNT: 45
NEW_MICRO_COUNT_IN_NEW_MACRO: 197
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 737278
INCREMENTAL_ROW_COUNT: 32983
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 4473
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=3,start_scn=1,end_scn=1766389820377032000;
MACRO_ID_LIST:
COMMENTS: comment="cost_mb=5;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 13
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 887294
*************************** 4. row ***************************
SVR_IP: 10.1.224.54
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389820377032000
START_TIME: 2025-12-22 15:50:20.514650
FINISH_TIME: 2025-12-22 15:50:20.578300
TASK_ID: YB420A01E036-00064608F1791FEF-0-0
OCCUPY_SIZE: 1299010
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4427
INCREMENTAL_ROW_COUNT: 4427
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 22117
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389518457846000,end_scn=1766389820377032000;
MACRO_ID_LIST: 10777
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 84
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 56643
*************************** 5. row ***************************
SVR_IP: 10.1.224.53
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINOR_MERGE
COMPACTION_SCN: 1766389667694039000
START_TIME: 2025-12-22 15:47:47.807329
FINISH_TIME: 2025-12-22 15:47:49.089983
TASK_ID: YB420A01E035-000646090545727C-0-0
OCCUPY_SIZE: 72460704
MACRO_BLOCK_COUNT: 47
MULTIPLEXED_MACRO_BLOCK_COUNT: 43
NEW_MICRO_COUNT_IN_NEW_MACRO: 197
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 735042
INCREMENTAL_ROW_COUNT: 32970
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 3100
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=3,start_scn=1,end_scn=1766389667694039000;
MACRO_ID_LIST:
COMMENTS: comment="cost_mb=5;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 10
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 1275582
*************************** 6. row ***************************
SVR_IP: 10.1.224.53
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389667694039000
START_TIME: 2025-12-22 15:47:47.742254
FINISH_TIME: 2025-12-22 15:47:47.803330
TASK_ID: YB420A01E035-000646090545727B-0-0
OCCUPY_SIZE: 1286904
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4441
INCREMENTAL_ROW_COUNT: 4441
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 36631
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389365671036000,end_scn=1766389667694039000;
MACRO_ID_LIST: 11228
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 76
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 33689
*************************** 7. row ***************************
SVR_IP: 10.1.224.55
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINOR_MERGE
COMPACTION_SCN: 1766389576601411000
START_TIME: 2025-12-22 15:46:17.050040
FINISH_TIME: 2025-12-22 15:46:19.279630
TASK_ID: YB420A01E037-00064608F5AC9A57-0-0
OCCUPY_SIZE: 72340453
MACRO_BLOCK_COUNT: 51
MULTIPLEXED_MACRO_BLOCK_COUNT: 46
NEW_MICRO_COUNT_IN_NEW_MACRO: 327
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 733703
INCREMENTAL_ROW_COUNT: 54968
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 2723
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=3,start_scn=1,end_scn=1766389576601411000;
MACRO_ID_LIST:
COMMENTS: comment="cost_mb=5;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 8
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 2220837
*************************** 8. row ***************************
SVR_IP: 10.1.224.55
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389576601411000
START_TIME: 2025-12-22 15:46:16.764451
FINISH_TIME: 2025-12-22 15:46:17.045829
TASK_ID: YB420A01E037-00064608F5AC9A56-0-0
OCCUPY_SIZE: 1295369
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4434
INCREMENTAL_ROW_COUNT: 4434
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 4657
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389274686661000,end_scn=1766389576601411000;
MACRO_ID_LIST: 908
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 96
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 270589
*************************** 9. row ***************************
SVR_IP: 10.1.224.54
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389518457846000
START_TIME: 2025-12-22 15:45:18.720563
FINISH_TIME: 2025-12-22 15:45:18.804510
TASK_ID: YB420A01E036-00064608F1791FEE-0-0
OCCUPY_SIZE: 1294616
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4437
INCREMENTAL_ROW_COUNT: 4437
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 21361
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389216414806002,end_scn=1766389518457846000;
MACRO_ID_LIST: 11502
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 85
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 58476
*************************** 10. row ***************************
SVR_IP: 10.1.224.53
SVR_PORT: 2882
TENANT_ID: 1007
LS_ID: 1
TABLET_ID: 49402
TYPE: MINI_MERGE
COMPACTION_SCN: 1766389365671036000
START_TIME: 2025-12-22 15:42:47.797239
FINISH_TIME: 2025-12-22 15:42:47.877061
TASK_ID: YB420A01E035-000646090545727A-0-0
OCCUPY_SIZE: 1281328
MACRO_BLOCK_COUNT: 1
MULTIPLEXED_MACRO_BLOCK_COUNT: 0
NEW_MICRO_COUNT_IN_NEW_MACRO: 27
MULTIPLEXED_MICRO_COUNT_IN_NEW_MACRO: 0
TOTAL_ROW_COUNT: 4439
INCREMENTAL_ROW_COUNT: 4439
COMPRESSION_RATIO: 1
NEW_FLUSH_DATA_RATE: 18859
PROGRESSIVE_COMPACTION_ROUND: 0
PROGRESSIVE_COMPACTION_NUM: 0
PARALLEL_DEGREE: 1
PARALLEL_INFO: -
PARTICIPANT_TABLE: table_cnt=1,start_scn=1766389063573273000,end_scn=1766389365671036000;
MACRO_ID_LIST: 9835
COMMENTS: comment="cost_mb=4;";
START_CG_ID: 0
END_CG_ID: 0
KEPT_SNAPSHOT:
MERGE_LEVEL: MACRO_BLOCK_LEVEL
EXEC_MODE: EXEC_MODE_LOCAL
IS_FULL_MERGE: FALSE
IO_COST_TIME_PERCENTAGE: 87
MERGE_REASON:
BASE_MAJOR_STATUS:
CO_MERGE_TYPE:
MDS_FILTER_INFO:
EXECUTE_TIME: 65616
10 rows in set (0.53 sec)
辞霜
2025 年12 月 22 日 16:04
#19
SELECT * FROM oceanbase.GV$OB_TABLET_COMPACTION_HISTORY WHERE TASK_ID like ‘%000646090545715A%’;
没有查询到 使用trace id查询下试试
烂笔头
2025 年12 月 22 日 16:07
#20
辞霜:
000646090545715A
是不是因为单个分区合并耗时很短,量变产生的巨变? 或者 GV$OB_TABLET_COMPACTION_HISTORY 记录不准确
集群最新貌似查不到 tablet_id in (‘49402’) 的记录了
辞霜
2025 年12 月 22 日 16:18
#21
49402是tablet id。
1个minor+13个mini一共触发14个sstable一起合并。看刚才发的该分区目前是可以正常合并成功的。怀疑当时宏块可能出现了问题