【 使用环境 】生产环境
【 OB or 其他组件 】OBCE-4.2.2
【 使用版本 】OBCE-4.2.2
【问题描述】ob备份 nfs 机器换了之后,原先的备份任务无法结束,导致新的备份任务无法发起。
【复现路径】问题出现前后相关操作
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!
【 使用环境 】生产环境
【 OB or 其他组件 】OBCE-4.2.2
【 使用版本 】OBCE-4.2.2
【问题描述】ob备份 nfs 机器换了之后,原先的备份任务无法结束,导致新的备份任务无法发起。
【复现路径】问题出现前后相关操作
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!
根据您的描述,您遇到了OceanBase备份任务在更换NFS服务器后无法正常终止的问题。
推荐使用 obdiag 工具进行日志采集,以便我们能够更好地定位问题所在。
日志采集场景: 数据备份问题
日志采集命令: obdiag gather scene run --scene=observer.backup
请您执行上述命令并提供采集到的日志压缩包,这将帮助我们更准确地分析问题的原因。
为了进一步了解问题,请您提供一些额外的信息:
具体信息可参考以下文档:
附上敏捷诊断工具 obdiag 使用帮助链接
再向您确认几个问题:
– 之前公司的nfs 服务器故障挂掉了,重新配置nfs服务器,原先的备份任务一直停止不掉,新任务无法发起,如下是 先关信息:
select * from oceanbase.CDB_OB_BACKUP_PARAMETER where tenant_id=1008;
尝试过重新设置 备份目录,rs切换 还是不行 。
根据您提供的信息,您更换了NFS服务器后,原有的备份任务一直停留在 CANCELING
状态,导致新的备份任务无法发起。接下来,我们将一步步排查这个问题。
首先,我们需要确认当前备份任务的状态。您可以使用以下SQL语句来查询:
SELECT * FROM oceanbase.CDB_OB_BACKUP_TASKS WHERE tenant_id = 1008;
您已经提供了备份参数表的信息:
SELECT * FROM oceanbase.CDB_OB_BACKUP_PARAMETER WHERE tenant_id = 1008;
根据输出结果,我们可以看到:
dest_id
: 1001path
: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/clog
piece_switch_interval
: 1dstate
: ENABLE您也提供了租户信息:
SELECT tenant_id, tenant_name, tenant_type, primary_zone, tenant_role, log_mode FROM oceanbase.DBA_OB_TENANTS WHERE tenant_id = 1008;
根据输出结果,我们可以看到:
tenant_id
: 1008tenant_name
: CM_c
tenant_type
: USER
primary_zone
: zone1,zone2,zone3
tenant_role
: PRIMARY
log_mode
: NOARCHIVELOG
您还提供了日志归档进度的信息:
SELECT tenant_id, dest_id, ls_id, round_id, piece_id, incarnation, scn_to_timestamp(start_scn) start_scn_ts, scn_to_timestamp(checkpoint_scn) ckpt_scn_ts, status, file_id FROM oceanbase.CDB_OB_LS_LOG_ARCHIVE_PROGRESS WHERE tenant_id = 1008 ORDER BY start_scn DESC;
根据输出结果,我们可以看到所有记录的状态都是 DOING
,这表明日志归档正在进行中。
您还提供了归档日志的信息:
SELECT tenant_id, dest_id, round_id, incarnation, status, start_scn_display, checkpoint_scn_display, piece_switch_interval, base_piece_id, comment, path FROM oceanbase.CDB_OB_ARCHIVELOG WHERE tenant_id = 1008;
根据输出结果,我们可以看到:
status
: 未知(未提供具体值)path
: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CANCELING
状态。ALTER SYSTEM CANCEL BACKUP
命令取消备份任务,并查看命令执行结果。如果上述步骤未能解决问题,您可以进一步执行以下步骤:
SELECT * FROM oceanbase.__all_virtual_backup_schedule_task WHERE tenant_id = 1008;
destination
列所显示的机器上,执行以下命令搜索日志:grep 'trace_id' observer.log
trace_id
为上一步查询出来的 trace_id
的值。根据您提供的信息,备份任务可能因为日志归档仍在进行中而无法终止。请按照上述步骤进行排查,并提供进一步的信息以便于更准确地定位问题。
具体信息可参考以下文档:
(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)
手动执行取消 备份任务:
MySQL [oceanbase]> ALTER SYSTEM CANCEL BACKUP tenant='CMAAS_C';
Query OK, 0 rows affected (0.00 sec)
查询备份任务还是 CANCELING
MySQL [oceanbase]> select tenant_id, job_id,incarnation,backup_set_id,PLUS_ARCHIVELOG ,backup_type,job_level,start_timestamp, status ,DESCRIPTION ,PATH
-> from oceanbase.CDB_OB_BACKUP_JOBS
-> where tenant_id=1008;
查询结果是空的
SELECT * FROM oceanbase.__all_virtual_backup_schedule_task WHERE tenant_id = 1008;
麻烦查一下看看:
select * from __all_virtual_backup_ls_task where tenant_id=1008 \G
空的
__all_virtual_backup_ls_task 记录是空的。
停掉日志备份的方法没有 。
alter system noarchivelog tenant=xxxx;
命令虽然不报错,但是 cdb_ob_ls_log_archive_progress 里 tenant_id=1008 的 doing 记录还是在。
select * from cdb_ob_backup_set_files where tenant_id=1008 order by START_TIMESTAMP desc limit 10;
*************************** 1. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 91
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: INC
PREV_FULL_BACKUP_SET_ID: 90
PREV_INC_BACKUP_SET_ID: 90
START_TIMESTAMP: 2024-07-30 04:00:15.017285
END_TIMESTAMP: 2024-07-30 04:02:02.052950
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 107
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1722252760404237001
START_REPLAY_SCN_DISPLAY: 2024-07-29 19:32:40.404237
MIN_RESTORE_SCN: 1722283322044522000
MIN_RESTORE_SCN_DISPLAY: 2024-07-30 04:02:02.044522000
INPUT_BYTES: 2005809445
OUTPUT_BYTES: 135528522
OUTPUT_RATE_BYTES: 1266199.6541
EXTRA_META_BYTES: 0
TABLET_COUNT: 5740
FINISH_TABLET_COUNT: 5740
MACRO_BLOCK_COUNT: 963
FINISH_MACRO_BLOCK_COUNT: 952
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1722283252211678000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
*************************** 2. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 90
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: FULL
PREV_FULL_BACKUP_SET_ID: 0
PREV_INC_BACKUP_SET_ID: 0
START_TIMESTAMP: 2024-07-29 04:00:12.870714
END_TIMESTAMP: 2024-07-29 04:01:49.221648
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 96
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1722161459377621003
START_REPLAY_SCN_DISPLAY: 2024-07-28 18:10:59.377621
MIN_RESTORE_SCN: 1722196909214784000
MIN_RESTORE_SCN_DISPLAY: 2024-07-29 04:01:49.214784000
INPUT_BYTES: 2030975293
OUTPUT_BYTES: 150623511
OUTPUT_RATE_BYTES: 1563280.2376
EXTRA_META_BYTES: 0
TABLET_COUNT: 5740
FINISH_TABLET_COUNT: 5740
MACRO_BLOCK_COUNT: 964
FINISH_MACRO_BLOCK_COUNT: 964
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1722196844167244000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
*************************** 3. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 89
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: INC
PREV_FULL_BACKUP_SET_ID: 86
PREV_INC_BACKUP_SET_ID: 88
START_TIMESTAMP: 2024-07-28 04:00:08.738552
END_TIMESTAMP: 2024-07-28 04:02:11.558965
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 123
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1722059929661726001
START_REPLAY_SCN_DISPLAY: 2024-07-27 13:58:49.661726
MIN_RESTORE_SCN: 1722110531551671000
MIN_RESTORE_SCN_DISPLAY: 2024-07-28 04:02:11.551671000
INPUT_BYTES: 2003711874
OUTPUT_BYTES: 135626783
OUTPUT_RATE_BYTES: 1104269.0680
EXTRA_META_BYTES: 0
TABLET_COUNT: 5740
FINISH_TABLET_COUNT: 5740
MACRO_BLOCK_COUNT: 965
FINISH_MACRO_BLOCK_COUNT: 951
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1722110462150823000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
*************************** 4. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 88
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: INC
PREV_FULL_BACKUP_SET_ID: 86
PREV_INC_BACKUP_SET_ID: 87
START_TIMESTAMP: 2024-07-27 04:00:10.087291
END_TIMESTAMP: 2024-07-27 04:02:00.048654
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 110
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1721974789094222001
START_REPLAY_SCN_DISPLAY: 2024-07-26 14:19:49.094222
MIN_RESTORE_SCN: 1722024120041982000
MIN_RESTORE_SCN_DISPLAY: 2024-07-27 04:02:00.041982000
INPUT_BYTES: 2012101077
OUTPUT_BYTES: 141230108
OUTPUT_RATE_BYTES: 1284361.1988
EXTRA_META_BYTES: 0
TABLET_COUNT: 5740
FINISH_TABLET_COUNT: 5740
MACRO_BLOCK_COUNT: 966
FINISH_MACRO_BLOCK_COUNT: 955
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1722024036910483000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
*************************** 5. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 87
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: INC
PREV_FULL_BACKUP_SET_ID: 86
PREV_INC_BACKUP_SET_ID: 86
START_TIMESTAMP: 2024-07-26 04:00:08.919533
END_TIMESTAMP: 2024-07-26 04:01:50.835628
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 102
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1721888834732896003
START_REPLAY_SCN_DISPLAY: 2024-07-25 14:27:14.732896
MIN_RESTORE_SCN: 1721937710828107000
MIN_RESTORE_SCN_DISPLAY: 2024-07-26 04:01:50.828107000
INPUT_BYTES: 2012100991
OUTPUT_BYTES: 142487577
OUTPUT_RATE_BYTES: 1398087.0931
EXTRA_META_BYTES: 0
TABLET_COUNT: 5740
FINISH_TABLET_COUNT: 5740
MACRO_BLOCK_COUNT: 965
FINISH_MACRO_BLOCK_COUNT: 955
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1721937633550247000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
*************************** 6. row ***************************
TENANT_ID: 1008
BACKUP_SET_ID: 86
DEST_ID: 1002
INCARNATION: 1
BACKUP_TYPE: FULL
PREV_FULL_BACKUP_SET_ID: 0
PREV_INC_BACKUP_SET_ID: 0
START_TIMESTAMP: 2024-07-25 04:00:09.332012
END_TIMESTAMP: 2024-07-25 04:02:05.703616
STATUS: SUCCESS
FILE_STATUS: AVAILABLE
ELAPSED_SECONDES: 116
PLUS_ARCHIVELOG: OFF
START_REPLAY_SCN: 1721808734620906000
START_REPLAY_SCN_DISPLAY: 2024-07-24 16:12:14.620906
MIN_RESTORE_SCN: 1721851325696715000
MIN_RESTORE_SCN_DISPLAY: 2024-07-25 04:02:05.696715000
INPUT_BYTES: 1959955261
OUTPUT_BYTES: 132953320
OUTPUT_RATE_BYTES: 1142489.3654
EXTRA_META_BYTES: 0
TABLET_COUNT: 4618
FINISH_TABLET_COUNT: 4618
MACRO_BLOCK_COUNT: 931
FINISH_MACRO_BLOCK_COUNT: 931
FILE_COUNT: 0
META_TURN_ID: 1
DATA_TURN_ID: 0
RESULT: 0
COMMENT:
ENCRYPTION_MODE: NONE
PASSWD:
TENANT_COMPATIBLE: 4.2.2.1
BACKUP_COMPATIBLE: 3
PATH: file:///backup/catl_cloudDB_MES/12/tenant_incarnation_1/1008/data
CLUSTER_VERSION: 4.2.2.1
CONSISTENT_SCN: 1721851254656875000
MINOR_TURN_ID: 1
MAJOR_TURN_ID: 1
show variables like ‘version_comment’;
帮忙贴一下完整的版本号 文本。
该问题已经完结。总结如下:
cdb_archivelog_progress
)都停止掉了。/backup
,那么 OB 集群备份路径建议设置为 /backup/obbackup
。这样以后如果想更换目录还可以调整为 /backup/obbackup2
,又不受此前的备份目录影响。