OCP中备份任务成功但恢复时无目标源租户

当前OCP版本我是恢复过几次的,不知道为啥这次列表突然获取不到了,根版本关系我感觉不大

除了升级还有其他方案吗,目前所有正常运行的租户都不显示,只显示了被删除的租户

@旭辉 大佬帮忙看看嘞

之前有过几次断电,重启集群后恢复正常了,不知道是不是跟元数据有关

看起来备份文件在,有没有check list检查或者有没有手动恢复的方法呀

@旭辉
我看了您

这个帖子,执行了
/home/admin/oceanbase/bin/ob_admin dump_backup -d ‘oss://xxx-dmas/main_data/1/tenant_incarnation_1/1004/data?host=oss-cn-hangzhou.aliyuncs.com

/home/admin/oceanbase/bin/ob_admin dump_backup -d ‘oss://xxx-dmas/main_data/1/tenant_incarnation_1/1004/clog?host=oss-cn-hangzhou.aliyuncs.com

不过还是没有恢复,看起来我和这位同志遇到的问题的表象是一样的

执行data的结果是
succ to open, filename=/data/ob_log//ob_admin.log, fd=3, wf_fd=2
succ to open, filename=/data/ob_log//ob_admin_rs.log, fd=4, wf_fd=2

其中/data/ob_log//ob_admin.log有明显报错

[2026-01-14 20:21:17.911085] INFO  main (main.cpp:122) [76745][][T0][Y0-0000000000000000-0-0] [lt=0] cmd: [/home/admin/oceanbase/bin/ob_admin dump_backup -d oss://xxx-dmas/main_data/1/tenant_incarnation_1/1004/data?host=oss-cn-hangzhou.aliyuncs.com ]
[2026-01-14 20:21:17.913803] INFO  [LIB] ObSliceAlloc (ob_slice_alloc.h:321) [76745][][T0][Y0-0000000000000000-0-0] [lt=44] ObSliceAlloc init finished(bsize_=7936, isize_=40, slice_limit_=7536, tmallocator_=NULL)
[2026-01-14 20:21:17.913887] INFO  [LIB] ObSliceAlloc (ob_slice_alloc.h:321) [76745][][T0][Y0-0000000000000000-0-0] [lt=37] ObSliceAlloc init finished(bsize_=7936, isize_=160, slice_limit_=7536, tmallocator_=NULL)
[2026-01-14 20:21:17.913996] WDIAG [STORAGE] set (ob_storage_info.cpp:104) [76745][][T500][Y0-0000000000000000-0-0] [lt=5][errcode=-9026] storage info is empty(ret=-9026, device_type=0)
[2026-01-14 20:21:17.914013] WDIAG [SHARE] set (ob_backup_struct.cpp:1502) [76745][][T500][Y0-0000000000000000-0-0] [lt=12][errcode=-9026] failed to set storage info(ret=-9026)
[2026-01-14 20:21:17.914021] WDIAG [STORAGE] check_tenant_backup_path_type_ (ob_admin_dump_backup_data_executor.cpp:2716) [76745][][T500][Y0-0000000000000000-0-0] [lt=5][errcode=-9026] fail to set backup dest(ret=-9026)
[2026-01-14 20:21:17.914039] WDIAG [STORAGE] execute (ob_admin_dump_backup_data_executor.cpp:546) [76745][][T500][Y0-0000000000000000-0-0] [lt=4][errcode=-9026] fail to check tenant backup path type(ret=-9026)
[2026-01-14 20:21:17.914043] WDIAG [COMMON] main (main.cpp:154) [76745][][T500][Y0-0000000000000000-0-0] [lt=4][errcode=-9026] Fail to executor cmd, (ret=-9026)

ob_admin_rs.log是以下内容

[2026-01-14 20:21:17.926483] INFO  [RS] destroy (ob_root_service.cpp:949) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] [ROOTSERVICE_NOTICE] start to destroy rootservice
[2026-01-14 20:21:17.926511] INFO  [RS] destroy (ob_root_service.cpp:961) [76745][][T500][Y0-0000000000000000-0-0] [lt=16] lost replica checker destroy
[2026-01-14 20:21:17.926520] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [76745][][T500][Y0-0000000000000000-0-0] [lt=6] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2026-01-14 20:21:17.926530] INFO  [RS] destroy (ob_root_service.cpp:969) [76745][][T500][Y0-0000000000000000-0-0] [lt=10] root balance destroy
[2026-01-14 20:21:17.926533] INFO  [RS] destroy (ob_root_service.cpp:976) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] empty server checker destroy
[2026-01-14 20:21:17.926538] INFO  [RS] destroy (ob_root_service.cpp:983) [76745][][T500][Y0-0000000000000000-0-0] [lt=5] rs_monitor_check : thread checker destroy
[2026-01-14 20:21:17.926541] INFO  [RS] destroy (ob_root_service.cpp:989) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] schema history recycler destroy
[2026-01-14 20:21:17.926547] INFO  [RS] destroy (ob_root_service.cpp:993) [76745][][T500][Y0-0000000000000000-0-0] [lt=5] inner queue destroy
[2026-01-14 20:21:17.926550] INFO  [RS] destroy (ob_root_service.cpp:995) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] inspect queue destroy
[2026-01-14 20:21:17.926555] INFO  [RS] destroy (ob_root_service.cpp:997) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] ddl builder destroy
[2026-01-14 20:21:17.926559] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2026-01-14 20:21:17.926564] INFO  [RS] destroy (ob_root_service.cpp:1002) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] heartbeat checker destroy
[2026-01-14 20:21:17.926575] INFO  [RS] destroy (ob_root_service.cpp:1006) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] event table operator destroy
[2026-01-14 20:21:17.926605] WDIAG [RS] destroy (ob_dbms_job_master.cpp:96) [76745][][T500][Y0-0000000000000000-0-0] [lt=5][errcode=-4006] scheduler task not inited(ret=-4006, inited_=false)
[2026-01-14 20:21:17.926629] INFO  [RS] destroy (ob_root_service.cpp:1009) [76745][][T500][Y0-0000000000000000-0-0] [lt=21] ObDBMSJobMaster destroy
[2026-01-14 20:21:17.926634] INFO  [RS] destroy (ob_root_service.cpp:1012) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] ddl task scheduler destroy
[2026-01-14 20:21:17.926637] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2026-01-14 20:21:17.926640] INFO  [RS] destroy (ob_root_service.cpp:1027) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] disaster recovery task mgr destroy
[2026-01-14 20:21:17.926649] WDIAG [RS] destroy (ob_dbms_sched_job_master.cpp:95) [76745][][T500][Y0-0000000000000000-0-0] [lt=4][errcode=-4006] scheduler task not inited(ret=-4006, inited_=false)
[2026-01-14 20:21:17.926653] INFO  [RS] destroy (ob_root_service.cpp:1031) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] ObDBMSSchedJobMaster destroy
[2026-01-14 20:21:17.926658] INFO  [RS] destroy (ob_root_service.cpp:1033) [76745][][T500][Y0-0000000000000000-0-0] [lt=5] global ctx timer destroyed
[2026-01-14 20:21:17.926663] INFO  [RS] destroy (ob_root_service.cpp:1042) [76745][][T500][Y0-0000000000000000-0-0] [lt=3] [ROOTSERVICE_NOTICE] destroy rootservice end(ret=0, ret="OB_SUCCESS")
[2026-01-14 20:21:17.928262] INFO  [RS] stop (ob_disaster_recovery_task_table_updater.cpp:188) [76745][][T500][Y0-0000000000000000-0-0] [lt=13] stop ObDRTaskTableUpdater success
[2026-01-14 20:21:17.928268] INFO  [RS] wait (ob_disaster_recovery_task_table_updater.cpp:194) [76745][][T500][Y0-0000000000000000-0-0] [lt=6] wait ObDRTaskTableUpdater

@辞霜 @旭辉
hello大佬,我搜了下ob_admin的OSS用法,目前已经执行了,看起来前几步是好的,后面报错了
ob_admin.log (186.6 KB)
ob_admin_rs.log (4.0 KB)

我AI了一下好像是说备份数据太多,超出缓冲区了,我看了下我确实有很多备份,都是全量的

OB是什么版本?

show variables like '%version_comment%';

之前是正常的?突然就选不到对应源租户吗?

[2026-01-15 09:45:37.676056] WDIAG [STORAGE] dump_tenant_backup_set_infos_ (ob_admin_dump_backup_data_executor.cpp:2364) [116413][][T500][Y0-0000000000000000-0-0] [lt=34][errcode=-4019] fail to printf buf(ret=-4019, i=99)
[2026-01-15 09:45:37.676094] WDIAG [STORAGE] dump_tenant_backup_path_ (ob_admin_dump_backup_data_executor.cpp:997) [116413][][T500][Y0-0000000000000000-0-0] [lt=34][errcode=-4019] fail to dump tenant backup set infos(ret=-4019)
[2026-01-15 09:45:37.676126] WDIAG [STORAGE] execute (ob_admin_dump_backup_data_executor.cpp:550) [116413][][T500][Y0-0000000000000000-0-0] [lt=6][errcode=-4019] fail to dump tenant backup path
[2026-01-15 09:45:37.676133] WDIAG [COMMON] main (main.cpp:154) [116413][][T500][Y0-0000000000000000-0-0] [lt=5][errcode=-4019] Fail to executor cmd, (ret=-4019)

这个报错触发条件:使用ob_admin dump_backup打印超过99个备份集的信息。

OB是:OceanBase_CE 4.2.2.1 (r101000012024030709-083a68a2907b6a1a12138c4a9e0994949166bfba) (Built Mar 7 2024 10:10:58)

之前是好的,OCP上成功恢复过几次,然后服务器之前断电过,但是重启后恢复了,没有注意到备份有没有问题,是昨天突然发现没法选择到正常租户了,目前是所有正在运行的租户都没法恢复,展示的列表是已经删除的租户

备份任务都是正常调度,然后文件实体也存在,就是发起恢复的时候选不到了

有没有办法解决一下嘞,比如指定某个全量备份,还是说我只能清理旧备份,如果清理旧备份的话该如何操作呀

ocp配置了集群的备份策略么。如果没有可以配置一份会自动进行备份清理
或者 黑屏设置清理策略,https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000004476467

是有配置策略的,日志保留最近180天,不过我看集群备份概览里有1600多页,24年的还保留着

SELECT POLICY_NAME, RECOVERY_WINDOW FROM oceanbase.DBA_OB_BACKUP_DELETE_POLICY;

没有数据

备份集没看出什么问题来啊

1 个赞

你是不是sys租户查询的,用CDB视图看看
SELECT * FROM oceanbase.CDB_OB_BACKUP_DELETE_POLICY;

是的,现在有了

手动触发一下清理任务试试
ALTER SYSTEM DELETE OBSOLETE BACKUP [TENANT [=] tenant_name];

ALTER SYSTEM DELETE OBSOLETE BACKUP TENANT data_office;