ocp告警通知-日志备份延迟

【 使用环境 】生产环境
【 OB or 其他组件 】
【 使用版本 】ocp:3.3.0 ; OB 版本号3.1.4
【问题描述】[OB日志告警] obcluster-10.100.1.101 ERROR
ob_cluster=obcluster:svr_ip=10.100.1.101:server_type=rootservice:ob_error_code=-1

  • 概述:[OB日志告警] obcluster-10.100.1.101 ERROR
  • 生成时间:2023-03-01T14:43:01+08:00
  • 详情:[OB日志告警] 集群=obcluster, 机器=10.100.1.101,日志类型=rootservice 错误码=-1, 错误名称=ERROR, 错误详情=[2023-03-02 10:49:22.246497] ERROR [RS] do_schedule_ (ob_log_archive_scheduler.cpp:1187) [107849][672][YB420A640165-0005F21BBA25BD7D] [lt=22] [dc=0] [LOG_ARCHIVE] log archive status is interrupted, need manual process(sys_info={status:{tenant_id:1, copy_id:0, start_ts:1672811232754425, checkpoint_ts:1677639099526678, status:5, incarnation:1, round:16, status_str:“INTERRUPTED”, is_mark_deleted:false, is_mount_file_created:true, compatible:1, backup_piece_id:77, start_piece_id:22}, backup_dest:“file:///data1/nfs/obbackup”}) BACKTRACE:0x9a99e6e 0x986e111 0x22a7974 0x22a745b 0x22a71c1 0x38b1b2c 0x70cbb75 0x70ca84b 0x6751a42 0x9a2ba8d 0x9a2b4be 0x340bbaf 0x2cac102 0x9821d75 0x9820762 0x981d21f

查看该节点的 nfs 挂载情况:
[root@ob-001 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 63G 0 63G 0% /dev
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 63G 779M 62G 2% /run
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/mapper/centos_ob–001-root 50G 5.8G 45G 12% /
/dev/sda1 1014M 182M 833M 18% /boot
/dev/mapper/centos_ob–001-home 504G 83G 421G 17% /home
/dev/sdb1 2.8T 2.3T 576G 80% /data/oceanbase/redo
/dev/sdb2 12T 5.1T 6.2T 45% /data/oceanbase/data
10.100.1.20:/data1/nfs 5.5T 394G 4.8T 8% /data1/nfs

挂载目录数据情况:

[root@ob-001 ~]# ls /data1/nfs/obbackup/obcluster/1/incarnation_1/1001/* -l
/data1/nfs/obbackup/obcluster/1/incarnation_1/1001/clog:
total 32
drwx------ 5 nfsnobody nfsnobody 4096 Feb 25 15:44 16_73_20230224
drwx------ 5 nfsnobody nfsnobody 4096 Feb 26 15:47 16_74_20230225
drwx------ 5 nfsnobody nfsnobody 4096 Feb 27 18:46 16_75_20230226
drwx------ 5 nfsnobody nfsnobody 4096 Feb 28 18:48 16_76_20230227
drwx------ 5 nfsnobody nfsnobody 4096 Mar 1 10:52 16_77_20230228
-rw------- 1 nfsnobody nfsnobody 6062 Mar 1 11:46 backup_piece_info
-rw------- 1 nfsnobody nfsnobody 87 Mar 1 11:46 tenant_clog_backup_info

/data1/nfs/obbackup/obcluster/1/incarnation_1/1001/data:
total 36
drwx------ 4 nfsnobody nfsnobody 4096 Feb 25 04:18 backup_set_75_full_20230225
drwx------ 4 nfsnobody nfsnobody 4096 Feb 26 04:02 backup_set_76_inc_20230226
drwx------ 4 nfsnobody nfsnobody 4096 Feb 27 04:02 backup_set_77_inc_20230227
drwx------ 4 nfsnobody nfsnobody 4096 Feb 28 04:02 backup_set_78_inc_20230228
drwx------ 4 nfsnobody nfsnobody 4096 Mar 1 04:08 backup_set_79_full_20230301
-rw------- 1 nfsnobody nfsnobody 10675 Mar 1 11:46 tenant_backup_set_file_info
-rw------- 1 nfsnobody nfsnobody 385 Mar 1 11:46 tenant_data_backup_info

1 个赞

看日志的内容是日志备份中断了,需要再手动发起一下
alter system noarchivelog;
alter system archivelog;

是用登录sys租户 的 root 用户 操作ob集群么?

3.1.4版本 是的

日志备份中断了 的原因是什么呢?

【SOP 系列 10】物理备份恢复问题排查相关 参考这个文档中的断流章节排查下

1 个赞

ocp 再次发起备份操作可以吗?

备份的磁盘一直没挂成功,不知道为什么没告警