【 使用环境 】测试环境
【 OB or 其他组件 】ocp、ocp_agent、observer
【 使用版本 】ocp4.3.2、oceanbase4.2.4
【问题描述】使用OCP进行集群备份成功后,不显示可恢复时间
【附件及日志】
我想通过备份文件创建备租户,提示备份文件不完整
【 使用环境 】测试环境
【 OB or 其他组件 】ocp、ocp_agent、observer
【 使用版本 】ocp4.3.2、oceanbase4.2.4
【问题描述】使用OCP进行集群备份成功后,不显示可恢复时间
【附件及日志】
我想通过备份文件创建备租户,提示备份文件不完整
正在查看中
已经找到原因了,oceanbase的bin目录下ob_admin丢了,我又复制了一份好了,但是不知道为啥ob_admin文件会丢,昨天升级过一次ocp,我的ob_admin文件是在bin目录上一层而不是bin目录内
2024-11-12 19:58:41.988 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.152:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:58:41.991 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=1000001, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:58:42.004 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.c.agent.HostAgentServiceImpl : Finding OCP agent: hostId=2000001
2024-11-12 19:58:42.020 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.c.a.p.HostAgentProcessServiceImpl : Getting all OCP agent processes on host 2000001
2024-11-12 19:58:42.111 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.153:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:58:42.112 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=2000001, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:58:42.115 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.c.agent.HostAgentServiceImpl : Finding OCP agent: hostId=2000003
2024-11-12 19:58:42.118 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.c.a.p.HostAgentProcessServiceImpl : Getting all OCP agent processes on host 2000003
2024-11-12 19:58:42.136 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.154:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:58:42.137 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=2000003, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:58:42.138 ERROR 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.b.i.d.info.BackupInfoServiceImpl : parse cluster backup info failed, caused by [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=BACKUP_FILE_ACCESS_FILE_ERROR, args=[1000001, 2000001, 2000003],/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:58:42.139 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-15,676292de17af8530,128db42070726511] c.o.o.s.c.trace.RequestTracingAspect : API OK: [GET /api/v2/ob/clusters/2000003/backup/info client=2.2.2.83, traceId=676292de17af8530, duration=490 ms]
这个是ocp-server的错误日志
2024-11-12 19:51:27.909 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
2024-11-12 19:51:27.909 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.c.a.p.HostAgentProcessServiceImpl : Getting all OCP agent processes on host 1000001
2024-11-12 19:51:27.910 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: SHOW VARIABLES LIKE ‘system_time_zone’
2024-11-12 19:51:27.912 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.b.c.o.o.c.MysqlClusterOperator : get system time zone: +08:00
2024-11-12 19:51:27.912 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.o.obsdk.connector.ObConnectorHolder : [obsdk] no ob connector found in holder, key=ObConnectorKey(connectionMode=direct, clusterName=obd, obClusterId=1703148157, tenantName=sys, username=root, address=172.18.90.152, port=2881, database=oceanbase)
2024-11-12 19:51:27.913 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
2024-11-12 19:51:27.914 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: SELECT tenant_id, name, value, gmt_create, gmt_modified FROM __all_virtual_sys_variable WHERE name = ?
2024-11-12 19:51:27.923 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-2,18cf47e651097426,ba471d1b1a2c4fd8] c.o.o.s.c.trace.RequestTracingAspect : API OK: [GET /api/v2/ob/clusters/2000003/backup/alarms/history client=2.2.2.83, traceId=18cf47e651097426, duration=79 ms]
2024-11-12 19:51:27.925 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.b.c.o.o.c.MysqlClusterOperator : get system time zone: +08:00
2024-11-12 19:51:27.925 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.b.c.o.o.c.MysqlClusterOperator : get system time zone: +08:00
2024-11-12 19:51:27.925 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.b.c.o.o.c.MysqlClusterOperator : get system time zone: +08:00
2024-11-12 19:51:27.925 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.o.obsdk.connector.ObConnectorHolder : [obsdk] no ob connector found in holder, key=ObConnectorKey(connectionMode=direct, clusterName=obd, obClusterId=1703148157, tenantName=sys, username=root, address=172.18.90.152, port=2881, database=oceanbase)
2024-11-12 19:51:27.926 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
2024-11-12 19:51:27.944 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: select * from (SELECT job_id, incarnation, job.tenant_id AS tenant_id, job.backup_set_id AS backup_set_id, backup_type, path AS backup_dest, start_timestamp AS start_time, end_timestamp AS end_time, now(6) AS check_time, task.data_progress AS data_progress, status, comment, description, ‘TENANT’ AS backup_level, file.min_restore_scn_display AS snapshot_version_time, output_bytes FROM ( SELECT job_id, incarnation, tenant_id, backup_set_id, backup_type, path, start_timestamp, NULL AS end_timestamp, status, comment, description FROM CDB_OB_BACKUP_JOBS UNION SELECT job_id, incarnation, tenant_id, backup_set_id, backup_type, path, start_timestamp, end_timestamp, status, comment, description FROM CDB_OB_BACKUP_JOB_HISTORY ) job LEFT JOIN (SELECT tenant_id AS tid, backup_set_id, min_restore_scn_display, output_bytes FROM CDB_OB_BACKUP_SET_FILES) file ON job.tenant_id = file.tid AND job.backup_set_id = file.backup_set_id LEFT JOIN (SELECT tenant_id AS tid, backup_set_id, data_progress FROM CDB_OB_BACKUP_TASKS) task ON job.tenant_id = task.tid AND job.backup_set_id = task.backup_set_id WHERE tenant_id != 1 ORDER BY start_time DESC)
2024-11-12 19:51:27.996 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-11,a214eb1bdf23a90d,8e24fe666bdc86b1] c.o.o.s.c.trace.RequestTracingAspect : API OK: [GET /api/v2/ob/clusters/2000003/backup/task/dataBackupTasks client=2.2.2.83, traceId=a214eb1bdf23a90d, duration=127 ms]
2024-11-12 19:51:27.998 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.152:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:51:28.000 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=1000001, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:51:28.003 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.c.agent.HostAgentServiceImpl : Finding OCP agent: hostId=2000001
2024-11-12 19:51:28.004 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.c.a.p.HostAgentProcessServiceImpl : Getting all OCP agent processes on host 2000001
2024-11-12 19:51:28.019 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.153:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:51:28.023 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=2000001, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:51:28.025 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.c.agent.HostAgentServiceImpl : Finding OCP agent: hostId=2000003
2024-11-12 19:51:28.026 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.c.a.p.HostAgentProcessServiceImpl : Getting all OCP agent processes on host 2000003
2024-11-12 19:51:28.040 INFO 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://172.18.90.154:62888/api/v1/backup/file/exists, request body:CheckBackupFileExistsRequest(path=/data/obbackup/obd/1703148157/tenant_incarnation_1), params:null
2024-11-12 19:51:28.045 WARN 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] .o.o.b.i.d.f.c.RemoteNfsBackupFileParser : try access backup file failed, hostId=2000003, path=/data/obbackup/obd/1703148157/tenant_incarnation_1
2024-11-12 19:51:28.045 ERROR 2794383 — [http-nio-0.0.0.0-8180-exec-6,cf7f73203435929e,688b4188384ce665] c.o.o.b.i.d.info.BackupInfoServiceImpl : parse cluster backup info failed, caused by [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=BACKUP_FILE_ACCESS_FILE_ERROR, args=[1000001, 2000001, 2000003],/data/obbackup/obd/1703148157/tenant_incarnation_1
com.oceanbase.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=BACKUP_FILE_ACCESS_FILE_ERROR, args=[1000001, 2000001, 2000003],/data/obbackup/obd/1703148157/tenant_incarnation_1
at com.oceanbase.ocp.backup.internal.data.file.common.RemoteNfsBackupFileParser.accessBackupFileForOneHost(RemoteNfsBackupFileParser.java:102)
at com.oceanbase.ocp.backup.internal.data.file.common.RemoteNfsBackupFileParser.doesFileExist(RemoteNfsBackupFileParser.java:38)
at com.oceanbase.ocp.backup.internal.data.file.common.NfsBackupFileParser.preCheck(NfsBackupFileParser.java:10)
at com.oceanbase.ocp.backup.internal.data.file.common.RemoteNfsBackupFileParser.getFileListNoRecursively(RemoteNfsBackupFileParser.java:79)
at com.oceanbase.ocp.backup.internal.data.file.physical.PhysicalBackupFileManager.getFileList(PhysicalBackupFileManager.java:83)
at com.oceanbase.backup.core.data.info.PhysicalAdvancedBackupInfoManager.getTenantNameInfo(PhysicalAdvancedBackupInfoManager.java:216)
at com.oceanbase.ocp.backup.internal.data.info.BackupInfoServiceImpl.parseObClusterBackupInfo(BackupInfoServiceImpl.java:200)
at com.oceanbase.ocp.backup.internal.facadeimpl.proxy.ProxyBackupServiceImpl.parseObClusterBackupInfo(ProxyBackupServiceImpl.java:861)
at com.oceanbase.ocp.server.common.controller.ObClusterBackupController.parseBackupInfo(ObClusterBackupController.java:172)
at com.oceanbase.ocp.server.common.controller.ObClusterBackupController$$FastClassBySpringCGLIB$$94825ea9.invoke()
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:792)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:762)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
at com.oceanbase.ocp.server.common.aspect.OperationEventAspect.aroundAuditEvent(OperationEventAspect.java:128)
at sun.reflect.GeneratedMethodAccessor1313.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:634)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:624)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:72)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:762)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:64)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
OCP 通过 OBServer 的 ob_admin 工具,解析备份源文件,然后根据解析结果,分析数据备份集和日志归档区间计算得到解析可恢复时间,ob_admin是需要有的
ob你这个OB集群是obd部署的还是ocp部署的?
这里obd的集群是指什么?是指另一集群吗?obd部署的吗?
我是通过ocp部署的ob集群,名称叫obd
其中ocp集群所在的服务器上缺失ob_admin,复制了一份过去,可以看到恢复节点了(但是不知道为什么会没有ob_admin文件,之前一直是可以看到恢复节点的,期间通过obd白屏升级过一次ocp,不知道是不是该原因)
obd集群(通过ocp建的业务用的集群)所在的服务上有ob_admin文件,但是还是没有恢复节点,通过排查,是因为nfs挂载在windows服务器上,windows的nfs-server端配置的用户权限问题
只是升级了OCP吗?从什么版本升级到什么版本?有没有升级OB?这个OCP集群是几个节点?