ocp重启obcluster任务卡主

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】OCP
【 使用版本 】3.3.0
【问题描述】
使用ocp重启集群提示如下:
2022-07-11 19:05:09.915 WARN 45 — [pool-manual-subtask-executor7,0766d5c77b9b4bff,9b616c25b5c5] c.a.o.c.t.engine.runner.RunnerFactory : Execute task failed, subtask=SubtaskInstanceEntity{id=49403, name=Sync cluster info, state=FAILED, operation=RETRY, className=com.alipay.ocp.service.task.business.cluster.SyncClusterInfoTask, seriesId=26, startTime=2022-07-11T19:04:49.390+08:00, endTime=2022-07-11T19:05:09.914+08:00}

68

69

java.lang.RuntimeException: com.alipay.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=OB_CLUSTER_CONNECT_FAILED, args=obcluster,ocp_monitor

70

at com.alipay.ocp.common.pattern.Retry.executeUntilSuccessWithLimit(Retry.java:173) ~[ocp-common-3.3.0-20220427.jar!/:3.3.0-20220427]

71

at com.alipay.ocp.common.pattern.Retry.executeUntilSuccessWithLimit(Retry.java:188) ~[ocp-common-3.3.0-20220427.jar!/:3.3.0-20220427]

72

at com.alipay.ocp.service.task.business.cluster.SyncClusterInfoTask.run(SyncClusterInfoTask.java:44) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

73

at com.alipay.ocp.core.task.runtime.Subtask.retry(Subtask.java:49) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

74

at com.alipay.ocp.core.metadb.task.model.SubtaskInstanceEntity.retry(SubtaskInstanceEntity.java:233) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

75

at com.alipay.ocp.core.task.engine.runner.JavaTaskRunner.doExecute(JavaTaskRunner.java:30) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

76

at com.alipay.ocp.core.task.engine.runner.JavaTaskRunner.run(JavaTaskRunner.java:20) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

77

at com.alipay.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:113) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

78

at com.alipay.ocp.core.task.engine.runner.RunnerFactory.redirectOutputIfNotSysSchedule(RunnerFactory.java:185) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

79

at com.alipay.ocp.core.task.engine.runner.RunnerFactory.run(RunnerFactory.java:102) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

80

at com.alipay.ocp.core.t

81

ask.engine.coordinator.worker.subtask.ReadySubtaskWorker.lambda$submitTask$3(ReadySubtaskWorker.java:123) ~[ocp-core-3.3.0-20220427.jar!/:3.3.0-20220427]

82

at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_312]

83

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_312]

84

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_312]

85

at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_312]

86

Caused by: com.alipay.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=OB_CLUSTER_CONNECT_FAILED, args=obcluster,ocp_monitor

87

at com.alipay.ocp.service.oceanbase.obsdk.factory.AbstractObOperatorFactory.getObOperator(AbstractObOperatorFactory.java:161) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

88

at com.alipay.ocp.service.oceanbase.obsdk.factory.AbstractObOperatorFactory.createObOperator(AbstractObOperatorFactory.java:121) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

89

at com.alipay.ocp.service.oceanbase.obsdk.factory.AbstractObOperatorFactory.createObOperator(AbstractObOperatorFactory.java:125) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

90

at com.alipay.ocp.service.operation.ob.cluster.ClusterSyncService.zoneEntitiesCurrent(ClusterSyncService.java:530) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

91

at com.alipay.ocp.service.operation.ob.cluster.ClusterSyncService.internalSyncCluster(ClusterSyncService.java:241) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

92

at com.alipay.ocp.service.operation.ob.cluster.ClusterSyncService.syncClusterInfo(ClusterSyncService.java:192) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

93

at com.alipay.ocp.service.task.business.cluster.SyncClusterInfoTask.lambda$run$0(SyncClusterInfoTask.java:45) ~[ocp-service-3.3.0-20220427.jar!/:3.3.0-20220427]

94

at com.alipay.ocp.common.pattern.Retry.lambda$executeUntilSuccessWithLimit$0(Retry.java:189) ~[ocp-common-3.3.0-20220427.jar!/:3.3.0-20220427]

95

at com.alipay.ocp.common.pat

96

tern.Retry.executeUntilSuccessWithLimit(Retry.java:169) ~[ocp-common-3.3.0-20220427.jar!/:3.3.0-20220427]
另外在
【复现路径】


在密码箱添加了ocp_monitor的用户和密码,仍然提示错误

【问题现象及影响】
集群无法重启,集群管理和监控出现问题
【附件】
subtask_49403.log (61.5 KB)

可以将这个配置项obsdk.ob.connection.mode改成direct,用直连的方式来连接ob集群,ocp_monitor 账号是自动创建的,如果确认密码箱中已经有了,可能是其他原因。

1 个赞