【 使用环境 】测试环境
【 OB or 其他组件 】 observer
【 使用版本 】3.1
【问题描述】通过ocp平台多次停止集群时,都是会在stopzone的时候卡主,需要不断重试停止集群的任务才会停止下去
【相关日志】
2022-08-23 11:46:36.760 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.m.t.model.SubtaskInstanceEntity : Run subtask, id=2090411, context=Context{parallelIdx=-1, stringMap={cluster_name=ymob_prod, target_server_status=RUNNING, prohibit_rollback=false, former_cluster_status=RUNNING, target_zone_status=RUNNING, task_instance_id=2079591, task_operation=execute, zone_name=zone2, ob_cluster_id=1660632206, cluster_id=1000002, ocpagent_service_name=agent, target_cluster_status=RUNNING, latest_execution_start_time=2022-08-23T11:46:36.755+08:00, sub_task_instance_id=2090411}, listMap={1660632206.ymob_prod.zone3.host_ids=[1000003], server_ids=[1000005], 1660632206.ymob_prod.zone1.server_ids=[1000004], 1660632206.ymob_prod.zone3.server_ids=[1000006], 1660632206.ymob_prod.zone2.server_ids=[1000005], 1660632206.ymob_prod.zone1.host_ids=[1000002], host_ids=[1000001], 1660632206.ymob_prod.zone2.host_ids=[1000001], zone_names=[zone1, zone2, zone3]}}, executor=10.221.22.221
2
3
2022-08-23 11:46:36.768 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.business.zone.StopObZoneTask : try to stop zone by ob cmd, clusterId=1000002, obClusterId=1660632206, clusterType=PRIMARY, zoneName=zone2
4
5
2022-08-23 11:46:36.768 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : begin to stop zone, clusterId=1000002, zone=zone2
6
7
2022-08-23 11:46:36.780 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
8
9
2022-08-23 11:46:36.782 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: select max(value) value from oceanbase.__all_virtual_sys_parameter_stat where name = ‘min_observer_version’
10
11
2022-08-23 11:46:36.803 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
12
13
2022-08-23 11:46:36.804 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: SELECT zone
, MAX(CASE name
WHEN ‘region’ THEN info
ELSE ‘’ END ) region
, MAX(CASE name
WHEN ‘idc’ THEN info
ELSE ‘’ END ) idc
, MAX(CASE name
WHEN ‘status’ THEN info
ELSE ‘’ END ) status
, MAX(CASE name
WHEN ‘merge_status’ THEN info
ELSE ‘’ END ) merge_status
, MAX(CASE name
WHEN ‘broadcast_version’ THEN value
ELSE 0 END ) broadcast_version
, MAX(CASE name
WHEN ‘all_merged_version’ THEN value
ELSE 0 END ) all_merged_version
, MAX(CASE name
WHEN ‘last_merged_version’ THEN value
ELSE 0 END ) last_merged_version
, MAX(CASE name
WHEN ‘merge_start_time’ THEN value
ELSE 0 END ) merge_start_time
, MAX(CASE name
WHEN ‘last_merged_time’ THEN value
ELSE 0 END ) last_merged_time
, MAX(CASE name
WHEN ‘is_merge_timeout’ THEN value
ELSE 0 END ) merge_timeout
FROM oceanbase.__all_zone WHERE zone
<> ‘’ GROUP BY zone
14
15
2022-08-23 11:46:36.806 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
16
17
2022-08-23 11:46:36.808 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]
18
19
2022-08-23 11:46:36.809 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone2]
20
21
2022-08-23 11:46:36.823 ERROR 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=196738) not enough member or quorum mismatch, stop zone not allowed; nested exception is java.sql.SQLTransientConnectionException: (conn=196738) not enough member or quorum mismatch, stop zone not allowed]
22
23
2022-08-23 11:46:36.824 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
24
25
2022-08-23 11:46:36.825 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : operate zone failed, exception msg=SQL [alter system stop zone ?; args:zone2]; SQL state [HY000]; error code [4179]; message [(conn=196738) not enough member or quorum mismatch, stop zone not allowed]
26
27
2022-08-23 11:46:36.826 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] com.alipay.ocp.common.pattern.Retry : wait for 15 seconds
28
29
2022-08-23 11:46:51.828 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
30
31
2022-08-23 11:46:51.830 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]
32
33
2022-08-23 11:46:51.831 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone2]
34
35
2022-08-23 11:46:51.844 ERROR 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=196738) not enough member or quorum mismatch, stop zone not allowed; nested exception is java.sql.SQLTransientConnectionException: (conn=196738) not enough member or quorum mismatch, stop zone not allowed]
36
37
2022-08-23 11:46:51.846 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
38
39
2022-08-23 11:46:51.847 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : operate zone failed, exception msg=SQL [alter system stop zone ?; args:zone2]; SQL state [HY000]; error code [4179]; message [(conn=196738) not enough member or quorum mismatch, stop zone not allowed]
40
41
2022-08-23 11:46:51.848 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] com.alipay.ocp.common.pattern.Retry : wait for 15 seconds
42
43
2022-08-23 11:47:06.849 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
44
45
2022-08-23 11:47:06.851 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]
46
47
2022-08-23 11:47:06.853 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone2]
48
49
2022-08-23 11:47:06.963 ERROR 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=196738) log is not sync, cannot stop zone not allowed; nested exception is java.sql.SQLTransientConnectionException: (conn=196738) log is not sync, cannot stop zone not allowed]
50
51
2022-08-23 11:47:06.965 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
52
53
2022-08-23 11:47:06.966 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : operate zone failed, exception msg=SQL [alter system stop zone ?; args:zone2]; SQL state [HY000]; error code [4179]; message [(conn=196738) log is not sync, cannot stop zone not allowed]
54
55
2022-08-23 11:47:06.967 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] com.alipay.ocp.common.pattern.Retry : wait for 15 seconds
56
57
2022-08-23 11:47:21.969 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
58
59
2022-08-23 11:47:21.971 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]
60
61
2022-08-23 11:47:21.973 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone2]
62
63
2022-08-23 11:47:22.084 ERROR 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=196738) log is not sync, cannot stop zone not allowed; nested exception is java.sql.SQLTransientConnectionException: (conn=196738) log is not sync, cannot stop zone not allowed]
64
65
2022-08-23 11:47:22.085 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
66
67
2022-08-23 11:47:22.087 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : operate zone failed, exception msg=SQL [alter system stop zone ?; args:zone2]; SQL state [HY000]; error code [4179]; message [(conn=196738) log is not sync, cannot stop zone not allowed]
68
69
2022-08-23 11:47:22.088 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] com.alipay.ocp.common.pattern.Retry : wait for 15 seconds
70
71
2022-08-23 11:47:37.090 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
72
73
2022-08-23 11:47:37.091 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]
74
75
2022-08-23 11:47:37.093 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone2]
76
77
2022-08-23 11:47:37.203 ERROR 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=196738) log is not sync, cannot stop zone not allowed; nested exception is java.sql.SQLTransientConnectionException: (conn=196738) log is not sync, cannot stop zone not allowed]
78
79
2022-08-23 11:47:37.204 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]
80
81
2022-08-23 11:47:37.206 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.s.t.b.zone.ObZoneTaskHandler : operate zone failed, exception msg=SQL [alter system stop zone ?; args:zone2]; SQL state [HY000]; error code [4179]; message [(conn=196738) log is not sync, cannot stop zone not allowed]
82
83
2022-08-23 11:47:37.207 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] com.alipay.ocp.common.pattern.Retry : wait for 15 seconds
84
85
2022-08-23 11:47:52.208 INFO 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.m.t.model.SubtaskInstanceEntity : Set state for subtask: 2090411, current state: RUNNING, new state: FAILED
86
87
2022-08-23 11:47:52.209 WARN 464 — [pool-subtask-executor-thread-1,17b22052ec594397,22d60764fdc6] c.a.o.c.t.engine.runner.RunnerFactory : Execute task failed, subtask=SubtaskInstanceEntity{id=2090411, name=Stop zone, state=FAILED, operation=EXECUTE, className=com.alipay.ocp.servi
【问题现象及影响】
【附件】