ocp 重装observe


把 zone 4 240 都删除掉 , 然后 把240 添加到 zone1 , 然后再删除 239 。 这样也能正常。

除了这种办法还有其他办法吗?想了解下这个原理,改primaryzone 为啥也不行

谢谢我看看


是不是和现在正在重装的任务有关系,可以跳过当前任务节点,往下走吗

2025-03-04 11:32:09.409 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.o.c.t.e.runner.JavaSubtaskRunner : Run subtask, id=60301, context=Context{parallelIdx=-1, stringMap={ob_log_disk_path=/data/log1, task_instance_id=60782, task_operation=execute, zone_name=zone4, ob_run_path=/home/admin/oceanbase, service_version=4.2.1.8, ob_install_path=/home/admin/oceanbase, ob_sql_port=2881, cluster_id=2, ob_disk_path_style=DEFAULT, ob_svr_port=2882, sub_task_instance_name=Stop zone, sub_task_instance_id=60301, cluster_name=ndcatl, target_server_status=DELETED, subtask_splitter=server_ids, service_name=ndcatl:1740736713, former_cluster_status=RUNNING, target_zone_status=DELETED, ob_cluster_id=1740736713, service_type=OB_CLUSTER, target_cluster_status=RUNNING, ob_run_user=admin, latest_execution_start_time=2025-03-04T11:32:09.275+08:00, ob_data_disk_path=/data/1}, listMap={server_ids=[6], host_ids=[3]}}, executor=10.38.36.237

2025-03-04 11:32:09.475 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.o.s.t.business.zone.StopObZoneTask : try to stop zone by ob cmd, clusterId=2, obClusterId=1740736713, clusterType=PRIMARY, zoneName=zone4

2025-03-04 11:32:09.480 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.o.s.t.b.zone.ObZoneTaskHandler : begin to stop zone, clusterId=2, zone=zone4

2025-03-04 11:32:09.544 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ObConnectors : [obsdk]:connected server ip:10.38.36.243, sql port:2881

2025-03-04 11:32:09.551 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ObConnectors : [obsdk]:connected server ip:10.38.36.243, sql port:2881

2025-03-04 11:32:09.555 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]

2025-03-04 11:32:09.562 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: SELECT zone, status, region, idc FROM DBA_OB_ZONES

2025-03-04 11:32:09.569 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ObConnectors : [obsdk]:connected server ip:10.38.36.243, sql port:2881

2025-03-04 11:32:09.574 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]

2025-03-04 11:32:09.579 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [1800000000]

2025-03-04 11:32:09.584 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: alter system stop zone ?, args: [zone4]

2025-03-04 11:32:09.590 WARN 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] update failed, sql:[alter system stop zone ?], error message:[PreparedStatementCallback; SQL [alter system stop zone ?]; (conn=3222379530) cannot stop server or stop zone in multiple zones; nested exception is java.sql.SQLTransientConnectionException: (conn=3222379530) cannot stop server or stop zone in multiple zones]

2025-03-04 11:32:09.594 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : Last Trace Info:[YB420A2624F3-00062F6C3CE88FB9-0-0]

2025-03-04 11:32:09.599 INFO 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.ocp.obsdk.connector.ConnectTemplate : [obsdk] sql: set ob_query_timeout = ?, args: [10000000]

2025-03-04 11:32:09.604 ERROR 17347 — [pool-manual-subtask-executor16,4a4a825694ee4afe,ec0648bd687f] c.o.o.c.t.e.c.w.subtask.SubtaskExecutor : cannot stop server or stop zone in multiple zones

java.sql.SQLException: cannot stop server or stop zone in multiple zones
at com.oceanbase.jdbc.internal.protocol.AbstractQueryProtocol.readErrorPacket(AbstractQueryProtocol.java:2192)
at com.oceanbase.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:2057)
at com.oceanbase.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1951)
at com.oceanbase.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:370)
at com.oceanbase.jdbc.JDBC4PreparedStatement.executeInternal(JDBC4PreparedStatement.java:234)
at com.oceanbase.jdbc.JDBC4PreparedStatement.execute(JDBC4PreparedStatement.java:161)
at com.oceanbase.jdbc.JDBC4PreparedStatement.executeUpdate(JDBC4PreparedStatement.java:195)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
at org.springframework.jdbc.core.JdbcTemplate.lambda$update$2(JdbcTemplate.java:973)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:656)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:968)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:1023)
at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:1033)
at com.oceanbase.ocp.obsdk.connector.ConnectTemplate.updateInner(ConnectTemplate.java:294)
at com.oceanbase.ocp.obsdk.connector.ConnectTemplate.update(ConnectTemplate.java:283)
at com.oceanbase.ocp.obsdk.operator.cluster.MysqlClusterOperator.stopZone(MysqlClusterOperator.java:508)
at com.oceanbase.ocp.service.task.business.zone.ObZoneTaskHandler.lambda$tryOperateZone$2(ObZoneTaskHandler.java:82)
at com.oceanbase.ocp.common.lang.pattern.Retry.executeUntilWithLimit(Retry.java:74)
at com.oceanbase.ocp.service.task.business.zone.ObZoneTaskHandler.tryOperat
eZone(ObZoneTaskHandler.java:80)
at com.oceanbase.ocp.service.task.business.zone.ObZoneTaskHandler.stopZone(ObZoneTaskHandler.java:59)
at com.oceanbase.ocp.service.task.business.zone.StopObZoneTask.run(StopObZoneTask.java:61)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.execute(JavaSubtaskRunner.java:64)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.doRun(JavaSubtaskRunner.java:32)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.run(JavaSubtaskRunner.java:26)
at com.oceanbase.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:76)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.doRun(SubtaskExecutor.java:203)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.redirectConsoleOutput(SubtaskExecutor.java:197)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.lambda$submit$2(SubtaskExecutor.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Set state for subtask: 60301, operation:EXECUTE, state: FAILED

alter system start zone zone1; 不能停止多个zone 吧 。 启动zone1 再看看

1 个赞

zone1的observe 目录损坏了,起不来,zone 状态是正常的

您把ocp上的任务都取消掉吧。 能回滚的回滚掉 。 使用obclient登录上去进行操作。 可能ocp有很多限制不让您这么做。

–取消后,在执行下面的内容 看下
select * from dba_ob_tenants;
select *from dba_ob_servers;
select * from dba_ob_zones;
select * from dba_ob_units;
slect * from dba_ob_resource_pools; --删除资源池就能删除unit.unit在资源池里面

select * from dba_ob_tenants;
select *from dba_ob_servers;
select * from dba_ob_zones;
select * from dba_ob_units;
slect * from dba_ob_resource_pools; --删除资源池就能删除unit.unit在资源池里面

–修改租户副本 --您哪里仅仅sys就修改,有其他租户都的修改
alter tenant sys SET LOCALITY = ‘FULL{1}@zone2, FULL{1}@zone3’;

–修改租户的priamry_zone
alter tenant sys primary_zone=‘zone2,zone3’; --去掉以前的zone

—查看unit的分布
select * from dba_ob_units;
注意:看zone1上是否还有zone

—切割下资源池
ALTER RESOURCE POOL sys_pool SPLIT INTO (‘sys_pool13’,‘sys_pool14’,‘sys_pool15’) ON (‘zone1’,‘zone2’,‘zone3’);
ALTER RESOURCE POOL pool10 UNIT=‘uc1’;
ALTER RESOURCE POOL pool11 UNIT=‘uc2’;
ALTER RESOURCE POOL pool12 UNIT=‘uc3’;
注意:通过删除租户资源池的方式,删除zone1上的unit。 如果没有。

—删除delete
alter system delete server ‘xxxxx:2882’ zone=zone1;

–删除zone
alter system stop zone zone1;
alter system delete zone zone1;

alter system delete server ‘192.168.10.15:2882’ zone=‘zone3’;
ERROR 4734 (HY000): can not migrate out unit ‘3’, no other available servers on zone ‘zone3’, delete server not allowed
解决办法:
如果确定zone不要了,observer也不要了

  1. 分割资源池
  2. 删除分配的资源池

ALTER RESOURCE POOL sys_pool SPLIT INTO (‘sys_pool13’,‘sys_pool14’,‘sys_pool15’) ON (‘zone1’,‘zone2’,‘zone3’);
ALTER RESOURCE POOL pool10 UNIT=‘uc1’;
ALTER RESOURCE POOL pool11 UNIT=‘uc2’;
ALTER RESOURCE POOL pool12 UNIT=‘uc3’;

drop resource pool sys_pool15;
ERROR 4626 (HY000): resource pool ‘sys_pool15’ has already been granted to a tenant
解决办法:
alter tenant sys resource_pool_list=(‘sys_pool13’,‘sys_pool14’);

1 个赞





obclient [oceanbase]> alter tenant sys locality=‘FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3’;
ERROR 1210 (HY000): Incorrect arguments to locality, zone name illegal
解决办法: 必须先给租户的资源池增加zone,才能扩展locality
obclient [oceanbase]> create resource pool sys_pool15 unit=sys_unit_config,unit_num=1,zone_list=(‘zone3’);
Query OK, 0 rows affected (1.291 sec)

obclient [oceanbase]> alter tenant sys resource_pool_list=(‘sys_pool14’,‘sys_pool13’,‘sys_pool15’);
Query OK, 0 rows affected (0.514 sec)
obclient [oceanbase]>
obclient [oceanbase]>
obclient [oceanbase]> alter tenant sys locality=‘FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3’;

按照我的流程试下。 我这都没问题的

谢谢可以了

对您那个部分有用,可以采纳下。谢谢!

ALTER RESOURCE POOL sys_pool SPLIT INTO (‘sys_pool13’,‘sys_pool14’,‘sys_pool15’) ON (‘zone1’,‘zone2’,‘zone3’); 这个语法为啥在官网找不到

ALTER RESOURCE POOL-V4.3.5-OceanBase 数据库文档-分布式数据库使用文档

GF之前整理在社区了,这个语法在 节点彻底报废没有救的时候 会用的,而且只有sys的资源池可以拆分

好好学习