【 使用环境 】测试环境
【 OB or 其他组件 】ob
【 使用版本 】4.3.5.5
【问题描述】和上一个问题 alter tenant locality任务卡死 基本一样,只是从sys租户换成了业务租户出问题,但是同样的处理方法无法解决问题
OceanBase 社区已接收您的帖子,正在跟进中。
当在业务租户上执行 ALTER TENANT LOCALITY 任务时遇到卡死的情况,可以按照以下步骤排查和解决该问题:
-
检查前提条件:确保待变更的 Zone 上各节点的资源分配情况满足要求。如果 Zone 上的节点资源不足,会导致无法进行 Locality 变更。具体操作请参考 查看租户和资源信息。
-
查看资源池属性:确认资源池的配置是否正确。可以通过查询系统视图
DBA_OB_RESOURCE_POOLS来查看资源池信息,具体操作请参考 DBA_OB_RESOURCE_POOLS。 -
检查任务状态:使用系统视图
CDB_OB_LS来查看当前任务的状态,以确定任务是否卡在某个特定阶段。具体操作请参考 CDB_OB_LS。 -
修改资源池属性:如果资源池属性不正确或需要调整,可以使用
ALTER RESOURCE POOL或ALTER RESOURCE TENANT语句来修改资源池属性。具体操作请参考 ALTER RESOURCE POOL 和 ALTER RESOURCE TENANT。 -
重启 OBServer:如果以上步骤都无法解决问题,可以尝试重启 OBServer,以排除可能的临时性问题。
希望这些步骤能帮助您解决 ALTER TENANT LOCALITY 任务卡死的问题。如果问题仍然存在,建议联系技术支持团队获取进一步的帮助。
详情请参考:
感谢官方已解决,最终问题定位到是卡在:
select * from __all_virtual_ls_meta_table where tenant_id = 1001;
select * from GV$OB_LOG_STAT where tenant_id = 1001;
这个元数据租户里记录的member_list没改成功影响的inprogress=0
执行
ALTER SYSTEM REMOVE REPLICA LS = 1 SERVER = ‘10.0.22.46:2882’ PAXOS_REPLICA_NUM = 2 TENANT = ‘META$1002’;
后alter tenant locality任务成功结束
上一个问题中因为sys租户没有meta租户所以直接ok了, 没有暴露出这个问题。。
—背景:业务租户的表不均衡,查看是日志流缺少副本导致的。
1、用户alter tenant locality增加日志流的副本发现卡住,较少副本也不行,来回操作。
—以减少副本为基础的操作开始排查问题
1.查看租户的任务
SELECT * FROM oceanbase.DBA_OB_TENANT_JOBS WHERE JOB_TYPE = ‘ALTER_TENANT_LOCALITY’ order by 1 desc limit 2;
±-------±----------------------±-----------±------------±---------±---------------------------±---------------------------±----------±--------------------------------------------------------------------------------±----------------------------------------------------------------------------------------±-----------±------------+
| JOB_ID | JOB_TYPE | JOB_STATUS | RESULT_CODE | PROGRESS | START_TIME | MODIFY_TIME | TENANT_ID | SQL_TEXT | EXTRA_INFO | RS_SVR_IP | RS_SVR_PORT |
±-------±----------------------±-----------±------------±---------±---------------------------±---------------------------±----------±--------------------------------------------------------------------------------±----------------------------------------------------------------------------------------±-----------±------------+
| 15 | ALTER_TENANT_LOCALITY | INPROGRESS | NULL | 0 | 2026-04-23 10:21:37.128194 | 2026-04-23 10:21:37.128194 | 1002 | ALTER TENANT archive LOCALITY = ‘FULL{1}@zone1, FULL{1}@zone2’ | FROM: ‘FULL{1}@zone1, FULL{1}@zone2, FULL{1}@zone3’, TO: ‘FULL{1}@zone1, FULL{1}@zone2’ | 10.0.22.44 | 2882 |
2.查看业务租户的RS调度
–DBA_OB_ROOTSERVICE_EVENT_HISTORY任务成对出现,没有问题。
mysql> select TIMESTAMP,MODULE,EVENT,NAME1,VALUE1,NAME2,VALUE2,NAME6,VALUE6 from DBA_OB_ROOTSERVICE_EVENT_HISTORY where module like ‘%disas%’ order by TIMESTAMP desc limit 5;
±---------------------------±------------------±-------------------------------±----------±-------±------±-------±---------------±-----------------------------------------------------+
| TIMESTAMP | MODULE | EVENT | NAME1 | VALUE1 | NAME2 | VALUE2 | NAME6 | VALUE6 |
±---------------------------±------------------±-------------------------------±----------±-------±------±-------±---------------±-----------------------------------------------------+
| 2026-04-23 10:21:37.443219 | disaster_recovery | finish_remove_ls_paxos_replica | tenant_id | 1002 | ls_id | 1002 | execute_result | ret:0, OB_SUCCESS; elapsed:443215; |
| 2026-04-23 10:21:37.429014 | disaster_recovery | finish_remove_ls_paxos_replica | tenant_id | 1002 | ls_id | 1001 | execute_result | ret:0, OB_SUCCESS; elapsed:429010; |
| 2026-04-23 10:21:37.413738 | disaster_recovery | finish_remove_ls_paxos_replica | tenant_id | 1002 | ls_id | 1 | execute_result | ret:0, OB_SUCCESS; elapsed:413734; |
| 2026-04-23 10:21:37.361019 | disaster_recovery | start_remove_ls_paxos_replica | tenant_id | 1002 | ls_id | 1002 | comment | remove redundant paxos replica according to locality |
| 2026-04-23 10:21:37.350369 | disaster_recovery | start_remove_ls_paxos_replica | tenant_id | 1002 | ls_id | 1001 | comment | remove redundant paxos replica according to locality |
±---------------------------±------------------±-------------------------------±----------±-------±------±-------±---------------±-----------------------------------------------------+
5 rows in set (0.00 sec)
3.查看容灾任务的执行情况
CDB_OB_LS_REPLICA_TASK_HISTORY查看业务租户触发的locality 变更是是一致的。
4.查看业务日志流信息
—GV$OB_LOG_STAT业务租户的日志流副本也没有问题
select * from GV$OB_LOG_STAT where tenant_id = 1002;
±----------±------±-----------±---------±---------±------------±------------------------------------±------------±------------------------------------±------------------±--------±-------------±-------------±--------------------±-------------±--------------------±-------------±--------------------±-------------------±--------------±-------------+
| TENANT_ID | LS_ID | SVR_IP | SVR_PORT | ROLE | PROPOSAL_ID | CONFIG_VERSION | ACCESS_MODE | PAXOS_MEMBER_LIST | PAXOS_REPLICA_NUM | IN_SYNC | BASE_LSN | BEGIN_LSN | BEGIN_SCN | END_LSN | END_SCN | MAX_LSN | MAX_SCN | ARBITRATION_MEMBER | DEGRADED_LIST | LEARNER_LIST |
±----------±------±-----------±---------±---------±------------±------------------------------------±------------±------------------------------------±------------------±--------±-------------±-------------±--------------------±-------------±--------------------±-------------±--------------------±-------------------±--------------±-------------+
| 1002 | 1 | 10.0.22.44 | 2882 | LEADER | 2 | {proposal_id:2, config_seq:10} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 12951220224 | 9595981824 | 1775512884197494001 | 13078387994 | 1776913978289095005 | 13078387994 | 1776913978289095005 | | | |
| 1002 | 1002 | 10.0.22.44 | 2882 | FOLLOWER | 7 | {proposal_id:7, config_seq:2480380} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 62944272384 | 61199548416 | 1775491057178946005 | 62975558258 | 1776913978601867001 | 62975558380 | 1776913978789769000 | | | |
| 1002 | 1001 | 10.0.22.44 | 2882 | LEADER | 2 | {proposal_id:2, config_seq:45} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 159910662144 | 158233042944 | 1775502598237335002 | 159934239588 | 1776913978289095005 | 159934239588 | 1776913978289095005 | | | |
| 1002 | 1 | 10.0.22.45 | 2882 | FOLLOWER | 2 | {proposal_id:2, config_seq:10} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 12951220224 | 9595981824 | 1775512884197494001 | 13078387994 | 1776913978289095005 | 13078387994 | 1776913978289095005 | | | |
| 1002 | 1002 | 10.0.22.45 | 2882 | LEADER | 7 | {proposal_id:7, config_seq:2480380} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 62944272384 | 61199548416 | 1775491057178946005 | 62975558380 | 1776913978789769000 | 62975558380 | 1776913978789769000 | | | |
| 1002 | 1001 | 10.0.22.45 | 2882 | FOLLOWER | 2 | {proposal_id:2, config_seq:45} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 159910662144 | 158233042944 | 1775502598237335002 | 159934239588 | 1776913978289095005 | 159934239588 | 1776913978289095005 | | | |
±----------±------±-----------±---------±---------±------------±------------------------------------±------------±------------------------------------±------------------±--------±-------------±-------------±--------------------±-------------±--------------------±-------------±--------------------±-------------------±--------------±-------------+
6 rows in set (0.01 sec)
5.最后meta租户的日志信息
—日志流成员列表不一致。
mysql> select * from GV$OB_LOG_STAT where tenant_id = 1001;
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
| TENANT_ID | LS_ID | SVR_IP | SVR_PORT | ROLE | PROPOSAL_ID | CONFIG_VERSION | ACCESS_MODE | PAXOS_MEMBER_LIST | PAXOS_REPLICA_NUM | IN_SYNC | BASE_LSN | BEGIN_LSN | BEGIN_SCN | END_LSN | END_SCN | MAX_LSN | MAX_SCN | ARBITRATION_MEMBER | DEGRADED_LIST | LEARNER_LIST |
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
| 1001 | 1 | 10.0.22.44 | 2882 | LEADER | 2 | {proposal_id:2, config_seq:3} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1,10.0.22.46:2882:1 | 3 | YES | 92336160768 | 91665113088 | 1776872839694594002 | 92577092013 | 1776927939939242000 | 92577092013 | 1776927939939242000 | | | |
| 1001 | 1 | 10.0.22.45 | 2882 | FOLLOWER | 2 | {proposal_id:2, config_seq:3} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1,10.0.22.46:2882:1 | 3 | YES | 92336160768 | 91665113088 | 1776872839694594002 | 92577092013 | 1776927939939242000 | 92577092013 | 1776927939939242000 | | | |
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
2 rows in set (0.00 sec)
6.执行移除日志流的副本
—ALTER SYSTEM REMOVE REPLICA LS = 1 SERVER = ‘10.0.22.46:2882’ PAXOS_REPLICA_NUM = 2 TENANT = ‘META$1002’;
7.查看meta日志流
—成员列表一致了。
mysql> select * from GV$OB_LOG_STAT where tenant_id = 1001;
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
| TENANT_ID | LS_ID | SVR_IP | SVR_PORT | ROLE | PROPOSAL_ID | CONFIG_VERSION | ACCESS_MODE | PAXOS_MEMBER_LIST | PAXOS_REPLICA_NUM | IN_SYNC | BASE_LSN | BEGIN_LSN | BEGIN_SCN | END_LSN | END_SCN | MAX_LSN | MAX_SCN | ARBITRATION_MEMBER | DEGRADED_LIST | LEARNER_LIST |
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
| 1001 | 1 | 10.0.22.44 | 2882 | LEADER | 2 | {proposal_id:2, config_seq:4} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 92336160768 | 91665113088 | 1776872839694594002 | 92583039483 | 1776928597910714000 | 92583039483 | 1776928597910714000 | | | |
| 1001 | 1 | 10.0.22.45 | 2882 | FOLLOWER | 2 | {proposal_id:2, config_seq:4} | APPEND | 10.0.22.44:2882:1,10.0.22.45:2882:1 | 2 | YES | 92336160768 | 91665113088 | 1776872839694594002 | 92583038892 | 1776928597810026000 | 92583039483 | 1776928597910714000 | | | |
±----------±------±-----------±---------±---------±------------±------------------------------±------------±------------------------------------±------------------±--------±------------±------------±--------------------±------------±--------------------±------------±--------------------±-------------------±--------------±-------------+
2 rows in set (0.00 sec)
6.查看租户的任务
—DBA_OB_TENANT_JOBS的JOB_STATUS变成了SUCCESS了