OB 4.2.4_CE_HF1版本出现inner tables are unmatched错误

【 使用环境 】POC测试环境
【 OB or 其他组件 】 OB
【 使用版本 】4.2.4_CE_HF1
【问题描述】observer.log中持续报错:[2024-09-14 15:17:33.739754] ERROR [RS] check_sys_table_schemas_ (ob_root_inspection.cpp:1213) [19231][RSAsyncTask3][T0][Y0-0000000000000000-0-0] [lt=53][errcode=-4754] root inspection is not passed(msg=“inner tables are unmatched”, ret=-4029, ret=“OB_SCHEMA_ERROR”, tenant_id=1004)
数据库访问正常。
【复现路径】 当前环境是oms从mysql迁移OB,目前处理实时同步状态中。
【附件及日志】通过grep Y0-0000000000000000-0-0 observer和rootservice的日志如下:
obyroot.log.gz (2.1 MB)
oby.log.gz (1.8 MB)

你查询oms组件监控 截图发一下

ob上查询一下 这两个信息 注意脱敏
select * from __all_virtual_server_schema_info;
select * from __all_server;

±-------------±---------±----------±-------------------------±------------------------±-------------±------------±---------------------------+
| svr_ip | svr_port | tenant_id | refreshed_schema_version | received_schema_version | schema_count | schema_size | min_sstable_schema_version |
±-------------±---------±----------±-------------------------±------------------------±-------------±------------±---------------------------+
| 172.16.xx.xx | 2882 | 1004 | 1726284771831128 | 1726284771831128 | 2224 | 2529947 | -1 |
| 172.16.xx.xx | 2882 | 1004 | 1726284771831128 | 1726284771831128 | 2224 | 2529947 | -1 |
| 172.16.xx.xx | 2882 | 1004 | 1726284771831128 | 1726284771831128 | 2224 | 2529947 | -1 |
±-------------±---------±----------±-------------------------±------------------------±-------------±------------±---------------------------+

obclient [oceanbase]> select * from __all_server;
ERROR 1146 (42S02): Table ‘oceanbase.__all_server’ doesn’t exist

select * from DBA_OB_SERVERS; 用这个视图查一下

不是oms的问题 应该是ob的问题 你先提供一下信息 同学排查一下

在sys租户查询下,下面两个也查询下

select * from dba_ob_servers;

select * from dba_ob_zones;

obclient [oceanbase]> select * from dba_ob_servers;
±-------------±---------±—±------±---------±----------------±-------±---------------------------±----------±----------------------±---------------------------±---------------------------±------------------------------------------------------------------------------------------±------------------+
| SVR_IP | SVR_PORT | ID | ZONE | SQL_PORT | WITH_ROOTSERVER | STATUS | START_SERVICE_TIME | STOP_TIME | BLOCK_MIGRATE_IN_TIME | CREATE_TIME | MODIFY_TIME | BUILD_VERSION | LAST_OFFLINE_TIME |
±-------------±---------±—±------±---------±----------------±-------±---------------------------±----------±----------------------±---------------------------±---------------------------±------------------------------------------------------------------------------------------±------------------+
| 172.16.xx.xx | 2882 | 1 | zone1 | 2881 | YES | ACTIVE | 2024-09-13 17:29:19.454893 | NULL | NULL | 2024-09-13 17:12:56.276053 | 2024-09-13 17:29:25.224687 | 4.2.4.0_100010022024091012-0e8ca8d9363eb5d5fbb56e9ed0159b949c21dc80(Sep 10 2024 13:57:03) | NULL |
| 172.16.xx.xx | 2882 | 2 | zone2 | 2881 | NO | ACTIVE | 2024-09-13 17:30:17.290526 | NULL | NULL | 2024-09-13 17:12:56.307070 | 2024-09-13 17:30:19.399047 | 4.2.4.0_100010022024091012-0e8ca8d9363eb5d5fbb56e9ed0159b949c21dc80(Sep 10 2024 13:57:03) | NULL |
| 172.16.xx.xx | 2882 | 3 | zone3 | 2881 | NO | ACTIVE | 2024-09-13 17:31:08.623663 | NULL | NULL | 2024-09-13 17:12:56.343110 | 2024-09-13 17:31:09.574727 | 4.2.4.0_100010022024091012-0e8ca8d9363eb5d5fbb56e9ed0159b949c21dc80(Sep 10 2024 13:57:03) | NULL |
±-------------±---------±—±------±---------±----------------±-------±---------------------------±----------±----------------------±---------------------------±---------------------------±------------------------------------------------------------------------------------------±------------------+

obclient [oceanbase]> select * from dba_ob_zones;
±------±---------------------------±---------------------------±-------±----±-----------±----------+
| ZONE | CREATE_TIME | MODIFY_TIME | STATUS | IDC | REGION | TYPE |
±------±---------------------------±---------------------------±-------±----±-----------±----------+
| zone1 | 2024-09-13 17:12:55.537426 | 2024-09-13 17:29:21.239638 | ACTIVE | | sys_region | ReadWrite |
| zone2 | 2024-09-13 17:12:55.537426 | 2024-09-13 17:30:22.680200 | ACTIVE | | sys_region | ReadWrite |
| zone3 | 2024-09-13 17:12:55.537426 | 2024-09-13 17:31:10.308711 | ACTIVE | | sys_region | ReadWrite |
±------±---------------------------±---------------------------±-------±----±-----------±----------+
3 rows in set (0.227 sec)

这个问题已联系研发同学排查中,有进展会及时回复你

这个集群是升级上来的吗?

查看下集群升级情况
select * from DBA_OB_CLUSTER_EVENT_HISTORY where module like ‘%upgrade%’;

该语句没有返回数据:
obclient [oceanbase]> select * from DBA_OB_CLUSTER_EVENT_HISTORY where module like ‘%upgrade%’;
Empty set (0.002 sec)

我这个应该是升级上来的,部署没法指定安装包的md5值,
这么升级的:
obd cluster upgrade ob_poc -c oceanbase-ce -V 4.2.4.0 --usable=e6c0a15b9aba27db858c1d336b898c57fce93c0b

同版本有两个安装包md5不同,部署的时候没有选择HF1版本,升级的时候指定md5升级的。

是从OB 4.2.4_CE升级到OB 4.2.4_CE_HF1吗?

是的

这个bug促使的升级

麻烦找下升级日志:

upgrade_checker.log
upgrade_cluster_health_checker.log
upgrade_post.log
upgrade_pre.log

upgrade_checker.log (7.1 KB)
upgrade_cluster_health_checker.log (2.6 KB)
只找到这两个日志