OBD升级oceanbase4.2.1失败一直卡住到upgrading&OCP无法打开

【 使用环境 】生产环境
生产内部项目 OceanBase_CE 4.2.1.0 (r100000102023092807-7b0f43693565654bb1d7343f728bc2013dfff959) (Built Sep 28 2023 07:25:28)

【 OB or 其他组件 】
obd --version
OceanBase Deploy: 2.4.1
REVISION: 955d1eab27a5bd304669b6280c88dc4102c07bb4

【问题描述】

  • 使用OBD黑屏方式升级4.2.1版本至4.2.2.0版本卡住,一直无反应,2.20号升级后一直卡住(该集群能
    正常读写)
  • OCP管理平台无法打开(可能也和升级卡住有关联)
    http://ob-ocp-xxxx.dmall.com/
    image
2024-02-20 15:59:03        obd cluster upgrade oceanbase41 -c oceanbase-ce -V 4.2.2.0 --usable=d687aabed34f610040c70cd8aa4f256f9a909564bcdb12e1bcbf83224c865fab

之前使用OBD从4.1.0升级至4.2.1版本正常

2023-10-25 20:37:27 obd cluster upgrade oceanbase41 -c oceanbase-ce -V 4.2.1.0 --usable=8f0cac8e81aaef587efb774b5de3cd98876dc196ccb8f2eca7bcd252f48ffb4a

obd cluster display oceanbase41
Deploy "oceanbase41" is upgrading
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 91b2297e-d779-11ee-924c-525400b51421
If you want to view detailed obd logs, please run: obd display-trace 91b2297e-d779-11ee-924c-525400b51421

obd display-trace c1279bda-d779-11ee-a1db-525400b51421

obd display-trace c1279bda-d779-11ee-a1db-525400b51421
[2024-03-01 11:14:02.668] [DEBUG] - cmd: ['oceanbase41']
[2024-03-01 11:14:02.668] [DEBUG] - opts: {'components': 'oceanbase-ce', 'style': 'cluster'}
[2024-03-01 11:14:02.668] [DEBUG] - mkdir /root/.obd/lock/
[2024-03-01 11:14:02.668] [DEBUG] - unknown lock mode 
[2024-03-01 11:14:02.668] [DEBUG] - try to get share lock /root/.obd/lock/global
[2024-03-01 11:14:02.668] [DEBUG] - share lock `/root/.obd/lock/global`, count 1
[2024-03-01 11:14:02.668] [DEBUG] - Get Deploy by name
[2024-03-01 11:14:02.668] [DEBUG] - mkdir /root/.obd/cluster/
[2024-03-01 11:14:02.669] [DEBUG] - mkdir /root/.obd/config_parser/
[2024-03-01 11:14:02.669] [DEBUG] - try to get exclusive lock /root/.obd/lock/deploy_oceanbase41
[2024-03-01 11:14:02.669] [DEBUG] - exclusive lock `/root/.obd/lock/deploy_oceanbase41`, count 1
[2024-03-01 11:14:02.675] [DEBUG] - Deploy config status judge
[2024-03-01 11:14:02.675] [ERROR] Deploy oceanbase41 need reload
[2024-03-01 11:14:02.675] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2024-03-01 11:14:02.675] [INFO] Trace ID: c1279bda-d779-11ee-a1db-525400b51421
[2024-03-01 11:14:02.675] [INFO] If you want to view detailed obd logs, please run: obd display-trace c1279bda-d779-11ee-a1db-525400b51421
[2024-03-01 11:14:02.675] [DEBUG] - exclusive lock /root/.obd/lock/deploy_oceanbase41 release, count 0
[2024-03-01 11:14:02.675] [DEBUG] - unlock /root/.obd/lock/deploy_oceanbase41
[2024-03-01 11:14:02.675] [DEBUG] - share lock /root/.obd/lock/global release, count 0
[2024-03-01 11:14:02.675] [DEBUG] - unlock /root/.obd/lock/global

obd cluster reload oceanbase41

[ERROR] Deploy "oceanbase41" is upgrading. You could not reload an upgrading cluster.
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: f9efd25a-d780-11ee-aec4-525400b51421
If you want to view detailed obd logs, please run: obd display-trace f9efd25a-d780-11ee-aec4-525400b51421

xxx.210 机器升级前硬件维护过,重启过机器,人工启动过observer和obproxy(强力建议OB官方做基于systemd的自启动服务,某db也是这么做的

该210 observer节点查询状态正常

210 observer无异常错误日志
grep ’ ERROR ’ /home/admin/oceanbase41/oceanbase/log/observer.log

ps -aux | egrep -w 'observer'
admin     70881  191 20.1 55220288 53195956 ?   Ssl  Feb19 28848:36 ./bin/observer

*************************** 3. row ***************************
           gmt_create: 2023-07-04 17:50:14.605122
         gmt_modified: 2024-02-19 23:47:37.050818
               svr_ip: xxx.210
             svr_port: 2882
                   id: 3
                 zone: zone3
           inner_port: 2881
      with_rootserver: 0
               status: ACTIVE
block_migrate_in_time: 0
        build_version: 4.2.1.0_100000102023092807-7b0f43693565654bb1d7343f728bc2013dfff959(Sep 28 2023 07:25:28)
            stop_time: 0
   start_service_time: 1708357655340830
         first_sessid: 0
       with_partition: 1
    last_offline_time: 0
3 rows in set (0.01 sec)

docker中OCP日志:
docker exec -it 0f6e907ac7f6 /bin/bash

大量报错日志
grep ‘Operation not allowed’ logs/ocp-server.0.err | wc -l
30681

less logs/ocp-server.0.err

Caused by: java.lang.IllegalArgumentException: Cannot instantiate interface org.springframework.boot.SpringApplicationRunListener : com.oceanbase.ocp.bootstrap.spring.BootstrapRunListener
        at org.springframework.boot.SpringApplication.createSpringFactoriesInstances(SpringApplication.java:461)
        at org.springframework.boot.SpringApplication.getSpringFactoriesInstances(SpringApplication.java:443)
        at org.springframework.boot.SpringApplication.getRunListeners(SpringApplication.java:431)
        at org.springframework.boot.SpringApplication.run(SpringApplication.java:297)
        at org.springframework.boot.SpringApplication.run(SpringApplication.java:1317)
        at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
        at com.oceanbase.ocp.OcpServerApplication.main(OcpServerApplication.java:21)
        ... 8 more
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.oceanbase.ocp.bootstrap.spring.BootstrapRunListener]: Constructor threw exception; nested exception is java.lang.
IllegalStateException: init distributed_lock table failed
        at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:224)
        at org.springframework.boot.SpringApplication.createSpringFactoriesInstances(SpringApplication.java:457)
        ... 14 more
Caused by: java.lang.IllegalStateException: init distributed_lock table failed
        at com.oceanbase.ocp.bootstrap.hooks.BootstrapLock.init(BootstrapLock.java:69)
        at com.oceanbase.ocp.bootstrap.hooks.BootstrapLock.tryLock(BootstrapLock.java:83)
        at com.oceanbase.ocp.bootstrap.hooks.OCPInitializer.initialize(OCPInitializer.java:58)
        at com.oceanbase.ocp.bootstrap.hooks.OCPInitializer.initialize(OCPInitializer.java:115)
        at com.oceanbase.ocp.bootstrap.spring.BootstrapRunListener.<init>(BootstrapRunListener.java:56)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:211)

【建议】

  • 强力建议OB官方做基于systemd的自启动服务,用户自己做自启动的话没有一个标准不好实现
  • 官方文档有更多关于黑屏方式应急恢复方面的说明

再执行一次421升级422的obd命令试试,obd支持断点继续升级,看看哪里失败了

另外我看你420升级421,和421升级422好像不是同一个集群?看着你执行升级422升级的是obtest这个集群,但是卡住的是oceanbase41这个集群?

是同一个集群,我先粘贴的命令是测试集群,重新贴了命令,最开始部署版本是去年4.1.0的,升级到4.2.1的select * from oceanbase.DBA_OB_CLUSTER_EVENT_HISTORY where module=‘upgrade’;

再次升级提示有错误


obd display-trace 91f994e4-d797-11ee-8955-525400b51421

[2024-03-01 14:47:28.415] [DEBUG] - cmd: [‘oceanbase41’]
[2024-03-01 14:47:28.415] [DEBUG] - opts: {‘component’: ‘oceanbase-ce’, ‘version’: ‘4.2.2.0’, ‘skip_check’: None, ‘usable’: ‘d687aabed34f610040c70cd8aa4f256f9a909564bcdb12e1bcbf83224c865fab’, ‘disable’: ‘’, ‘executer_path’: ‘/usr/obd/lib/executer’, ‘script_query_timeout’: ‘’, ‘ignore_standby’: None}
[2024-03-01 14:47:28.416] [DEBUG] - mkdir /root/.obd/lock/
[2024-03-01 14:47:28.416] [DEBUG] - unknown lock mode
[2024-03-01 14:47:28.416] [DEBUG] - try to get share lock /root/.obd/lock/global
[2024-03-01 14:47:28.416] [DEBUG] - share lock /root/.obd/lock/global, count 1
[2024-03-01 14:47:28.417] [DEBUG] - Get Deploy by name
[2024-03-01 14:47:28.417] [DEBUG] - mkdir /root/.obd/cluster/
[2024-03-01 14:47:28.417] [DEBUG] - mkdir /root/.obd/config_parser/
[2024-03-01 14:47:28.417] [DEBUG] - try to get exclusive lock /root/.obd/lock/deploy_oceanbase41
[2024-03-01 14:47:28.417] [DEBUG] - exclusive lock /root/.obd/lock/deploy_oceanbase41, count 1
[2024-03-01 14:47:28.423] [DEBUG] - Deploy status judge
[2024-03-01 14:47:28.446] [DEBUG] - load config parser: /root/.obd/config_parser/oceanbase-ce/cluster_config_parser.py
[2024-03-01 14:47:28.446] [DEBUG] - import cluster_config_parser
[2024-03-01 14:47:28.448] [DEBUG] - add cluster_config_parser ref count to 1
[2024-03-01 14:47:28.454] [INFO] Get local repositories and plugins
[2024-03-01 14:47:28.454] [DEBUG] - mkdir /root/.obd/repository
[2024-03-01 14:47:28.454] [DEBUG] - Get local repository oceanbase-ce-4.2.1.0-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:28.455] [DEBUG] - try to get share lock /root/.obd/lock/mirror_and_repo
[2024-03-01 14:47:28.455] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 1
[2024-03-01 14:47:28.456] [DEBUG] - Get local repository obproxy-ce-4.2.1.0-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:28.456] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 2
[2024-03-01 14:47:28.458] [DEBUG] - Get local repository obagent-1.3.1-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:28.458] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 3
[2024-03-01 14:47:28.461] [DEBUG] - Get local repository ocp-express-1.0.1-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:28.461] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 4
[2024-03-01 14:47:28.464] [DEBUG] - Searching param plugin for components …
[2024-03-01 14:47:28.464] [DEBUG] - Search param plugin for oceanbase-ce
[2024-03-01 14:47:28.464] [DEBUG] - mkdir /root/.obd/plugins
[2024-03-01 14:47:28.468] [DEBUG] - Found for oceanbase-ce-param-4.2.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:28.468] [DEBUG] - Applying oceanbase-ce-param-4.2.0.0 for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:29.007] [DEBUG] - Search param plugin for obproxy-ce
[2024-03-01 14:47:29.008] [DEBUG] - Found for obproxy-ce-param-3.1.0 for obproxy-ce-4.2.1.0
[2024-03-01 14:47:29.008] [DEBUG] - Applying obproxy-ce-param-3.1.0 for obproxy-ce-4.2.1.0-11.el7-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:29.136] [DEBUG] - Search param plugin for obagent
[2024-03-01 14:47:29.140] [DEBUG] - Found for obagent-param-1.3.0 for obagent-1.3.1
[2024-03-01 14:47:29.140] [DEBUG] - Applying obagent-param-1.3.0 for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:29.210] [DEBUG] - Search param plugin for ocp-express
[2024-03-01 14:47:29.225] [DEBUG] - Found for ocp-express-param-1.0.1 for ocp-express-1.0.1
[2024-03-01 14:47:29.225] [DEBUG] - Applying ocp-express-param-1.0.1 for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:29.403] [DEBUG] - Searching get_standbys plugin for components …
[2024-03-01 14:47:29.403] [DEBUG] - Searching get_standbys plugin for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:29.404] [DEBUG] - Found for oceanbase-ce-py_script_get_standbys-4.2.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:29.404] [DEBUG] - Searching get_standbys plugin for obproxy-ce-4.2.1.0-11.el7-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:29.404] [DEBUG] - No such get_standbys plugin for obproxy-ce-4.2.1.0
[2024-03-01 14:47:29.404] [DEBUG] - Searching get_standbys plugin for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:29.405] [DEBUG] - No such get_standbys plugin for obagent-1.3.1
[2024-03-01 14:47:29.405] [DEBUG] - Searching get_standbys plugin for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:29.405] [DEBUG] - No such get_standbys plugin for ocp-express-1.0.1
[2024-03-01 14:47:29.405] [DEBUG] - Searching get_relation_tenants plugin for components …
[2024-03-01 14:47:29.405] [DEBUG] - Searching get_relation_tenants plugin for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:29.405] [DEBUG] - Found for oceanbase-ce-py_script_get_relation_tenants-4.2.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:29.405] [DEBUG] - Searching get_relation_tenants plugin for obproxy-ce-4.2.1.0-11.el7-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:29.405] [DEBUG] - No such get_relation_tenants plugin for obproxy-ce-4.2.1.0
[2024-03-01 14:47:29.405] [DEBUG] - Searching get_relation_tenants plugin for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:29.406] [DEBUG] - No such get_relation_tenants plugin for obagent-1.3.1
[2024-03-01 14:47:29.406] [DEBUG] - Searching get_relation_tenants plugin for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:29.406] [DEBUG] - No such get_relation_tenants plugin for ocp-express-1.0.1
[2024-03-01 14:47:29.406] [DEBUG] - Searching get_deployment_connections plugin for components …
[2024-03-01 14:47:29.406] [DEBUG] - Searching get_deployment_connections plugin for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:29.406] [DEBUG] - Found for oceanbase-ce-py_script_get_deployment_connections-4.2.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:29.406] [DEBUG] - Searching get_deployment_connections plugin for obproxy-ce-4.2.1.0-11.el7-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:29.406] [DEBUG] - No such get_deployment_connections plugin for obproxy-ce-4.2.1.0
[2024-03-01 14:47:29.406] [DEBUG] - Searching get_deployment_connections plugin for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:29.407] [DEBUG] - No such get_deployment_connections plugin for obagent-1.3.1
[2024-03-01 14:47:29.407] [DEBUG] - Searching get_deployment_connections plugin for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:29.407] [DEBUG] - No such get_deployment_connections plugin for ocp-express-1.0.1
[2024-03-01 14:47:29.407] [DEBUG] - Searching connect plugin for components …
[2024-03-01 14:47:29.407] [DEBUG] - Searching connect plugin for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:29.407] [DEBUG] - Found for oceanbase-ce-py_script_connect-3.1.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:29.407] [DEBUG] - Searching connect plugin for obproxy-ce-4.2.1.0-11.el7-0aed4b782120e4248b749f67be3d2cc82cdcb70d
[2024-03-01 14:47:29.407] [DEBUG] - Found for obproxy-ce-py_script_connect-3.1.0 for obproxy-ce-4.2.1.0
[2024-03-01 14:47:29.407] [DEBUG] - Searching connect plugin for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2024-03-01 14:47:29.408] [DEBUG] - Found for obagent-py_script_connect-1.3.0 for obagent-1.3.1
[2024-03-01 14:47:29.408] [DEBUG] - Searching connect plugin for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2024-03-01 14:47:29.408] [DEBUG] - Found for ocp-express-py_script_connect-1.0.1 for ocp-express-1.0.1
[2024-03-01 14:47:29.408] [INFO] Open ssh connection
[2024-03-01 14:47:29.409] [DEBUG] - host: xxxx143.209, port: 22, user: admin, password: admin
[2024-03-01 14:47:29.825] [DEBUG] - host: xxxx143.208, port: 22, user: admin, password: admin
[2024-03-01 14:47:30.246] [DEBUG] - host: xxxx143.210, port: 22, user: admin, password: admin
[2024-03-01 14:47:30.842] [DEBUG] - Call oceanbase-ce-py_script_get_relation_tenants-4.2.0.0 for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:30.842] [DEBUG] - import get_relation_tenants
[2024-03-01 14:47:30.845] [DEBUG] - add get_relation_tenants ref count to 1
[2024-03-01 14:47:30.845] [DEBUG] - exclusive lock /root/.obd/lock/deploy_oceanbase41, count 2
[2024-03-01 14:47:30.875] [DEBUG] - sub get_relation_tenants ref count to 0
[2024-03-01 14:47:30.875] [DEBUG] - export get_relation_tenants
[2024-03-01 14:47:30.875] [DEBUG] - Call oceanbase-ce-py_script_get_deployment_connections-4.2.0.0 for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:30.875] [DEBUG] - import get_deployment_connections
[2024-03-01 14:47:30.877] [DEBUG] - add get_deployment_connections ref count to 1
[2024-03-01 14:47:30.878] [INFO] Get deployment connections
[2024-03-01 14:47:31.660] [DEBUG] - sub get_deployment_connections ref count to 0
[2024-03-01 14:47:31.660] [DEBUG] - export get_deployment_connections
[2024-03-01 14:47:31.661] [DEBUG] - Call oceanbase-ce-py_script_get_standbys-4.2.0.0 for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:31.661] [DEBUG] - import get_standbys
[2024-03-01 14:47:31.663] [DEBUG] - add get_standbys ref count to 1
[2024-03-01 14:47:31.663] [INFO] Get standbys info
[2024-03-01 14:47:31.794] [DEBUG] - sub get_standbys ref count to 0
[2024-03-01 14:47:31.794] [DEBUG] - export get_standbys
[2024-03-01 14:47:31.824] [DEBUG] - Search repository oceanbase-ce version: 4.2.1.0, tag: a8b9979de1f2809d74de71b2a536cff8aab15bff, release: None, package_hash: None
[2024-03-01 14:47:31.824] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 5
[2024-03-01 14:47:31.824] [DEBUG] - mkdir /root/.obd/repository/oceanbase-ce
[2024-03-01 14:47:31.824] [DEBUG] - Found repository oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:31.824] [DEBUG] - Search repository oceanbase-ce version: 4.2.2.0, tag: aa3053da7370a6685a2ef457cd202d50e5ab75d3, release: None, package_hash: None
[2024-03-01 14:47:31.931] [DEBUG] - Found repository oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3
[2024-03-01 14:47:31.931] [DEBUG] - Searching install plugin for components …
[2024-03-01 14:47:31.931] [DEBUG] - Search install plugin for oceanbase-ce
[2024-03-01 14:47:31.931] [DEBUG] - Found for oceanbase-ce-install-4.0.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:31.931] [DEBUG] - Search install plugin for oceanbase-ce
[2024-03-01 14:47:31.932] [DEBUG] - Found for oceanbase-ce-install-4.0.0.0 for oceanbase-ce-4.2.2.0
[2024-03-01 14:47:31.932] [DEBUG] - Searching install plugin for components …
[2024-03-01 14:47:31.932] [DEBUG] - Searching upgrade plugin for components …
[2024-03-01 14:47:31.932] [DEBUG] - Searching upgrade plugin for oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3
[2024-03-01 14:47:31.932] [DEBUG] - Found for oceanbase-ce-py_script_upgrade-4.1.0.0 for oceanbase-ce-4.2.2.0
[2024-03-01 14:47:31.932] [DEBUG] - Call oceanbase-ce-py_script_upgrade-4.1.0.0 for oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3
[2024-03-01 14:47:31.932] [DEBUG] - import upgrade
[2024-03-01 14:47:31.938] [DEBUG] - add upgrade ref count to 1
[2024-03-01 14:47:31.940] [DEBUG] - Searching param plugin for components …
[2024-03-01 14:47:31.940] [DEBUG] - Search param plugin for oceanbase-ce
[2024-03-01 14:47:31.940] [DEBUG] - Found for oceanbase-ce-param-4.2.0.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:31.940] [DEBUG] - Applying oceanbase-ce-param-4.2.0.0 for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:31.940] [DEBUG] - Searching start plugin for components …
[2024-03-01 14:47:31.940] [DEBUG] - Searching start plugin for oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3
[2024-03-01 14:47:31.941] [DEBUG] - Found for oceanbase-ce-py_script_start-4.2.0.0 for oceanbase-ce-4.2.2.0
[2024-03-01 14:47:31.941] [DEBUG] – import start
[2024-03-01 14:47:31.943] [DEBUG] – add start ref count to 1
[2024-03-01 14:47:31.943] [INFO] Start observer
[2024-03-01 14:47:31.944] [DEBUG] — admin@xxxx143.209 execute: ls /data3/oceanbase41_data/clog/tenant_1/
[2024-03-01 14:47:31.967] [DEBUG] — exited code 0
[2024-03-01 14:47:31.967] [DEBUG] — admin@xxxx143.209 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:32.070] [DEBUG] — exited code 0
[2024-03-01 14:47:32.070] [DEBUG] — admin@xxxx143.209 execute: ls /proc/224367
[2024-03-01 14:47:32.134] [DEBUG] — exited code 0
[2024-03-01 14:47:32.134] [DEBUG] — admin@xxxx143.208 execute: ls /data3/oceanbase41_data/clog/tenant_1/
[2024-03-01 14:47:32.166] [DEBUG] — exited code 0
[2024-03-01 14:47:32.167] [DEBUG] — admin@xxxx143.208 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:32.266] [DEBUG] — exited code 0
[2024-03-01 14:47:32.267] [DEBUG] — admin@xxxx143.208 execute: ls /proc/126448
[2024-03-01 14:47:32.332] [DEBUG] — exited code 0
[2024-03-01 14:47:32.332] [DEBUG] — admin@xxxx143.210 execute: ls /data3/oceanbase41_data/clog/tenant_1/
[2024-03-01 14:47:32.385] [DEBUG] — exited code 0
[2024-03-01 14:47:32.386] [DEBUG] — admin@xxxx143.210 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:32.461] [DEBUG] — exited code 0
[2024-03-01 14:47:32.461] [DEBUG] — admin@xxxx143.210 execute: ls /proc/70881
[2024-03-01 14:47:32.529] [DEBUG] — exited code 0
[2024-03-01 14:47:32.596] [INFO] observer program health check
[2024-03-01 14:47:35.600] [DEBUG] — xxxx143.209 program health check
[2024-03-01 14:47:35.600] [DEBUG] — admin@xxxx143.209 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:35.620] [DEBUG] — exited code 0
[2024-03-01 14:47:35.620] [DEBUG] — admin@xxxx143.209 execute: ls /proc/224367
[2024-03-01 14:47:35.685] [DEBUG] — exited code 0
[2024-03-01 14:47:35.685] [DEBUG] — xxxx143.209 observer[pid: 224367] started
[2024-03-01 14:47:35.685] [DEBUG] — xxxx143.208 program health check
[2024-03-01 14:47:35.685] [DEBUG] — admin@xxxx143.208 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:35.706] [DEBUG] — exited code 0
[2024-03-01 14:47:35.706] [DEBUG] — admin@xxxx143.208 execute: ls /proc/126448
[2024-03-01 14:47:35.769] [DEBUG] — exited code 0
[2024-03-01 14:47:35.770] [DEBUG] — xxxx143.208 observer[pid: 126448] started
[2024-03-01 14:47:35.770] [DEBUG] — xxxx143.210 program health check
[2024-03-01 14:47:35.770] [DEBUG] — admin@xxxx143.210 execute: cat /home/admin/oceanbase41/oceanbase/run/observer.pid
[2024-03-01 14:47:35.792] [DEBUG] — exited code 0
[2024-03-01 14:47:35.793] [DEBUG] — admin@xxxx143.210 execute: ls /proc/70881
[2024-03-01 14:47:35.861] [DEBUG] — exited code 0
[2024-03-01 14:47:35.861] [DEBUG] — xxxx143.210 observer[pid: 70881] started
[2024-03-01 14:47:35.986] [DEBUG] – sub start ref count to 0
[2024-03-01 14:47:35.986] [DEBUG] – export start
[2024-03-01 14:47:35.986] [DEBUG] - Searching connect plugin for components …
[2024-03-01 14:47:35.986] [DEBUG] - Searching connect plugin for oceanbase-ce-4.2.1.0-100000102023092807.el7-a8b9979de1f2809d74de71b2a536cff8aab15bff
[2024-03-01 14:47:35.987] [DEBUG] - Found for oceanbase-ce-py_script_connect-3.1.0 for oceanbase-ce-4.2.1.0
[2024-03-01 14:47:35.987] [INFO] Connect to observer
[2024-03-01 14:47:35.987] [DEBUG] — connect xxxx143.209 -P2881 -uroot -pxxxxx
[2024-03-01 14:47:36.002] [DEBUG] — execute sql: select 1. args: None
[2024-03-01 14:47:36.118] [DEBUG] — execute sql: use oceanbase. args: None
[2024-03-01 14:47:36.195] [DEBUG] — execute sql: set session ob_query_timeout=1000000000. args: None
[2024-03-01 14:47:36.207] [DEBUG] – upgrade oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 to oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3
[2024-03-01 14:47:36.208] [INFO] Exec upgrade_checker.py
[2024-03-01 14:47:36.208] [DEBUG] – exec oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 upgrade_checker.py
[2024-03-01 14:47:36.208] [DEBUG] – exec oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 upgrade_checker.py
[2024-03-01 14:47:36.208] [DEBUG] – local execute: /usr/obd/lib/executer/executer27/bin/executer /tmp/xxxx143.209:2882_xxxx143.208:2882_xxxx143.210:2882/aa3053da7370a6685a2ef457cd202d50e5ab75d3/upgrade_checker.py -h xxxx143.209 -P 2881 -u root -p ‘xxxx’
[2024-03-01 14:47:40.145] [DEBUG] – exited code 255, error output:
[2024-03-01 14:47:40.146] [DEBUG] Character set ‘45’ is not a compiled character set and is not specified in the ‘/usr/local/mysql/share/charsets/Index.xml’ file
[2024-03-01 14:47:40.146] [DEBUG] Character set ‘45’ is not a compiled character set and is not specified in the ‘/usr/local/mysql/share/charsets/Index.xml’ file
[2024-03-01 14:47:40.146] [DEBUG] Traceback (most recent call last):
[2024-03-01 14:47:40.146] [DEBUG] File “executer27.py”, line 51, in
[2024-03-01 14:47:40.146] [DEBUG] File “/tmp/xxxx143.209:2882_xxxx143.208:2882_xxxx143.210:2882/aa3053da7370a6685a2ef457cd202d50e5ab75d3/upgrade_checker.py”, line 741, in
[2024-03-01 14:47:40.146] [DEBUG] raise e
[2024-03-01 14:47:40.146] [DEBUG] main.MyError: 'upgrade checker failed with 1 reasons: [servers build_version not match] ’
[2024-03-01 14:47:40.146] [DEBUG] [13543] Failed to execute script executer27
[2024-03-01 14:47:40.146] [DEBUG]
[2024-03-01 14:47:40.247] [DEBUG] - sub upgrade ref count to 0
[2024-03-01 14:47:40.247] [DEBUG] - export upgrade
[2024-03-01 14:47:40.248] [DEBUG] - dump upgrade meta data to /root/.obd/cluster/oceanbase41/.upgrade
[2024-03-01 14:47:40.255] [INFO] See OceanBase分布式数据库-海量数据 笔笔算数 .
[2024-03-01 14:47:40.255] [INFO] Trace ID: 91f994e4-d797-11ee-8955-525400b51421
[2024-03-01 14:47:40.255] [INFO] If you want to view detailed obd logs, please run: obd display-trace 91f994e4-d797-11ee-8955-525400b51421
[2024-03-01 14:47:40.256] [DEBUG] - unlock /root/.obd/lock/global
[2024-03-01 14:47:40.256] [DEBUG] - unlock /root/.obd/lock/deploy_oceanbase41
[2024-03-01 14:47:40.256] [DEBUG] - unlock /root/.obd/lock/mirror_and_repo

再重新执行下obd cluster upgrade xxx的升级命令看看,看下obd的日志执行到哪个升级脚本失败了。对应的在执行命令的目录下,也会有upgrade_pre.log、upgrade_post.log之类的日志

看着执行upgrade_checker.py失败了,看下执行目录下的upgrade_checker.log

upgrade_checker.log 完整日志已上传

可能有个节点已经升级起了,之前发现209节点的启动时间变成20号的了,208节点启动时间还是2023年的


upgrade_checker.log (8.0 KB)

你这个环境比较奇怪,不知道为什么会4.2.1和4.2.2的版本共存。 按照正常的升级流程,如果执行 upgrade_checker.py失败以后,是不会继续替换4.2.2版本进行升级的。但是你发的_all_server里有一个4.2.2版本,应该是之前升级操作或者环境有问题导致的。

https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000510506#6-title-步骤五:执行脚本%20upgrade_checker.py

生产环境的内部项目,还是要恢复下,感谢佬铁支持

建议先备份下数据再搞,现在集群状态已经是比较异常的状态了,只能尝试恢复下

刚回复的好像误删了,可以尝试下通过locality变更把不对的机器下掉,换成多数派版本的再继续升级,升级时注意选择最新的422_HF1

OK