【 使用环境 】测试环境
【 OB or 其他组件 】oceanbase ce
【 使用版本 】4.1
【问题描述】使用OBD2.2.0 升级oceanbase ce4.1到4.2失败,状态一直在upgrading
【复现路径】问题出现前后相关操作
【问题现象及影响】
[lancoo@hg7185 ~]$ obd cluster upgrade LGOBD -c oceanbase-ce -V 4.2.0.0 --usable=5cc69b0ce9944adb57e36deb449bb70786d3ddc5
Get local repositories and plugins ok
Open ssh connection ok
Start observer ok
observer program health check ok
Connect to observer ok
Exec upgrade_post.py x
See OceanBase分布式数据库-海量数据 笔笔算数 .
Trace ID: 7e197b7c-3811-11ee-8635-0024ecf2d6ef
If you want to view detailed obd logs, please run: obd display-trace 7e197b7c-3811-11ee-8635-0024ecf2d6ef
之后查看集群状态,一直是upgrading
【附件】
樊安润
2023 年8 月 11 日 16:43
#5
执行命令:obd display-trace 7e197b7c-3811-11ee-8635-0024ecf2d6ef
看一下详细日志
[2023-08-11 14:37:07.626] [DEBUG] - cmd: [‘LGOBD’]
[2023-08-11 14:37:07.626] [DEBUG] - opts: {‘component’: ‘oceanbase-ce’, ‘version’: ‘4.2.0.0’, ‘skip_check’: None, ‘usable’: ‘5cc69b0ce9944adb57e36deb449bb70786d3ddc5’, ‘disable’: ‘’, ‘executer_path’: ‘/home/lancoo/.oceanbase-all-in-one/obd/usr/obd/lib/executer’, ‘script_query_timeout’: ‘’}
[2023-08-11 14:37:07.627] [DEBUG] - mkdir /home/lancoo/.obd/lock/
[2023-08-11 14:37:07.627] [DEBUG] - unknown lock mode
[2023-08-11 14:37:07.627] [DEBUG] - try to get share lock /home/lancoo/.obd/lock/global
[2023-08-11 14:37:07.627] [DEBUG] - share lock /home/lancoo/.obd/lock/global
, count 1
[2023-08-11 14:37:07.627] [DEBUG] - Get Deploy by name
[2023-08-11 14:37:07.627] [DEBUG] - mkdir /home/lancoo/.obd/cluster/
[2023-08-11 14:37:07.627] [DEBUG] - mkdir /home/lancoo/.obd/config_parser/
[2023-08-11 14:37:07.628] [DEBUG] - try to get exclusive lock /home/lancoo/.obd/lock/deploy_LGOBD
[2023-08-11 14:37:07.628] [DEBUG] - exclusive lock /home/lancoo/.obd/lock/deploy_LGOBD
, count 1
[2023-08-11 14:37:07.633] [DEBUG] - Deploy status judge
[2023-08-11 14:37:07.647] [INFO] Get local repositories and plugins
[2023-08-11 14:37:07.647] [DEBUG] - mkdir /home/lancoo/.obd/repository
[2023-08-11 14:37:07.648] [DEBUG] - Get local repository oceanbase-ce-4.1.0.0-21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe
[2023-08-11 14:37:07.648] [DEBUG] - try to get share lock /home/lancoo/.obd/lock/mirror_and_repo
[2023-08-11 14:37:07.648] [DEBUG] - share lock /home/lancoo/.obd/lock/mirror_and_repo
, count 1
[2023-08-11 14:37:07.651] [DEBUG] - Searching param plugin for components …
[2023-08-11 14:37:07.651] [DEBUG] - Search param plugin for oceanbase-ce
[2023-08-11 14:37:07.651] [DEBUG] - mkdir /home/lancoo/.obd/plugins
[2023-08-11 14:37:07.652] [DEBUG] - Found for oceanbase-ce-param-4.0.0.0 for oceanbase-ce-4.1.0.0
[2023-08-11 14:37:07.652] [DEBUG] - Applying oceanbase-ce-param-4.0.0.0 for oceanbase-ce-4.1.0.0-101010022023051821.el7-21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe
[2023-08-11 14:37:08.367] [DEBUG] - Search repository oceanbase-ce version: 4.1.0.0, tag: 21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe, release: None, package_hash: None
[2023-08-11 14:37:08.368] [DEBUG] - share lock /home/lancoo/.obd/lock/mirror_and_repo
, count 2
[2023-08-11 14:37:08.368] [DEBUG] - mkdir /home/lancoo/.obd/repository/oceanbase-ce
[2023-08-11 14:37:08.368] [DEBUG] - Found repository oceanbase-ce-4.1.0.0-101010022023051821.el7-21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe
[2023-08-11 14:37:08.368] [DEBUG] - Search repository oceanbase-ce version: 4.2.0.0, tag: 5cc69b0ce9944adb57e36deb449bb70786d3ddc5, release: None, package_hash: None
[2023-08-11 14:37:08.370] [DEBUG] - Found repository oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5
[2023-08-11 14:37:08.371] [INFO] Open ssh connection
[2023-08-11 14:37:08.502] [DEBUG] - Searching install plugin for components …
[2023-08-11 14:37:08.502] [DEBUG] - Search install plugin for oceanbase-ce
[2023-08-11 14:37:08.503] [DEBUG] - Found for oceanbase-ce-install-4.0.0.0 for oceanbase-ce-4.1.0.0
[2023-08-11 14:37:08.503] [DEBUG] - Search install plugin for oceanbase-ce
[2023-08-11 14:37:08.503] [DEBUG] - Found for oceanbase-ce-install-4.0.0.0 for oceanbase-ce-4.2.0.0
[2023-08-11 14:37:08.503] [DEBUG] - Searching install plugin for components …
[2023-08-11 14:37:08.503] [DEBUG] - Searching upgrade plugin for components …
[2023-08-11 14:37:08.503] [DEBUG] - Searching upgrade plugin for oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5
[2023-08-11 14:37:08.504] [DEBUG] - Found for oceanbase-ce-py_script_upgrade-4.1.0.0 for oceanbase-ce-4.2.0.0
[2023-08-11 14:37:08.504] [DEBUG] - Call oceanbase-ce-py_script_upgrade-4.1.0.0 for oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5
[2023-08-11 14:37:08.504] [DEBUG] - import upgrade
[2023-08-11 14:37:08.510] [DEBUG] - add upgrade ref count to 1
[2023-08-11 14:37:08.511] [DEBUG] - Searching param plugin for components …
[2023-08-11 14:37:08.511] [DEBUG] - Search param plugin for oceanbase-ce
[2023-08-11 14:37:08.511] [DEBUG] - Found for oceanbase-ce-param-4.0.0.0 for oceanbase-ce-4.1.0.0
[2023-08-11 14:37:08.511] [DEBUG] - Applying oceanbase-ce-param-4.0.0.0 for oceanbase-ce-4.1.0.0-101010022023051821.el7-21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe
[2023-08-11 14:37:08.512] [DEBUG] - Searching start plugin for components …
[2023-08-11 14:37:08.512] [DEBUG] - Searching start plugin for oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5
[2023-08-11 14:37:08.512] [DEBUG] - Found for oceanbase-ce-py_script_start-4.0.0.0 for oceanbase-ce-4.2.0.0
[2023-08-11 14:37:08.512] [DEBUG] – import start
[2023-08-11 14:37:08.515] [DEBUG] – add start ref count to 1
[2023-08-11 14:37:08.515] [INFO] Start observer
[2023-08-11 14:37:08.516] [DEBUG] — lancoo@192.168.122.74 execute: ls /data/LGOBD/oceanbase/store/clog/tenant_1/
[2023-08-11 14:37:08.697] [DEBUG] — exited code 0
[2023-08-11 14:37:08.697] [DEBUG] — lancoo@192.168.122.74 execute: cat /data/LGOBD/oceanbase/run/observer.pid
[2023-08-11 14:37:08.842] [DEBUG] — exited code 1, error output:
[2023-08-11 14:37:08.842] [DEBUG] cat: /data/LGOBD/oceanbase/run/observer.pid: No such file or directory
[2023-08-11 14:37:08.842] [DEBUG]
[2023-08-11 14:37:08.843] [DEBUG] — 192.168.122.74 start command construction
[2023-08-11 14:37:08.843] [DEBUG] — starting 192.168.122.74 observer
[2023-08-11 14:37:08.844] [DEBUG] — lancoo@192.168.122.74 set env LD_LIBRARY_PATH to ‘/data/LGOBD/oceanbase/lib:’
[2023-08-11 14:37:08.844] [DEBUG] — lancoo@192.168.122.74 execute: cd /data/LGOBD/oceanbase; /data/LGOBD/oceanbase/bin/observer -r ‘192.168.122.74:2882:2881’ -p 2881 -P 2882 -z ‘zone1’ -n ‘LGOBD’ -c 1 -d ‘/data/LGOBD/oceanbase/store’ -i ‘enp3s0f1’ -o __min_full_resource_pool_memory=2147483648,datafile_size=‘200G’,log_disk_size=‘64G’,memory_limit=‘16G’,system_memory=‘2G’,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,cpu_count=126
[2023-08-11 14:37:09.229] [DEBUG] — exited code 0
[2023-08-11 14:37:09.230] [DEBUG] — lancoo@192.168.122.74 delete env LD_LIBRARY_PATH
[2023-08-11 14:37:09.299] [INFO] observer program health check
[2023-08-11 14:37:12.303] [DEBUG] — 192.168.122.74 program health check
[2023-08-11 14:37:12.303] [DEBUG] — lancoo@192.168.122.74 execute: cat /data/LGOBD/oceanbase/run/observer.pid
[2023-08-11 14:37:12.416] [DEBUG] — exited code 0
[2023-08-11 14:37:12.416] [DEBUG] — lancoo@192.168.122.74 execute: ls /proc/11585
[2023-08-11 14:37:12.566] [DEBUG] — exited code 0
[2023-08-11 14:37:12.567] [DEBUG] — 192.168.122.74 observer[pid: 11585] started
[2023-08-11 14:37:12.688] [DEBUG] – sub start ref count to 0
[2023-08-11 14:37:12.689] [DEBUG] – export start
[2023-08-11 14:37:12.689] [DEBUG] - Searching connect plugin for components …
[2023-08-11 14:37:12.689] [DEBUG] - Searching connect plugin for oceanbase-ce-4.1.0.0-101010022023051821.el7-21271468e0dee7aaf3d4eff4c4bf5e07421ef6fe
[2023-08-11 14:37:12.690] [DEBUG] - Found for oceanbase-ce-py_script_connect-3.1.0 for oceanbase-ce-4.1.0.0
[2023-08-11 14:37:12.690] [DEBUG] – import connect
[2023-08-11 14:37:12.730] [DEBUG] – add connect ref count to 1
[2023-08-11 14:37:12.731] [INFO] Connect to observer
[2023-08-11 14:37:12.732] [DEBUG] — connect 192.168.122.74 -P2881 -uroot -pLancooECPsys
[2023-08-11 14:37:15.926] [DEBUG] — connect 192.168.122.74 -P2881 -uroot -p
[2023-08-11 14:37:15.930] [DEBUG] — execute sql: select 1. args: None
[2023-08-11 14:37:15.991] [DEBUG] – sub connect ref count to 0
[2023-08-11 14:37:15.991] [DEBUG] – export connect
[2023-08-11 14:37:15.991] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:15.994] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:17.995] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:17.996] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:19.998] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:19.999] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:22.001] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:22.002] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:24.003] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:24.004] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:26.006] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:26.007] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:28.009] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:28.010] [DEBUG] — OBD-5000: use oceanbase execute failed
[2023-08-11 14:37:30.012] [DEBUG] — execute sql: use oceanbase. args: None
[2023-08-11 14:37:30.020] [DEBUG] — execute sql: set session ob_query_timeout=1000000000. args: None
[2023-08-11 14:37:30.023] [DEBUG] – upgrade oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5 to oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5
[2023-08-11 14:37:30.023] [INFO] Exec upgrade_post.py
[2023-08-11 14:37:30.024] [DEBUG] – exec oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5 upgrade_post.py
[2023-08-11 14:37:30.055] [DEBUG] – exec oceanbase-ce-4.2.0.0-100000152023080109.el7-5cc69b0ce9944adb57e36deb449bb70786d3ddc5 upgrade_post.py
[2023-08-11 14:37:30.055] [DEBUG] – local execute: /home/lancoo/.oceanbase-all-in-one/obd/usr/obd/lib/executer/executer27/bin/executer /tmp/192.168.122.74:2882/5cc69b0ce9944adb57e36deb449bb70786d3ddc5/upgrade_post.py -h 192.168.122.74 -P 2881 -u root -p ‘LancooECPsys’
[2023-08-11 14:47:31.156] [DEBUG] – exited code 255, error output:
[2023-08-11 14:47:31.157] [DEBUG] Traceback (most recent call last):
[2023-08-11 14:47:31.157] [DEBUG] File “executer27.py”, line 51, in
[2023-08-11 14:47:31.157] [DEBUG] File “/tmp/192.168.122.74:2882/5cc69b0ce9944adb57e36deb449bb70786d3ddc5/upgrade_post.py”, line 2825, in
[2023-08-11 14:47:31.157] [DEBUG] do_upgrade_by_argv(sys.argv[1:])
[2023-08-11 14:47:31.157] [DEBUG] File “/tmp/192.168.122.74:2882/5cc69b0ce9944adb57e36deb449bb70786d3ddc5/upgrade_post_extract_files_2023_08_11_14_37_30_188989_uettFB6O/do_upgrade_post.py”, line 159, in do_upgrade_by_argv
[2023-08-11 14:47:31.157] [DEBUG] raise e
[2023-08-11 14:47:31.157] [DEBUG] NameError: global name ‘job_name’ is not defined
[2023-08-11 14:47:31.157] [DEBUG] [13315] Failed to execute script executer27
[2023-08-11 14:47:31.157] [DEBUG]
[2023-08-11 14:47:31.161] [DEBUG] - sub upgrade ref count to 0
[2023-08-11 14:47:31.161] [DEBUG] - export upgrade
[2023-08-11 14:47:31.163] [DEBUG] - dump upgrade meta data to /home/lancoo/.obd/cluster/LGOBD/.upgrade
[2023-08-11 14:47:31.170] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2023-08-11 14:47:31.171] [INFO] Trace ID: 7e197b7c-3811-11ee-8635-0024ecf2d6ef
[2023-08-11 14:47:31.171] [INFO] If you want to view detailed obd logs, please run: obd display-trace 7e197b7c-3811-11ee-8635-0024ecf2d6ef
[2023-08-11 14:47:31.171] [DEBUG] - unlock /home/lancoo/.obd/lock/global
[2023-08-11 14:47:31.172] [DEBUG] - unlock /home/lancoo/.obd/lock/deploy_LGOBD
[2023-08-11 14:47:31.172] [DEBUG] - unlock /home/lancoo/.obd/lock/mirror_and_repo
目前数据库操作正常,有必要重建集群吗
君野
2023 年8 月 11 日 17:40
#8
你好,我们内部试了下oceanbase-ce-4.1.0.0-101010022023051821升级到4.2可以正常升级,可以发下执行目录下的upgrade_post.log文件
upgrade_post.log (21.1 KB)
下午2点多时,由于SSH服务挂了,重启过服务器,重启后重新执行过升级程序
君野
2023 年8 月 11 日 19:42
#10
[2023-08-11 14:47:31] INFO init .py:1611 succeed to execute query: select count(*) from oceanbase.__all_server where (start_service_time <= 0 or status=‘inactive’), rowcount = 1
[2023-08-11 14:47:31] INFO init .py:1611 value is 1, expected value is 0, not matched
看着是执行到这一步失败了,机器没有起来,可以确认下机器起来后,查下oceanbase.__all_server这个表,节点都需要是active