【 使用环境 】测试环境
【 OB 】oceanbase-ce:4.2.5.1
【 使用版本 】
【问题描述】Oceanbase启动报错(Server is initializing)
【复现路径】
基础环境:麒麟V10 SP3
【附件及日志】
----------------------- 容器进程情况 ---------------------
sh-4.4#
sh-4.4# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 04:01 ? 00:00:00 /bin/bash /root/boot/start.sh
root 8 1 0 04:01 ? 00:00:00 /usr/sbin/sshd
root 201 1 28 04:01 ? 00:34:35 /root/ob/bin/observer -p 2881 -P 2882 -z zone1 -n obcluster -c 1 -d /root/ob/store -l INFO -I 172.16.233.6 -o __min_full_resource_pool_memory=2147483648,memory_limit=8G
root 9091 1 0 06:02 ? 00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 10
root 9096 0 0 06:02 pts/0 00:00:00 sh
root 9104 9096 0 06:02 pts/0 00:00:00 ps -ef
------------------------------端口情况--------------------------
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 51799/sshd
tcp 0 0 0.0.0.0:2881 0.0.0.0:* LISTEN 52157/observer
tcp 0 0 0.0.0.0:2882 0.0.0.0:* LISTEN 52157/observer
tcp 0 0 127.0.0.11:35587 0.0.0.0:* LISTEN 122527/dockerd
tcp 1 0 172.16.233.6:2881 172.16.233.1:59928 CLOSE_WAIT 52157/observer
tcp 1 0 172.16.233.6:2881 172.16.233.32:37978 CLOSE_WAIT 52157/observer
tcp 0 0 172.16.233.6:2881 172.16.233.1:48248 TIME_WAIT -
tcp6 0 0 :::22 :::* LISTEN 51799/sshd
tcp6 0 0 :::2882 :::* LISTEN 52157/observer
-----------------------obdiag log-------------------------
[2025-03-21 10:47:34.722] [ERROR] Traceback (most recent call last):
[2025-03-21 10:47:34.722] [ERROR] File “core.py”, line 2094, in start_cluster
[2025-03-21 10:47:34.722] [ERROR] File “core.py”, line 2141, in _start_cluster
[2025-03-21 10:47:34.722] [ERROR] File “core.py”, line 233, in run_workflow
[2025-03-21 10:47:34.722] [ERROR] File “core.py”, line 275, in run_plugin_template
[2025-03-21 10:47:34.723] [ERROR] File “core.py”, line 320, in call_plugin
[2025-03-21 10:47:34.723] [ERROR] File “_plugin.py”, line 352, in call
[2025-03-21 10:47:34.723] [ERROR] File “_plugin.py”, line 309, in _new_func
[2025-03-21 10:47:34.723] [ERROR] File “/root/.obd/plugins/oceanbase-ce/3.1.0/connect.py”, line 73, in connect
[2025-03-21 10:47:34.723] [ERROR] cursor = Cursor(ip=server.ip, port=server_config.get(‘mysql_port’, 2881), tenant=’’, password=****** if password is not None else ‘’, stdio=stdio)
[2025-03-21 10:47:34.723] [ERROR] File “_stdio.py”, line 1017, in wrapper
[2025-03-21 10:47:34.723] [ERROR] File “_stdio.py”, line 1004, in func_wrapper
[2025-03-21 10:47:34.723] [ERROR] File “tool.py”, line 768, in init
[2025-03-21 10:47:34.723] [ERROR] File “tool.py”, line 798, in _connect
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/connections.py”, line 353, in init
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/connections.py”, line 633, in connect
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/connections.py”, line 907, in _request_authentication
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/connections.py”, line 725, in _read_packet
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/protocol.py”, line 221, in raise_for_error
[2025-03-21 10:47:34.723] [ERROR] File “pymysql/err.py”, line 143, in raise_mysql_exception
[2025-03-21 10:47:34.723] [ERROR] pymysql.err.OperationalError: (8001, ‘Server is initializing\n[0.0.0.0:0] [2025-03-21 10:47:34.721303] [Y0-000630D7EA67FFA4-0-0]’)
[2025-03-21 10:47:34.723] [ERROR]
[2025-03-21 10:47:37.726] [ERROR] OBD-1006: Failed to connect to oceanbase-ce
-----------------observer.log--------------------------
[2025-03-24 05:56:17.744580] WDIAG [RPC] call_rpc (ob_async_rpc_proxy.h:356) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C730-0-0] [lt=6][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1}, cluster_id=1, tenant_id=1, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:17.744590] WDIAG [RPC] call (ob_async_rpc_proxy.h:290) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C730-0-0] [lt=9][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, cluster_id=1, tenant_id=1, arg={addr:“172.16.233.9:2882”, cluster_id:1}, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:17.744604] WDIAG [SHARE.PT] do_detect_master_rs_ls_ (ob_rpc_ls_table.cpp:296) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C730-0-0] [lt=4][errcode=0] fail to send rpc(tmp_ret=-4122, tmp_ret=“OB_RPC_POST_ERROR”, cluster_id=1, addr=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1})
[2025-03-24 05:56:18.746496] WDIAG [RPC] call_rpc (ob_async_rpc_proxy.h:356) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C731-0-0] [lt=7][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1}, cluster_id=1, tenant_id=1, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:18.746510] WDIAG [RPC] call (ob_async_rpc_proxy.h:290) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C731-0-0] [lt=13][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, cluster_id=1, tenant_id=1, arg={addr:“172.16.233.9:2882”, cluster_id:1}, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:18.746532] WDIAG [SHARE.PT] do_detect_master_rs_ls_ (ob_rpc_ls_table.cpp:296) [579][T1_LSMetaCh][T1][YB42AC10E906-0006310EA906C731-0-0] [lt=10][errcode=0] fail to send rpc(tmp_ret=-4122, tmp_ret=“OB_RPC_POST_ERROR”, cluster_id=1, addr=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1})
[2025-03-24 05:56:19.124184] WDIAG [RPC] call_rpc (ob_async_rpc_proxy.h:356) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=16][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1}, cluster_id=1, tenant_id=1, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:19.124196] WDIAG [RPC] call (ob_async_rpc_proxy.h:290) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=11][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, cluster_id=1, tenant_id=1, arg={addr:“172.16.233.9:2882”, cluster_id:1}, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:19.124219] WDIAG [SHARE.PT] do_detect_master_rs_ls_ (ob_rpc_ls_table.cpp:296) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=8][errcode=0] fail to send rpc(tmp_ret=-4122, tmp_ret=“OB_RPC_POST_ERROR”, cluster_id=1, addr=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1})
[2025-03-24 05:56:19.124372] WDIAG [RPC] call_rpc (ob_async_rpc_proxy.h:356) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=11][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1}, cluster_id=1, tenant_id=1, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:19.124381] WDIAG [RPC] call (ob_async_rpc_proxy.h:290) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=8][errcode=-4122] call rpc func failed(server=“172.16.233.9:2882”, timeout=2000000, cluster_id=1, tenant_id=1, arg={addr:“172.16.233.9:2882”, cluster_id:1}, group_id=9, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:19.124404] WDIAG [SHARE.PT] do_detect_master_rs_ls_ (ob_rpc_ls_table.cpp:296) [201][observer][T0][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=14][errcode=0] fail to send rpc(tmp_ret=-4122, tmp_ret=“OB_RPC_POST_ERROR”, cluster_id=1, addr=“172.16.233.9:2882”, timeout=2000000, arg={addr:“172.16.233.9:2882”, cluster_id:1})
[2025-03-24 05:56:19.124582] WDIAG [SHARE.PT] update (ob_rpc_ls_table.cpp:193) [201][observer][T1][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=10][errcode=-4122] report sys_tenant’s ls through rpc failed(replica={modify_time_us:0, create_time_us:0, tenant_id:1, ls_id:{id:1}, server:“172.16.233.6:2882”, sql_port:2881, role:2, member_list:[{server:“172.16.233.9:2882”, timestamp:1}], replica_type:0, proposal_id:1, replica_status:“NORMAL”, restore_status:{status:0}, property:{memstore_percent_:100}, unit_id:1, zone:“zone1”, paxos_replica_number:1, data_size:0, required_size:0, in_member_list:false, member_time_us:0, learner_list:{learner_num:0, learner_array:[]}, in_learner_list:false, rebuild:false}, rs_addr=“172.16.233.9:2882”, ret=-4122, ret=“OB_RPC_POST_ERROR”)
[2025-03-24 05:56:19.124607] INFO [SHARE.PT] update (ob_rpc_ls_table.cpp:196) [201][observer][T1][YB42AC10E906-0006310EA7C6AC4C-0-0] [lt=24] update sys_tenant’s ls replica(ret=-4122, ret=“OB_RPC_POST_ERROR”, replica={modify_time_us:0, create_time_us:0, tenant_id:1, ls_id:{id:1}, server:“172.16.233.6:2882”, sql_port:2881, role:2, member_list:[{server:“172.16.233.9:2882”, timestamp:1}], replica_type:0, proposal_id:1, replica_status:“NORMAL”, restore_status:{status:0}, property:{memstore_percent_:100}, unit_id:1, zone:“zone1”, paxos_replica_number:1, data_size:0, required_size:0, in_member_list:false, member_time_us:0, learner_list:{learner_num:0, learner_array:[]}, in_learner_list:false, rebuild:false})
-------------------------- 容器 IP ---------------------------------
172.16.233.6
问题排查进度:
我是docker容器方式部署,容器地址是172.16.233.6,不知道为什么observer的报错,连接地址是
172.16.233.9
ob配置如下:
cat /home/moresec/data/mid_res/oceanbase/data/obd/cluster/obcluster/config.yaml
oceanbase-ce:
servers:
- 172.16.233.6
global:
home_path: /root/ob
mysql_port: 2881
rpc_port: 2882
zone: zone1
cluster_id: 1
appname: obcluster
memory_limit: 8G
system_memory: 3G
datafile_size: 5G
log_disk_size: 5G
root_password: weckV9vEpm
scenario: express_oltp
obconfig_url:
cpu_count: 16
production_mode: false
syslog_level: INFO
enable_syslog_wf: false
enable_syslog_recycle: true
max_syslog_file_count: 4
enable_rich_error_msg: true
这里里面的server ip是我手动修改成 172.16.233.6的,没修改前是172.16.233.9,与实际容器地址不符,
目前不知道问题原因在哪