OpenEuler 24.3 使用OBD WEB 白屏部署集群失败

【 使用环境 】生产环境 OpenEuler 24.03版本 x86
【 OB or 其他组件 】OceanBase DataBase、OBProxy
【 使用版本 】4.3.4.0
【问题描述】使用OBD WEB 白屏部署 OceanBase DataBase失败
报错日志如下,检测实际目录下 文件 /root/myoceanbase/oceanbase/run/obshell.pid存在

[2024-12-06 16:40:35.450] [INFO] [WARN] OBD-1012: (192.168.124.46) clog and data use the same disk (/)
[2024-12-06 16:40:35.450] [INFO] [WARN] OBD-1012: (192.168.124.49) clog and data use the same disk (/)
[2024-12-06 16:40:35.450] [INFO] [WARN] OBD-1012: (192.168.124.32) clog and data use the same disk (/)
[2024-12-06 16:40:35.450] [INFO] 
[2024-12-06 16:40:35.451] [DEBUG] - plugin oceanbase-ce-py_script_start_check-4.3.0.0 result: True
[2024-12-06 16:40:35.451] [DEBUG] - Call oceanbase-ce-py_script_start-4.3.0.0 for oceanbase-ce-4.3.4.0-100000162024110717.el7-5d59e837a0ecff1a6baa20f72747c343ac7c8dce
[2024-12-06 16:40:35.451] [DEBUG] - import start
[2024-12-06 16:40:35.456] [DEBUG] - add start ref count to 1
[2024-12-06 16:40:35.457] [INFO] cluster scenario: htap
[2024-12-06 16:40:35.457] [INFO] Start observer
[2024-12-06 16:40:35.457] [DEBUG] -- root@192.168.124.46 execute: ls /root/myoceanbase/oceanbase/store/clog/tenant_1/ 
[2024-12-06 16:40:35.521] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:40:35.522] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/store/clog/tenant_1/': No such file or directory
[2024-12-06 16:40:35.522] [DEBUG] 
[2024-12-06 16:40:35.522] [DEBUG] -- root@192.168.124.46 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:35.625] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:40:35.626] [DEBUG] cat: /root/myoceanbase/oceanbase/run/observer.pid: No such file or directory
[2024-12-06 16:40:35.626] [DEBUG] 
[2024-12-06 16:40:35.626] [DEBUG] -- 192.168.124.46 start command construction
[2024-12-06 16:40:35.626] [DEBUG] -- update large_query_threshold to 600s because of scenario
[2024-12-06 16:40:35.626] [DEBUG] -- update enable_record_trace_log to False because of scenario
[2024-12-06 16:40:35.626] [DEBUG] -- update enable_syslog_recycle to 1 because of scenario
[2024-12-06 16:40:35.626] [DEBUG] -- update max_syslog_file_count to 300 because of scenario
[2024-12-06 16:40:35.627] [DEBUG] -- root@192.168.124.49 execute: ls /root/myoceanbase/oceanbase/store/clog/tenant_1/ 
[2024-12-06 16:40:35.690] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:40:35.691] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/store/clog/tenant_1/': No such file or directory
[2024-12-06 16:40:35.691] [DEBUG] 
[2024-12-06 16:40:35.691] [DEBUG] -- root@192.168.124.49 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:35.793] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:40:35.794] [DEBUG] cat: /root/myoceanbase/oceanbase/run/observer.pid: No such file or directory
[2024-12-06 16:40:35.794] [DEBUG] 
[2024-12-06 16:40:35.794] [DEBUG] -- 192.168.124.49 start command construction
[2024-12-06 16:40:35.794] [DEBUG] -- update large_query_threshold to 600s because of scenario
[2024-12-06 16:40:35.794] [DEBUG] -- update enable_record_trace_log to False because of scenario
[2024-12-06 16:40:35.794] [DEBUG] -- update enable_syslog_recycle to 1 because of scenario
[2024-12-06 16:40:35.794] [DEBUG] -- update max_syslog_file_count to 300 because of scenario
[2024-12-06 16:40:35.794] [DEBUG] -- root@192.168.124.32 execute: ls /root/myoceanbase/oceanbase/store/clog/tenant_1/ 
[2024-12-06 16:40:35.859] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:40:35.860] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/store/clog/tenant_1/': No such file or directory
[2024-12-06 16:40:35.860] [DEBUG] 
[2024-12-06 16:40:35.860] [DEBUG] -- root@192.168.124.32 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:35.963] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:40:35.963] [DEBUG] cat: /root/myoceanbase/oceanbase/run/observer.pid: No such file or directory
[2024-12-06 16:40:35.963] [DEBUG] 
[2024-12-06 16:40:35.963] [DEBUG] -- 192.168.124.32 start command construction
[2024-12-06 16:40:35.963] [DEBUG] -- update large_query_threshold to 600s because of scenario
[2024-12-06 16:40:35.964] [DEBUG] -- update enable_record_trace_log to False because of scenario
[2024-12-06 16:40:35.964] [DEBUG] -- update enable_syslog_recycle to 1 because of scenario
[2024-12-06 16:40:35.964] [DEBUG] -- update max_syslog_file_count to 300 because of scenario
[2024-12-06 16:40:35.964] [DEBUG] -- starting 192.168.124.46 observer
[2024-12-06 16:40:35.964] [DEBUG] -- root@192.168.124.46 export LD_LIBRARY_PATH='/root/myoceanbase/oceanbase/lib:'
[2024-12-06 16:40:35.964] [DEBUG] -- root@192.168.124.46 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/observer -r '192.168.124.46:2882:2881;192.168.124.49:2882:2881;192.168.124.32:2882:2881' -p 2881 -P 2882 -z 'zone1' -n 'myoceanbase' -c 1733473891 -d '/root/myoceanbase/oceanbase/store' -I '192.168.124.46' -o __min_full_resource_pool_memory=2147483648,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit='10G',datafile_size='25G',system_memory='3G',log_disk_size='25G',cpu_count=16,datafile_maxsize='64G',datafile_next='6G',large_query_threshold='600s',enable_record_trace_log=False,enable_syslog_recycle=1 
[2024-12-06 16:40:36.077] [DEBUG] -- exited code 0
[2024-12-06 16:40:36.078] [DEBUG] -- root@192.168.124.46 export LD_LIBRARY_PATH=''
[2024-12-06 16:40:36.078] [DEBUG] -- starting 192.168.124.49 observer
[2024-12-06 16:40:36.079] [DEBUG] -- root@192.168.124.49 export LD_LIBRARY_PATH='/root/myoceanbase/oceanbase/lib:'
[2024-12-06 16:40:36.079] [DEBUG] -- root@192.168.124.49 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/observer -r '192.168.124.46:2882:2881;192.168.124.49:2882:2881;192.168.124.32:2882:2881' -p 2881 -P 2882 -z 'zone2' -n 'myoceanbase' -c 1733473891 -d '/root/myoceanbase/oceanbase/store' -I '192.168.124.49' -o __min_full_resource_pool_memory=2147483648,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit='10G',datafile_size='25G',system_memory='3G',log_disk_size='25G',cpu_count=16,datafile_maxsize='64G',datafile_next='6G',large_query_threshold='600s',enable_record_trace_log=False,enable_syslog_recycle=1 
[2024-12-06 16:40:36.197] [DEBUG] -- exited code 0
[2024-12-06 16:40:36.197] [DEBUG] -- root@192.168.124.49 export LD_LIBRARY_PATH=''
[2024-12-06 16:40:36.197] [DEBUG] -- starting 192.168.124.32 observer
[2024-12-06 16:40:36.198] [DEBUG] -- root@192.168.124.32 export LD_LIBRARY_PATH='/root/myoceanbase/oceanbase/lib:'
[2024-12-06 16:40:36.198] [DEBUG] -- root@192.168.124.32 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/observer -r '192.168.124.46:2882:2881;192.168.124.49:2882:2881;192.168.124.32:2882:2881' -p 2881 -P 2882 -z 'zone3' -n 'myoceanbase' -c 1733473891 -d '/root/myoceanbase/oceanbase/store' -I '192.168.124.32' -o __min_full_resource_pool_memory=2147483648,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit='10G',datafile_size='25G',system_memory='3G',log_disk_size='25G',cpu_count=16,datafile_maxsize='26G',datafile_next='3G',large_query_threshold='600s',enable_record_trace_log=False,enable_syslog_recycle=1 
[2024-12-06 16:40:36.318] [DEBUG] -- exited code 0
[2024-12-06 16:40:36.319] [DEBUG] -- root@192.168.124.32 export LD_LIBRARY_PATH=''
[2024-12-06 16:40:36.319] [DEBUG] -- start_obshell: False
[2024-12-06 16:40:36.320] [INFO] observer program health check
[2024-12-06 16:40:39.323] [DEBUG] -- 192.168.124.46 program health check
[2024-12-06 16:40:39.323] [DEBUG] -- root@192.168.124.46 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:39.390] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.391] [DEBUG] -- root@192.168.124.46 execute: ls /proc/39638 
[2024-12-06 16:40:39.508] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.509] [DEBUG] -- 192.168.124.46 observer[pid: 39638] started
[2024-12-06 16:40:39.509] [DEBUG] -- 192.168.124.49 program health check
[2024-12-06 16:40:39.509] [DEBUG] -- root@192.168.124.49 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:39.593] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.593] [DEBUG] -- root@192.168.124.49 execute: ls /proc/31146 
[2024-12-06 16:40:39.710] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.710] [DEBUG] -- 192.168.124.49 observer[pid: 31146] started
[2024-12-06 16:40:39.711] [DEBUG] -- 192.168.124.32 program health check
[2024-12-06 16:40:39.711] [DEBUG] -- root@192.168.124.32 execute: cat /root/myoceanbase/oceanbase/run/observer.pid 
[2024-12-06 16:40:39.780] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.780] [DEBUG] -- root@192.168.124.32 execute: ls /proc/23050 
[2024-12-06 16:40:39.889] [DEBUG] -- exited code 0
[2024-12-06 16:40:39.890] [DEBUG] -- 192.168.124.32 observer[pid: 23050] started
[2024-12-06 16:40:39.890] [DEBUG] -- need_bootstrap: True
[2024-12-06 16:40:39.890] [DEBUG] - sub start ref count to 0
[2024-12-06 16:40:39.891] [DEBUG] - export start
[2024-12-06 16:40:39.891] [DEBUG] - plugin oceanbase-ce-py_script_start-4.3.0.0 result: True
[2024-12-06 16:40:39.891] [DEBUG] - Call oceanbase-ce-py_script_connect-4.2.2.0 for oceanbase-ce-4.3.4.0-100000162024110717.el7-5d59e837a0ecff1a6baa20f72747c343ac7c8dce
[2024-12-06 16:40:39.891] [DEBUG] - import connect
[2024-12-06 16:40:39.901] [DEBUG] - add connect ref count to 1
[2024-12-06 16:40:39.902] [DEBUG] -- connect obshell (192.168.124.46:2886)
[2024-12-06 16:40:39.902] [DEBUG] -- connect obshell (192.168.124.49:2886)
[2024-12-06 16:40:39.902] [DEBUG] -- connect obshell (192.168.124.32:2886)
[2024-12-06 16:40:39.903] [INFO] Connect to observer
[2024-12-06 16:40:39.903] [DEBUG] -- connect 192.168.124.46 -P2881 -uroot -p******
[2024-12-06 16:40:39.907] [DEBUG] -- connect 192.168.124.49 -P2881 -uroot -p******
[2024-12-06 16:40:39.911] [DEBUG] -- connect 192.168.124.32 -P2881 -uroot -p******
[2024-12-06 16:41:55.196] [DEBUG] -- execute sql: select 1. args: None
[2024-12-06 16:41:55.200] [DEBUG] - sub connect ref count to 0
[2024-12-06 16:41:55.200] [DEBUG] - export connect
[2024-12-06 16:41:55.200] [DEBUG] - plugin oceanbase-ce-py_script_connect-4.2.2.0 result: True
[2024-12-06 16:41:55.200] [INFO] Initialize oceanbase-ce
[2024-12-06 16:41:55.201] [DEBUG] - Call oceanbase-ce-py_script_bootstrap-4.2.2.0 for oceanbase-ce-4.3.4.0-100000162024110717.el7-5d59e837a0ecff1a6baa20f72747c343ac7c8dce
[2024-12-06 16:41:55.201] [DEBUG] - import bootstrap
[2024-12-06 16:41:55.204] [DEBUG] - add bootstrap ref count to 1
[2024-12-06 16:41:55.205] [DEBUG] -- bootstrap for components: dict_keys(['oceanbase-ce', 'obproxy-ce'])
[2024-12-06 16:41:55.205] [DEBUG] -- execute sql: set session ob_query_timeout=1000000000
[2024-12-06 16:41:55.205] [DEBUG] -- execute sql: set session ob_query_timeout=1000000000. args: None
[2024-12-06 16:41:55.208] [DEBUG] -- execute sql: alter system bootstrap REGION "sys_region" ZONE "zone1" SERVER "192.168.124.46:2882",REGION "sys_region" ZONE "zone2" SERVER "192.168.124.49:2882",REGION "sys_region" ZONE "zone3" SERVER "192.168.124.32:2882". args: None
[2024-12-06 16:47:35.948] [DEBUG] -- execute sql: alter user "root" IDENTIFIED BY %s. args: ['******']
[2024-12-06 16:47:38.055] [DEBUG] -- execute sql: select * from oceanbase.__all_server. args: None
[2024-12-06 16:47:38.083] [DEBUG] -- root@192.168.124.46 execute: ls /root/myoceanbase/oceanbase/.meta 
[2024-12-06 16:47:38.175] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:47:38.176] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.176] [DEBUG] 
[2024-12-06 16:47:38.176] [DEBUG] -- 
[2024-12-06 16:47:38.176] [DEBUG] -- ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.176] [DEBUG] 
[2024-12-06 16:47:38.176] [DEBUG] -- root@192.168.124.46 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:47:38.306] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:47:38.306] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:47:38.306] [DEBUG] 
[2024-12-06 16:47:38.307] [DEBUG] -- root@192.168.124.46 execute: strings /root/myoceanbase/oceanbase/etc/observer.conf.bin 
[2024-12-06 16:47:38.428] [DEBUG] -- exited code 127, error output:
[2024-12-06 16:47:38.428] [DEBUG] bash: line 1: strings: command not found
[2024-12-06 16:47:38.428] [DEBUG] 
[2024-12-06 16:47:38.428] [DEBUG] -- 
[2024-12-06 16:47:38.428] [DEBUG] -- bash: line 1: strings: command not found
[2024-12-06 16:47:38.428] [DEBUG] 
[2024-12-06 16:47:38.429] [DEBUG] -- root@192.168.124.49 execute: ls /root/myoceanbase/oceanbase/.meta 
[2024-12-06 16:47:38.523] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:47:38.523] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.523] [DEBUG] 
[2024-12-06 16:47:38.523] [DEBUG] -- 
[2024-12-06 16:47:38.523] [DEBUG] -- ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.523] [DEBUG] 
[2024-12-06 16:47:38.523] [DEBUG] -- root@192.168.124.49 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:47:38.654] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:47:38.655] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:47:38.655] [DEBUG] 
[2024-12-06 16:47:38.655] [DEBUG] -- root@192.168.124.49 execute: strings /root/myoceanbase/oceanbase/etc/observer.conf.bin 
[2024-12-06 16:47:38.781] [DEBUG] -- exited code 127, error output:
[2024-12-06 16:47:38.782] [DEBUG] bash: line 1: strings: command not found
[2024-12-06 16:47:38.782] [DEBUG] 
[2024-12-06 16:47:38.782] [DEBUG] -- 
[2024-12-06 16:47:38.782] [DEBUG] -- bash: line 1: strings: command not found
[2024-12-06 16:47:38.782] [DEBUG] 
[2024-12-06 16:47:38.782] [DEBUG] -- root@192.168.124.32 execute: ls /root/myoceanbase/oceanbase/.meta 
[2024-12-06 16:47:38.866] [DEBUG] -- exited code 2, error output:
[2024-12-06 16:47:38.867] [DEBUG] ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.867] [DEBUG] 
[2024-12-06 16:47:38.867] [DEBUG] -- 
[2024-12-06 16:47:38.867] [DEBUG] -- ls: cannot access '/root/myoceanbase/oceanbase/.meta': No such file or directory
[2024-12-06 16:47:38.867] [DEBUG] 
[2024-12-06 16:47:38.867] [DEBUG] -- root@192.168.124.32 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:47:38.984] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:47:38.985] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:47:38.985] [DEBUG] 
[2024-12-06 16:47:38.985] [DEBUG] -- root@192.168.124.32 execute: strings /root/myoceanbase/oceanbase/etc/observer.conf.bin 
[2024-12-06 16:47:39.100] [DEBUG] -- exited code 127, error output:
[2024-12-06 16:47:39.101] [DEBUG] bash: line 1: strings: command not found
[2024-12-06 16:47:39.101] [DEBUG] 
[2024-12-06 16:47:39.101] [DEBUG] -- 
[2024-12-06 16:47:39.101] [DEBUG] -- bash: line 1: strings: command not found
[2024-12-06 16:47:39.101] [DEBUG] 
[2024-12-06 16:47:39.102] [DEBUG] -- root@192.168.124.46 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:47:39.180] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:47:39.180] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:47:39.180] [DEBUG] 
[2024-12-06 16:47:39.181] [DEBUG] -- root@192.168.124.46 export OB_ROOT_PASSWORD=''******''
[2024-12-06 16:47:39.181] [DEBUG] -- start obshell: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.46 --port 2886
[2024-12-06 16:47:39.181] [DEBUG] -- root@192.168.124.46 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.46 --port 2886 
[2024-12-06 16:49:58.531] [DEBUG] -- exited code 0
[2024-12-06 16:49:58.531] [DEBUG] -- root@192.168.124.49 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:49:58.629] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:49:58.629] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:49:58.629] [DEBUG] 
[2024-12-06 16:49:58.629] [DEBUG] -- root@192.168.124.49 export OB_ROOT_PASSWORD=''******''
[2024-12-06 16:49:58.630] [DEBUG] -- start obshell: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.49 --port 2886
[2024-12-06 16:49:58.630] [DEBUG] -- root@192.168.124.49 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.49 --port 2886 
[2024-12-06 16:50:44.920] [DEBUG] -- exited code 0
[2024-12-06 16:50:44.920] [DEBUG] -- root@192.168.124.32 execute: cat /root/myoceanbase/oceanbase/run/obshell.pid 
[2024-12-06 16:50:45.018] [DEBUG] -- exited code 1, error output:
[2024-12-06 16:50:45.019] [DEBUG] cat: /root/myoceanbase/oceanbase/run/obshell.pid: No such file or directory
[2024-12-06 16:50:45.019] [DEBUG] 
[2024-12-06 16:50:45.019] [DEBUG] -- root@192.168.124.32 export OB_ROOT_PASSWORD=''******''
[2024-12-06 16:50:45.020] [DEBUG] -- start obshell: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.32 --port 2886
[2024-12-06 16:50:45.020] [DEBUG] -- root@192.168.124.32 execute: cd /root/myoceanbase/oceanbase; /root/myoceanbase/oceanbase/bin/obshell admin start --ip 192.168.124.32 --port 2886 
[2024-12-06 16:51:33.795] [DEBUG] -- exited code 0
[2024-12-06 16:51:36.799] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/status, data: {}, headers: None, params: None
[2024-12-06 16:51:36.815] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/status, data: {}, headers: None, params: None
[2024-12-06 16:51:36.830] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/status, data: {}, headers: None, params: None
[2024-12-06 16:51:39.991] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/maintain/agent, data: rKXi5FNjZytmLV4SvfX7vQ==, headers: None, params: None
[2024-12-06 16:51:39.992] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:40.001] [DEBUG] -- request obshell failed: <Response [404]>
[2024-12-06 16:51:40.016] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/maintain/agent, data: lttgDDGmPPxGWFvlj6hlDA==, headers: None, params: None
[2024-12-06 16:51:40.017] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:40.026] [DEBUG] -- request obshell failed: <Response [404]>
[2024-12-06 16:51:40.035] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/maintain/agent, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:40.035] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:40.159] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/23232267296028861, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:40.159] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:41.171] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/23232267296028861, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:41.171] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:45.724] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/23232267296028861, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:45.725] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:47.877] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/23232267296028861, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:47.878] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:51.226] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/task/dag/23232267296028861, data: kqAO8qDCrS+QI2/Bb0BZbA==, headers: None, params: None
[2024-12-06 16:51:51.226] [DEBUG] -- send request to obshell: method: GET, url: /api/v1/secret, data: {}, headers: None, params: None
[2024-12-06 16:51:56.249] [DEBUG] -- request obshell failed: <Response [404]>
[2024-12-06 16:51:56.249] [DEBUG] -- find take over dag failed, count: 600
[2024-12-06 16:51:57.253] [ERROR] obshell take over failed
[2024-12-06 16:51:57.253] [DEBUG] - sub bootstrap ref count to 0
[2024-12-06 16:51:57.253] [DEBUG] - export bootstrap
[2024-12-06 16:51:57.253] [DEBUG] - plugin oceanbase-ce-py_script_bootstrap-4.2.2.0 result: False
[2024-12-06 16:51:57.253] [INFO] [ERROR] obshell take over failed
[2024-12-06 16:51:57.253] [INFO] 
[2024-12-06 16:51:57.253] [ERROR] Cluster init failed

【复现路径】config配置如下

user:
  username: root
  password: *****
  port: 22
oceanbase-ce:
  version: 4.3.4.0
  release: 100000162024110717.el7
  package_hash: 5d59e837a0ecff1a6baa20f72747c343ac7c8dce
  192.168.124.46:
    zone: zone1
    datafile_maxsize: 64G
    datafile_next: 6G
  192.168.124.49:
    zone: zone2
    datafile_maxsize: 64G
    datafile_next: 6G
  192.168.124.32:
    zone: zone3
    datafile_maxsize: 26G
    datafile_next: 3G
  servers:
  - 192.168.124.46
  - 192.168.124.49
  - 192.168.124.32
  global:
    appname: myoceanbase
    root_password: Yanfa2023@
    mysql_port: 2881
    rpc_port: 2882
    home_path: /root/myoceanbase/oceanbase
    scenario: htap
    cluster_id: 1733473891
    proxyro_password: iMoFmDyt3U
    enable_syslog_wf: false
    max_syslog_file_count: 4
    memory_limit: 10G
    datafile_size: 25G
    system_memory: 3G
    log_disk_size: 25G
    cpu_count: 16
    production_mode: false
obproxy-ce:
  version: 4.3.2.0
  package_hash: fd779e401be448715254165b1a4f7205c4c1bda5
  release: 26.el7
  servers:
  - 192.168.124.46
  global:
    prometheus_listen_port: 2884
    listen_port: 2883
    rpc_listen_port: 2885
    home_path: /root/myoceanbase/obproxy
    obproxy_sys_password: Bjn@W21dzSW~JP
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false
    enable_cluster_checkout: false
  192.168.124.46:
    proxy_id: 4135
    client_session_id_version: 2
  depends:
  - oceanbase-ce

【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!

根据您提供的信息,您在使用 OBD Web 白屏部署 OceanBase 数据库时遇到了问题。具体表现为:

  1. 在启动 obshell 时,/root/myoceanbase/oceanbase/run/obshell.pid 文件不存在。
  2. obshell 启动后,尝试通过 API 请求获取状态时返回 404 错误,最终导致 obshell take over failed

分析与解决方案

1. 检查 obshell 是否正确安装

确保 obshell 已经正确安装在所有节点上,并且路径正确。您可以手动检查 /root/myoceanbase/oceanbase/bin/obshell 文件是否存在。

ls /root/myoceanbase/oceanbase/bin/obshell

如果文件不存在,您需要重新安装 obshell

2. 检查 obshell 的启动日志

查看 obshell 的启动日志,以确定是否有任何错误信息。日志文件通常位于 /root/myoceanbase/oceanbase/log/ 目录下。

cat /root/myoceanbase/oceanbase/log/obshell.log

3. 检查 obshell 的配置

确保 obshell 的配置文件正确无误。配置文件通常位于 /root/myoceanbase/oceanbase/etc/ 目录下。

cat /root/myoceanbase/oceanbase/etc/obshell.conf

4. 手动启动 obshell

尝试手动启动 obshell,以确定是否能够正常运行。

cd /root/myoceanbase/oceanbase
./bin/obshell admin start --ip 192.168.124.46 --port 2886

5. 检查网络和防火墙

确保所有节点之间的网络通信正常,并且防火墙没有阻止必要的端口(如 2886)。

telnet 192.168.124.46 2886

6. 检查 strings 命令

在日志中看到 strings: command not found 的错误,这可能是由于 strings 命令未安装。您可以安装 binutils 包来解决这个问题。

sudo yum install binutils

综合建议

  1. 重新安装 obshell:确保 obshell 正确安装在所有节点上。
  2. 检查配置文件:确保 obshell 的配置文件正确无误。
  3. 手动启动 obshell:尝试手动启动 obshell,以确定是否能够正常运行。
  4. 检查网络和防火墙:确保所有节点之间的网络通信正常,并且防火墙没有阻止必要的端口。
  5. 安装 strings 命令:确保 strings 命令已安装。

如果以上步骤仍然无法解决问题,建议您联系 OceanBase 的技术支持团队,以获取更详细的帮助。希望这些信息对您有所帮助!

检索到的文档如下:

  1. 配置全局唯一的 Client Session ID
  2. 使用 obd 部署 ODP
  3. ODP 支持 RPC 服务端口
  4. 通过命令行部署 ODP
  5. OceanBase 安装部署工具 V2.8.0
  6. 通过图形化界面部署 OceanBase 集群
  7. 集群诊断
  8. 安装并配置 obd
  9. ob_error
  10. 通过 obd 白屏部署 OceanBase 集群
  11. 快速体验 OceanBase 社区版
  12. 部署 FAQ

参考小助手的把obshell日志发出来看一下。感觉像是obshell被杀掉了

日志文件如下:
obshell_192.168.124.46.log (489.5 KB)
obshell_192.168.124.49.log (220.3 KB)
obshell_192.168.124.32.log (262.1 KB)


访问2886端口返回信息如图

感谢您提供了更多的信息,特别是 obshell 的日志文件。根据您提供的日志文件和访问 2886 端口的返回信息,我们可以进一步分析问题。

分析

  1. 404 错误

    • 访问 http://192.168.124.46:2886/ 返回 404 错误,表明 obshell 服务没有正确启动或监听该端口。
  2. 日志文件分析

    • 请检查 obshell 的日志文件,以确定是否有任何错误信息。以下是每个节点的日志文件路径:
      • obshell_192.168.124.46.log
      • obshell_192.168.124.49.log
      • obshell_192.168.124.32.log

日志文件分析

1. obshell_192.168.124.46.log

cat obshell_192.168.124.46.log

2. obshell_192.168.124.49.log

cat obshell_192.168.124.49.log

3. obshell_192.168.124.32.log

cat obshell_192.168.124.32.log

关键点

  1. 检查 obshell 进程

    • 确认 obshell 进程是否在运行。您可以使用以下命令检查进程:
      ps aux | grep obshell
      
  2. 检查端口监听

    • 确认 2886 端口是否被 obshell 监听。您可以使用以下命令检查端口监听情况:
      netstat -tuln | grep 2886
      
  3. 检查日志中的错误信息

    • 查看日志文件中的错误信息,特别是启动时的错误信息。常见的错误包括配置文件错误、权限问题、依赖库缺失等。

示例分析

假设 obshell_192.168.124.46.log 中有以下错误信息:

2024-12-06 16:47:38.428 [DEBUG] -- bash: line 1: strings: command not found

这表明 strings 命令未安装。您可以安装 binutils 包来解决这个问题:

sudo yum install binutils

综合建议

  1. 检查 obshell 进程

    • 确认 obshell 进程是否在运行。
    • 如果进程未运行,尝试手动启动 obshell
      cd /root/myoceanbase/oceanbase
      ./bin/obshell admin start --ip 192.168.124.46 --port 2886
      
  2. 检查端口监听

    • 确认 2886 端口是否被 obshell 监听。
    • 如果端口未被监听,检查 obshell 的配置文件和启动命令。
  3. 检查日志中的错误信息

    • 查看日志文件中的错误信息,特别是启动时的错误信息。
    • 解决日志中提到的任何问题,如安装缺失的命令、调整配置文件等。
  4. 重新安装 obshell

    • 如果上述步骤仍无法解决问题,建议重新安装 obshell

联系技术支持

如果以上步骤仍然无法解决问题,建议您联系 OceanBase 的技术支持团队,以获取更详细的帮助。希望这些信息对您有所帮助!

具体信息可参考以下文档:

  1. 查询集群信息
  2. 查询状态
  3. 获取任务的详细信息
  4. 获取 Node 的详细信息
  5. 查询 Agent 未完成的任务
  6. 查询 obshell 的最后一个运维任务
  7. 查询所有未完成的任务

(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)

根据 obshell 的日志分析,发现是在 obd 请求 obshell 的接口时发生错误,导致 obd 直接认为 obshell 接管 ob 集群失败。
但其实 obshell 已经正常接管集群,可以通过 /api/v1/ob/info 查看集群状态,也可以通过 obshell cluster show查看,可以参考这篇文档 https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000821500。
目前,可以通过 OBD 黑屏执行 obd cluster start myoceanbase 尝试重新拉起集群。
另外,这个问题我们会在 OBD3.0 进行优化,进行多次尝试请求 obshell 接口。

顺便方便告知一下你的部署环境吗?比如磁盘环境,使用的固态盘、机械硬盘还是云盘呢?

1 个赞

上述文档引用连接已经无法访问。


重新部署后的环境信息如下:

硬件情况:

  • 虚拟化:pve 虚拟平台
  • 操作系统: OpenEuler 24.03 LTS x86
  • 磁盘:128G 机械硬盘
  • 内存:16G
  • cpu:6核

可提供obd 集群log日志。

详情:

Node IP INFO
白屏部署机 192.168.124.38 6核 8g 128g
obServer、ObProxy 192.168.124.46 6核 16g 128g
obServer 192.168.124.49 6核 16g 128g
obServer 192.168.124.52 6核 16g 128g

部署操作流程:

All In One 安装

yum -y install vim
yum -y install tar
yum -y install net-tools
yum -y install binutils
yum -y install ntpdate

# 0.临时修改tmp目录文件大小 
mount -o remount,size=16G /tmp
# 1. 下载all-in-one安装包
bash -c "$(curl -s https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/download-center/opensource/oceanbase-all-in-one/installer.sh)"
# 环境变量生效
source ~/.oceanbase-all-in-one/bin/env.sh
# 使用obd安装 OceanBase

# 运行obd web 图形界面管理集群部署
obd web &
# 访问管理页面部署集群

以下配置优化来自官方推荐,且每台机器上必须执行!

安装前准备

yum -y install vim
yum -y install tar
yum -y install net-tools
yum -y install binutils
yum -y install ntpdate
1. NTP时钟同步
ntpdate time1.aliyun.com
ntpdate cn.pool.ntp.org
2. 配置 limits.conf
vim /etc/security/limits.conf

在文本最后一行添加内容,并保存修改

root soft nofile 655350
root hard nofile 655350
* soft nofile 655350
* hard nofile 655350
* soft stack unlimited
* hard stack unlimited
* soft nproc 655360
* hard nproc 655360
* soft core unlimited
* hard core unlimited

修改完毕后,重新打开会话或reboot: ulimit -a 后显示如下

real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 29446
max locked memory           (kbytes, -l) 65536
max memory size             (kbytes, -m) unlimited
open files                          (-n) 655350
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 655360
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited
3. 配置 sysctl.conf
vim /etc/sysctl.conf

/etc/sysctl.conf 配置文件中添加以下内容:

#for oceanbase
#修改内核异步 I/O 限制
fs.aio-max-nr = 1048576

#网络优化
net.core.somaxconn = 2048
net.core.netdev_max_backlog = 10000 
net.core.rmem_default = 16777216 
net.core.wmem_default = 16777216 
net.core.rmem_max = 16777216 
net.core.wmem_max = 16777216

net.ipv4.ip_forward = 0 
net.ipv4.conf.default.rp_filter = 1 
net.ipv4.conf.default.accept_source_route = 0 
net.ipv4.tcp_syncookies = 1 
net.ipv4.tcp_rmem = 4096 87380 16777216 
net.ipv4.tcp_wmem = 4096 65536 16777216 
net.ipv4.tcp_max_syn_backlog = 16384 
net.ipv4.tcp_fin_timeout = 15 
net.ipv4.tcp_slow_start_after_idle = 0

vm.swappiness = 0
vm.min_free_kbytes = 2097152
vm.overcommit_memory = 0

fs.file-max = 6573688
fs.pipe-user-pages-soft = 0

#修改进程可以拥有的虚拟内存区域数量
vm.max_map_count = 655360

#此处为 OceanBase 数据库的 data 目录
kernel.core_pattern = /data/core-%e-%p-%t

更改配置完成后,执行以下命令,加载配置,使配置生效。

sysctl -p
4. 关闭防火墙
systemctl disable firewalld 
systemctl stop firewalld
systemctl status firewalld
5. 关闭 SELinux

执行以下命令,打开 /etc/selinux/config 配置文件:

vim /etc/selinux/config

/etc/selinux/config 配置文件中修改对应配置项为以下内容:

SELINUX=disabled

执行以下命令或重启服务器,使更改生效:

setenforce 0

执行以下命令,查看更改是否生效:

sestatus

3. 使用OBD WEB 图形界面安装集群

使用obd web

文档应该是这个,句尾多了一个句号:https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000821500
感谢你的反馈,目前可以通过 OBD 黑屏执行 obd cluster start myoceanbase 尝试重新拉起集群,你有尝试过吗?

1 个赞

刚刚又重装了一遍, obd 白屏报错后,在未执行 obd cluster start myoceanbase命令时, 查看其中节点 ps 信息如下:
192.168.124.46节点与其他节点:
obshell 进程都在运行中

白屏部署报错是什么贴出来看一下。obd --version看一下obd版本是多少
看截图ob应该是处于启动状态,使用黑屏化登录试试

白屏部署日志如下:
OceanBase-DataBase-Error.txt (51.3 KB)

obd version:

另,黑屏化登录如何操作 :handshake:

使用navicat 客户端连接成功:

image

这样算部署成功了吗?如果要操作数据库接下来要创建租户操作吗?

相当于,成功了
你的obproxy拉起来么,如果没有使用单独拉起组件功能
obd cluster start xxxx -c obproxy-ce

尝试运行命令 obd cluster start myoceanbase -c obproxy-ce
响应:

[ERROR] Another app is currently holding the obd lock.
Trace ID: 001db0ca-b601-11ef-8240-bc24112b4e4e
If you want to view detailed obd logs, please run: obd display-trace 001db0ca-b601-11ef-8240-bc24112b4e4e

执行 obd display-trace 001db0ca-b601-11ef-8240-bc24112b4e4e
显示信息如下:

[2024-12-09 15:41:28.494] [DEBUG] - cmd: ['myoceanbase']
[2024-12-09 15:41:28.495] [DEBUG] - opts: {'servers': None, 'components': 'obproxy-ce', 'force_delete': None, 'strict_check': None, 'without_parameter': None}
[2024-12-09 15:41:28.495] [DEBUG] - mkdir /root/.obd/lock/
[2024-12-09 15:41:28.495] [DEBUG] - unknown lock mode
[2024-12-09 15:41:28.496] [DEBUG] - try to get share lock /root/.obd/lock/global
[2024-12-09 15:41:28.496] [DEBUG] - share lock `/root/.obd/lock/global`, count 1
[2024-12-09 15:41:28.496] [DEBUG] - Get Deploy by name
[2024-12-09 15:41:28.496] [DEBUG] - mkdir /root/.obd/cluster/
[2024-12-09 15:41:28.497] [DEBUG] - mkdir /root/.obd/config_parser/
[2024-12-09 15:41:28.497] [DEBUG] - try to get exclusive lock /root/.obd/lock/deploy_myoceanbase
[2024-12-09 15:41:28.499] [ERROR] Another app is currently holding the obd lock.
[2024-12-09 15:41:28.499] [ERROR] Traceback (most recent call last):
[2024-12-09 15:41:28.499] [ERROR]   File "_lock.py", line 64, in _ex_lock
[2024-12-09 15:41:28.500] [ERROR]   File "tool.py", line 500, in exclusive_lock_obj
[2024-12-09 15:41:28.500] [ERROR] BlockingIOError: [Errno 11] Resource temporarily unavailable
[2024-12-09 15:41:28.500] [ERROR]
[2024-12-09 15:41:28.500] [ERROR] During handling of the above exception, another exception occurred:
[2024-12-09 15:41:28.500] [ERROR]
[2024-12-09 15:41:28.500] [ERROR] Traceback (most recent call last):
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 85, in ex_lock
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 66, in _ex_lock
[2024-12-09 15:41:28.500] [ERROR] _errno.LockError: [Errno 11] Resource temporarily unavailable
[2024-12-09 15:41:28.500] [ERROR]
[2024-12-09 15:41:28.500] [ERROR] During handling of the above exception, another exception occurred:
[2024-12-09 15:41:28.500] [ERROR]
[2024-12-09 15:41:28.500] [ERROR] Traceback (most recent call last):
[2024-12-09 15:41:28.500] [ERROR]   File "obd.py", line 251, in do_command
[2024-12-09 15:41:28.500] [ERROR]   File "obd.py", line 941, in _do_command
[2024-12-09 15:41:28.500] [ERROR]   File "core.py", line 2079, in start_cluster
[2024-12-09 15:41:28.500] [ERROR]   File "_deploy.py", line 1864, in get_deploy_config
[2024-12-09 15:41:28.500] [ERROR]   File "_deploy.py", line 1851, in _lock
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 283, in deploy_ex_lock
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 262, in _ex_lock
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 254, in _lock
[2024-12-09 15:41:28.500] [ERROR]   File "_lock.py", line 185, in lock
[2024-12-09 15:41:28.501] [ERROR]   File "_lock.py", line 90, in ex_lock
[2024-12-09 15:41:28.501] [ERROR] _errno.LockError: [Errno 11] Resource temporarily unavailable
[2024-12-09 15:41:28.501] [ERROR]
[2024-12-09 15:41:28.501] [DEBUG] - share lock /root/.obd/lock/global release, count 0
[2024-12-09 15:41:28.501] [DEBUG] - unlock /root/.obd/lock/global
[2024-12-09 15:41:28.501] [INFO] Trace ID: 001db0ca-b601-11ef-8240-bc24112b4e4e
[2024-12-09 15:41:28.501] [INFO] If you want to view detailed obd logs, please run: obd display-trace 001db0ca-b601-11ef-8240-bc24112b4e4e
[2024-12-09 15:41:28.501] [DEBUG] - unlock /root/.obd/lock/deploy_myoceanbase

这里好像obd web 不能和其他obd 命令共存,我找到obd web命令并kill进程后 再尝试运行obd cluster start myoceanbase -c obproxy-ce 执行成功,展示信息如下:

是的,你楼上的报错内容就是[ERROR] Another app is currently holding the obd lock.

1 个赞

是的,当前集群状态正常了

1 个赞

好的,感谢。
还有几个问题要请教一下:

  1. 在只用组件安装了OceanBase Database 和DBProxy的情况下如何再安装OCP,需要整体重新安装吗?能增量安装吗?

  2. 我在尝试创建租户的操作,发现问题:

    • 使用白屏提示的(代理管理用户) 连接信息登录 2883 proxy 进入 cli ,无法使用 USE oceanbase;语句

      obclient -h192.168.124.46 -P2883 -uroot@proxysys -p'MEEV{6wDaeW,IMM-.hO7C2Ph' -Doceanbase -A 
      
    • 使用数据库账户登录2881端口进入cli,可以使用USE oceanbase;语句

      obclient -h192.168.124.46 -P2881 -uroot@sys -p'Yanfa2023@' -Doceanbase -A
      

    那么我创建租户应该用 2881的账户去创建吗,而不是使用2883的代理管理用户