在Ubuntu 20.04上部署集群失败


2024-03-14 09:46:32,660 INFO dispatch (idle_shutdown.py:36) [f91cbf23a9984af8afa4c92350d594ce] dispatch request and update last request time
2024-03-14 09:46:32,662 INFO dispatch (request_response_log.py:43) [f91cbf23a9984af8afa4c92350d594ce] app send response, code: 200
2024-03-14 09:46:33,102 INFO dispatch (request_response_log.py:40) [68be1530ed3c4580843d58542d6521ec] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install, query_params: , body: , from: 172.16.94.250:2244
2024-03-14 09:46:33,104 INFO dispatch (idle_shutdown.py:36) [68be1530ed3c4580843d58542d6521ec] dispatch request and update last request time
2024-03-14 09:46:33,105 INFO dispatch (request_response_log.py:40) [fbbca6322ec242d8a726e53c52e8a6ee] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install/log, query_params: , body: , from: 172.16.94.250:2243
2024-03-14 09:46:33,107 INFO dispatch (idle_shutdown.py:36) [fbbca6322ec242d8a726e53c52e8a6ee] dispatch request and update last request time
2024-03-14 09:46:33,108 INFO dispatch (request_response_log.py:43) [68be1530ed3c4580843d58542d6521ec] app send response, code: 200
2024-03-14 09:46:33,111 INFO dispatch (request_response_log.py:43) [fbbca6322ec242d8a726e53c52e8a6ee] app send response, code: 200
2024-03-14 09:46:34,146 INFO dispatch (request_response_log.py:40) [3da10a0afe024a5884b7a7ff9c6b8a37] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install, query_params: , body: , from: 172.16.94.250:2242
2024-03-14 09:46:34,148 INFO dispatch (idle_shutdown.py:36) [3da10a0afe024a5884b7a7ff9c6b8a37] dispatch request and update last request time
2024-03-14 09:46:34,152 INFO dispatch (request_response_log.py:43) [3da10a0afe024a5884b7a7ff9c6b8a37] app send response, code: 200
2024-03-14 09:46:34,295 INFO dispatch (request_response_log.py:40) [a3531b7b58804785b5e2df98cdef6929] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install/log, query_params: , body: , from: 172.16.94.250:2242
2024-03-14 09:46:34,297 INFO dispatch (idle_shutdown.py:36) [a3531b7b58804785b5e2df98cdef6929] dispatch request and update last request time
2024-03-14 09:46:34,301 INFO dispatch (request_response_log.py:43) [a3531b7b58804785b5e2df98cdef6929] app send response, code: 200
2024-03-14 09:46:34,897 WARNING _do_install (deployment_handler.py:303) [None] deploy kfeyoceanbase failed
2024-03-14 09:46:34,897 INFO _do_install (deployment_handler.py:304) [None] finish do deploy kfeyoceanbase
2024-03-14 09:46:34,897 INFO _do_install (deployment_handler.py:305) [None] start do start kfeyoceanbase
2024-03-14 09:46:34,898 ERROR wrapper (task.py:140) [598dc1b464c845e4b05a486eaa79a12d] task kfeyoceanbase got exception
Traceback (most recent call last):
File “service/common/task.py”, line 126, in wrapper
File “concurrent/futures/_base.py”, line 444, in result
File “concurrent/futures/_base.py”, line 389, in __get_result
File “concurrent/futures/thread.py”, line 57, in run
File “service/handler/deployment_handler.py”, line 331, in _do_install
Exception: task kfeyoceanbase deploy failed
2024-03-14 09:46:34,899 INFO wrapper (task.py:143) [598dc1b464c845e4b05a486eaa79a12d] task kfeyoceanbase finished failed
2024-03-14 09:46:35,204 INFO dispatch (request_response_log.py:40) [995fff96152448e0bab852ba62d64a2c] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install, query_params: , body: , from: 172.16.94.250:2242
2024-03-14 09:46:35,205 INFO dispatch (idle_shutdown.py:36) [995fff96152448e0bab852ba62d64a2c] dispatch request and update last request time
2024-03-14 09:46:35,210 INFO dispatch (request_response_log.py:43) [995fff96152448e0bab852ba62d64a2c] app send response, code: 200
2024-03-14 09:46:35,367 INFO dispatch (request_response_log.py:40) [aeed2f7bb1834f98889e41d9aab9df6c] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/install/log, query_params: , body: , from: 172.16.94.250:2242
2024-03-14 09:46:35,368 INFO dispatch (idle_shutdown.py:36) [aeed2f7bb1834f98889e41d9aab9df6c] dispatch request and update last request time
2024-03-14 09:46:35,373 INFO dispatch (request_response_log.py:43) [aeed2f7bb1834f98889e41d9aab9df6c] app send response, code: 200
2024-03-14 09:46:37,299 INFO dispatch (request_response_log.py:40) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/connection, query_params: , body: , from: 172.16.94.250:2244
2024-03-14 09:46:37,300 INFO dispatch (request_response_log.py:40) [f13e8f3fb82c46bc8812b99c19dec216] app receive request, method: GET, url: http://172.16.89.224:8680/assets/failed.png, query_params: , body: , from: 172.16.94.250:2242
2024-03-14 09:46:37,302 INFO dispatch (request_response_log.py:40) [5efd73e02cc24567b755bcad869c5cfc] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/kfeyoceanbase/report, query_params: , body: , from: 172.16.94.250:2243
2024-03-14 09:46:37,303 INFO dispatch (idle_shutdown.py:36) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] dispatch request and update last request time
2024-03-14 09:46:37,303 INFO dispatch (idle_shutdown.py:36) [f13e8f3fb82c46bc8812b99c19dec216] dispatch request and update last request time
2024-03-14 09:46:37,303 WARNING list_connection_info (deployment_handler.py:417) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] component oceanbase-ce start failed
2024-03-14 09:46:37,303 WARNING list_connection_info (deployment_handler.py:417) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] component obagent start failed
2024-03-14 09:46:37,304 WARNING list_connection_info (deployment_handler.py:417) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] component ocp-express start failed
2024-03-14 09:46:37,304 INFO dispatch (idle_shutdown.py:36) [5efd73e02cc24567b755bcad869c5cfc] dispatch request and update last request time
2024-03-14 09:46:37,307 INFO dispatch (request_response_log.py:43) [31d21ab32f8e4e2fbc3bf8e7c2b6948a] app send response, code: 200
2024-03-14 09:46:37,308 INFO dispatch (request_response_log.py:43) [5efd73e02cc24567b755bcad869c5cfc] app send response, code: 200
2024-03-14 09:46:37,313 INFO dispatch (request_response_log.py:43) [f13e8f3fb82c46bc8812b99c19dec216] app send response, code: 304
2024-03-14 10:16:38,178 INFO check_for_idle (idle_shutdown.py:48) [ba17fc67408c44cab71ff73bd1e1fcea] Shutting down due to inactivity.
2024-03-14 10:16:38,179 INFO check_for_idle (idle_shutdown.py:50) [ba17fc67408c44cab71ff73bd1e1fcea] shutdown pid 3847886

cd ~/.obd/log/ 看下这个路径下面是否有obd文件呢

版本是多少呢。obd什么版本(obd --version)

我换成admin启动obd部署试试

admin@xyh-NF5468M6:~/.obd/log$ ll
total 156
drwxr-xr-x 2 admin admin 4096 3月 14 10:30 ./
drwxrwxr-x 10 admin admin 4096 3月 14 10:33 …/
-rw-rw-r-- 1 admin admin 144664 3月 14 10:38 obd
admin@xyh-NF5468M6:~/.obd/log$

admin@xyh-NF5468M6:~/.obd/log$ obd --version
OceanBase Deploy: 2.6.2
REVISION: 01acdf3854db484d7443457f3e5bb210c9db6c0f
BUILD_BRANCH: HEAD
BUILD_TIME: Mar 01 2024 17:09:07OURCE
Copyright (C) 2021 OceanBase
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

失败。
2024-03-14 10:36:46,765 INFO _do_install (deployment_handler.py:304) [None] finish do deploy myoceanbase
2024-03-14 10:36:46,765 INFO _do_install (deployment_handler.py:305) [None] start do start myoceanbase
2024-03-14 10:36:46,765 ERROR wrapper (task.py:140) [b9ebc6b43b3a4506a3dae62b4504b19f] task myoceanbase got exception
Traceback (most recent call last):
File “service/common/task.py”, line 126, in wrapper
File “concurrent/futures/_base.py”, line 444, in result
File “concurrent/futures/_base.py”, line 389, in __get_result
File “concurrent/futures/thread.py”, line 57, in run
File “service/handler/deployment_handler.py”, line 331, in _do_install
Exception: task myoceanbase deploy failed
2024-03-14 10:36:46,767 INFO wrapper (task.py:143) [b9ebc6b43b3a4506a3dae62b4504b19f] task myoceanbase finished failed
2024-03-14 10:36:47,358 INFO dispatch (request_response_log.py:40) [de6a238420aa49bfa642f1bd47cfe205] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/myoceanbase/install/log, query_params: , body: , from: 172.16.94.250:2336
2024-03-14 10:36:47,360 INFO dispatch (idle_shutdown.py:36) [de6a238420aa49bfa642f1bd47cfe205] dispatch request and update last request time
2024-03-14 10:36:47,362 INFO dispatch (request_response_log.py:40) [6f7591510ed54a61bc1925c7839fe5ec] app receive request, method: GET, url: http://172.16.89.224:8680/api/v1/deployments/myoceanbase/install, query_params: , body: , from: 172.16.94.250:2337
2024-03-14 10:36:47,364 INFO dispatch (idle_shutdown.py:36) [6f7591510ed54a61bc1925c7839fe5ec] dispatch request and update last request time
2024-03-14 10:36:47,367 INFO dispatch (request_response_log.py:43) [de6a238420aa49bfa642f1bd47cfe205] app send response, code: 200
2024-03-14 10:36:47,369 INFO dispatch (request_response_log.py:43) [6f7591510ed54a61bc1925c7839fe5ec] app send response, code: 200

麻烦提供下

obd.txt (141.3 KB)

[ERROR] OBD-1002: Fail to init 172.16.89.224 sock path: create /tmp/obshell failed.
看下权限是否正确

也看下资源呢 df -h free -h

又报这个错了。
[2024-03-14 13:13:58.299] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] – starting 172.16.89.224 observer
[2024-03-14 13:13:58.299] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] – admin@172.16.89.224 set env LD_LIBRARY_PATH to ‘/data4T/oceanbase/myoceanbase/oceanbase/lib:’
[2024-03-14 13:13:58.299] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] – admin@172.16.89.224 execute: cd /data4T/oceanbase/myoceanbase/oceanbase; /data4T/oceanbase/myoceanbase/oceanbase/bin/observer -r ‘172.16.89.224:2882:2881’ -p 2881 -P 2882 -z ‘zone1’ -n ‘myoceanbase’ -c 1710393176 -d ‘/data4T/oceanbase/myoceanbase/oceanbase/store’ -I ‘172.16.89.224’ -o __min_full_resource_pool_memory=1073741824,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit=‘6G’,datafile_size=‘2G’,system_memory=‘1G’,log_disk_size=‘14G’,cpu_count=78,datafile_maxsize=‘8G’,datafile_next=‘2G’
[2024-03-14 13:13:58.426] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] – exited code 164, error output:
[2024-03-14 13:13:58.426] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] /data4T/oceanbase/myoceanbase/oceanbase/bin/observer -r 172.16.89.224:2882:2881 -p 2881 -P 2882 -z zone1 -n myoceanbase -c 1710393176 -d /data4T/oceanbase/myoceanbase/oceanbase/store -I 172.16.89.224 -o __min_full_resource_pool_memory=1073741824,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit=6G,datafile_size=2G,system_memory=1G,log_disk_size=14G,cpu_count=78,datafile_maxsize=8G,datafile_next=2G
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] rs list: 172.16.89.224:2882:2881
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] mysql port: 2881
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] rpc port: 2882
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] zone: zone1
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] appname: myoceanbase
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] cluster id: 1710393176
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] data_dir: /data4T/oceanbase/myoceanbase/oceanbase/store
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] local_ip: 172.16.89.224
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] optstr: __min_full_resource_pool_memory=1073741824,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit=6G,datafile_size=2G,system_memory=1G,log_disk_size=14G,cpu_count=78,datafile_maxsize=8G,datafile_next=2G
[2024-03-14 13:13:58.427] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] ERROR: current user(uid=1005) that starts observer is not the same with the original one(uid=0), observer starts failed!
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] Fail check_uid_before_start, please use the initial user to start observer!
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] ============= [AFTER_DESTROY] begin to show unstopped thread =============
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] [AFTER_DESTROY] detect unstopped thread, tid: 3872775, name: observer
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] ============= [AFTER_DESTROY] finish to show unstopped thread =============
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG]
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [DEBUG] – admin@172.16.89.224 set env LD_LIBRARY_PATH to ‘’
[2024-03-14 13:13:58.428] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] OBD-2002: Failed to start 172.16.89.224 observer: /data4T/oceanbase/myoceanbase/oceanbase/bin/observer -r 172.16.89.224:2882:2881 -p 2881 -P 2882 -z zone1 -n myoceanbase -c 1710393176 -d /data4T/oceanbase/myoceanbase/oceanbase/store -I 172.16.89.224 -o __min_full_resource_pool_memory=1073741824,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit=6G,datafile_size=2G,system_memory=1G,log_disk_size=14G,cpu_count=78,datafile_maxsize=8G,datafile_next=2G
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] rs list: 172.16.89.224:2882:2881
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] mysql port: 2881
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] rpc port: 2882
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] zone: zone1
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] appname: myoceanbase
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] cluster id: 1710393176
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] data_dir: /data4T/oceanbase/myoceanbase/oceanbase/store
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] local_ip: 172.16.89.224
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] optstr: __min_full_resource_pool_memory=1073741824,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,memory_limit=6G,datafile_size=2G,system_memory=1G,log_disk_size=14G,cpu_count=78,datafile_maxsize=8G,datafile_next=2G
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] ERROR: current user(uid=1005) that starts observer is not the same with the original one(uid=0), observer starts failed!
[2024-03-14 13:13:58.429] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] Fail check_uid_before_start, please use the initial user to start observer!
[2024-03-14 13:13:58.430] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] ============= [AFTER_DESTROY] begin to show unstopped thread =============
[2024-03-14 13:13:58.430] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] [AFTER_DESTROY] detect unstopped thread, tid: 3872775, name: observer
[2024-03-14 13:13:58.430] [a8e52bb2-e1c1-11ee-beb7-b4055da9be0a] [ERROR] ============= [AFTER_DESTROY] finish to show unstopped thread =============

ps -ef |grep observer && free -h 看下

,memory_limit=6G 这个内存太小了,失败的概率非常高。
改到 memory_limit=8G 。 确保 ubuntu 服务器可用内存 接近 8G . free -h

当前使用

部署obd 没问题呢。 看看资源是否充足呢?
建议把当前环境清理干净,重新部署下呢。

  1. 通过obd cluster list 获取集群状态和路径
  2. 再执行rm -rf /root/.obd/cluster/myoceanbase 和配置文件路径一般默认是/root/myoceanbase
  3. 清理内存缓存 echo 3 > /proc/sys/vm/drop_caches
  4. 确认obd版本是否是最新版2.6.X obd --version
  5. 重新执行obd web 一般选项都是选择默认的。

按这个清理过了,还是这样报错。
rpc port: 2882
zone: zone1
appname: myoceanbase
cluster id: 1710397716
data_dir: /data4T/oceanbase/myoceanbase/oceanbase/store
local_ip: 172.16.89.224
optstr: __min_full_resource_pool_memory=2147483648,datafile_size=500GB,datafile_maxsize=500GB,memory_limit=24GB,enable_syslog_recycle=True,enable_syslog_wf=False,max_syslog_file_count=4,system_memory=5G,log_disk_size=62G,cpu_count=78
ERROR: current user(uid=1005) that starts observer is not the same with the original one(uid=0), observer starts failed!
Fail check_uid_before_start, please use the initial user to start observer!
============= [AFTER_DESTROY] begin to show unstopped thread =============
[AFTER_DESTROY] detect unstopped thread, tid: 3882379, name: observer
============= [AFTER_DESTROY] finish to show unstopped thread =============

[ERROR] oceanbase-ce start failed

用户有问题吗

root@xyh-NF5468M6:~# free -mh
total used free shared buff/cache available
Mem: 251Gi 80Gi 165Gi 622Mi 5.8Gi 168Gi
Swap: 2.0Gi 1.9Gi 59Mi

麻烦提供下config.yaml配置文件和obd完整日志呢