Docker oceanbase-ce 4.2.2 重启异常

【 使用环境 】 测试环境
【 OB or 其他组件 】
【 使用版本 】4.2.2
【问题描述】目前基于docker 启动oceanbase-ce单实例,基于镜像oceanbase/oceanbase-ce:4.2.2本版本,目前通过docker stop 以及docker start 后,发现实例启动异常,或者基于k8s部署 删除 oceanbase pod后,服务再次拉起,oceanbase数据库就无法正常启动。

【复现路径】暂时通过修改启动命令obd cluster start powersmart 暂时无效,以及删除run目录pid文件均无法正常重启pod或者容器,且目前已挂在三个目录/root/ob、/root/.obd/cluster、/root/obagent。
【附件及日志】目前提供了 deployment编排文件以及pod 新部署后,伸缩副本 1到0,0到1 ,然后异常日志。

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!
oceanbase-ce.log (370 字节)
oceanbase-ce-deployment.log (2.5 KB)

根据您的描述,您遇到了OceanBase数据库在Docker环境下启动失败的问题。为了更准确地定位问题并提供有效的解决方案,我们建议使用 obdiag 工具来收集相关日志信息。

日志采集场景:

集群无法连接

日志采集命令:

obdiag gather scene run --scene=observer.cluster_down

该命令会收集与集群状态有关的日志信息,包括但不限于系统日志、错误日志等,这些信息有助于我们分析集群启动失败的原因。

提供更多信息:

为了进一步诊断问题,请您提供以下信息:

  1. 启动命令: 您是如何启动OceanBase实例的?请提供完整的启动命令。
  2. 错误日志: 在尝试启动OceanBase时,是否有任何错误信息或警告信息出现?
  3. 环境配置: 您使用的Docker版本是多少?是否使用了特定的Docker配置?
  4. 资源限制: Docker容器中对资源(如CPU、内存)的限制设置是什么样的?

请执行上述 obdiag 命令,并将生成的日志文件压缩后提供给我们,这将有助于我们更快地定位问题所在。

具体信息可参考以下文档:

附上敏捷诊断工具 obdiag 使用帮助链接

这个我看是一个通用问题啊,好多帖子都提过
docker可以直接安装obdiag嘛,目前我再看

再向您确认几个问题:

  1. 您能否确认一下您尝试使用的obdiag命令的具体版本以及您是从哪个渠道获取的此命令
  2. 您在尝试使用obdiag工具时遇到了什么具体的错误或问题
  3. 您是否已经在Docker环境中成功安装了obdiag工具
  4. 如果没有成功安装obdiag,您尝试了哪些步骤或方法来安装它

docker 执行记录

docker 启动

docker run -p 2881:2881 --name obstandalone -e MINI_MODE=0 -d oceanbase/oceanbase-ce:4.2.2

docker 状态查看

root@cxn:~# docker ps  | grep obst
a8d673ec9173   oceanbase/oceanbase-ce:4.2.2                                         "/bin/sh -c _boot"       About a minute ago   Up About a minute   0.0.0.0:2881->2881/tcp, :::2881->2881/tcp   obstandalone

docker 日志查看

root@cxn:/data/cxn/package# docker logs -f obstandalone 
///////////////////////         STEP 1: Config Generation         ///////////////////////
oceanbase-ce docker in mini mode
/////////////////////// phase end: 0.024 s ///////////////////////
///////////////////////         STEP 2: Ob-deploy mirror clone         ///////////////////////
name: jq
version: 1.6
release:14.an8
arch: x86_64
md5: aced79b4fa3df5b2ba6e060984732011024b2173
add /root/pkg/jq-1.6-14.an8.x86_64.rpm to local mirror
name: ob-configserver
version: 1.0.0
release:2.el7
arch: x86_64
md5: feca6b9c76e26ac49464f34bfa0780b5a8d3f4a0
add /root/pkg/ob-configserver-1.0.0-2.el7.x86_64.rpm to local mirror
name: obagent
version: 4.2.2
release:100000042024011120.el7
arch: x86_64
md5: 19739a07a12eab736aff86ecf357b1ae660b554e
add /root/pkg/obagent-4.2.2-100000042024011120.el7.x86_64.rpm to local mirror
name: oceanbase-ce
version: 4.2.2.0
release:100000192024011915.el7
arch: x86_64
md5: aa3053da7370a6685a2ef457cd202d50e5ab75d3
add /root/pkg/oceanbase-ce-4.2.2.0-100000192024011915.el7.x86_64.rpm to local mirror
name: oceanbase-ce-libs
version: 4.2.2.0
release:100000192024011915.el7
arch: x86_64
md5: 3ef68164e36c5a344b257e57575833134d34a27a
add /root/pkg/oceanbase-ce-libs-4.2.2.0-100000192024011915.el7.x86_64.rpm to local mirror
name: oniguruma
version: 6.8.2
release:2.0.1.an8
arch: x86_64
md5: 1bafad39df270d01e4494f5e6a5f6def972bf26d
add /root/pkg/oniguruma-6.8.2-2.0.1.an8.x86_64.rpm to local mirror
Trace ID: efddad96-5ddb-11ef-9295-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace efddad96-5ddb-11ef-9295-0242ac110003
+----------------------------------------------------------------------------------------------------------+
|                                            local Package List                                            |
+-------------------+---------+------------------------+--------+------------------------------------------+
| name              | version | release                | arch   | md5                                      |
+-------------------+---------+------------------------+--------+------------------------------------------+
| jq                | 1.6     | 14.an8                 | x86_64 | aced79b4fa3df5b2ba6e060984732011024b2173 |
| ob-configserver   | 1.0.0   | 2.el7                  | x86_64 | feca6b9c76e26ac49464f34bfa0780b5a8d3f4a0 |
| obagent           | 4.2.2   | 100000042024011120.el7 | x86_64 | 19739a07a12eab736aff86ecf357b1ae660b554e |
| oceanbase-ce      | 4.2.2.0 | 100000192024011915.el7 | x86_64 | aa3053da7370a6685a2ef457cd202d50e5ab75d3 |
| oceanbase-ce-libs | 4.2.2.0 | 100000192024011915.el7 | x86_64 | 3ef68164e36c5a344b257e57575833134d34a27a |
| oniguruma         | 6.8.2   | 2.0.1.an8              | x86_64 | 1bafad39df270d01e4494f5e6a5f6def972bf26d |
+-------------------+---------+------------------------+--------+------------------------------------------+
Trace ID: f0464428-5ddb-11ef-8438-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f0464428-5ddb-11ef-8438-0242ac110003
/////////////////////// phase end: 1.322 s ///////////////////////
///////////////////////         STEP 3: Ob-deploy deploy         ///////////////////////
Local deploy is empty
Trace ID: f097382e-5ddb-11ef-92af-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f097382e-5ddb-11ef-92af-0242ac110003
///////////////////////         STEP 4: Ob-deploy autodeploy         ///////////////////////
Dev Mode: ON
Trace ID: f0f183d8-5ddb-11ef-9486-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f0f183d8-5ddb-11ef-9486-0242ac110003
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package oceanbase-ce-4.2.2.0-100000192024011915.el7 is available.
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package obagent-4.2.2-100000042024011120.el7 is available.
install oceanbase-ce-4.2.2.0 for local ok
install obagent-4.2.2 for local ok
Cluster param config check ok
Open ssh connection ok
Generate observer configuration ok
Generate obagent configuration ok
[WARN] Use centos 7 remote mirror repository for anolis 8.8
[WARN] Use centos 7 remote mirror repository for anolis 8.8
+--------------------------------------------------------------------------------------------+
|                                          Packages                                          |
+--------------+---------+------------------------+------------------------------------------+
| Repository   | Version | Release                | Md5                                      |
+--------------+---------+------------------------+------------------------------------------+
| oceanbase-ce | 4.2.2.0 | 100000192024011915.el7 | aa3053da7370a6685a2ef457cd202d50e5ab75d3 |
| obagent      | 4.2.2   | 100000042024011120.el7 | 19739a07a12eab736aff86ecf357b1ae660b554e |
+--------------+---------+------------------------+------------------------------------------+
Repository integrity check ok
Parameter check ok
Cluster status check ok
Initializes observer work home ok
Initializes obagent work home ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository install ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository lib check !!
Remote obagent-4.2.2-100000042024011120.el7-19739a07a12eab736aff86ecf357b1ae660b554e repository install ok
Remote obagent-4.2.2-100000042024011120.el7-19739a07a12eab736aff86ecf357b1ae660b554e repository lib check ok
Try to get lib-repository
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package oceanbase-ce-libs-4.2.2.0-100000192024011915.el7 is available.
install oceanbase-ce-libs-4.2.2.0 for local ok
Remote oceanbase-ce-libs-4.2.2.0-100000192024011915.el7-3ef68164e36c5a344b257e57575833134d34a27a repository install ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository lib check ok
obcluster deployed
Get local repositories ok
Search plugins ok
Load cluster param plugin ok
Open ssh connection ok
Check before start observer ok
[WARN] OBD-1011: (127.0.0.1) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)
[WARN] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 8192)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.max_map_count" must be within [327600, 1310720] (Current value: 65530, Recommended value: 655360)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.overcommit_memory" must be 0 (Current value: 1, Recommended value: 0)
[WARN] OBD-1017: (127.0.0.1) The value of the "fs.file-max" must be greater than 6573688 (Current value: 65536, Recommended value: 6573688)
[WARN] OBD-1012: (127.0.0.1) clog and data use the same disk (/)

Check before start obagent ok
Start observer ok
observer program health check ok
Connect to observer ok
Initialize oceanbase-ce ok
Start obagent ok
obagent program health check ok
Connect to Obagent ok
Wait for observer init ok
+---------------------------------------------+
|                   observer                  |
+-----------+---------+------+-------+--------+
| ip        | version | port | zone  | status |
+-----------+---------+------+-------+--------+
| 127.0.0.1 | 4.2.2.0 | 2881 | zone1 | ACTIVE |
+-----------+---------+------+-------+--------+
obclient -h127.0.0.1 -P2881 -uroot -Doceanbase -A

+---------------------------------------------------------------+
|                            obagent                            |
+------------+--------------------+--------------------+--------+
| ip         | mgragent_http_port | monagent_http_port | status |
+------------+--------------------+--------------------+--------+
| 172.17.0.3 | 8089               | 8088               | active |
+------------+--------------------+--------------------+--------+
obcluster running
Trace ID: f14617ae-5ddb-11ef-acdf-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f14617ae-5ddb-11ef-acdf-0242ac110003
/////////////////////// phase end: 93.367 s ///////////////////////
///////////////////////         STEP 5: Ob-deploy Create Tenant         ///////////////////////
Get local repositories and plugins ok
Open ssh connection ok
Connect to observer ok
Create tenant test ok
Trace ID: 289a6b56-5ddc-11ef-8d3a-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace 289a6b56-5ddc-11ef-8d3a-0242ac110003
/////////////////////// phase end: 23.803 s ///////////////////////
deploy success!
boot success!

docker 服务重启

root@cxn:~# docker restart obstandalone 
obstandalone
root@cxn:~# docker ps | grep obst
a8d673ec9173   oceanbase/oceanbase-ce:4.2.2                                         "/bin/sh -c _boot"       3 minutes ago   Up 7 seconds   0.0.0.0:2881->2881/tcp, :::2881->2881/tcp   obstandalone
root@cxn:~# docker logs -f obstandalone 
///////////////////////         STEP 1: Config Generation         ///////////////////////
oceanbase-ce docker in mini mode
/////////////////////// phase end: 0.024 s ///////////////////////
///////////////////////         STEP 2: Ob-deploy mirror clone         ///////////////////////
name: jq
version: 1.6
release:14.an8
arch: x86_64
md5: aced79b4fa3df5b2ba6e060984732011024b2173
add /root/pkg/jq-1.6-14.an8.x86_64.rpm to local mirror
name: ob-configserver
version: 1.0.0
release:2.el7
arch: x86_64
md5: feca6b9c76e26ac49464f34bfa0780b5a8d3f4a0
add /root/pkg/ob-configserver-1.0.0-2.el7.x86_64.rpm to local mirror
name: obagent
version: 4.2.2
release:100000042024011120.el7
arch: x86_64
md5: 19739a07a12eab736aff86ecf357b1ae660b554e
add /root/pkg/obagent-4.2.2-100000042024011120.el7.x86_64.rpm to local mirror
name: oceanbase-ce
version: 4.2.2.0
release:100000192024011915.el7
arch: x86_64
md5: aa3053da7370a6685a2ef457cd202d50e5ab75d3
add /root/pkg/oceanbase-ce-4.2.2.0-100000192024011915.el7.x86_64.rpm to local mirror
name: oceanbase-ce-libs
version: 4.2.2.0
release:100000192024011915.el7
arch: x86_64
md5: 3ef68164e36c5a344b257e57575833134d34a27a
add /root/pkg/oceanbase-ce-libs-4.2.2.0-100000192024011915.el7.x86_64.rpm to local mirror
name: oniguruma
version: 6.8.2
release:2.0.1.an8
arch: x86_64
md5: 1bafad39df270d01e4494f5e6a5f6def972bf26d
add /root/pkg/oniguruma-6.8.2-2.0.1.an8.x86_64.rpm to local mirror
Trace ID: efddad96-5ddb-11ef-9295-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace efddad96-5ddb-11ef-9295-0242ac110003
+----------------------------------------------------------------------------------------------------------+
|                                            local Package List                                            |
+-------------------+---------+------------------------+--------+------------------------------------------+
| name              | version | release                | arch   | md5                                      |
+-------------------+---------+------------------------+--------+------------------------------------------+
| jq                | 1.6     | 14.an8                 | x86_64 | aced79b4fa3df5b2ba6e060984732011024b2173 |
| ob-configserver   | 1.0.0   | 2.el7                  | x86_64 | feca6b9c76e26ac49464f34bfa0780b5a8d3f4a0 |
| obagent           | 4.2.2   | 100000042024011120.el7 | x86_64 | 19739a07a12eab736aff86ecf357b1ae660b554e |
| oceanbase-ce      | 4.2.2.0 | 100000192024011915.el7 | x86_64 | aa3053da7370a6685a2ef457cd202d50e5ab75d3 |
| oceanbase-ce-libs | 4.2.2.0 | 100000192024011915.el7 | x86_64 | 3ef68164e36c5a344b257e57575833134d34a27a |
| oniguruma         | 6.8.2   | 2.0.1.an8              | x86_64 | 1bafad39df270d01e4494f5e6a5f6def972bf26d |
+-------------------+---------+------------------------+--------+------------------------------------------+
Trace ID: f0464428-5ddb-11ef-8438-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f0464428-5ddb-11ef-8438-0242ac110003
/////////////////////// phase end: 1.322 s ///////////////////////
///////////////////////         STEP 3: Ob-deploy deploy         ///////////////////////
Local deploy is empty
Trace ID: f097382e-5ddb-11ef-92af-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f097382e-5ddb-11ef-92af-0242ac110003
///////////////////////         STEP 4: Ob-deploy autodeploy         ///////////////////////
Dev Mode: ON
Trace ID: f0f183d8-5ddb-11ef-9486-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f0f183d8-5ddb-11ef-9486-0242ac110003
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package oceanbase-ce-4.2.2.0-100000192024011915.el7 is available.
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package obagent-4.2.2-100000042024011120.el7 is available.
install oceanbase-ce-4.2.2.0 for local ok
install obagent-4.2.2 for local ok
Cluster param config check ok
Open ssh connection ok
Generate observer configuration ok
Generate obagent configuration ok
[WARN] Use centos 7 remote mirror repository for anolis 8.8
[WARN] Use centos 7 remote mirror repository for anolis 8.8
+--------------------------------------------------------------------------------------------+
|                                          Packages                                          |
+--------------+---------+------------------------+------------------------------------------+
| Repository   | Version | Release                | Md5                                      |
+--------------+---------+------------------------+------------------------------------------+
| oceanbase-ce | 4.2.2.0 | 100000192024011915.el7 | aa3053da7370a6685a2ef457cd202d50e5ab75d3 |
| obagent      | 4.2.2   | 100000042024011120.el7 | 19739a07a12eab736aff86ecf357b1ae660b554e |
+--------------+---------+------------------------+------------------------------------------+
Repository integrity check ok
Parameter check ok
Cluster status check ok
Initializes observer work home ok
Initializes obagent work home ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository install ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository lib check !!
Remote obagent-4.2.2-100000042024011120.el7-19739a07a12eab736aff86ecf357b1ae660b554e repository install ok
Remote obagent-4.2.2-100000042024011120.el7-19739a07a12eab736aff86ecf357b1ae660b554e repository lib check ok
Try to get lib-repository
[WARN] Use centos 7 remote mirror repository for anolis 8.8
Package oceanbase-ce-libs-4.2.2.0-100000192024011915.el7 is available.
install oceanbase-ce-libs-4.2.2.0 for local ok
Remote oceanbase-ce-libs-4.2.2.0-100000192024011915.el7-3ef68164e36c5a344b257e57575833134d34a27a repository install ok
Remote oceanbase-ce-4.2.2.0-100000192024011915.el7-aa3053da7370a6685a2ef457cd202d50e5ab75d3 repository lib check ok
obcluster deployed
Get local repositories ok
Search plugins ok
Load cluster param plugin ok
Open ssh connection ok
Check before start observer ok
[WARN] OBD-1011: (127.0.0.1) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)
[WARN] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 8192)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.max_map_count" must be within [327600, 1310720] (Current value: 65530, Recommended value: 655360)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.overcommit_memory" must be 0 (Current value: 1, Recommended value: 0)
[WARN] OBD-1017: (127.0.0.1) The value of the "fs.file-max" must be greater than 6573688 (Current value: 65536, Recommended value: 6573688)
[WARN] OBD-1012: (127.0.0.1) clog and data use the same disk (/)

Check before start obagent ok
Start observer ok
observer program health check ok
Connect to observer ok
Initialize oceanbase-ce ok
Start obagent ok
obagent program health check ok
Connect to Obagent ok
Wait for observer init ok
+---------------------------------------------+
|                   observer                  |
+-----------+---------+------+-------+--------+
| ip        | version | port | zone  | status |
+-----------+---------+------+-------+--------+
| 127.0.0.1 | 4.2.2.0 | 2881 | zone1 | ACTIVE |
+-----------+---------+------+-------+--------+
obclient -h127.0.0.1 -P2881 -uroot -Doceanbase -A

+---------------------------------------------------------------+
|                            obagent                            |
+------------+--------------------+--------------------+--------+
| ip         | mgragent_http_port | monagent_http_port | status |
+------------+--------------------+--------------------+--------+
| 172.17.0.3 | 8089               | 8088               | active |
+------------+--------------------+--------------------+--------+
obcluster running
Trace ID: f14617ae-5ddb-11ef-acdf-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace f14617ae-5ddb-11ef-acdf-0242ac110003
/////////////////////// phase end: 93.367 s ///////////////////////
///////////////////////         STEP 5: Ob-deploy Create Tenant         ///////////////////////
Get local repositories and plugins ok
Open ssh connection ok
Connect to observer ok
Create tenant test ok
Trace ID: 289a6b56-5ddc-11ef-8d3a-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace 289a6b56-5ddc-11ef-8d3a-0242ac110003
/////////////////////// phase end: 23.803 s ///////////////////////
deploy success!
boot success!
find obd deploy information, skip configuring...
start ob cluster ...
Get local repositories ok
Search plugins ok
Load cluster param plugin ok
Cluster status check ok
Check before start observer ok
[WARN] OBD-1011: (127.0.0.1) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)
[WARN] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 8192)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.max_map_count" must be within [327600, 1310720] (Current value: 65530, Recommended value: 655360)
[WARN] OBD-1017: (127.0.0.1) The value of the "vm.overcommit_memory" must be 0 (Current value: 1, Recommended value: 0)
[WARN] OBD-1017: (127.0.0.1) The value of the "fs.file-max" must be greater than 6573688 (Current value: 65536, Recommended value: 6573688)

Check before start obagent ok
Start observer ok
observer program health check ok
obshell program health check ok
Connect to observer ok
Start obagent ok
obagent program health check x
[WARN] failed to start 127.0.0.1 obagent
[ERROR] obagent start failed
Wait for observer init ok
+---------------------------------------------+
|                   observer                  |
+-----------+---------+------+-------+--------+
| ip        | version | port | zone  | status |
+-----------+---------+------+-------+--------+
| 127.0.0.1 | 4.2.2.0 | 2881 | zone1 | ACTIVE |
+-----------+---------+------+-------+--------+
obclient -h127.0.0.1 -P2881 -uroot -Doceanbase -A

See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 5948f45c-5ddc-11ef-ba0f-0242ac110003
If you want to view detailed obd logs, please run: obd display-trace 5948f45c-5ddc-11ef-ba0f-0242ac110003
boot failed!

obdiag gather scene run 信息如下:

root@cxn:/data/cxn/package# obdiag gather scene run --scene=observer.cluster_down
gather_scenes_run start ...
[ERROR] connect OB: 127.0.0.1:2881 with user root@sys failed, error:(2003, "Can't connect to MySQL server on '127.0.0.1' ([Errno 111] Connection refused)")
[ERROR] connect OB: 127.0.0.1:2881 with user root@sys failed, error:(2003, "Can't connect to MySQL server on '127.0.0.1' ([Errno 111] Connection refused)")
gather from_time: 2024-08-19 11:09:40, to_time: 2024-08-19 11:40:40
execute tasks: observer.cluster_down
[ERROR] can't get version, Exception: [Errno None] Unable to connect to port 22 on 127.0.0.1
[ERROR] __execute_yaml_task_one Exception : can't get version, Exception: [Errno None] Unable to connect to port 22 on 127.0.0.1

Gather scene results stored in this directory: /data/cxn/package/obdiag_gather_pack_20240819113940

Trace ID: a9f29278-5ddc-11ef-83e3-000c2923e77e
If you want to view detailed obdiag logs, please run: obdiag display-trace a9f29278-5ddc-11ef-83e3-000c2923e77e

obdiag display-trace 信息如下:

root@cxn:/data/cxn/package# obdiag display-trace a9f29278-5ddc-11ef-83e3-000c2923e77e
[2024-08-19 11:39:39.721] [DEBUG] - cmd: []
[2024-08-19 11:39:39.721] [DEBUG] - opts: {'scene': 'observer.cluster_down', 'from': None, 'to': None, 'since': '30m', 'env': None, 'store_dir': './', 'c': '/root/.obdiag/config.yml'}
[2024-08-19 11:39:39.721] [DEBUG] - mkdir /usr/local/oceanbase-diagnostic-tool/conf/inner_config.yml
[2024-08-19 11:39:39.724] [DEBUG] - mkdir /root/.obdiag/config.yml
[2024-08-19 11:39:39.726] [INFO] gather_scenes_run start ...
[2024-08-19 11:39:40.060] [INFO] gather from_time: 2024-08-19 11:09:40, to_time: 2024-08-19 11:40:40
[2024-08-19 11:39:40.060] [DEBUG] - gather scene variables: {'observer_data_dir': '/root/observer', 'obproxy_data_dir': '/root/obproxy', 'from_time': '2024-08-19 11:09:40', 'to_time': '2024-08-19 11:40:40'}
[2024-08-19 11:39:40.060] [DEBUG] - Use /data/cxn/package/obdiag_gather_pack_20240819113940 as pack dir.
[2024-08-19 11:39:40.060] [DEBUG] - mkdir /data/cxn/package/obdiag_gather_pack_20240819113940
[2024-08-19 11:39:40.063] [DEBUG] - execute_tasks. the number of tasks is 1 ,tasks is dict_keys(['observer.cluster_down'])
[2024-08-19 11:39:40.063] [INFO] execute tasks: observer.cluster_down
[2024-08-19 11:39:40.065] [ERROR] can't get version, Exception: [Errno None] Unable to connect to port 22 on 127.0.0.1
[2024-08-19 11:39:40.065] [ERROR] Traceback (most recent call last):
[2024-08-19 11:39:40.065] [ERROR]   File "cmd.py", line 222, in do_command
[2024-08-19 11:39:40.065] [ERROR]   File "cmd.py", line 553, in _do_command
[2024-08-19 11:39:40.065] [ERROR]   File "core.py", line 227, in gather_function
[2024-08-19 11:39:40.065] [ERROR]   File "handler/gather/gather_scenes.py", line 74, in handle
[2024-08-19 11:39:40.065] [ERROR]   File "handler/gather/gather_scenes.py", line 81, in execute
[2024-08-19 11:39:40.065] [ERROR]   File "handler/gather/gather_scenes.py", line 92, in __execute_yaml_task_one
[2024-08-19 11:39:40.065] [ERROR]   File "common/scene.py", line 96, in get_obproxy_and_ob_version
[2024-08-19 11:39:40.065] [ERROR]   File "common/ssh.py", line 879, in __init__
[2024-08-19 11:39:40.065] [ERROR]   File "paramiko/client.py", line 409, in connect
[2024-08-19 11:39:40.065] [ERROR] paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 127.0.0.1
[2024-08-19 11:39:40.065] [ERROR] 
[2024-08-19 11:39:40.065] [ERROR] __execute_yaml_task_one Exception : can't get version, Exception: [Errno None] Unable to connect to port 22 on 127.0.0.1
[2024-08-19 11:39:40.065] [INFO] 
[2024-08-19 11:39:40.065] [INFO] Gather scene results stored in this directory: /data/cxn/package/obdiag_gather_pack_20240819113940
[2024-08-19 11:39:40.065] [INFO] 
[2024-08-19 11:39:40.329] [INFO] Trace ID: a9f29278-5ddc-11ef-83e3-000c2923e77e
[2024-08-19 11:39:40.329] [INFO] If you want to view detailed obdiag logs, please run: obdiag display-trace a9f29278-5ddc-11ef-83e3-000c2923e77e

根据您提供的日志信息,OceanBase 实例在重启过程中出现了问题,主要体现在 obagent 启动失败。同时,使用 obdiag 工具尝试收集诊断信息时也遇到了连接失败的情况。下面我们将针对这两个问题进行分析。

obagent 启动失败

在重启日志中,可以看到 obagent 启动失败的提示:

obagent program health check x
[WARN] failed to start 127.0.0.1 obagent
[ERROR] obagent start failed

这表明 obagent 在健康检查阶段未能通过,导致整个集群启动失败。

obdiag 连接失败

使用 obdiag 收集诊断信息时,遇到连接失败的问题:

[ERROR] connect OB: 127.0.0.1:2881 with user root@sys failed, error:(2003, "Can't connect to MySQL server on '127.0.0.1' ([Errno 111] Connection refused)")

这表明 obdiag 无法连接到 OceanBase 实例,可能是由于实例未完全启动或网络配置问题导致的。

解决方案

1. 检查 obagent 日志

  • 查看 obagent 的日志文件,通常位于 /var/log/obagent/ 目录下,寻找启动失败的原因。
  • 检查 obagent 的配置文件,确保配置正确无误。

2. 检查系统资源限制

  • 根据日志中的警告信息,您的系统存在一些资源限制问题,例如 fs.aio-max-nr, vm.max_map_count, fs.file-max 等。这些问题可能会导致 obagent 启动失败。
  • 调整这些系统参数至推荐值,然后重新启动 obagent。

3. 检查 obagent 依赖项

  • 确保 obagent 的所有依赖项都已正确安装。
  • 可能需要重新安装 obagent 或者更新其依赖项。

4. 检查网络配置

  • 确认 obdiag 能够访问到 OceanBase 实例所在的网络。
  • 如果使用的是 Docker 容器,请确保容器间的网络配置正确。

5. 使用 obdiag 收集日志

  • 在 obagent 成功启动后,再次尝试使用 obdiag 收集诊断信息,以便进一步排查问题。

总结

  • 首先检查 obagent 的日志和配置文件,找出启动失败的具体原因。
  • 然后调整系统资源限制,确保 obagent 能够正常启动。
  • 最后,确认 obdiag 能够成功连接到 OceanBase 实例,以便收集更多的诊断信息。

如果您在执行上述步骤后仍然遇到问题,请提供更详细的日志信息,以便我们进一步分析。

具体信息可参考以下文档:

(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)

我这前面不是演示了嘛
docker restart oceanbase 就无法启动了,这本身就是有问题的呀,哪有不支持重启的服务呢
在未修改任何参数以及配置的情况下

你好,想问一下具体的使用场景是什么呢,oceanbase-ce 这个镜像主要面向的只是测试场景,如果需要在 K8s 中运行,请参考 ob-operator 的文档 https://oceanbase.github.io/ob-operator/ ,通过 ob-operator 来管理 K8s 中的 OceanBase 集群,如果是 docker 运行单机版镜像的话最好是把 OceanBase 的文件挂载出来,需要挂在的目录有 /root/ob, /root/.obd/cluster

我就是本地测一下,但是基于k8s部署,我用docker版做了编排文件,挂载了三个目录,也不行
总之就是OceanBase-ce的镜像,不支持服务重启,重启后必无法正常运行。
目前ob-operator的话,我在研究了,群里那个陈新宁是我,感谢大佬回复我的帖子。。。

我当时相当于把docker 单机版做成了一个简单chart,发现没法用,就想着有啥处理办法没,现在用ob-operator了,这一块OceanBase-ce的我先不捣鼓了

以前编排文件如下:
deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "oceanbase-ce.fullname" . }}
  labels:
    {{- include "oceanbase-ce.labels" . | nindent 4 }}
  namespace: {{ .Release.Namespace }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "oceanbase-ce.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      {{- with .Values.podAnnotations }}
      annotations:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      labels:
        {{- include "oceanbase-ce.selectorLabels" . | nindent 8 }}
    spec:
      containers:
      - name: {{ .Chart.Name }}
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}-{{ .Values.image.imageArch }}"
        imagePullPolicy: {{ .Values.image.pullPolicy }}
        ports:
        - name: http
          containerPort: 2881
          protocol: TCP
        env:
        - name: MODE
          value: {{ .Values.mode | quote }}
        resources:
          {{- toYaml .Values.resources | nindent 12 }}
        volumeMounts:
        - name: ob-data
          mountPath: /root/ob
        - name: ob-cluster
          mountPath: /root/.obd/cluster
        - name: obagent
          mountPath: /root/obagent
        - name: localtime
          mountPath: /etc/localtime
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.affinity }}
      affinity:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.tolerations }}
      tolerations:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      volumes:
      - name: ob-data
        persistentVolumeClaim:
          claimName: {{ include "oceanbase-ce.fullname" . }}-ob-data
      - name: ob-cluster
        persistentVolumeClaim:
          claimName: {{ include "oceanbase-ce.fullname" . }}-ob-cluster
      - name: obagent
        persistentVolumeClaim:
          claimName: {{ include "oceanbase-ce.fullname" . }}-obagent
      - hostPath:
          path: /etc/localtime
          type: ""
        name: localtime

pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: {{ include "oceanbase-ce.fullname" . }}-ob-data
  namespace: {{ .Release.Namespace }}
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: {{ .Values.pvc.obData.size }}
  storageClassName: {{ .Values.pvc.obData.storageClass }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: {{ include "oceanbase-ce.fullname" . }}-ob-cluster
  namespace: {{ .Release.Namespace }}
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: {{ .Values.pvc.obCluster.size }}
  storageClassName: {{ .Values.pvc.obCluster.storageClass }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: {{ include "oceanbase-ce.fullname" . }}-obagent
  namespace: {{ .Release.Namespace }}
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: {{ .Values.pvc.obAgent.size }}
  storageClassName: {{ .Values.pvc.obAgent.storageClass }}

svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: {{ include "oceanbase-ce.fullname" . }}
  labels:
    {{- include "oceanbase-ce.labels" . | nindent 4 }}
  namespace: {{ .Release.Namespace }}
spec:
  type: {{ .Values.service.type }}
  ports:
    - port: {{ .Values.service.port }}
      targetPort: http
      protocol: TCP
  selector:
    {{- include "oceanbase-ce.selectorLabels" . | nindent 4 }}

values.yaml

replicaCount: 1

# Default settings for OceanBase deployment
mode: "normal"
# MODE values:
# - "mini": Indicates that the container will use the least amount of resources.
# - "normal": Indicates that the container will utilize as much of the available resources as possible.
# - "slim": Indicates that the container will only start the observer and use the fastboot mode.

image:
  repository: xxxx/oceanbase/oceanbase-ce
  tag: 4.2.2
  imageArch: amd64
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 2881

pvc:
  obData:
    storageClass: nfs-client
    size: 10Gi
  obCluster:
    storageClass: nfs-client
    size: 5Gi
  obAgent:
    storageClass: nfs-client  # 使用之前定义的存储类
    size: 5Gi    

resources:
  limits:
    cpu: 4
    memory: 8Gi
  requests:
    cpu: 2
    memory: 4Gi

nodeSelector: {}

tolerations: []

affinity: {}

好的,之后可以群里多交流

把 obd cluster list中看到的demo给destroy掉