机房断电到导致数据库启动卡住

【 使用环境 】测试环境
【 OB or 其他组件 】
【 使用版本 】4.1
【问题描述】机房断电,导致数据库重启的时候卡住,ERROR 8001 (08004): Server is initializing
[admin@localhost ~]$ obd cluster restart dscpcolony -c oceanbase-ce
Get local repositories and plugins ok
Load cluster param plugin ok
Open ssh connection ok
Cluster status check ok
Search plugins ok
Load cluster param plugin ok
Check before start observer ok
Check before start obproxy ok
Check before start obagent ok
Check before start ocp-express ok
Start observer ok
observer program health check ok
Connect to observer ok
Start obproxy ok
obproxy program health check ok
Connect to obproxy ok
Start obagent ok
obagent program health check ok
Connect to Obagent ok
Start ocp-express x
[ERROR] 127.0.0.1: failed to connect meta db

[ERROR] ocp-express start failed
Wait for observer init \

observer.7z (766.4 KB)

rootservice.7z (1.9 MB)

看着是ocp-express不能连接meta db报错了
1、这个日志发一下
obd日志: 默认保存在安装obd的用户home路径: cd ~/.obd/log/
2、obd cluster list --查一下集群名
obd cluster edit-config {集群名} --保存在文本里 提供一下

obd.7z (185.1 KB)
[admin@localhost log]$ obd cluster list
±-------------------------------------------------------------------+
| Cluster List |
±-----------±------------------------------------±----------------+
| Name | Configuration Path | Status (Cached) |
±-----------±------------------------------------±----------------+
| dscpcolony | /home/admin/.obd/cluster/dscpcolony | stopped |
±-----------±------------------------------------±----------------+
config.txt (2.0 KB)

再查一下 ps -ef | grep observer | grep -v grep

[admin@localhost log]$ ps -ef | grep observer | grep -v grep
admin 26676 1 48 08:53 ? 00:49:22 /home/admin/dscpcolony/oceanbase/bin/observer -p 2881

在启动一下 ocp-expresss
obd cluster start xxx -s ip -c ocp-expresss
是否修改过proxyro的密码

[admin@localhost log]$ obd cluster start dscpcolony -s 127.0.0.1 -c ocp-expresss
[ERROR] Deploy need restart.
Use obd cluster restart dscpcolony --wp to make changes take effect.
If you still need to start the cluster, use the obd cluster start dscpcolony --wop option to start the cluster without loading parameters.
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 265dea22-0924-11f0-982a-005056ab789a
If you want to view detailed obd logs, please run: obd display-trace 265dea22-0924-11f0-982a-005056ab789a
[admin@localhost log]$ obd display-trace 265dea22-0924-11f0-982a-005056ab789a
[2025-03-25 10:52:11.565] [DEBUG] - cmd: [‘dscpcolony’]
[2025-03-25 10:52:11.565] [DEBUG] - opts: {‘servers’: ‘127.0.0.1’, ‘components’: ‘ocp-expresss’, ‘force_delete’: None, ‘strict_check’: None, ‘without_parameter’: None}
[2025-03-25 10:52:11.565] [DEBUG] - mkdir /home/admin/.obd/lock/
[2025-03-25 10:52:11.565] [DEBUG] - unknown lock mode
[2025-03-25 10:52:11.566] [DEBUG] - try to get share lock /home/admin/.obd/lock/global
[2025-03-25 10:52:11.566] [DEBUG] - share lock /home/admin/.obd/lock/global, count 1
[2025-03-25 10:52:11.566] [DEBUG] - Get Deploy by name
[2025-03-25 10:52:11.566] [DEBUG] - mkdir /home/admin/.obd/cluster/
[2025-03-25 10:52:11.566] [DEBUG] - mkdir /home/admin/.obd/config_parser/
[2025-03-25 10:52:11.567] [DEBUG] - try to get exclusive lock /home/admin/.obd/lock/deploy_dscpcolony
[2025-03-25 10:52:11.567] [DEBUG] - exclusive lock /home/admin/.obd/lock/deploy_dscpcolony, count 1
[2025-03-25 10:52:11.577] [DEBUG] - Deploy status judge
[2025-03-25 10:52:11.577] [ERROR] Deploy need restart.
[2025-03-25 10:52:11.577] [ERROR] Use obd cluster restart dscpcolony --wp to make changes take effect.
[2025-03-25 10:52:11.577] [ERROR] If you still need to start the cluster, use the obd cluster start dscpcolony --wop option to start the cluster without loading parameters.
[2025-03-25 10:52:11.579] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2025-03-25 10:52:11.580] [INFO] Trace ID: 265dea22-0924-11f0-982a-005056ab789a
[2025-03-25 10:52:11.580] [INFO] If you want to view detailed obd logs, please run: obd display-trace 265dea22-0924-11f0-982a-005056ab789a
[2025-03-25 10:52:11.580] [DEBUG] - exclusive lock /home/admin/.obd/lock/deploy_dscpcolony release, count 0
[2025-03-25 10:52:11.580] [DEBUG] - unlock /home/admin/.obd/lock/deploy_dscpcolony
[2025-03-25 10:52:11.580] [DEBUG] - share lock /home/admin/.obd/lock/global release, count 0
[2025-03-25 10:52:11.580] [DEBUG] - unlock /home/admin/.obd/lock/global

这样试一下 连接obclient -h127.0.0.1 -P2883 -umeta@ocp -potz5tWb1On

[admin@localhost log]$ obclient -h127.0.0.1 -P2883 -umeta@ocp -potz5tWb1On
ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 11

[admin@localhost log]$ obclient -h127.0.0.1 -P2883 -umeta@ocp -potz5tWb1On
ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 11
[admin@localhost log]$ obclient -h127.0.0.1 -P2881 -umeta@ocp -potz5tWb1On
ERROR 8001 (08004): Server is initializing

提供最新的obproxy.log日志。目前看密码是一致的但proxy相关用户可能校验有问题。

obproxy.7z (3.8 MB)

message:“Access denied for user ‘proxyro’@‘xxx.xxx.xxx.xxx’ (using password: YES)”}
看着是proxyro@sys 连接的密码不正确。
看配置密码是HlRLi1Jj6N
你可以手动连接看下,是否正常。
mysql -h127.0.0.1 -uproxyro@sys -P2881 -pHlRLi1Jj6N -A (只能2881,不能带#集群名称)

如果也是报错,基本是ob的proxyro用户密码有问题。

[admin@localhost log]$ mysql -h127.0.0.1 -uproxyro@sys -P2881 -pHlRLi1Jj6N -A
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (42000): Access denied for user ‘proxyro’@‘xxx.xxx.xxx.xxx’ (using password: YES)

能否不使用代理,我只把数据库启动起来就可以,至于proxyro ocp这些都不是必要的

如果是单节点的 不使用obproxy也可以 其实目前是不建议使用ocp-express 建议部署ocp 我看你使用过的是4.1的版本 目前也不建议使用了 建议使用ob425版本 重新搭建425版本吧或者你还记得proxyro原先的密码 修改回去

数据库有办法恢复么? 我想把数据备份出来

显示你的ob一直处于初始化状态,ob目前也是有问题的
单独使用obd cluster rstart -c oceanbase-ce --wop 无参启动试试

目前数据库observer我看运行的 你这样登录一下
mysql -h127.0.0.1 -uroot@sys -P2881 -pHlRLi1Jj6N -A