单机在线安装社区 OceanBase 数据库 LTS 版本 demo ,启动后提示failed 失败,但相关进程已经起来

【 使用环境 】测试环境
【 OB or 其他组件 】
【 使用版本 】
observer version: OceanBase_CE 4.2.1.4, revision: 104000052024022918-3246b00b20dbd0cf1e1bef4b67fd31ff8fe7088f, sysname: Linux, os release: 3.10.0-1160.el7.x86_64, machine: x86_64, tz GMT offset: 08:00"
【问题描述】清晰明确描述问题
通过在线安装社区 OceanBase 数据库 LTS 版本 demo
启动命令
systemctl start oceanbase
Job for oceanbase.service failed because a timeout was exceeded. See “systemctl status oceanbase.service” and “journalctl -xe” for details.

[root@OB log]# journalctl -xe
– Unit oceanbase.service has begun starting up.
5月 29 15:54:36 OB bash[13648]: oceanbase service started at 2024-05-29 15:54:36
5月 29 15:54:37 OB bash[13648]: daemon process with PID 2277 is not running.
5月 29 15:54:39 OB bash[13648]: Failed to notify init system: Permission denied
5月 29 15:54:39 OB bash[13648]: Failed to notify init system: Permission denied
5月 29 15:54:39 OB bash[13648]: The observer is already bootstrap, please start it immediately
5月 29 15:54:39 OB bash[13648]: the start observer trace id is 23232235888028867
5月 29 15:54:39 OB bash[13648]: the response state is RUNNING
5月 29 15:54:39 OB bash[13648]: wait 6s and the retry
5月 29 15:54:45 OB bash[13648]: the response state is RUNNING
5月 29 15:54:45 OB bash[13648]: wait 6s and the retry
5月 29 15:54:51 OB bash[13648]: the response state is RUNNING
5月 29 15:54:51 OB bash[13648]: wait 6s and the retry
5月 29 15:54:57 OB bash[13648]: the response state is RUNNING
5月 29 15:54:57 OB bash[13648]: wait 6s and the retry
5月 29 15:55:03 OB bash[13648]: the response state is RUNNING
5月 29 15:55:03 OB bash[13648]: wait 6s and the retry
5月 29 15:55:09 OB bash[13648]: the response state is RUNNING
5月 29 15:55:09 OB bash[13648]: wait 6s and the retry
5月 29 15:55:15 OB bash[13648]: the response state is SUCCEED
5月 29 15:55:15 OB bash[13648]: request successfully
5月 29 15:55:16 OB bash[13648]: Failed to notify init system: Permission denied
5月 29 15:55:16 OB bash[13648]: Observer process with PID 14250 is still running.
5月 29 15:55:46 OB bash[13648]: Observer process with PID 14250 is still running.
5月 29 15:56:06 OB systemd[1]: oceanbase.service start operation timed out. Terminating.
5月 29 15:56:06 OB systemd[1]: Failed to start oceanbase.
– Subject: Unit oceanbase.service has failed
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit oceanbase.service has failed.

– The result is failed.
5月 29 15:56:06 OB systemd[1]: Unit oceanbase.service entered failed state.
5月 29 15:56:06 OB systemd[1]: oceanbase.service failed.
5月 29 15:56:06 OB polkitd[660]: Unregistered Authentication Agent for unix-process:13642:930659 (system bus name :1.29, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnected from bus
5月 29 15:56:16 OB bash[13648]: Observer process with PID 14250 is still running.
5月 29 15:56:46 OB bash[13648]: Observer process with PID 14250 is still running.
5月 29 15:57:16 OB bash[13648]: Observer process with PID 14250 is still running.

查看相关进程是起来了
root 14191 1 0 15:54 ? 00:00:00 /home/admin/oceanbase/bin/obshell daemon --ip 192.168.1.112 --port 2886
root 14203 14191 0 15:54 ? 00:00:01 /home/admin/oceanbase/bin/obshell server --ip 192.168.1.112 --port 2886
root 14250 1 6 15:54 ? 00:01:56 /home/admin/oceanbase/bin/observer

【复现路径】问题出现前后相关操作

【附件及日志】
observer_failed.log (1.4 MB)


详细得完整日志提供下。提供得日志看着有被刷得痕迹

你这个是通过什么方式安装部署的?看报错像是配置有问题,有配置文件吗?

基于下面脚本快速安装
sudo bash -c “$(curl -s https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/download-center/opensource/service/installer.sh)”
也修改了/etc下的配置文件

日志刷的比较多,压缩了在附件。启动阶段就报错
observer.zip (5.4 MB)

从问题上的截图出发,看上去observer已经启动成功了,但systemd后面超时了,有可能这与systemd机制相关,不一定是observer失败造成的

我在兼容一些其他系统上遇到过,systemd有个超时机制
先看一下解决方法:

  1. 通过ps检查observer进程到底是否存在
  2. 如果存在,这是systemd兼容造成的问题,而非observer失败。如果没有,就是observer启动失败,起来后失败基本就是core掉了,需要查看observer日志和core dump文件
  3. 我后续开发了systemd forking机制,而非类似于mysql的notify,但下月初才发版,相信那时候systemd的体验会好一些。