OMS docker 容器内执行命令报错 127.0.0.1:8084 refused connection

AntTech_4FCPLP · 2023 年3 月 23 日 12:28

【使用环境】测试环境
【 OB or 其他组件】 OMS 4.0
【使用版本】4.0

OMS docker 容器内执行命令报错

[root@whdrcsrv403 admin]# supervisorctl status
http://127.0.0.1:8084 refused connection

目前OMS的网页也是访问不了

重启 docker 容器之后，还是解决不了。

镜水 · 2023 年3 月 23 日 14:05

已为您联系OMS支持人员，稍等

AntTech_4FCPLP · 2023 年3 月 23 日 14:16

补充一下日志 view legacy.log

[2023-03-23 09:49:00.016][INFO][SUPERVISOR_TASK_9][HostStatusUtils:192][] fileSystem:mounted_on_path /dev/mapper/ol-logs:/u01/ds/store
[2023-03-23 09:49:00.016][INFO][SUPERVISOR_TASK_9][HostStatusUtils:218][] diskSizeInGB 399, diskUsedPercent 0.81954885
[2023-03-23 09:49:00.053][INFO][SUPERVISOR_TASK_7][CheckpointServiceImpl:288][] checker 10.25.15.84-9000:90216:0000000002 skip report heartbeat for heartbeat not update, lastModifiedMs 1679475348759.
[2023-03-23 09:49:08.974][ERROR][SUPERVISOR_TASK_7][CheckpointServiceImpl:930][] reportHeartBeat cost too much time(ms), threshold: 5000, errMsg: Read timed out, URL(POST): http://10.25.15.84:8088/service → 8010
[2023-03-23 09:49:10.051][INFO][SUPERVISOR_TASK_8][CheckpointServiceImpl:288][] checker 10.25.15.84-9000:90216:0000000002 skip report heartbeat for heartbeat not update, lastModifiedMs 1679475348759.
[2023-03-23 09:49:18.465][ERROR][SUPERVISOR_TASK_8][CheckpointServiceImpl:930][] reportHeartBeat cost too much time(ms), threshold: 5000, errMsg: Read timed out, URL(POST): http://10.25.15.84:8088/service → 8009
[2023-03-23 09:49:20.055][INFO][SUPERVISOR_TASK_4][CheckpointServiceImpl:288][] checker 10.25.15.84-9000:90216:0000000002 skip report heartbeat for heartbeat not update, lastModifiedMs 1679475348759.

AntTech_4FCPLP · 2023 年3 月 23 日 14:42

似乎端口8084这个服务起不来了

镜水 · 2023 年3 月 23 日 14:50

请问是单节点还是多节点环境

AntTech_4FCPLP · 2023 年3 月 23 日 15:13

单节点部署

wzqiang · 2023 年3 月 23 日 15:22

OMS没有端口映射吧？

AntTech_4FCPLP · 2023 年3 月 23 日 15:26

容器里这个端口 8084 也没有启动：

root@whdrcsrv403 ~]# docker exec -it 52dcc0486b89 /bin/bash

[root@whdrcsrv403 supervisor]# supervisorctl status
http://127.0.0.1:8084 refused connection
[root@whdrcsrv403 supervisor]# netstat -nltp | grep 8084

wzqiang · 2023 年3 月 23 日 15:30

使用的OMS版本是多少？

wzqiang · 2023 年3 月 23 日 15:35

ps -aux |grep supervisor 看一下进程是否还在。

AntTech_4FCPLP · 2023 年3 月 23 日 15:38

大佬，没有这个服务进程，我怀疑都没起来

[root@whdrcsrv403 supervisor]# ps -aux |grep supervisor
root 361 0.0 0.0 10704 2276 pts/1 S+ 15:37 0:00 grep --color=auto supervisor

AntTech_4FCPLP · 2023 年3 月 23 日 15:38

镜像版本： reg.docker.alibaba-inc.com/oceanbase/oms:feature_4.0.0-ce_bp1

AntTech_4FCPLP · 2023 年3 月 23 日 15:41

我怀疑是不是OB本身出问题了…

我发现怎么所有的租户都不可用了

wzqiang · 2023 年3 月 23 日 16:02

嗯嗯，应该是没有起来，你重新执行一下docker_init.sh 不可以启动OMS吗？

wzqiang · 2023 年3 月 23 日 16:36

@AntTech_4FCPLP 还在吗？

AntTech_4FCPLP · 2023 年3 月 23 日 16:47

好的大佬！

重新执行一下docker_init.sh 这个好像是需要重新初始化 OMS的元数据库。我之前的OMS成功启动过，还做过迁移的任务。

所以会报错：

AntTech_4FCPLP · 2023 年3 月 23 日 16:48

重新执行一下docker_init.sh 确实是起来：

[root@whdrcsrv403 ~]# supervisorctl status
nginx RUNNING pid 3949, uptime 0:04:42
oms_console RUNNING pid 3957, uptime 0:04:32
oms_drc_cm RUNNING pid 4028, uptime 0:04:22
oms_drc_supervisor RUNNING pid 4297, uptime 0:04:12
sshd RUNNING pid 4542, uptime 0:04:02

AntTech_4FCPLP · 2023 年3 月 23 日 16:49

检查了一下之前的迁移任务都还在：

wzqiang · 2023 年3 月 23 日 16:53

这个报错没事的，是OMS要创建的表都已经存在了然后就报错了

AntTech_4FCPLP · 2023 年3 月 23 日 16:55

谢谢大佬！解决方案是执行docker_init.sh