记录OMS-4.2.7社区版安装部署报错Error 503-Service Unavailable解决过程

问题描述

OceanBase数据迁移工具OMS-4.2.7在进行单节点部署时,执行完

bash docker_remote_deploy.sh -o /home/admin/oms -c /root/dbs/repo/oms_config.yaml -i 19.xxx.xx.xx -d 238392012d5e

进行到【步骤五】初始化 OMS 资源标签和资源组,部署界面反复打印:

9 post url : http://19.xxx.xx.xx:8088/resource/host/add, data : {'status': 'ONLINE', 'ip': '19.xxx.xx.xx', 'storeLimit': 32, 'jdbcWriterLimit': 40, 'groupName': '100'} error,retry later ......
2024-12-27 15:39:02,232-urllib3.connectionpool-DEBUG connectionpool._new_conn.250 :Starting new HTTP connection (1): 19.xxx.xx.xx:8088
2024-12-27 15:39:02,234-urllib3.connectionpool-DEBUG connectionpool._make_request.484 :http://19.xxx.xx.xx:8088 "POST /resource/host/add HTTP/1.1" 503 305
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 503 </title>
</head>
<body>
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /resource/host/add. Reason:
<pre>    Service Unavailable</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
</body>
</html>

问题排查

翻看安装部署日志,发现oms_console一直在不断重启,开始在oms容器内查找oms_console相关日志

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# 【步骤四】重启 OMS 所有组件 (大约需要两分钟,请耐心等待)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
supervisorctl restart nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx: stopped
oms_console: stopped
oms_drc_cm: stopped
oms_drc_supervisor: stopped
sshd: stopped
nginx: started
oms_console: started
oms_drc_cm: started
oms_drc_supervisor: started
sshd: started
supervisorctl status nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx                            RUNNING   pid 51452, uptime 0:00:51
oms_console                      RUNNING   pid 51461, uptime 0:00:41
oms_drc_cm                       RUNNING   pid 51554, uptime 0:00:31
oms_drc_supervisor               STARTING  
sshd                             RUNNING   pid 52165, uptime 0:00:10
supervisorctl status nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx                            RUNNING   pid 51452, uptime 0:01:51
oms_console                      STARTING  
oms_drc_cm                       RUNNING   pid 51554, uptime 0:01:31
oms_drc_supervisor               RUNNING   pid 52433, uptime 0:01:09
sshd                             RUNNING   pid 52165, uptime 0:01:10

最终在/home/admin/logs/ghana/Ghana/commom-error.log 找到关键报错日志内容:

Caused by: java.net.UnknownHostException: sy6042603: sy6042603: Name or service not known
        at java.net.InetAddress.getLocalHost(InetAddress.java:1506)
        at com.alipay.dss.core.operation.util.HostnameUtil.getCurrentHostname(HostnameUtil.java:29)
        ... 49 common frames omitted
Caused by: java.net.UnknownHostException: sy6042603: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
        at java.net.InetAddress.getLocalHost(InetAddress.java:1501)
        ... 50 common frames omitted

其中sy6042603 宿主机的主机名,说明在解析主机名时遇到问题。

解决方案

在oms容器内,/etc/hosts下添加一行 宿主机IP和主机名

 [root@sy6042603 Ghana]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
19.xxx.xx.xx sy6042603 

重新部署后,访问正常

此贴原计划发布在博客区域,但由于文字内容过少,官方建议发布在问答区。
在实践过程中,遇到问题常常打开社区官网进行检索关键字,希望通过记录问题排查和解决的过程给更多的社区用户提供参考