问题描述
OceanBase数据迁移工具OMS-4.2.7在进行单节点部署时,执行完
bash docker_remote_deploy.sh -o /home/admin/oms -c /root/dbs/repo/oms_config.yaml -i 19.xxx.xx.xx -d 238392012d5e
进行到【步骤五】初始化 OMS 资源标签和资源组,部署界面反复打印:
9 post url : http://19.xxx.xx.xx:8088/resource/host/add, data : {'status': 'ONLINE', 'ip': '19.xxx.xx.xx', 'storeLimit': 32, 'jdbcWriterLimit': 40, 'groupName': '100'} error,retry later ......
2024-12-27 15:39:02,232-urllib3.connectionpool-DEBUG connectionpool._new_conn.250 :Starting new HTTP connection (1): 19.xxx.xx.xx:8088
2024-12-27 15:39:02,234-urllib3.connectionpool-DEBUG connectionpool._make_request.484 :http://19.xxx.xx.xx:8088 "POST /resource/host/add HTTP/1.1" 503 305
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 503 </title>
</head>
<body>
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /resource/host/add. Reason:
<pre> Service Unavailable</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
</body>
</html>
问题排查
翻看安装部署日志,发现oms_console一直在不断重启,开始在oms容器内查找oms_console相关日志
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# 【步骤四】重启 OMS 所有组件 (大约需要两分钟,请耐心等待)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
supervisorctl restart nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx: stopped
oms_console: stopped
oms_drc_cm: stopped
oms_drc_supervisor: stopped
sshd: stopped
nginx: started
oms_console: started
oms_drc_cm: started
oms_drc_supervisor: started
sshd: started
supervisorctl status nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx RUNNING pid 51452, uptime 0:00:51
oms_console RUNNING pid 51461, uptime 0:00:41
oms_drc_cm RUNNING pid 51554, uptime 0:00:31
oms_drc_supervisor STARTING
sshd RUNNING pid 52165, uptime 0:00:10
supervisorctl status nginx oms_console oms_drc_cm oms_drc_supervisor sshd
nginx RUNNING pid 51452, uptime 0:01:51
oms_console STARTING
oms_drc_cm RUNNING pid 51554, uptime 0:01:31
oms_drc_supervisor RUNNING pid 52433, uptime 0:01:09
sshd RUNNING pid 52165, uptime 0:01:10
最终在/home/admin/logs/ghana/Ghana/commom-error.log 找到关键报错日志内容:
Caused by: java.net.UnknownHostException: sy6042603: sy6042603: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1506)
at com.alipay.dss.core.operation.util.HostnameUtil.getCurrentHostname(HostnameUtil.java:29)
... 49 common frames omitted
Caused by: java.net.UnknownHostException: sy6042603: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
at java.net.InetAddress.getLocalHost(InetAddress.java:1501)
... 50 common frames omitted
其中sy6042603 宿主机的主机名,说明在解析主机名时遇到问题。
解决方案
在oms容器内,/etc/hosts下添加一行 宿主机IP和主机名
[root@sy6042603 Ghana]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
19.xxx.xx.xx sy6042603
重新部署后,访问正常