感谢老师,问题二应该是由于反复重试导致的
今天重新部署了一个OAT-420再进行OMS422部署,感觉是遇到bug了,无论是OAT420还是OAT410进行OMS422的部署,到了步骤"start_first_batch_oms_container"就会遭遇 8090 端口的这个报错,这是一个稳定复现的问题,部署其他版本OMS则不会有这个报错
[2024-07-09T15:54:39.815+0800] INFO - {‘taskParam’: ‘{“password”: “MjAyNC0wNy0wOQ==”, “user”: “RM_oms”}’}
[2024-07-09T15:54:39.825+0800] INFO - 2024-07-09 15:54:36,753-urllib3.connectionpool-DEBUG connectionpool._new_conn.250 :Starting new HTTP connection (1): localhost:8090
[2024-07-09T15:54:39.827+0800] ERROR - Traceback (most recent call last):
[2024-07-09T15:54:39.828+0800] ERROR - File “/usr/lib64/python2.7/runpy.py”, line 162, in _run_module_as_main
[2024-07-09T15:54:39.828+0800] ERROR - “main”, fname, loader, pkg_name)
[2024-07-09T15:54:39.829+0800] ERROR - File “/usr/lib64/python2.7/runpy.py”, line 72, in _run_code
[2024-07-09T15:54:39.829+0800] ERROR - exec code in run_globals
[2024-07-09T15:54:39.831+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 196, in
[2024-07-09T15:54:39.832+0800] ERROR - main()
[2024-07-09T15:54:39.832+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 192, in main
[2024-07-09T15:54:39.832+0800] ERROR - o.add_resource()
[2024-07-09T15:54:39.833+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 175, in add_resource
[2024-07-09T15:54:39.833+0800] ERROR - self.add_resource_nodes(self.role)
[2024-07-09T15:54:39.833+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 168, in add_resource_nodes
[2024-07-09T15:54:39.834+0800] ERROR - self.add_rm_cluster(cm_endpoint)
[2024-07-09T15:54:39.834+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 108, in add_rm_cluster
[2024-07-09T15:54:39.834+0800] ERROR - token = self._auth(rm_endpoint)
[2024-07-09T15:54:39.836+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 84, in _auth
[2024-07-09T15:54:39.836+0800] ERROR - auth_ret = requests.post(url, data=params)
[2024-07-09T15:54:39.837+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/api.py”, line 117, in post
[2024-07-09T15:54:39.837+0800] ERROR - return request(‘post’, url, data=data, json=json, **kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/api.py”, line 61, in request
[2024-07-09T15:54:39.838+0800] ERROR - return session.request(method=method, url=url, **kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/sessions.py”, line 542, in request
[2024-07-09T15:54:39.838+0800] ERROR - resp = self.send(prep, **send_kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/sessions.py”, line 655, in send
[2024-07-09T15:54:39.839+0800] ERROR - r = adapter.send(request, **kwargs)
[2024-07-09T15:54:39.839+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/adapters.py”, line 516, in send
[2024-07-09T15:54:39.839+0800] ERROR - raise ConnectionError(e, request=request)
[2024-07-09T15:54:39.839+0800] ERROR - requests.exceptions.ConnectionError: HTTPConnectionPool(host=‘localhost’, port=8090): Max retries exceeded with url: /api/auth (Caused by NewConnectionError(’<urllib3.connection.HTTPConnection object at 0x7f2077bfebd0>: Failed to establish a new connection: [Errno 111] Connection refused’,))
[2024-07-09T15:54:39.874+0800] INFO -
[2024-07-09T15:54:39.876+0800] INFO -
[2024-07-09T15:54:39.876+0800] INFO - e[0;31m# 【结束】初始化失败,当前命令: python -m omsflow.scripts.units.oms_cluster_manager add_resource e[0m
操作步骤见PDF
OMS报错-HTTPConnectionPool(host=‘localhost’, port=8090.pdf (2.8 MB)
日志
OMS报错-subtask_start_first_batch_oms_container@0 (4).log (185.2 KB)