OMS调整资源规格报错-Consumer already exists:drc

【 使用环境 】测试环境
【 OMS 】
【 使用版本 】4.2.2
【问题描述】通过OAT4.1.0 将 OMS -4.2.2 将内存由10G扩大至20G,出现如下报错

【复现路径】资源规格扩大

日志如下
OMS-subtask_operate_container@0 (1).log (430.6 KB)

报错一

尝试如下步骤解决:
MySQL [oms_cm]> select * from location_cm;
±—±---------±----------------------±-------±--------------------±--------------------+
| id | location | cm_url | status | gmt_created | gmt_modified |
±—±---------±----------------------±-------±--------------------±--------------------+
| 1 | 100 | http://127.0.0.1:8088 | 0 | 2024-07-08 13:46:59 | 2024-07-08 13:46:59 |
±—±---------±----------------------±-------±--------------------±--------------------+
1 row in set (0.01 sec)
MySQL [oms_cm]>
MySQL [oms_cm]> truncate table location_cm;
Query OK, 0 rows affected (0.24 sec)
MySQL [oms_cm]>
MySQL [oms_cm]> select * from location_cm;
Empty set (0.01 sec)

报错二,请协助分析,感谢
[2024-07-08T16:02:45.746+0800] INFO - 2024-07-08 16:03:39,956-urllib3.connectionpool-DEBUG connectionpool._new_conn.250 :Starting new HTTP connection (1): 127.0.0.1:8088

[2024-07-08T16:02:45.945+0800] INFO - 2024-07-08 16:03:40,101-urllib3.connectionpool-DEBUG connectionpool._make_request.483 :http://127.0.0.1:8088 “POST /consumer/create HTTP/1.1” 200 426

[2024-07-08T16:02:45.946+0800] INFO - {“errMsg”:“Consumer already exists:drc reason: Consumer already exists:drc”,“errorDetail”:{“code”:“CM-RESOIE000003”,“extraContext”:null,“level”:“ERROR”,“message”:“Consumer already exists:drc”,“messageMcmsContext”:null,“messageMcmsKey”:null,“proposal”:null,“proposalMcmsContext”:null,“proposalMcmsKey”:null,“reason”:null,“reasonMcmsContext”:null,“reasonMcmsKey”:null,“upstreamErrorDetail”:null},“errCode”:302,“isSuccess”:false}

你好,这是个OMS的问题,基于OAT做的操作,OMS的报错,请问下标题存在什么问题?
这个主要是问题二不知道如何排查,社区版OMS也是一样的组件,报错提示了drc组件已存在,希望能帮忙提供一个排查方向或具体方法,企业版和社区版个人资源有限,很难做到同时部署两套环境。

好的 这边排查一下 问题 稍后给你回贴

1 个赞

问题二中的 “consumer 已存在” 是在调用 OMS 中 cm 服务的 /consumer/create 接口时返回的,至于重复创建的原因,可能是前期处理中涉及已有 consumer 清理但是没有清理成功,也可能是其他的原因,因为 OAT 是企业版的工具,我这边对操作细节并不了解,需要您那边联系一下支持同学让 OAT 的同学一起看一下。

感谢老师,问题二应该是由于反复重试导致的

今天重新部署了一个OAT-420再进行OMS422部署,感觉是遇到bug了,无论是OAT420还是OAT410进行OMS422的部署,到了步骤"start_first_batch_oms_container"就会遭遇 8090 端口的这个报错,这是一个稳定复现的问题,部署其他版本OMS则不会有这个报错

[2024-07-09T15:54:39.815+0800] INFO - {‘taskParam’: ‘{“password”: “MjAyNC0wNy0wOQ==”, “user”: “RM_oms”}’}
[2024-07-09T15:54:39.825+0800] INFO - 2024-07-09 15:54:36,753-urllib3.connectionpool-DEBUG connectionpool._new_conn.250 :Starting new HTTP connection (1): localhost:8090
[2024-07-09T15:54:39.827+0800] ERROR - Traceback (most recent call last):
[2024-07-09T15:54:39.828+0800] ERROR - File “/usr/lib64/python2.7/runpy.py”, line 162, in _run_module_as_main
[2024-07-09T15:54:39.828+0800] ERROR - “main”, fname, loader, pkg_name)
[2024-07-09T15:54:39.829+0800] ERROR - File “/usr/lib64/python2.7/runpy.py”, line 72, in _run_code
[2024-07-09T15:54:39.829+0800] ERROR - exec code in run_globals
[2024-07-09T15:54:39.831+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 196, in
[2024-07-09T15:54:39.832+0800] ERROR - main()
[2024-07-09T15:54:39.832+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 192, in main
[2024-07-09T15:54:39.832+0800] ERROR - o.add_resource()
[2024-07-09T15:54:39.833+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 175, in add_resource
[2024-07-09T15:54:39.833+0800] ERROR - self.add_resource_nodes(self.role)
[2024-07-09T15:54:39.833+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 168, in add_resource_nodes
[2024-07-09T15:54:39.834+0800] ERROR - self.add_rm_cluster(cm_endpoint)
[2024-07-09T15:54:39.834+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 108, in add_rm_cluster
[2024-07-09T15:54:39.834+0800] ERROR - token = self._auth(rm_endpoint)
[2024-07-09T15:54:39.836+0800] ERROR - File “/root/omsflow/scripts/units/oms_cluster_manager.py”, line 84, in _auth
[2024-07-09T15:54:39.836+0800] ERROR - auth_ret = requests.post(url, data=params)
[2024-07-09T15:54:39.837+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/api.py”, line 117, in post
[2024-07-09T15:54:39.837+0800] ERROR - return request(‘post’, url, data=data, json=json, **kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/api.py”, line 61, in request
[2024-07-09T15:54:39.838+0800] ERROR - return session.request(method=method, url=url, **kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/sessions.py”, line 542, in request
[2024-07-09T15:54:39.838+0800] ERROR - resp = self.send(prep, **send_kwargs)
[2024-07-09T15:54:39.838+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/sessions.py”, line 655, in send
[2024-07-09T15:54:39.839+0800] ERROR - r = adapter.send(request, **kwargs)
[2024-07-09T15:54:39.839+0800] ERROR - File “/usr/lib/python2.7/site-packages/requests/adapters.py”, line 516, in send
[2024-07-09T15:54:39.839+0800] ERROR - raise ConnectionError(e, request=request)
[2024-07-09T15:54:39.839+0800] ERROR - requests.exceptions.ConnectionError: HTTPConnectionPool(host=‘localhost’, port=8090): Max retries exceeded with url: /api/auth (Caused by NewConnectionError(’<urllib3.connection.HTTPConnection object at 0x7f2077bfebd0>: Failed to establish a new connection: [Errno 111] Connection refused’,))
[2024-07-09T15:54:39.874+0800] INFO -
[2024-07-09T15:54:39.876+0800] INFO -
[2024-07-09T15:54:39.876+0800] INFO - e[0;31m# 【结束】初始化失败,当前命令: python -m omsflow.scripts.units.oms_cluster_manager add_resource e[0m

操作步骤见PDF
OMS报错-HTTPConnectionPool(host=‘localhost’, port=8090.pdf (2.8 MB)

日志
OMS报错-subtask_start_first_batch_oms_container@0 (4).log (185.2 KB)