OAT管理者平台创建服务器报错

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】3.2.4
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【问题现象及影响】
image

【附件】
dagrun_43.zip (9.4 KB)

???

请稍等,已提工单咨询相关同学

还想问一下,
1.OAT是可以直接安装和部署三节点集群的是吧? 不用我手动在去安装集群了吧?
2.上面这个报错是否可以忽略,直接执行后面的步骤

辛苦发一下 config_deps 的告警详情,并且OAT版本号发一下

告警详情,应该在附件里吧,那个是我下载的日志,OAT版本号是4.1.0

收到,请稍等

这个步骤可以先跳过,后面还有会precheck步骤,这个时候如果少的包也会报出来,可以手动补全

dagrun_43.zip (15.8 KB)
image
precheck报错了,不知道哪里有问题


收到,请等待相关同学查看下


红框中的这几个包手动安装一下

dagrun_43.zip (24.2 KB)


安装完后,还是不行啊

[2023-06-27T14:17:46.888+0800] INFO - ### SUMMARY OF ISSUES IN PRE-CHECK ###
[2023-06-27T14:17:46.889+0800] INFO - check CPU count: 16 < 32 … EXPECT >= 32 … FAIL
[2023-06-27T14:17:46.889+0800] INFO - TIPS: replace another machine with more CPU
[2023-06-27T14:17:46.889+0800] INFO - check total MEM: 62 GB < 128 GB … EXPECT >= 128 GB … FAIL
[2023-06-27T14:17:46.889+0800] INFO - TIPS: replace another machine with more MEM
[2023-06-27T14:17:46.889+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL
[2023-06-27T14:17:46.890+0800] INFO - TIPS: re-part disk to mount /data/1
[2023-06-27T14:17:46.890+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL
[2023-06-27T14:17:46.890+0800] INFO - TIPS: re-part disk to mount /data/log1
[2023-06-27T14:17:46.890+0800] INFO - can not find command nc … FAIL
[2023-06-27T14:17:46.890+0800] INFO - check irqbalance status: active != inactive … EXPECT inactive … FAIL
[2023-06-27T14:17:46.890+0800] INFO - TIPS: stop irqbalance service on PHY
[2023-06-27T14:17:46.890+0800] INFO - systemctl stop irqbalance
[2023-06-27T14:17:46.892+0800] INFO - execute command on 10.30.37.119:
rm -f /tmp/precheck.shEHYon8Rk
[2023-06-27T14:17:46.961+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_server_with_tag.py”, line 79, in precheck
common.server_precheck(ctx, logger=logger)
File “/oat/task_engine/plugins/common.py”, line 1542, in server_precheck
raise RuntimeError(‘server precheck failed, please see the summary info above for details’)
RuntimeError: server precheck failed, please see the summary info above for details
[2023-06-27T14:17:46.968+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20230627T024116, start_date=20230627T061733, end_date=20230627T061746
[2023-06-27T14:17:46.970+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_server_with_tag__precheck__20230627 and exception information is server precheck failed, please see the summary info above for details’, 50]
[2023-06-27T14:17:46.971+0800] INFO - Rows affected: 1
[2023-06-27T14:17:46.984+0800] ERROR - Failed to execute job 409 for task precheck (server precheck failed, please see the summary info above for details; 8417)
[2023-06-27T14:17:47.009+0800] INFO - Task exited with return code 1
[2023-06-27T14:17:47.033+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check

我觉得这些是不是可以忽略啊?他说我没有挂盘 我这就一个盘data,已经挂盘了啊,必须单独挂盘吗?
image

你好,工单同学正在查看,请稍等

请问现在是否可以正常连接了

没有,创建服务器那个我跳过了,感觉应该不影响了,现在是到metadb,两台创建成功了,第三个一直创建不了。。。一直连不上

日志可以发一下吗

dagrun_78.zip (13.2 KB)

收到,正在咨询工单同学


还有这个,扫描很久了,一直停在这,也没看到有啥日志