【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】oat 4.0.2
【问题现象及影响】
[2023-02-23T11:06:07.033+0800] INFO - usermod: user admin is currently used by process 785132
[2023-02-23T11:06:07.034+0800] INFO - change user admin’s uid to 500 error!
[2023-02-23T11:06:07.035+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 188, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 193, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_server_with_tag.py”, line 50, in create_admin_user
common.config_os_admin(ctx, logger)
File “/oat/task_engine/plugins/common.py”, line 1354, in config_os_admin
raise RuntimeError(‘config os admin failed’)
RuntimeError: config os admin failed
[2023-02-23T11:06:07.041+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=create_admin_user, execution_date=20230223T025316, start_date=20230223T030606, end_date=20230223T030607
[2023-02-23T11:06:07.049+0800] ERROR - Failed to execute job 423 for task create_admin_user (config os admin failed; 120910)
[2023-02-23T11:06:07.093+0800] INFO - Task exited with return code 1
这个服务器有admin 这个用户吗?id admin 看看
oat 会建一个 uid 是500 的admin 用户,现在看这个报错,应该是卡这里了,可能你有对应的用户了。
我手动创建了这个admin用户,使用了公钥的方式添加的服务器。我手动调整了admin的uid位500该阶段已成功。不过在precheck阶段报错:
[2023-02-23T12:24:56.476+0800] INFO - execute command on x.x.x.x:
rm -f /tmp/precheck.shhJ3mleYX
[2023-02-23T12:24:56.549+0800] INFO -
[2023-02-23T12:24:56.550+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 188, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 193, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_server_with_tag.py”, line 82, in precheck
common.server_precheck(ctx, logger=logger)
File “/oat/task_engine/plugins/common.py”, line 1486, in server_precheck
raise RuntimeError(‘server precheck failed, please see the summary info above for details’)
RuntimeError: server precheck failed, please see the summary info above for details
[2023-02-23T12:24:56.555+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20230223T025316, start_date=20230223T042447, end_date=20230223T042456
[2023-02-23T12:24:56.564+0800] ERROR - Failed to execute job 433 for task precheck (server precheck failed, please see the summary info above for details; 137551)
[2023-02-23T12:24:56.601+0800] INFO - Task exited with return code 1
execute command on x.x.x.x:
rm -f /tmp/precheck.shhJ3mleYX ,这个命令有执行成功吗, /tmp 目录可以正常读写吗
可以成功,不过删除后重试还会继续遇到这个问题,很奇怪
oat 是商业版的工具,这个可以到官网提交工单找人支持。