【 使用环境 】 测试环境
【 OB or 其他组件 】OCP部署
【 使用版本 】4.3.2
【问题描述】创建产品任务失败,报错machine resource is not enough to hold a new unit
【复现路径】问题出现前后相关操作
1,变量和全局参数ob_create_table_strict_mode均为OFF
2,SELECT * FROM __all_server; 结果:
2024-08-14 18:14:26.465723 2024-08-14 18:15:47.306766 192.168.31.31 2882 1 META_ZONE_1 2881 1 active 0 2.2.77_116010032023022813-4f4fbb6de5d75b4db00ee05d44d56d8c1500c21c(Feb 28 2023 13:48:43) 0 1723630544320002 0 1 0
3,SELECT tenant_id,tenant_name,locality FROM __all_tenant;结果:
1 sys FULL{1}@META_ZONE_1
【附件及日志】
############{1}{2024-08-14T19:44:30+08:00}############
[2024-08-14T19:44:30.627+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T19:44:30.637+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T19:44:30.638+0800] INFO -
[2024-08-14T19:44:30.638+0800] INFO - Starting attempt 1 of 1
[2024-08-14T19:44:30.638+0800] INFO -
[2024-08-14T19:44:30.657+0800] INFO - Executing <Task(_PythonDecoratedOperator): create_tenant> on 2024-08-14 11:44:28.523156+00:00
[2024-08-14T19:44:30.660+0800] INFO - Started process 6806 to run task
[2024-08-14T19:44:30.663+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_ocp’, ‘create_tenant’, ‘manual__2024-08-14T11:44:28.523156+00:00’, ‘–job-id’, ‘205’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_ocp.py’, ‘–cfg-path’, ‘/tmp/tmpd_l7nfz5’]
[2024-08-14T19:44:30.665+0800] INFO - Job 205: Subtask create_tenant
[2024-08-14T19:44:30.727+0800] INFO - Running <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [running]> on host master01
[2024-08-14T19:44:30.796+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_ocp
AIRFLOW_CTX_TASK_ID=create_tenant
AIRFLOW_CTX_EXECUTION_DATE=2024-08-14T11:44:28.523156+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-08-14T11:44:28.523156+00:00
[2024-08-14T19:44:30.797+0800] INFO - use metadb connection
[2024-08-14T19:44:30.797+0800] INFO - Running statement: select a.id, a.ip, a.hardware, b.name as idc, b.region from oat_server a, oat_idc b where a.idc_id=b.id and a.id in (%s), parameters: [1]
[2024-08-14T19:44:30.798+0800] INFO - Rows affected: 1
[2024-08-14T19:44:30.799+0800] INFO - Execute query: select distinct(zone) from __all_server order by zone, args: None
[2024-08-14T19:44:30.800+0800] INFO - Execute rows: 1
[2024-08-14T19:44:30.800+0800] INFO - Execute query: select info from __all_zone where zone=%s and name=‘region’, args: (‘META_ZONE_1’,)
[2024-08-14T19:44:30.801+0800] INFO - Execute rows: 1
[2024-08-14T19:44:30.801+0800] INFO - Execute query: select tenant_id from __all_resource_pool where name=%s, args: (‘jbfmeta_resource_pool’,)
[2024-08-14T19:44:30.801+0800] INFO - Execute rows: 0
[2024-08-14T19:44:30.801+0800] INFO - Execute query: select unit_config_id from __all_unit_config where name=%s, args: (‘jbfmeta_unit’,)
[2024-08-14T19:44:30.802+0800] INFO - Execute rows: 0
[2024-08-14T19:44:30.802+0800] INFO - Execute query: CREATE RESOURCE UNIT IF NOT EXISTS jbfmeta_unit MAX_CPU 2, MAX_MEMORY ‘3G’, MAX_IOPS 128 ,MAX_DISK_SIZE ‘1G’, MAX_SESSION_NUM 10000, args: None
[2024-08-14T19:44:31.052+0800] INFO - Execute rows: 0
[2024-08-14T19:44:31.052+0800] INFO - Execute query: CREATE RESOURCE POOL IF NOT EXISTS jbfmeta_resource_pool UNIT=‘jbfmeta_unit’, UNIT_NUM=1, ZONE_LIST=(‘META_ZONE_1’), args: None
[2024-08-14T19:44:31.178+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_ocp.py”, line 54, in create_tenant
common.create_tenant(ctx, logger, product=‘ocp’)
File “/oat/task_engine/plugins/common.py”, line 731, in create_tenant
cur.execute(sql)
File “/oat/task_engine/plugins/utils.py”, line 1612, in execute
res = super().execute(query, args)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 148, in execute
result = self._query(query)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 310, in _query
conn.query(q)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 775, in _read_query_result
result.read()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 1156, in read
first_packet = self.connection._read_packet()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 725, in _read_packet
packet.raise_for_error()
File “/usr/local/lib/python3.9/site-packages/pymysql/protocol.py”, line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File “/usr/local/lib/python3.9/site-packages/pymysql/err.py”, line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (4624, ‘machine resource is not enough to hold a new unit’)
[2024-08-14T19:44:31.185+0800] INFO - Marking task as FAILED. dag_id=init_ocp, task_id=create_tenant, execution_date=20240814T114428, start_date=20240814T114430, end_date=20240814T114431
[2024-08-14T19:44:31.186+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [“failed task instance is init_ocp__create_tenant__20240814 and exception information is (4624, ‘machine resource is not enough to hold a new unit’)”, 58]
[2024-08-14T19:44:31.187+0800] INFO - Rows affected: 1
[2024-08-14T19:44:31.222+0800] ERROR - Failed to execute job 205 for task create_tenant ((4624, ‘machine resource is not enough to hold a new unit’); 6806)
[2024-08-14T19:44:31.235+0800] INFO - Task exited with return code 1
[2024-08-14T19:44:31.272+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check
############{2}{2024-08-14T19:54:56+08:00}############
[2024-08-14T19:54:56.522+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T19:54:56.533+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T19:54:56.533+0800] INFO -
[2024-08-14T19:54:56.533+0800] INFO - Starting attempt 2 of 2
[2024-08-14T19:54:56.533+0800] INFO -
[2024-08-14T19:54:56.553+0800] INFO - Executing <Task(_PythonDecoratedOperator): create_tenant> on 2024-08-14 11:44:28.523156+00:00
[2024-08-14T19:54:56.556+0800] INFO - Started process 9807 to run task
[2024-08-14T19:54:56.559+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_ocp’, ‘create_tenant’, ‘manual__2024-08-14T11:44:28.523156+00:00’, ‘–job-id’, ‘206’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_ocp.py’, ‘–cfg-path’, ‘/tmp/tmplbzo9dvo’]
[2024-08-14T19:54:56.561+0800] INFO - Job 206: Subtask create_tenant
[2024-08-14T19:54:56.626+0800] INFO - Running <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [running]> on host master01
[2024-08-14T19:54:56.692+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_ocp
AIRFLOW_CTX_TASK_ID=create_tenant
AIRFLOW_CTX_EXECUTION_DATE=2024-08-14T11:44:28.523156+00:00
AIRFLOW_CTX_TRY_NUMBER=2
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-08-14T11:44:28.523156+00:00
[2024-08-14T19:54:56.693+0800] INFO - use metadb connection
[2024-08-14T19:54:56.694+0800] INFO - Running statement: select a.id, a.ip, a.hardware, b.name as idc, b.region from oat_server a, oat_idc b where a.idc_id=b.id and a.id in (%s), parameters: [1]
[2024-08-14T19:54:56.695+0800] INFO - Rows affected: 1
[2024-08-14T19:54:56.696+0800] INFO - Execute query: select distinct(zone) from __all_server order by zone, args: None
[2024-08-14T19:54:56.696+0800] INFO - Execute rows: 1
[2024-08-14T19:54:56.697+0800] INFO - Execute query: select info from __all_zone where zone=%s and name=‘region’, args: (‘META_ZONE_1’,)
[2024-08-14T19:54:56.697+0800] INFO - Execute rows: 1
[2024-08-14T19:54:56.697+0800] INFO - Execute query: select tenant_id from __all_resource_pool where name=%s, args: (‘jbfmeta_resource_pool’,)
[2024-08-14T19:54:56.698+0800] INFO - Execute rows: 0
[2024-08-14T19:54:56.698+0800] INFO - Execute query: select unit_config_id from __all_unit_config where name=%s, args: (‘jbfmeta_unit’,)
[2024-08-14T19:54:56.698+0800] INFO - Execute rows: 1
[2024-08-14T19:54:56.698+0800] INFO - Execute query: select name from __all_resource_pool where unit_config_id=%s, args: (1002,)
[2024-08-14T19:54:56.700+0800] INFO - Execute rows: 0
[2024-08-14T19:54:56.700+0800] INFO - Execute query: drop resource unit jbfmeta_unit, args: None
[2024-08-14T19:54:56.737+0800] INFO - Execute rows: 0
[2024-08-14T19:54:56.737+0800] INFO - Execute query: CREATE RESOURCE UNIT IF NOT EXISTS jbfmeta_unit MAX_CPU 2, MAX_MEMORY ‘3G’, MAX_IOPS 128 ,MAX_DISK_SIZE ‘1G’, MAX_SESSION_NUM 10000, args: None
[2024-08-14T19:54:56.871+0800] INFO - Execute rows: 0
[2024-08-14T19:54:56.871+0800] INFO - Execute query: CREATE RESOURCE POOL IF NOT EXISTS jbfmeta_resource_pool UNIT=‘jbfmeta_unit’, UNIT_NUM=1, ZONE_LIST=(‘META_ZONE_1’), args: None
[2024-08-14T19:54:56.938+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_ocp.py”, line 54, in create_tenant
common.create_tenant(ctx, logger, product=‘ocp’)
File “/oat/task_engine/plugins/common.py”, line 731, in create_tenant
cur.execute(sql)
File “/oat/task_engine/plugins/utils.py”, line 1612, in execute
res = super().execute(query, args)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 148, in execute
result = self._query(query)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 310, in _query
conn.query(q)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 775, in _read_query_result
result.read()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 1156, in read
first_packet = self.connection._read_packet()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 725, in _read_packet
packet.raise_for_error()
File “/usr/local/lib/python3.9/site-packages/pymysql/protocol.py”, line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File “/usr/local/lib/python3.9/site-packages/pymysql/err.py”, line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (4624, ‘machine resource is not enough to hold a new unit’)
[2024-08-14T19:54:56.946+0800] INFO - Marking task as FAILED. dag_id=init_ocp, task_id=create_tenant, execution_date=20240814T114428, start_date=20240814T115456, end_date=20240814T115456
[2024-08-14T19:54:56.947+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [“failed task instance is init_ocp__create_tenant__20240814 and exception information is (4624, ‘machine resource is not enough to hold a new unit’)”, 58]
[2024-08-14T19:54:56.947+0800] INFO - Rows affected: 1
[2024-08-14T19:54:56.969+0800] ERROR - Failed to execute job 206 for task create_tenant ((4624, ‘machine resource is not enough to hold a new unit’); 9807)
[2024-08-14T19:54:57.011+0800] INFO - Task exited with return code 1
[2024-08-14T19:54:57.046+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check
############{3}{2024-08-14T20:10:32+08:00}############
[2024-08-14T20:10:32.808+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:10:32.820+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:10:32.820+0800] INFO -
[2024-08-14T20:10:32.820+0800] INFO - Starting attempt 3 of 3
[2024-08-14T20:10:32.821+0800] INFO -
[2024-08-14T20:10:32.841+0800] INFO - Executing <Task(_PythonDecoratedOperator): create_tenant> on 2024-08-14 11:44:28.523156+00:00
[2024-08-14T20:10:32.844+0800] INFO - Started process 14497 to run task
[2024-08-14T20:10:32.847+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_ocp’, ‘create_tenant’, ‘manual__2024-08-14T11:44:28.523156+00:00’, ‘–job-id’, ‘209’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_ocp.py’, ‘–cfg-path’, ‘/tmp/tmpofcrjsjc’]
[2024-08-14T20:10:32.849+0800] INFO - Job 209: Subtask create_tenant
[2024-08-14T20:10:32.912+0800] INFO - Running <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [running]> on host master01
[2024-08-14T20:10:32.978+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_ocp
AIRFLOW_CTX_TASK_ID=create_tenant
AIRFLOW_CTX_EXECUTION_DATE=2024-08-14T11:44:28.523156+00:00
AIRFLOW_CTX_TRY_NUMBER=3
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-08-14T11:44:28.523156+00:00
[2024-08-14T20:10:32.979+0800] INFO - use metadb connection
[2024-08-14T20:10:32.980+0800] INFO - Running statement: select a.id, a.ip, a.hardware, b.name as idc, b.region from oat_server a, oat_idc b where a.idc_id=b.id and a.id in (%s), parameters: [1]
[2024-08-14T20:10:32.980+0800] INFO - Rows affected: 1
[2024-08-14T20:10:32.982+0800] INFO - Execute query: select distinct(zone) from __all_server order by zone, args: None
[2024-08-14T20:10:32.982+0800] INFO - Execute rows: 1
[2024-08-14T20:10:32.982+0800] INFO - Execute query: select info from __all_zone where zone=%s and name=‘region’, args: (‘META_ZONE_1’,)
[2024-08-14T20:10:32.983+0800] INFO - Execute rows: 1
[2024-08-14T20:10:32.983+0800] INFO - Execute query: select tenant_id from __all_resource_pool where name=%s, args: (‘jbfmeta_resource_pool’,)
[2024-08-14T20:10:32.983+0800] INFO - Execute rows: 0
[2024-08-14T20:10:32.984+0800] INFO - Execute query: select unit_config_id from __all_unit_config where name=%s, args: (‘jbfmeta_unit’,)
[2024-08-14T20:10:32.984+0800] INFO - Execute rows: 1
[2024-08-14T20:10:32.984+0800] INFO - Execute query: select name from __all_resource_pool where unit_config_id=%s, args: (1003,)
[2024-08-14T20:10:32.985+0800] INFO - Execute rows: 0
[2024-08-14T20:10:32.985+0800] INFO - Execute query: drop resource unit jbfmeta_unit, args: None
[2024-08-14T20:10:33.030+0800] INFO - Execute rows: 0
[2024-08-14T20:10:33.031+0800] INFO - Execute query: CREATE RESOURCE UNIT IF NOT EXISTS jbfmeta_unit MAX_CPU 2, MAX_MEMORY ‘3G’, MAX_IOPS 128 ,MAX_DISK_SIZE ‘1G’, MAX_SESSION_NUM 10000, args: None
[2024-08-14T20:10:33.264+0800] INFO - Execute rows: 0
[2024-08-14T20:10:33.265+0800] INFO - Execute query: CREATE RESOURCE POOL IF NOT EXISTS jbfmeta_resource_pool UNIT=‘jbfmeta_unit’, UNIT_NUM=1, ZONE_LIST=(‘META_ZONE_1’), args: None
[2024-08-14T20:10:33.332+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_ocp.py”, line 54, in create_tenant
common.create_tenant(ctx, logger, product=‘ocp’)
File “/oat/task_engine/plugins/common.py”, line 731, in create_tenant
cur.execute(sql)
File “/oat/task_engine/plugins/utils.py”, line 1612, in execute
res = super().execute(query, args)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 148, in execute
result = self._query(query)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 310, in _query
conn.query(q)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 775, in _read_query_result
result.read()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 1156, in read
first_packet = self.connection._read_packet()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 725, in _read_packet
packet.raise_for_error()
File “/usr/local/lib/python3.9/site-packages/pymysql/protocol.py”, line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File “/usr/local/lib/python3.9/site-packages/pymysql/err.py”, line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (4624, ‘machine resource is not enough to hold a new unit’)
[2024-08-14T20:10:33.340+0800] INFO - Marking task as FAILED. dag_id=init_ocp, task_id=create_tenant, execution_date=20240814T114428, start_date=20240814T121032, end_date=20240814T121033
[2024-08-14T20:10:33.340+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [“failed task instance is init_ocp__create_tenant__20240814 and exception information is (4624, ‘machine resource is not enough to hold a new unit’)”, 58]
[2024-08-14T20:10:33.341+0800] INFO - Rows affected: 1
[2024-08-14T20:10:33.360+0800] ERROR - Failed to execute job 209 for task create_tenant ((4624, ‘machine resource is not enough to hold a new unit’); 14497)
[2024-08-14T20:10:33.379+0800] INFO - Task exited with return code 1
[2024-08-14T20:10:33.416+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check
############{4}{2024-08-14T20:11:38+08:00}############
[2024-08-14T20:11:38.642+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:11:38.652+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:11:38.652+0800] INFO -
[2024-08-14T20:11:38.653+0800] INFO - Starting attempt 4 of 4
[2024-08-14T20:11:38.653+0800] INFO -
[2024-08-14T20:11:38.672+0800] INFO - Executing <Task(_PythonDecoratedOperator): create_tenant> on 2024-08-14 11:44:28.523156+00:00
[2024-08-14T20:11:38.675+0800] INFO - Started process 14796 to run task
[2024-08-14T20:11:38.678+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_ocp’, ‘create_tenant’, ‘manual__2024-08-14T11:44:28.523156+00:00’, ‘–job-id’, ‘210’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_ocp.py’, ‘–cfg-path’, ‘/tmp/tmp6hf5d2rp’]
[2024-08-14T20:11:38.680+0800] INFO - Job 210: Subtask create_tenant
[2024-08-14T20:11:38.745+0800] INFO - Running <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [running]> on host master01
[2024-08-14T20:11:38.811+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_ocp
AIRFLOW_CTX_TASK_ID=create_tenant
AIRFLOW_CTX_EXECUTION_DATE=2024-08-14T11:44:28.523156+00:00
AIRFLOW_CTX_TRY_NUMBER=4
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-08-14T11:44:28.523156+00:00
[2024-08-14T20:11:38.812+0800] INFO - use metadb connection
[2024-08-14T20:11:38.813+0800] INFO - Running statement: select a.id, a.ip, a.hardware, b.name as idc, b.region from oat_server a, oat_idc b where a.idc_id=b.id and a.id in (%s), parameters: [1]
[2024-08-14T20:11:38.813+0800] INFO - Rows affected: 1
[2024-08-14T20:11:38.815+0800] INFO - Execute query: select distinct(zone) from __all_server order by zone, args: None
[2024-08-14T20:11:38.815+0800] INFO - Execute rows: 1
[2024-08-14T20:11:38.816+0800] INFO - Execute query: select info from __all_zone where zone=%s and name=‘region’, args: (‘META_ZONE_1’,)
[2024-08-14T20:11:38.816+0800] INFO - Execute rows: 1
[2024-08-14T20:11:38.816+0800] INFO - Execute query: select tenant_id from __all_resource_pool where name=%s, args: (‘jbfmeta_resource_pool’,)
[2024-08-14T20:11:38.817+0800] INFO - Execute rows: 0
[2024-08-14T20:11:38.817+0800] INFO - Execute query: select unit_config_id from __all_unit_config where name=%s, args: (‘jbfmeta_unit’,)
[2024-08-14T20:11:38.817+0800] INFO - Execute rows: 1
[2024-08-14T20:11:38.817+0800] INFO - Execute query: select name from __all_resource_pool where unit_config_id=%s, args: (1004,)
[2024-08-14T20:11:38.818+0800] INFO - Execute rows: 0
[2024-08-14T20:11:38.818+0800] INFO - Execute query: drop resource unit jbfmeta_unit, args: None
[2024-08-14T20:11:38.884+0800] INFO - Execute rows: 0
[2024-08-14T20:11:38.884+0800] INFO - Execute query: CREATE RESOURCE UNIT IF NOT EXISTS jbfmeta_unit MAX_CPU 2, MAX_MEMORY ‘3G’, MAX_IOPS 128 ,MAX_DISK_SIZE ‘1G’, MAX_SESSION_NUM 10000, args: None
[2024-08-14T20:11:39.017+0800] INFO - Execute rows: 0
[2024-08-14T20:11:39.018+0800] INFO - Execute query: CREATE RESOURCE POOL IF NOT EXISTS jbfmeta_resource_pool UNIT=‘jbfmeta_unit’, UNIT_NUM=1, ZONE_LIST=(‘META_ZONE_1’), args: None
[2024-08-14T20:11:39.087+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_ocp.py”, line 54, in create_tenant
common.create_tenant(ctx, logger, product=‘ocp’)
File “/oat/task_engine/plugins/common.py”, line 731, in create_tenant
cur.execute(sql)
File “/oat/task_engine/plugins/utils.py”, line 1612, in execute
res = super().execute(query, args)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 148, in execute
result = self._query(query)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 310, in _query
conn.query(q)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 775, in _read_query_result
result.read()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 1156, in read
first_packet = self.connection._read_packet()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 725, in _read_packet
packet.raise_for_error()
File “/usr/local/lib/python3.9/site-packages/pymysql/protocol.py”, line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File “/usr/local/lib/python3.9/site-packages/pymysql/err.py”, line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (4624, ‘machine resource is not enough to hold a new unit’)
[2024-08-14T20:11:39.094+0800] INFO - Marking task as FAILED. dag_id=init_ocp, task_id=create_tenant, execution_date=20240814T114428, start_date=20240814T121138, end_date=20240814T121139
[2024-08-14T20:11:39.095+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [“failed task instance is init_ocp__create_tenant__20240814 and exception information is (4624, ‘machine resource is not enough to hold a new unit’)”, 58]
[2024-08-14T20:11:39.096+0800] INFO - Rows affected: 1
[2024-08-14T20:11:39.122+0800] ERROR - Failed to execute job 210 for task create_tenant ((4624, ‘machine resource is not enough to hold a new unit’); 14796)
[2024-08-14T20:11:39.129+0800] INFO - Task exited with return code 1
[2024-08-14T20:11:39.165+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check
############{5}{2024-08-14T20:11:59+08:00}############
[2024-08-14T20:11:59.800+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:11:59.810+0800] INFO - Dependencies all met for <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [queued]>
[2024-08-14T20:11:59.811+0800] INFO -
[2024-08-14T20:11:59.811+0800] INFO - Starting attempt 5 of 5
[2024-08-14T20:11:59.811+0800] INFO -
[2024-08-14T20:11:59.830+0800] INFO - Executing <Task(_PythonDecoratedOperator): create_tenant> on 2024-08-14 11:44:28.523156+00:00
[2024-08-14T20:11:59.833+0800] INFO - Started process 14803 to run task
[2024-08-14T20:11:59.836+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_ocp’, ‘create_tenant’, ‘manual__2024-08-14T11:44:28.523156+00:00’, ‘–job-id’, ‘211’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_ocp.py’, ‘–cfg-path’, ‘/tmp/tmpz4ujrbzx’]
[2024-08-14T20:11:59.838+0800] INFO - Job 211: Subtask create_tenant
[2024-08-14T20:11:59.904+0800] INFO - Running <TaskInstance: init_ocp.create_tenant manual__2024-08-14T11:44:28.523156+00:00 [running]> on host master01
[2024-08-14T20:11:59.971+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_ocp
AIRFLOW_CTX_TASK_ID=create_tenant
AIRFLOW_CTX_EXECUTION_DATE=2024-08-14T11:44:28.523156+00:00
AIRFLOW_CTX_TRY_NUMBER=5
AIRFLOW_CTX_DAG_RUN_ID=manual__2024-08-14T11:44:28.523156+00:00
[2024-08-14T20:11:59.972+0800] INFO - use metadb connection
[2024-08-14T20:11:59.973+0800] INFO - Running statement: select a.id, a.ip, a.hardware, b.name as idc, b.region from oat_server a, oat_idc b where a.idc_id=b.id and a.id in (%s), parameters: [1]
[2024-08-14T20:11:59.973+0800] INFO - Rows affected: 1
[2024-08-14T20:11:59.975+0800] INFO - Execute query: select distinct(zone) from __all_server order by zone, args: None
[2024-08-14T20:11:59.975+0800] INFO - Execute rows: 1
[2024-08-14T20:11:59.976+0800] INFO - Execute query: select info from __all_zone where zone=%s and name=‘region’, args: (‘META_ZONE_1’,)
[2024-08-14T20:11:59.976+0800] INFO - Execute rows: 1
[2024-08-14T20:11:59.976+0800] INFO - Execute query: select tenant_id from __all_resource_pool where name=%s, args: (‘jbfmeta_resource_pool’,)
[2024-08-14T20:11:59.977+0800] INFO - Execute rows: 0
[2024-08-14T20:11:59.977+0800] INFO - Execute query: select unit_config_id from __all_unit_config where name=%s, args: (‘jbfmeta_unit’,)
[2024-08-14T20:11:59.977+0800] INFO - Execute rows: 1
[2024-08-14T20:11:59.978+0800] INFO - Execute query: select name from __all_resource_pool where unit_config_id=%s, args: (1005,)
[2024-08-14T20:11:59.978+0800] INFO - Execute rows: 0
[2024-08-14T20:11:59.978+0800] INFO - Execute query: drop resource unit jbfmeta_unit, args: None
[2024-08-14T20:12:00.070+0800] INFO - Execute rows: 0
[2024-08-14T20:12:00.070+0800] INFO - Execute query: CREATE RESOURCE UNIT IF NOT EXISTS jbfmeta_unit MAX_CPU 2, MAX_MEMORY ‘3G’, MAX_IOPS 128 ,MAX_DISK_SIZE ‘1G’, MAX_SESSION_NUM 10000, args: None
[2024-08-14T20:12:00.220+0800] INFO - Execute rows: 0
[2024-08-14T20:12:00.220+0800] INFO - Execute query: CREATE RESOURCE POOL IF NOT EXISTS jbfmeta_resource_pool UNIT=‘jbfmeta_unit’, UNIT_NUM=1, ZONE_LIST=(‘META_ZONE_1’), args: None
[2024-08-14T20:12:00.287+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_ocp.py”, line 54, in create_tenant
common.create_tenant(ctx, logger, product=‘ocp’)
File “/oat/task_engine/plugins/common.py”, line 731, in create_tenant
cur.execute(sql)
File “/oat/task_engine/plugins/utils.py”, line 1612, in execute
res = super().execute(query, args)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 148, in execute
result = self._query(query)
File “/usr/local/lib/python3.9/site-packages/pymysql/cursors.py”, line 310, in _query
conn.query(q)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 775, in _read_query_result
result.read()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 1156, in read
first_packet = self.connection._read_packet()
File “/usr/local/lib/python3.9/site-packages/pymysql/connections.py”, line 725, in _read_packet
packet.raise_for_error()
File “/usr/local/lib/python3.9/site-packages/pymysql/protocol.py”, line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File “/usr/local/lib/python3.9/site-packages/pymysql/err.py”, line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (4624, ‘machine resource is not enough to hold a new unit’)
[2024-08-14T20:12:00.295+0800] INFO - Marking task as FAILED. dag_id=init_ocp, task_id=create_tenant, execution_date=20240814T114428, start_date=20240814T121159, end_date=20240814T121200
[2024-08-14T20:12:00.296+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [“failed task instance is init_ocp__create_tenant__20240814 and exception information is (4624, ‘machine resource is not enough to hold a new unit’)”, 58]
[2024-08-14T20:12:00.296+0800] INFO - Rows affected: 1
[2024-08-14T20:12:00.325+0800] ERROR - Failed to execute job 211 for task create_tenant ((4624, ‘machine resource is not enough to hold a new unit’); 14803)
[2024-08-14T20:12:00.368+0800] INFO - Task exited with return code 1
[2024-08-14T20:12:00.405+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!