OAT 4.1.0-部署组件metadb报错-32c250G

[2023-11-02T15:16:49.043+0800] INFO - Dependencies all met for <TaskInstance: init_metadb.start_metadb manual__2023-11-02T07:16:31.987060+00:00 map_index=0 [queued]>
[2023-11-02T15:16:49.055+0800] INFO - Dependencies all met for <TaskInstance: init_metadb.start_metadb manual__2023-11-02T07:16:31.987060+00:00 map_index=0 [queued]>
[2023-11-02T15:16:49.055+0800] INFO -

[2023-11-02T15:16:49.056+0800] INFO - Starting attempt 1 of 1
[2023-11-02T15:16:49.056+0800] INFO -

[2023-11-02T15:16:49.071+0800] INFO - Executing <Mapped(_PythonDecoratedOperator): start_metadb> on 2023-11-02 07:16:31.987060+00:00
[2023-11-02T15:16:49.075+0800] INFO - Started process 8262 to run task
[2023-11-02T15:16:49.079+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_metadb’, ‘start_metadb’, ‘manual__2023-11-02T07:16:31.987060+00:00’, ‘–job-id’, ‘45’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_metadb.py’, ‘–cfg-path’, ‘/tmp/tmphpj_78q6’, ‘–map-index’, ‘0’]
[2023-11-02T15:16:49.082+0800] INFO - Job 45: Subtask start_metadb
[2023-11-02T15:16:49.155+0800] INFO - Running <TaskInstance: init_metadb.start_metadb manual__2023-11-02T07:16:31.987060+00:00 map_index=0 [running]> on host SRV00400743
[2023-11-02T15:16:49.249+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_metadb
AIRFLOW_CTX_TASK_ID=start_metadb
AIRFLOW_CTX_EXECUTION_DATE=2023-11-02T07:16:31.987060+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2023-11-02T07:16:31.987060+00:00
[2023-11-02T15:16:49.251+0800] INFO - Running statement: select id, ip from oat_server where id in (%s), parameters: [1]
[2023-11-02T15:16:49.251+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.253+0800] INFO - Running statement: select * from oat_image where id=%s, parameters: [1]
[2023-11-02T15:16:49.253+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.255+0800] INFO - Running statement: select oat_server.id, oat_credential.id as credential_id, ip, ssh_port, username, password, auth_type, key_data, passphrase from oat_server, oat_credential where oat_server.credential_id=oat_credential.id and oat_server.id=%s, parameters: [1]
[2023-11-02T15:16:49.256+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.257+0800] INFO - Running statement: select rollback_status from oat_operationdagrun where id=%s, parameters: [11]
[2023-11-02T15:16:49.258+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.259+0800] INFO - Running statement: update oat_operationdagrun set rollback_status=%s where id=%s, parameters: [‘can_rollback’, 11]
[2023-11-02T15:16:49.260+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.276+0800] INFO - Connected (version 2.0, client OpenSSH_7.4)
[2023-11-02T15:16:49.331+0800] INFO - Auth banner: b’This is a private network server, in monitoring state.\nIt is strictly prohibited to unauthorized access and used.\n’
[2023-11-02T15:16:49.332+0800] INFO - Authentication (publickey) successful!
[2023-11-02T15:16:49.332+0800] INFO - execute command on 192.168.10.144:
dev_name=$(ip a | grep -w 192.168.10.144 | awk ‘{print $NF}’)
[ -z “$dev_name” ] && { echo ‘can not get nic dev name!’; exit 1; }
for d in “/home/admin/oceanbase” “/data/1” “/data/log1”;
do
if [ -e “$d” ]; then
[ “$(ls -A $d | grep -vw lost+found)” ] && { echo “$d is not empty. Please clean it and retry!”; exit 2; }
else
mkdir -p “$d”
fi
done
used_percentage=$(df --output=pcent “/data/1”| tail -n 1 | sed ‘s/%//’)
if [[ $(($used_percentage + 90)) -gt 99 ]]; then
echo “/data/1 has no enough disk space, used percentage is $used_percentage, datafile_disk_percentage you set is 90”; exit 1
fi
docker run -d -it --cap-add SYS_RESOURCE --name metadb --net=host
-e OBCLUSTER_NAME=metadb2ocp
-e DEV_NAME=$dev_name
-e ROOTSERVICE_LIST=192.168.10.144:2882:2881
-e DATAFILE_DISK_PERCENTAGE=90
-e CLUSTER_ID=1698909391
-e ZONE_NAME=META_ZONE_1
-e OBPROXY_PORT=2883
-e MYSQL_PORT=2881
-e RPC_PORT=2882
-e app.password_root=’_HGCEnHjiMd_84#Qiq’
-e OBPROXY_OPTSTR=obproxy_sys_password=da29de07ec7fd86ffb785d2b761d9fca6ae89358,observer_sys_password=99374caae90934d2b76379c2d6b97373b07e1aa9,automatic_match_work_thread=false,enable_strict_kernel_release=false,work_thread_num=16,proxy_mem_limited=4G,client_max_connections=16384,log_dir_size_threshold=10G
-e OPTSTR=cpu_count=8,memory_limit=13G,system_memory=6G,__min_full_resource_pool_memory=1073741824,memory_limit_percentage=90
-e SSHD_PORT=2022
–cpu-period 100000
–cpu-quota 800000
–cpuset-cpus “0-7”
–memory 16G
-v /home/admin/oceanbase:/home/admin/oceanbase
-v /data/log1:/data/log1
-v /data/1:/data/1
–restart on-failure:5
reg.docker.alibaba-inc.com/antman/ob-docker:OB2277_OBP329_x86_20230330 || exit 1

[2023-11-02T15:16:49.405+0800] INFO - /data/1 is not empty. Please clean it and retry!
[2023-11-02T15:16:49.406+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_metadb.py”, line 109, in start_metadb
raise RuntimeError(f’start metadb on {server_ip} failed’)
RuntimeError: start metadb on 192.168.10.144 failed
[2023-11-02T15:16:49.413+0800] INFO - Marking task as FAILED. dag_id=init_metadb, task_id=start_metadb, map_index=0, execution_date=20231102T071631, start_date=20231102T071649, end_date=20231102T071649
[2023-11-02T15:16:49.415+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_metadb__start_metadb__20231102 and exception information is start metadb on 192.168.10.144 failed’, 12]
[2023-11-02T15:16:49.416+0800] INFO - Rows affected: 1
[2023-11-02T15:16:49.430+0800] ERROR - Failed to execute job 45 for task start_metadb (start metadb on 192.168.10.144 failed; 8262)
[2023-11-02T15:16:49.452+0800] INFO - Task exited with return code 1
[2023-11-02T15:16:49.483+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check

[2023-11-02T15:16:49.405+0800] INFO - /data/1 is not empty. Please clean it and retry!

/data/1 这目录下面需要是空的

1 个赞

好的 我试试,谢谢!