创建metadb失败,没有具体报错信息,只有超时信息

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【问题现象及影响】
image

【附件】
[2023-07-03T14:58:14.021+0800] INFO - Dependencies all met for <TaskInstance: init_metadb.start_metadb manual__2023-07-03T06:57:46.622879+00:00 map_index=0 [queued]>
[2023-07-03T14:58:14.033+0800] INFO - Dependencies all met for <TaskInstance: init_metadb.start_metadb manual__2023-07-03T06:57:46.622879+00:00 map_index=0 [queued]>
[2023-07-03T14:58:14.033+0800] INFO -

[2023-07-03T14:58:14.033+0800] INFO - Starting attempt 1 of 1
[2023-07-03T14:58:14.033+0800] INFO -

[2023-07-03T14:58:14.048+0800] INFO - Executing <Mapped(_PythonDecoratedOperator): start_metadb> on 2023-07-03 06:57:46.622879+00:00
[2023-07-03T14:58:14.053+0800] INFO - Started process 26268 to run task
[2023-07-03T14:58:14.057+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_metadb’, ‘start_metadb’, ‘manual__2023-07-03T06:57:46.622879+00:00’, ‘–job-id’, ‘715’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_metadb.py’, ‘–cfg-path’, ‘/tmp/tmpuh9i1v6f’, ‘–map-index’, ‘0’]
[2023-07-03T14:58:14.060+0800] INFO - Job 715: Subtask start_metadb
[2023-07-03T14:58:14.134+0800] INFO - Running <TaskInstance: init_metadb.start_metadb manual__2023-07-03T06:57:46.622879+00:00 map_index=0 [running]> on host pekphis542894
[2023-07-03T14:58:14.232+0800] INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=init_metadb
AIRFLOW_CTX_TASK_ID=start_metadb
AIRFLOW_CTX_EXECUTION_DATE=2023-07-03T06:57:46.622879+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2023-07-03T06:57:46.622879+00:00
[2023-07-03T14:58:14.234+0800] INFO - Running statement: select id, ip from oat_server where id in (%s), parameters: [12]
[2023-07-03T14:58:14.235+0800] INFO - Rows affected: 1
[2023-07-03T14:58:14.236+0800] INFO - Running statement: select * from oat_image where id=%s, parameters: [2]
[2023-07-03T14:58:14.237+0800] INFO - Rows affected: 1
[2023-07-03T14:58:14.238+0800] INFO - Running statement: select oat_server.id, oat_credential.id as credential_id, ip, ssh_port, username, password, auth_type, key_data, passphrase from oat_server, oat_credential where oat_server.credential_id=oat_credential.id and oat_server.id=%s, parameters: [12]
[2023-07-03T14:58:14.239+0800] INFO - Rows affected: 1
[2023-07-03T14:58:14.240+0800] INFO - Running statement: select rollback_status from oat_operationdagrun where id=%s, parameters: [94]
[2023-07-03T14:58:14.240+0800] INFO - Rows affected: 1
[2023-07-03T14:58:14.241+0800] INFO - Running statement: update oat_operationdagrun set rollback_status=%s where id=%s, parameters: [‘can_rollback’, 94]
[2023-07-03T14:58:14.241+0800] INFO - Rows affected: 1
[2023-07-03T14:58:14.342+0800] INFO - Connected (version 2.0, client OpenSSH_7.4)
[2023-07-03T14:58:14.659+0800] INFO - Auth banner: b’\nAuthorized users only. All activities may be monitored and reported.\n’
[2023-07-03T14:58:14.659+0800] INFO - Authentication (password) successful!
[2023-07-03T14:58:14.660+0800] INFO - execute command on 10.50.157.39:
dev_name=$(ip a | grep -w 10.50.157.39 | awk ‘{print $NF}’)
[ -z “$dev_name” ] && { echo ‘can not get nic dev name!’; exit 1; }
for d in “/home/admin/oceanbase” “/data/1” “/data/log1”;
do
if [ -e “$d” ]; then
[ “$(ls -A $d | grep -vw lost+found)” ] && { echo “$d is not empty. Please clean it and retry!”; exit 2; }
else
mkdir -p “$d”
fi
done
used_percentage=$(df --output=pcent “/data/1”| tail -n 1 | sed ‘s/%//’)
if [[ $(($used_percentage + 60)) -gt 99 ]]; then
echo “/data/1 has no enough disk space, used percentage is $used_percentage, datafile_disk_percentage you set is 60”; exit 1
fi
docker run -d -it --cap-add SYS_RESOURCE --name metadb --net=host
-e OBCLUSTER_NAME=testdb1
-e DEV_NAME=$dev_name
-e ROOTSERVICE_LIST=10.50.157.39:2882:2881
-e DATAFILE_DISK_PERCENTAGE=60
-e CLUSTER_ID=1688367466
-e ZONE_NAME=META_ZONE_1
-e OBPROXY_PORT=2883
-e MYSQL_PORT=2881
-e RPC_PORT=2882
-e app.password_root=Gauss_234
-e OBPROXY_OPTSTR=obproxy_sys_password=5dcf2a7d1933a9bf45756797a503d9d970099c1b,observer_sys_password=242fe1853bcbd3a56875452fbe27566387b618dc,automatic_match_work_thread=false,enable_strict_kernel_release=false,work_thread_num=64,proxy_mem_limited=4G,client_max_connections=16384,log_dir_size_threshold=10G
-e OPTSTR=cpu_count=32,memory_limit=87G,system_memory=30G,__min_full_resource_pool_memory=1073741824,memory_limit_percentage=80
-e SSHD_PORT=2022
–cpu-period 100000
–cpu-quota 3200000
–cpuset-cpus “0-31”
–memory 90G
-v /home/admin/oceanbase:/home/admin/oceanbase
-v /data/log1:/data/log1
-v /data/1:/data/1
–restart on-failure:5
reg.docker.alibaba-inc.com/antman/ob-docker:OB2277_OBP329_x86_20230330 || exit 1

[2023-07-03T14:58:15.073+0800] INFO - 762512c7f71d67650b7c5f208150991e62934408b09e36fd3c3318619b4e8ef1
[2023-07-03T14:58:15.226+0800] INFO - Depends on disk’s performance, do io bench and bootstrap observer may take 5~30 minutes, Please wait.
Start test connection 10.50.157.39:2881
[2023-07-03T14:58:45.297+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T14:59:15.356+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T14:59:45.396+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:00:15.465+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:00:45.534+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:01:15.604+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:01:45.673+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:02:15.713+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:02:45.784+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:03:15.852+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:03:45.921+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:04:15.991+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:04:46.061+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:05:16.131+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:05:46.189+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:06:16.237+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:06:46.282+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:07:16.352+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:07:46.423+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:08:16.482+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:08:46.552+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:09:16.622+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:09:46.667+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:10:16.715+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:10:46.783+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:11:16.852+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:11:46.897+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:12:16.938+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:12:46.991+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:13:17.035+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:13:47.081+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:14:17.134+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:14:47.197+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:15:17.265+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:15:47.316+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:16:17.372+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:16:47.432+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:17:17.476+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:17:47.523+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:18:17.569+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:18:47.608+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:19:17.676+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:19:47.723+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:20:17.790+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:20:47.833+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:21:17.902+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:21:47.971+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:22:18.041+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:22:48.109+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:23:18.176+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:23:48.216+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:24:18.266+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:24:48.308+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:25:18.370+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:25:48.437+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:26:18.479+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:26:48.550+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:27:18.609+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:27:48.676+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:28:18.746+0800] INFO - (2003, “Can’t connect to MySQL server on ‘10.50.157.39’ ([Errno 111] Connection refused)”)
observer not ready, sleep 30s to try again…
[2023-07-03T15:28:18.746+0800] ERROR - observer not ready in 30min, exit…
[2023-07-03T15:28:18.746+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_metadb.py”, line 124, in start_metadb
raise RuntimeError(‘observer not ready in 30min’)
RuntimeError: observer not ready in 30min
[2023-07-03T15:28:18.754+0800] INFO - Marking task as FAILED. dag_id=init_metadb, task_id=start_metadb, map_index=0, execution_date=20230703T065746, start_date=20230703T065814, end_date=20230703T072818
[2023-07-03T15:28:18.755+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_metadb__start_metadb__20230703 and exception information is observer not ready in 30min’, 105]
[2023-07-03T15:28:18.756+0800] INFO - Rows affected: 1
[2023-07-03T15:28:18.772+0800] ERROR - Failed to execute job 715 for task start_metadb (observer not ready in 30min; 26268)
[2023-07-03T15:28:18.793+0800] INFO - Task exited with return code 1
[2023-07-03T15:28:18.821+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check

你使用的是oat还是ocp?

oat

试一下直接联系你们的专属技术支持同学,让他直接解决掉这个问题?

哦,好吧

能说一下怎么解决的吗,我也遇到同样的问题了

您好,您那边的问题解决了吗,能分享一下怎么解决的吗,我遇到了类似的问题,想请教一下您