【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】oat 4.1.0 all-in-one
【 使用版本 】3.34
【问题描述】添加服务器初始化时异常
【复现路径】
添加的服务器版本是 anlios 8.9
【附件及日志】
[2025-03-28T08:50:21.761+0800] INFO - Dependencies all met for <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [queued]>
2
[2025-03-28T08:50:21.773+0800] INFO - Dependencies all met for <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [queued]>
3
[2025-03-28T08:50:21.773+0800] INFO -
4
5
[2025-03-28T08:50:21.774+0800] INFO - Starting attempt 2 of 2
6
[2025-03-28T08:50:21.774+0800] INFO -
7
8
[2025-03-28T08:50:21.826+0800] INFO - Executing <Task(_PythonDecoratedOperator): precheck> on 2025-03-28 00:31:49.971857+00:00
9
[2025-03-28T08:50:21.830+0800] INFO - Started process 8405 to run task
10
[2025-03-28T08:50:21.834+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_server_with_tag’, ‘precheck’, ‘manual__2025-03-28T00:31:49.971857+00:00’, ‘–job-id’, ‘24’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_server_with_tag.py’, ‘–cfg-path’, ‘/tmp/tmp37e0s0yc’]
11
[2025-03-28T08:50:21.836+0800] INFO - Job 24: Subtask precheck
12
[2025-03-28T08:50:21.909+0800] INFO - Running <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [running]> on host anlios-16
13
[2025-03-28T08:50:21.983+0800] INFO - Exporting the following env vars:
14
AIRFLOW_CTX_DAG_OWNER=airflow
15
AIRFLOW_CTX_DAG_ID=init_server_with_tag
16
AIRFLOW_CTX_TASK_ID=precheck
17
AIRFLOW_CTX_EXECUTION_DATE=2025-03-28T00:31:49.971857+00:00
18
AIRFLOW_CTX_TRY_NUMBER=2
19
AIRFLOW_CTX_DAG_RUN_ID=manual__2025-03-28T00:31:49.971857+00:00
20
[2025-03-28T08:50:21.985+0800] INFO - Running statement: select oat_server.id, oat_credential.id as credential_id, ip, ssh_port, username, password, auth_type, key_data, passphrase from oat_server, oat_credential where oat_server.credential_id=oat_credential.id and oat_server.id=%s, parameters: [1]
21
[2025-03-28T08:50:21.986+0800] INFO - Rows affected: 1
22
[2025-03-28T08:50:21.998+0800] INFO - Connected (version 2.0, client OpenSSH_8.0)
23
[2025-03-28T08:50:22.074+0800] INFO - Authentication (password) successful!
24
[2025-03-28T08:50:22.186+0800] INFO - execute command on 192.168.2.16:
25
/tmp/precheck.shSVF0AQwp -m both
26
[2025-03-28T08:50:22.215+0800] INFO - Machine Role: both
27
[2025-03-28T08:50:22.217+0800] INFO - Peer IP List:
28
[2025-03-28T08:50:22.219+0800] INFO - Machine Type: PHY
29
[2025-03-28T08:50:22.219+0800] INFO - Inspect Mode: FALSE
30
[2025-03-28T08:50:22.220+0800] INFO -
31
[2025-03-28T08:50:22.222+0800] INFO - check CPU count: 8 < 32 … EXPECT >= 32 … FAIL
32
[2025-03-28T08:50:22.223+0800] INFO - TIPS: replace another machine with more CPU
33
[2025-03-28T08:50:22.226+0800] INFO - check total MEM: 15 GB < 128 GB … EXPECT >= 128 GB … FAIL
34
[2025-03-28T08:50:22.227+0800] INFO - TIPS: replace another machine with more MEM
35
[2025-03-28T08:50:22.229+0800] INFO - check SELinux status: Disabled … PASS
36
[2025-03-28T08:50:22.232+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL
37
[2025-03-28T08:50:22.233+0800] INFO - TIPS: re-part disk to mount /data/1
38
[2025-03-28T08:50:22.236+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL
39
[2025-03-28T08:50:22.237+0800] INFO - TIPS: re-part disk to mount /data/log1
40
[2025-03-28T08:50:22.238+0800] INFO - check /home/admin, exist … PASS
41
[2025-03-28T08:50:22.241+0800] INFO - check /home/admin owner: admin … PASS
42
[2025-03-28T08:50:22.246+0800] INFO - check /home/admin disk usage, total: 221G, used: 54G, use%: 25% < 50% … PASS
43
[2025-03-28T08:50:22.252+0800] INFO - check account [admin] and home dir, exist … PASS
44
[2025-03-28T08:50:22.285+0800] INFO - check clock sync service: chronyd, chrony server: 203.107.6.88 … PASS
45
[2025-03-28T08:50:27.314+0800] INFO - check chrony clock offset: 2.036ms <= 50ms … PASS
46
[2025-03-28T08:50:27.325+0800] INFO - sysctl /proc/sys/net/core/somaxconn = 2048, correct … PASS
47
[2025-03-28T08:50:27.332+0800] INFO - sysctl /proc/sys/net/core/netdev_max_backlog = 10000, correct … PASS
48
[2025-03-28T08:50:27.340+0800] INFO - sysctl /proc/sys/net/core/rmem_default = 16777216, correct … PASS
49
[2025-03-28T08:50:27.348+0800] INFO - sysctl /proc/sys/net/core/wmem_default = 16777216, correct … PASS
50
[2025-03-28T08:50:27.355+0800] INFO - sysctl /proc/sys/net/core/rmem_max = 16777216, correct … PASS
51
[2025-03-28T08:50:27.363+0800] INFO - sysctl /proc/sys/net/core/wmem_max = 16777216, correct … PASS
52
[2025-03-28T08:50:27.372+0800] INFO - sysctl /proc/sys/net/ipv4/conf/default/rp_filter = 1, correct … PASS
53
[2025-03-28T08:50:27.379+0800] INFO - sysctl /proc/sys/net/ipv4/conf/default/accept_source_route = 0, correct … PASS
54
[2025-03-28T08:50:27.387+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_syncookies = 1, correct … PASS
55
[2025-03-28T08:50:27.395+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_rmem = 4096 87380 16777216, correct … PASS
56
[2025-03-28T08:50:27.402+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_wmem = 4096 65536 16777216, correct … PASS
57
[2025-03-28T08:50:27.410+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_max_syn_backlog = 16384, correct … PASS
58
[2025-03-28T08:50:27.418+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_fin_timeout = 15, correct … PASS
59
[2025-03-28T08:50:27.425+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_max_syn_backlog = 16384, correct … PASS
60
[2025-03-28T08:50:27.433+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_tw_reuse = 1, correct … PASS
61
[2025-03-28T08:50:27.440+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_slow_start_after_idle = 0, correct … PASS
62
[2025-03-28T08:50:27.449+0800] INFO - sysctl /proc/sys/vm/swappiness = 0, correct … PASS
63
[2025-03-28T08:50:27.457+0800] INFO - sysctl /proc/sys/kernel/core_pattern = /data/1/core-%e-%p-%t, correct … PASS
64
[2025-03-28T08:50:27.465+0800] INFO - sysctl /proc/sys/vm/min_free_kbytes = 2097152, correct … PASS
65
[2025-03-28T08:50:27.472+0800] INFO - sysctl /proc/sys/vm/max_map_count = 655360, correct … PASS
66
[2025-03-28T08:50:27.480+0800] INFO - sysctl /proc/sys/fs/aio-max-nr = 1048576, correct … PASS
67
[2025-03-28T08:50:27.488+0800] INFO - sysctl /proc/sys/vm/overcommit_memory = 0, correct … PASS
68
[2025-03-28T08:50:27.496+0800] INFO - sysctl /proc/sys/vm/nr_hugepages = 0, correct … PASS
69
[2025-03-28T08:50:27.503+0800] INFO - sysctl /proc/sys/net/ipv4/ip_forward = 1, correct … PASS
70
[2025-03-28T08:50:27.511+0800] INFO - sysctl /proc/sys/net/ipv4/ip_local_port_range = 10000 65535, correct … PASS
71
[2025-03-28T08:50:27.800+0800] INFO - check service [crond]: enabled … PASS
72
[2025-03-28T08:50:28.382+0800] INFO - check service [sshd]: enabled … PASS
73
[2025-03-28T08:50:28.671+0800] INFO - check service [firewalld]: inactive … PASS
74
[2025-03-28T08:50:28.679+0800] INFO - check service [firewalld]: disabled … PASS
75
[2025-03-28T08:50:28.683+0800] INFO - check sshd_config PubkeyAuthentication: yes … PASS
76
[2025-03-28T08:50:28.685+0800] INFO - check sshd_config UseDNS: no … PASS
77
[2025-03-28T08:50:28.688+0800] INFO - check sshd_config ClientAliveInterval: 60 … PASS
78
[2025-03-28T08:50:28.691+0800] INFO - check sshd_config ClientAliveCountMax: 10 … PASS
79
[2025-03-28T08:50:28.692+0800] INFO - check hugepage: disabled … PASS
80
[2025-03-28T08:50:28.693+0800] INFO - check oceanbase_limits.conf, exist … PASS
81
[2025-03-28T08:50:28.739+0800] INFO - check hard limit of new session open_files (ulimit -H -n): 655360 … PASS
82
[2025-03-28T08:50:28.739+0800] INFO - check hard limit of open_files (ulimit -H -n): 655360 … PASS
83
[2025-03-28T08:50:28.775+0800] INFO - check soft limit of new session open_files (ulimit -S -n): 655360 … PASS
84
[2025-03-28T08:50:28.775+0800] INFO - check soft limit of open_files (ulimit -S -n): 655360 … PASS
85
[2025-03-28T08:50:28.821+0800] INFO - check hard limit of new session max_user_processes (ulimit -H -u): 655360 … PASS
86
[2025-03-28T08:50:28.821+0800] INFO - check hard limit of max_user_processes (ulimit -H -u): 655360 … PASS
87
[2025-03-28T08:50:28.857+0800] INFO - check soft limit of new session max_user_processes (ulimit -S -u): 655360 … PASS
88
[2025-03-28T08:50:28.858+0800] INFO - check soft limit of max_user_processes (ulimit -S -u): 655360 … PASS
89
[2025-03-28T08:50:28.905+0800] INFO - check hard limit of new session stack_size (ulimit -H -s): 10240 … PASS
90
[2025-03-28T08:50:28.906+0800] INFO - check hard limit of stack_size (ulimit -H -s): 10240 … PASS
91
[2025-03-28T08:50:28.962+0800] INFO - check soft limit of new session stack_size (ulimit -S -s): 10240 … PASS
92
[2025-03-28T08:50:28.962+0800] INFO - check soft limit of stack_size (ulimit -S -s): 10240 … PASS
93
[2025-03-28T08:50:29.011+0800] INFO - check hard limit of new session core_file_size (ulimit -H -c): unlimited … PASS
94
[2025-03-28T08:50:29.012+0800] INFO - check hard limit of core_file_size (ulimit -H -c): unlimited … PASS
95
[2025-03-28T08:50:29.046+0800] INFO - check soft limit of new session core_file_size (ulimit -S -c): unlimited … PASS
96
[2025-03-28T08:50:29.047+0800] INFO - check soft limit of core_file_size (ulimit -S -c): unlimited … PASS
97
[2025-03-28T08:50:29.095+0800] INFO - check hard limit of new session cpu_time (ulimit -H -t): unlimited … PASS
98
[2025-03-28T08:50:29.095+0800] INFO - check hard limit of cpu_time (ulimit -H -t): unlimited … PASS
99
[2025-03-28T08:50:29.134+0800] INFO - check soft limit of new session cpu_time (ulimit -S -t): unlimited … PASS
100
[2025-03-28T08:50:29.135+0800] INFO - check soft limit of cpu_time (ulimit -S -t): unlimited … PASS
101
[2025-03-28T08:50:29.138+0800] INFO - check numa stat, pass … PASS
102
[2025-03-28T08:50:29.143+0800] INFO - check elevator policy: deadline … PASS
103
[2025-03-28T08:50:29.144+0800] INFO - check current_clocksource: tsc … PASS
104
[2025-03-28T08:50:29.149+0800] INFO - check logical sector size of /dev/sda: 512 … PASS
105
[2025-03-28T08:50:29.814+0800] INFO - check RPM: mariadb-common-10.3.39-1.0.1.module+an8.8.0+11133+62929fd4.x86_64 mariadb-connector-c-config-3.2.6-1.an8.noarch mariadb-connector-c-3.2.6-1.an8.x86_64 mariadb-10.3.39-1.0.1.module+an8.8.0+11133+62929fd4.x86_64 is installed … PASS
106
[2025-03-28T08:50:30.492+0800] INFO - check RPM: python2-devel-2.7.18-17.0.1.module+an8.9.0+11214+2a3a4a9e.x86_64 is installed … PASS
107
[2025-03-28T08:50:31.145+0800] INFO - check RPM: net-tools-2.0-0.52.20160912git.an8.x86_64 is installed … PASS
108
[2025-03-28T08:50:31.825+0800] INFO - check RPM: mtr-0.92-3.el8.x86_64 is installed … PASS
109
[2025-03-28T08:50:32.515+0800] INFO - check RPM: selinux-policy-targeted-3.14.3-139.0.1.an8.1.noarch keentune-target-2.4.1-1.an8.noarch tar-1.30-9.0.1.an8.x86_64 is installed … PASS
110
[2025-03-28T08:50:33.171+0800] INFO - check RPM: binutils-2.30-125.0.1.an8.x86_64 is installed … PASS
111
[2025-03-28T08:50:33.832+0800] INFO - check RPM: bind-utils-9.11.36-16.0.1.an8.4.x86_64 is installed … PASS
112
[2025-03-28T08:50:34.481+0800] INFO - check RPM: libaio-0.3.112-1.0.1.an8.x86_64 is installed … PASS
113
[2025-03-28T08:50:35.138+0800] INFO - check RPM: libcurl-7.61.1-35.0.2.an8.3.x86_64 curl-7.61.1-35.0.2.an8.3.x86_64 python3-pycurl-7.43.0.2-4.el8.x86_64 is installed … PASS
114
[2025-03-28T08:50:35.799+0800] INFO - check RPM: libatomic-8.5.0-23.0.1.an8.x86_64 is installed … PASS
115
[2025-03-28T08:50:36.451+0800] INFO - check RPM: irqbalance-1.9.2-1.0.1.an8.x86_64 ncurses-libs-6.1-10.20180224.0.1.an8.x86_64 ncurses-6.1-10.20180224.0.1.an8.x86_64 rsync-3.1.3-20.0.1.an8.x86_64 ncurses-base-6.1-10.20180224.0.1.an8.noarch nmap-ncat-7.92-3.0.1.an8.x86_64 perl-Encode-2.97-3.0.1.an8.x86_64 is installed … PASS
116
[2025-03-28T08:50:37.110+0800] INFO - check RPM: iproute-6.2.0-6.0.1.an8.x86_64 is installed … PASS
117
[2025-03-28T08:50:37.119+0800] INFO - check mysql client, working … PASS
118
[2025-03-28T08:50:37.127+0800] INFO - checking irq affinity …
119
[2025-03-28T08:50:37.135+0800] INFO - checking ens192 …
120
[2025-03-28T08:50:37.136+0800] INFO - netlink error: Operation not supported
121
[2025-03-28T08:50:37.140+0800] INFO - netlink error: Operation not supported
122
[2025-03-28T08:50:37.142+0800] INFO - check irq channels, NIC: ens192, Channel Combined: … PASS
123
[2025-03-28T08:50:37.156+0800] INFO - check irq affinity, NIC: ens192, smp_affinity count: 4 … PASS
124
[2025-03-28T08:50:37.176+0800] INFO - check irqbalance status: inactive … PASS
125
[2025-03-28T08:50:37.177+0800] INFO - check irqbalance service: disabled … PASS
126
[2025-03-28T08:50:37.178+0800] INFO - df: /data/1: No such file or directory
127
[2025-03-28T08:50:37.185+0800] INFO - /tmp/precheck.shSVF0AQwp: line 1058: python: command not found
128
[2025-03-28T08:50:37.192+0800] INFO - python command must be version python 2.x … FAIL
129
[2025-03-28T08:50:37.193+0800] INFO -
130
[2025-03-28T08:50:37.193+0800] INFO -
131
[2025-03-28T08:50:37.193+0800] INFO - ### SUMMARY OF ISSUES IN PRE-CHECK ###
132
[2025-03-28T08:50:37.194+0800] INFO - check CPU count: 8 < 32 … EXPECT >= 32 … FAIL
133
[2025-03-28T08:50:37.194+0800] INFO - TIPS: replace another machine with more CPU
134
[2025-03-28T08:50:37.194+0800] INFO - check total MEM: 15 GB < 128 GB … EXPECT >= 128 GB … FAIL
135
[2025-03-28T08:50:37.194+0800] INFO - TIPS: replace another machine with more MEM
136
[2025-03-28T08:50:37.194+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL
137
[2025-03-28T08:50:37.194+0800] INFO - TIPS: re-part disk to mount /data/1
138
[2025-03-28T08:50:37.195+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL
139
[2025-03-28T08:50:37.195+0800] INFO - TIPS: re-part disk to mount /data/log1
140
[2025-03-28T08:50:37.195+0800] INFO - python command must be version python 2.x … FAIL
141
[2025-03-28T08:50:37.195+0800] INFO - execute command on 192.168.2.16:
142
rm -f /tmp/precheck.shSVF0AQwp
143
[2025-03-28T08:50:37.213+0800] ERROR - Task failed with exception
144
Traceback (most recent call last):
145
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
146
return_value = super().execute(context)
147
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
148
return_value = self.execute_callable()
149
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
150
return self.python_callable(*self.op_args, **self.op_kwargs)
151
File “/oat/task_engine/dags/init_server_with_tag.py”, line 79, in precheck
152
common.server_precheck(ctx, logger=logger)
153
File “/oat/task_engine/plugins/common.py”, line 1542, in server_precheck
154
raise RuntimeError(‘server precheck failed, please see the summary info above for details’)
155
RuntimeError: server precheck failed, please see the summary info above for details
156
[2025-03-28T08:50:37.222+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20250328T003149, start_date=20250328T005021, end_date=20250328T005037
157
[2025-03-28T08:50:37.222+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_server_with_tag__precheck__20250328 and exception information is server precheck failed, please see the summary info above for details’, 4]
158
[2025-03-28T08:50:37.223+0800] INFO - Rows affected: 1
159
[2025-03-28T08:50:37.325+0800] ERROR - Failed to execute job 24 for task precheck (server precheck failed, please see the summary info above for details; 8405)
160
[2025-03-28T08:50:37.360+0800] INFO - Task exited with return code 1
161
[2025-03-28T08:50:37.383+0800]
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!