OAT 添加服务器异常

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】oat 4.1.0 all-in-one
【 使用版本 】3.34
【问题描述】添加服务器初始化时异常

【复现路径】
添加的服务器版本是 anlios 8.9
【附件及日志】

[2025-03-28T08:50:21.761+0800] INFO - Dependencies all met for <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [queued]>

2

[2025-03-28T08:50:21.773+0800] INFO - Dependencies all met for <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [queued]>

3

[2025-03-28T08:50:21.773+0800] INFO -

4


5

[2025-03-28T08:50:21.774+0800] INFO - Starting attempt 2 of 2

6

[2025-03-28T08:50:21.774+0800] INFO -

7


8

[2025-03-28T08:50:21.826+0800] INFO - Executing <Task(_PythonDecoratedOperator): precheck> on 2025-03-28 00:31:49.971857+00:00

9

[2025-03-28T08:50:21.830+0800] INFO - Started process 8405 to run task

10

[2025-03-28T08:50:21.834+0800] INFO - Running: [‘airflow’, ‘tasks’, ‘run’, ‘init_server_with_tag’, ‘precheck’, ‘manual__2025-03-28T00:31:49.971857+00:00’, ‘–job-id’, ‘24’, ‘–raw’, ‘–subdir’, ‘DAGS_FOLDER/init_server_with_tag.py’, ‘–cfg-path’, ‘/tmp/tmp37e0s0yc’]

11

[2025-03-28T08:50:21.836+0800] INFO - Job 24: Subtask precheck

12

[2025-03-28T08:50:21.909+0800] INFO - Running <TaskInstance: init_server_with_tag.precheck manual__2025-03-28T00:31:49.971857+00:00 [running]> on host anlios-16

13

[2025-03-28T08:50:21.983+0800] INFO - Exporting the following env vars:

14

AIRFLOW_CTX_DAG_OWNER=airflow

15

AIRFLOW_CTX_DAG_ID=init_server_with_tag

16

AIRFLOW_CTX_TASK_ID=precheck

17

AIRFLOW_CTX_EXECUTION_DATE=2025-03-28T00:31:49.971857+00:00

18

AIRFLOW_CTX_TRY_NUMBER=2

19

AIRFLOW_CTX_DAG_RUN_ID=manual__2025-03-28T00:31:49.971857+00:00

20

[2025-03-28T08:50:21.985+0800] INFO - Running statement: select oat_server.id, oat_credential.id as credential_id, ip, ssh_port, username, password, auth_type, key_data, passphrase from oat_server, oat_credential where oat_server.credential_id=oat_credential.id and oat_server.id=%s, parameters: [1]

21

[2025-03-28T08:50:21.986+0800] INFO - Rows affected: 1

22

[2025-03-28T08:50:21.998+0800] INFO - Connected (version 2.0, client OpenSSH_8.0)

23

[2025-03-28T08:50:22.074+0800] INFO - Authentication (password) successful!

24

[2025-03-28T08:50:22.186+0800] INFO - execute command on 192.168.2.16:

25

/tmp/precheck.shSVF0AQwp -m both

26

[2025-03-28T08:50:22.215+0800] INFO - Machine Role: both

27

[2025-03-28T08:50:22.217+0800] INFO - Peer IP List:

28

[2025-03-28T08:50:22.219+0800] INFO - Machine Type: PHY

29

[2025-03-28T08:50:22.219+0800] INFO - Inspect Mode: FALSE

30

[2025-03-28T08:50:22.220+0800] INFO -

31

[2025-03-28T08:50:22.222+0800] INFO - check CPU count: 8 < 32 … EXPECT >= 32 … FAIL

32

[2025-03-28T08:50:22.223+0800] INFO - TIPS: replace another machine with more CPU

33

[2025-03-28T08:50:22.226+0800] INFO - check total MEM: 15 GB < 128 GB … EXPECT >= 128 GB … FAIL

34

[2025-03-28T08:50:22.227+0800] INFO - TIPS: replace another machine with more MEM

35

[2025-03-28T08:50:22.229+0800] INFO - check SELinux status: Disabled … PASS

36

[2025-03-28T08:50:22.232+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL

37

[2025-03-28T08:50:22.233+0800] INFO - TIPS: re-part disk to mount /data/1

38

[2025-03-28T08:50:22.236+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL

39

[2025-03-28T08:50:22.237+0800] INFO - TIPS: re-part disk to mount /data/log1

40

[2025-03-28T08:50:22.238+0800] INFO - check /home/admin, exist … PASS

41

[2025-03-28T08:50:22.241+0800] INFO - check /home/admin owner: admin … PASS

42

[2025-03-28T08:50:22.246+0800] INFO - check /home/admin disk usage, total: 221G, used: 54G, use%: 25% < 50% … PASS

43

[2025-03-28T08:50:22.252+0800] INFO - check account [admin] and home dir, exist … PASS

44

[2025-03-28T08:50:22.285+0800] INFO - check clock sync service: chronyd, chrony server: 203.107.6.88 … PASS

45

[2025-03-28T08:50:27.314+0800] INFO - check chrony clock offset: 2.036ms <= 50ms … PASS

46

[2025-03-28T08:50:27.325+0800] INFO - sysctl /proc/sys/net/core/somaxconn = 2048, correct … PASS

47

[2025-03-28T08:50:27.332+0800] INFO - sysctl /proc/sys/net/core/netdev_max_backlog = 10000, correct … PASS

48

[2025-03-28T08:50:27.340+0800] INFO - sysctl /proc/sys/net/core/rmem_default = 16777216, correct … PASS

49

[2025-03-28T08:50:27.348+0800] INFO - sysctl /proc/sys/net/core/wmem_default = 16777216, correct … PASS

50

[2025-03-28T08:50:27.355+0800] INFO - sysctl /proc/sys/net/core/rmem_max = 16777216, correct … PASS

51

[2025-03-28T08:50:27.363+0800] INFO - sysctl /proc/sys/net/core/wmem_max = 16777216, correct … PASS

52

[2025-03-28T08:50:27.372+0800] INFO - sysctl /proc/sys/net/ipv4/conf/default/rp_filter = 1, correct … PASS

53

[2025-03-28T08:50:27.379+0800] INFO - sysctl /proc/sys/net/ipv4/conf/default/accept_source_route = 0, correct … PASS

54

[2025-03-28T08:50:27.387+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_syncookies = 1, correct … PASS

55

[2025-03-28T08:50:27.395+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_rmem = 4096 87380 16777216, correct … PASS

56

[2025-03-28T08:50:27.402+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_wmem = 4096 65536 16777216, correct … PASS

57

[2025-03-28T08:50:27.410+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_max_syn_backlog = 16384, correct … PASS

58

[2025-03-28T08:50:27.418+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_fin_timeout = 15, correct … PASS

59

[2025-03-28T08:50:27.425+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_max_syn_backlog = 16384, correct … PASS

60

[2025-03-28T08:50:27.433+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_tw_reuse = 1, correct … PASS

61

[2025-03-28T08:50:27.440+0800] INFO - sysctl /proc/sys/net/ipv4/tcp_slow_start_after_idle = 0, correct … PASS

62

[2025-03-28T08:50:27.449+0800] INFO - sysctl /proc/sys/vm/swappiness = 0, correct … PASS

63

[2025-03-28T08:50:27.457+0800] INFO - sysctl /proc/sys/kernel/core_pattern = /data/1/core-%e-%p-%t, correct … PASS

64

[2025-03-28T08:50:27.465+0800] INFO - sysctl /proc/sys/vm/min_free_kbytes = 2097152, correct … PASS

65

[2025-03-28T08:50:27.472+0800] INFO - sysctl /proc/sys/vm/max_map_count = 655360, correct … PASS

66

[2025-03-28T08:50:27.480+0800] INFO - sysctl /proc/sys/fs/aio-max-nr = 1048576, correct … PASS

67

[2025-03-28T08:50:27.488+0800] INFO - sysctl /proc/sys/vm/overcommit_memory = 0, correct … PASS

68

[2025-03-28T08:50:27.496+0800] INFO - sysctl /proc/sys/vm/nr_hugepages = 0, correct … PASS

69

[2025-03-28T08:50:27.503+0800] INFO - sysctl /proc/sys/net/ipv4/ip_forward = 1, correct … PASS

70

[2025-03-28T08:50:27.511+0800] INFO - sysctl /proc/sys/net/ipv4/ip_local_port_range = 10000 65535, correct … PASS

71

[2025-03-28T08:50:27.800+0800] INFO - check service [crond]: enabled … PASS

72

[2025-03-28T08:50:28.382+0800] INFO - check service [sshd]: enabled … PASS

73

[2025-03-28T08:50:28.671+0800] INFO - check service [firewalld]: inactive … PASS

74

[2025-03-28T08:50:28.679+0800] INFO - check service [firewalld]: disabled … PASS

75

[2025-03-28T08:50:28.683+0800] INFO - check sshd_config PubkeyAuthentication: yes … PASS

76

[2025-03-28T08:50:28.685+0800] INFO - check sshd_config UseDNS: no … PASS

77

[2025-03-28T08:50:28.688+0800] INFO - check sshd_config ClientAliveInterval: 60 … PASS

78

[2025-03-28T08:50:28.691+0800] INFO - check sshd_config ClientAliveCountMax: 10 … PASS

79

[2025-03-28T08:50:28.692+0800] INFO - check hugepage: disabled … PASS

80

[2025-03-28T08:50:28.693+0800] INFO - check oceanbase_limits.conf, exist … PASS

81

[2025-03-28T08:50:28.739+0800] INFO - check hard limit of new session open_files (ulimit -H -n): 655360 … PASS

82

[2025-03-28T08:50:28.739+0800] INFO - check hard limit of open_files (ulimit -H -n): 655360 … PASS

83

[2025-03-28T08:50:28.775+0800] INFO - check soft limit of new session open_files (ulimit -S -n): 655360 … PASS

84

[2025-03-28T08:50:28.775+0800] INFO - check soft limit of open_files (ulimit -S -n): 655360 … PASS

85

[2025-03-28T08:50:28.821+0800] INFO - check hard limit of new session max_user_processes (ulimit -H -u): 655360 … PASS

86

[2025-03-28T08:50:28.821+0800] INFO - check hard limit of max_user_processes (ulimit -H -u): 655360 … PASS

87

[2025-03-28T08:50:28.857+0800] INFO - check soft limit of new session max_user_processes (ulimit -S -u): 655360 … PASS

88

[2025-03-28T08:50:28.858+0800] INFO - check soft limit of max_user_processes (ulimit -S -u): 655360 … PASS

89

[2025-03-28T08:50:28.905+0800] INFO - check hard limit of new session stack_size (ulimit -H -s): 10240 … PASS

90

[2025-03-28T08:50:28.906+0800] INFO - check hard limit of stack_size (ulimit -H -s): 10240 … PASS

91

[2025-03-28T08:50:28.962+0800] INFO - check soft limit of new session stack_size (ulimit -S -s): 10240 … PASS

92

[2025-03-28T08:50:28.962+0800] INFO - check soft limit of stack_size (ulimit -S -s): 10240 … PASS

93

[2025-03-28T08:50:29.011+0800] INFO - check hard limit of new session core_file_size (ulimit -H -c): unlimited … PASS

94

[2025-03-28T08:50:29.012+0800] INFO - check hard limit of core_file_size (ulimit -H -c): unlimited … PASS

95

[2025-03-28T08:50:29.046+0800] INFO - check soft limit of new session core_file_size (ulimit -S -c): unlimited … PASS

96

[2025-03-28T08:50:29.047+0800] INFO - check soft limit of core_file_size (ulimit -S -c): unlimited … PASS

97

[2025-03-28T08:50:29.095+0800] INFO - check hard limit of new session cpu_time (ulimit -H -t): unlimited … PASS

98

[2025-03-28T08:50:29.095+0800] INFO - check hard limit of cpu_time (ulimit -H -t): unlimited … PASS

99

[2025-03-28T08:50:29.134+0800] INFO - check soft limit of new session cpu_time (ulimit -S -t): unlimited … PASS

100

[2025-03-28T08:50:29.135+0800] INFO - check soft limit of cpu_time (ulimit -S -t): unlimited … PASS

101

[2025-03-28T08:50:29.138+0800] INFO - check numa stat, pass … PASS

102

[2025-03-28T08:50:29.143+0800] INFO - check elevator policy: deadline … PASS

103

[2025-03-28T08:50:29.144+0800] INFO - check current_clocksource: tsc … PASS

104

[2025-03-28T08:50:29.149+0800] INFO - check logical sector size of /dev/sda: 512 … PASS

105

[2025-03-28T08:50:29.814+0800] INFO - check RPM: mariadb-common-10.3.39-1.0.1.module+an8.8.0+11133+62929fd4.x86_64 mariadb-connector-c-config-3.2.6-1.an8.noarch mariadb-connector-c-3.2.6-1.an8.x86_64 mariadb-10.3.39-1.0.1.module+an8.8.0+11133+62929fd4.x86_64 is installed … PASS

106

[2025-03-28T08:50:30.492+0800] INFO - check RPM: python2-devel-2.7.18-17.0.1.module+an8.9.0+11214+2a3a4a9e.x86_64 is installed … PASS

107

[2025-03-28T08:50:31.145+0800] INFO - check RPM: net-tools-2.0-0.52.20160912git.an8.x86_64 is installed … PASS

108

[2025-03-28T08:50:31.825+0800] INFO - check RPM: mtr-0.92-3.el8.x86_64 is installed … PASS

109

[2025-03-28T08:50:32.515+0800] INFO - check RPM: selinux-policy-targeted-3.14.3-139.0.1.an8.1.noarch keentune-target-2.4.1-1.an8.noarch tar-1.30-9.0.1.an8.x86_64 is installed … PASS

110

[2025-03-28T08:50:33.171+0800] INFO - check RPM: binutils-2.30-125.0.1.an8.x86_64 is installed … PASS

111

[2025-03-28T08:50:33.832+0800] INFO - check RPM: bind-utils-9.11.36-16.0.1.an8.4.x86_64 is installed … PASS

112

[2025-03-28T08:50:34.481+0800] INFO - check RPM: libaio-0.3.112-1.0.1.an8.x86_64 is installed … PASS

113

[2025-03-28T08:50:35.138+0800] INFO - check RPM: libcurl-7.61.1-35.0.2.an8.3.x86_64 curl-7.61.1-35.0.2.an8.3.x86_64 python3-pycurl-7.43.0.2-4.el8.x86_64 is installed … PASS

114

[2025-03-28T08:50:35.799+0800] INFO - check RPM: libatomic-8.5.0-23.0.1.an8.x86_64 is installed … PASS

115

[2025-03-28T08:50:36.451+0800] INFO - check RPM: irqbalance-1.9.2-1.0.1.an8.x86_64 ncurses-libs-6.1-10.20180224.0.1.an8.x86_64 ncurses-6.1-10.20180224.0.1.an8.x86_64 rsync-3.1.3-20.0.1.an8.x86_64 ncurses-base-6.1-10.20180224.0.1.an8.noarch nmap-ncat-7.92-3.0.1.an8.x86_64 perl-Encode-2.97-3.0.1.an8.x86_64 is installed … PASS

116

[2025-03-28T08:50:37.110+0800] INFO - check RPM: iproute-6.2.0-6.0.1.an8.x86_64 is installed … PASS

117

[2025-03-28T08:50:37.119+0800] INFO - check mysql client, working … PASS

118

[2025-03-28T08:50:37.127+0800] INFO - checking irq affinity …

119

[2025-03-28T08:50:37.135+0800] INFO - checking ens192 …

120

[2025-03-28T08:50:37.136+0800] INFO - netlink error: Operation not supported

121

[2025-03-28T08:50:37.140+0800] INFO - netlink error: Operation not supported

122

[2025-03-28T08:50:37.142+0800] INFO - check irq channels, NIC: ens192, Channel Combined: … PASS

123

[2025-03-28T08:50:37.156+0800] INFO - check irq affinity, NIC: ens192, smp_affinity count: 4 … PASS

124

[2025-03-28T08:50:37.176+0800] INFO - check irqbalance status: inactive … PASS

125

[2025-03-28T08:50:37.177+0800] INFO - check irqbalance service: disabled … PASS

126

[2025-03-28T08:50:37.178+0800] INFO - df: /data/1: No such file or directory

127

[2025-03-28T08:50:37.185+0800] INFO - /tmp/precheck.shSVF0AQwp: line 1058: python: command not found

128

[2025-03-28T08:50:37.192+0800] INFO - python command must be version python 2.x … FAIL

129

[2025-03-28T08:50:37.193+0800] INFO -

130

[2025-03-28T08:50:37.193+0800] INFO -

131

[2025-03-28T08:50:37.193+0800] INFO - ### SUMMARY OF ISSUES IN PRE-CHECK ###

132

[2025-03-28T08:50:37.194+0800] INFO - check CPU count: 8 < 32 … EXPECT >= 32 … FAIL

133

[2025-03-28T08:50:37.194+0800] INFO - TIPS: replace another machine with more CPU

134

[2025-03-28T08:50:37.194+0800] INFO - check total MEM: 15 GB < 128 GB … EXPECT >= 128 GB … FAIL

135

[2025-03-28T08:50:37.194+0800] INFO - TIPS: replace another machine with more MEM

136

[2025-03-28T08:50:37.194+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL

137

[2025-03-28T08:50:37.194+0800] INFO - TIPS: re-part disk to mount /data/1

138

[2025-03-28T08:50:37.195+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL

139

[2025-03-28T08:50:37.195+0800] INFO - TIPS: re-part disk to mount /data/log1

140

[2025-03-28T08:50:37.195+0800] INFO - python command must be version python 2.x … FAIL

141

[2025-03-28T08:50:37.195+0800] INFO - execute command on 192.168.2.16:

142

rm -f /tmp/precheck.shSVF0AQwp

143

[2025-03-28T08:50:37.213+0800] ERROR - Task failed with exception

144

Traceback (most recent call last):

145

File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute

146

return_value = super().execute(context)

147

File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute

148

return_value = self.execute_callable()

149

File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable

150

return self.python_callable(*self.op_args, **self.op_kwargs)

151

File “/oat/task_engine/dags/init_server_with_tag.py”, line 79, in precheck

152

common.server_precheck(ctx, logger=logger)

153

File “/oat/task_engine/plugins/common.py”, line 1542, in server_precheck

154

raise RuntimeError(‘server precheck failed, please see the summary info above for details’)

155

RuntimeError: server precheck failed, please see the summary info above for details

156

[2025-03-28T08:50:37.222+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20250328T003149, start_date=20250328T005021, end_date=20250328T005037

157

[2025-03-28T08:50:37.222+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_server_with_tag__precheck__20250328 and exception information is server precheck failed, please see the summary info above for details’, 4]

158

[2025-03-28T08:50:37.223+0800] INFO - Rows affected: 1

159

[2025-03-28T08:50:37.325+0800] ERROR - Failed to execute job 24 for task precheck (server precheck failed, please see the summary info above for details; 8405)

160

[2025-03-28T08:50:37.360+0800] INFO - Task exited with return code 1

161

[2025-03-28T08:50:37.383+0800]

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!

根据您提供的日志信息,OCP 在添加服务器时的初始化检查失败。以下是问题的具体原因和解决方法:

问题分析

从日志中可以看出,以下几项检查未通过:

  1. CPU 核心数不足

    check CPU count: 8 < 32 ... EXPECT >= 32 ... FAIL
    TIPS: replace another machine with more CPU
    

    当前服务器的 CPU 核心数为 8,而 OCP 需要至少 32 核。

  2. 内存不足

    check total MEM: 15 GB < 128 GB ... EXPECT >= 128 GB ... FAIL
    TIPS: replace another machine with more MEM
    

    当前服务器的总内存为 15GB,而 OCP 需要至少 128GB。

  3. 磁盘挂载点缺失

    check /data/1, NOT mounted ... EXPECT mounted as individual disk ... FAIL
    TIPS: re-part disk to mount /data/1
    
    check /data/log1, NOT mounted ... EXPECT mounted as individual disk ... FAIL
    TIPS: re-part disk to mount /data/log1
    

    /data/1/data/log1 挂载点未正确配置。

  4. Python 版本问题

    /tmp/precheck.shSVF0AQwp: line 1058: python: command not found
    python command must be version python 2.x ... FAIL
    

    当前服务器上未安装 Python 或者未找到 Python 2.x 的命令。


解决方案

1. 增加 CPU 核心数

  • 确保服务器的 CPU 核心数达到 32 核或以上
  • 如果当前硬件无法满足要求,建议更换更高配置的服务器。

2. 增加内存

  • 确保服务器的总内存达到 128GB 或以上
  • 如果当前硬件无法满足要求,建议更换更高配置的服务器。

3. 配置磁盘挂载点

  • 检查并确保 /data/1/data/log1 已正确挂载为独立磁盘。
  • 示例操作步骤(以 Linux 系统为例):
    # 创建目录
    sudo mkdir -p /data/1
    sudo mkdir -p /data/log1
    
    # 格式化磁盘(假设磁盘设备为 /dev/sdb 和 /dev/sdc)
    sudo mkfs.ext4 /dev/sdb
    sudo mkfs.ext4 /dev/sdc
    
    # 挂载磁盘
    sudo mount /dev/sdb /data/1
    sudo mount /dev/sdc /data/log1
    
    # 将挂载信息写入 /etc/fstab 以实现开机自动挂载
    echo "/dev/sdb /data/1 ext4 defaults 0 0" | sudo tee -a /etc/fstab
    echo "/dev/sdc /data/log1 ext4 defaults 0 0" | sudo tee -a /etc/fstab
    

4. 安装 Python 2.x

  • 确保服务器上已安装 Python 2.x,并将其路径添加到环境变量中。
  • 示例操作步骤:
    # 安装 Python 2.x(如果系统未提供,请手动下载并编译安装)
    sudo yum install -y python2
    
    # 检查 Python 版本
    python2 --version
    
    # 如果需要,将 Python 2.x 添加到 PATH
    export PATH=/usr/bin/python2:$PATH
    

总结

根据文档库中的信息,OCP 对服务器的硬件和软件配置有较高要求。请按照上述解决方案逐一排查并修复问题后,重新尝试添加服务器。如果问题仍然存在,建议联系 OceanBase 的技术支持团队获取进一步帮助。

更多信息请查看:

  1. host_agent_version_not_same 主机 OCP-Agent 版本与预期版本不一致
  2. OceanBase 集群常见问题
  3. OCP V3.3.1
  4. OCP V4.0.1
  5. OCP V4.0.0
  6. 升级常见问题
  7. OCP V3.3.0
  8. OCP V4.2.1
  9. 主机规划
  10. OCP V3.3.5
python command must be version python 2.x … FAIL

待添加服务器 麻烦安装python2.7试下

python2是有的。
image

根据您提供的信息,服务器上已经安装了 Python 2,并且可以通过 which python2 命令找到其路径为 /usr/bin/python2。然而,从之前的日志中可以看出,OCP 的初始化检查脚本在执行时未能找到 Python 2 的命令:

/tmp/precheck.shSVF0AQwp: line 1058: python: command not found
python command must be version python 2.x ... FAIL

这表明问题可能出在以下几点之一:

可能原因分析

  1. 默认 Python 版本未设置为 Python 2

    • 虽然系统中存在 Python 2,但默认的 python 命令可能指向其他版本(例如 Python 3),或者未正确配置。
  2. 环境变量 PATH 配置问题

    • 初始化检查脚本运行时的环境变量可能未包含 /usr/bin,导致无法找到 python2
  3. 脚本调用方式问题

    • 检查脚本可能直接调用了 python 而非 python2,需要确认脚本是否支持指定 Python 版本。

解决方案

1. 确认默认 Python 版本

  • 检查当前系统的默认 Python 版本:
    python --version
    
  • 如果默认版本不是 Python 2,可以通过创建符号链接或修改默认命令来解决:
    # 创建符号链接(谨慎操作,可能影响其他依赖)
    sudo ln -sf /usr/bin/python2 /usr/bin/python
    
    # 或者通过 alternatives 管理默认版本
    sudo alternatives --install /usr/bin/python python /usr/bin/python2 1
    sudo alternatives --config python
    

2. 检查环境变量 PATH

  • 确保 /usr/bin 已添加到环境变量中:
    echo $PATH
    
  • 如果未包含 /usr/bin,可以手动添加:
    export PATH=/usr/bin:$PATH
    

3. 修改检查脚本调用方式

  • 如果检查脚本允许指定 Python 版本,可以直接使用 python2 执行脚本:
    /usr/bin/python2 /tmp/precheck.shSVF0AQwp
    
  • 如果脚本中硬编码了 python,可以临时替换为 python2
    sed -i 's/python/python2/g' /tmp/precheck.shSVF0AQwp
    

4. 验证修复结果

  • 重新运行初始化检查脚本,确保 Python 相关检查通过:
    /tmp/precheck.shSVF0AQwp -m both
    

总结

根据文档库中的信息和您的描述,问题的主要原因是初始化检查脚本未能正确识别 Python 2。请按照上述步骤逐一排查并修复问题后,重新尝试添加服务器。如果问题仍然存在,建议联系 OceanBase 的技术支持团队获取进一步帮助。

检索到的文档如下:

  1. 下载 RPM 包
  2. 搜索 RPM 包
  3. pyobvector Python SDK 接口说明
  4. 安装 obshell
  5. 导出数据
  6. 启动 obshell
  7. obshell-sdk-python 快速上手
  8. PyMySQL 连接 OceanBase 数据库示例程序

(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)

可以按照小助手的建议 检查及设置默认版本,设置默认变量 再试下

an8.noarch nmap-ncat-7.92-3.0.1.an8.x86_64 perl-Encode-2.97-3.0.1.an8.x86_64 is installed … PASS
[2025-03-28T13:45:15.700+0800] INFO - check RPM: iproute-6.2.0-6.0.1.an8.x86_64 is installed … PASS
[2025-03-28T13:45:15.708+0800] INFO - check mysql client, working … PASS
[2025-03-28T13:45:15.715+0800] INFO - checking irq affinity …
[2025-03-28T13:45:15.723+0800] INFO - checking ens192 …
[2025-03-28T13:45:15.724+0800] INFO - netlink error: Operation not supported
[2025-03-28T13:45:15.728+0800] INFO - netlink error: Operation not supported
[2025-03-28T13:45:15.730+0800] INFO - check irq channels, NIC: ens192, Channel Combined: … PASS
[2025-03-28T13:45:15.742+0800] INFO - check irq affinity, NIC: ens192, smp_affinity count: 4 … PASS
[2025-03-28T13:45:15.759+0800] INFO - check irqbalance status: inactive … PASS
[2025-03-28T13:45:15.760+0800] INFO - check irqbalance service: disabled … PASS
[2025-03-28T13:45:15.761+0800] INFO - df: /data/1: No such file or directory
[2025-03-28T13:45:15.782+0800] INFO -
[2025-03-28T13:45:15.783+0800] INFO -
[2025-03-28T13:45:15.783+0800] INFO - ### SUMMARY OF ISSUES IN PRE-CHECK ###
[2025-03-28T13:45:15.783+0800] INFO - check CPU count: 8 < 32 … EXPECT >= 32 … FAIL
[2025-03-28T13:45:15.783+0800] INFO - TIPS: replace another machine with more CPU
[2025-03-28T13:45:15.783+0800] INFO - check total MEM: 15 GB < 128 GB … EXPECT >= 128 GB … FAIL
[2025-03-28T13:45:15.783+0800] INFO - TIPS: replace another machine with more MEM
[2025-03-28T13:45:15.784+0800] INFO - check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL
[2025-03-28T13:45:15.784+0800] INFO - TIPS: re-part disk to mount /data/1
[2025-03-28T13:45:15.784+0800] INFO - check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL
[2025-03-28T13:45:15.784+0800] INFO - TIPS: re-part disk to mount /data/log1
[2025-03-28T13:45:15.785+0800] INFO - execute command on 192.168.2.16:
rm -f /tmp/precheck.shlHtJOQaP
[2025-03-28T13:45:15.801+0800] ERROR - Task failed with exception
Traceback (most recent call last):
File “/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py”, line 217, in execute
return_value = super().execute(context)
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 175, in execute
return_value = self.execute_callable()
File “/usr/local/lib/python3.9/site-packages/airflow/operators/python.py”, line 192, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File “/oat/task_engine/dags/init_server_with_tag.py”, line 79, in precheck
common.server_precheck(ctx, logger=logger)
File “/oat/task_engine/plugins/common.py”, line 1542, in server_precheck
raise RuntimeError(‘server precheck failed, please see the summary info above for details’)
RuntimeError: server precheck failed, please see the summary info above for details
[2025-03-28T13:45:15.808+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20250328T003149, start_date=20250328T054500, end_date=20250328T054515
[2025-03-28T13:45:15.809+0800] INFO - Running statement: update oat_audit set status=‘failed’, update_time=utc_timestamp(), failed_reason=%s where id=%s, parameters: [‘failed task instance is init_server_with_tag__precheck__20250328 and exception information is server precheck failed, please see the summary info above for details’, 4]
[2025-03-28T13:45:15.809+0800] INFO - Rows affected: 1
[2025-03-28T13:45:15.829+0800] ERROR - Failed to execute job 59 for task precheck (server precheck failed, please see the summary info above for details; 94782)
[2025-03-28T13:45:15.857+0800] INFO - Task exited with return code 1
[2025-03-28T13:45:15.879+0800] INFO - 0 downstream tasks scheduled from follow-on schedule check

已经默认是2.7了

不知道为什么老是会去跑

/usr/local/lib/python3.9/

我压根这个目录呀

OCP是什么版本呢?
接管主机时选择的主机凭据是root用户吗?

麻烦确认下是用的社区版OCP吗

我部署的是企业版的 最新版本,刚刚在页面上下载的。

而且,我又找另一台机器centos7,又抛出同样的错误

[2025-03-28T13:55:17.474+0800] INFO - check mysql client, version is 8.x ... EXPECT client using mysql_native_password auth ... FAIL
[2025-03-28T13:55:17.475+0800] INFO - execute command on 192.168.2.101:
rm -f /tmp/precheck.shMd9wRqQf
[2025-03-28T13:55:17.488+0800] ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airflow/decorators/base.py", line 217, in execute
    return_value = super().execute(context)
  File "/usr/local/lib/python3.9/site-packages/airflow/operators/python.py", line 175, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.9/site-packages/airflow/operators/python.py", line 192, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/oat/task_engine/dags/init_server_with_tag.py", line 79, in precheck
    common.server_precheck(ctx, logger=logger)
  File "/oat/task_engine/plugins/common.py", line 1542, in server_precheck
    raise RuntimeError('server precheck failed, please see the summary info above for details')
RuntimeError: server precheck failed, please see the summary info above for details
[2025-03-28T13:55:17.494+0800] INFO - Marking task as FAILED. dag_id=init_server_with_tag, task_id=precheck, execution_date=20250328T052716, start_date=20250328T055359, end_date=20250328T055517
[2025-03-28T13:55:17.495+0800]```

是root用户

准确的说,我是部署完OAT后添加服务器发生异常。
OceanBase Admin Toolkit


是在这里下载的

建议使用社区版,企业版需要联系企业支持