【 使用环境 】测试环境
【 OB or 其他组件 】 OCP
【 使用版本 】
【问题描述】企业版3.2.3安装OCP时报错,但是找不到具体报错的日志信息
【复现路径】测试安装 bash install.sh -i 1-8 后续想通过OCP安装ob obproxy
【问题现象及影响】
##安装过程
[root@pdmpcdp07 t-oceanbase-antman]# bash install.sh -i 1-8
run install.sh with DEBUG=FALSE, INSTALL_STEPS=1 2 3 4 5 6 7 8 CLEAR_STEPS= CONFIG_FILE=/root/t-oceanbase-antman/obcluster.conf
[2022-10-08 20:29:34.915545] INFO [check conf file /root/t-oceanbase-antman/obcluster.conf format …]
[2022-10-08 20:29:34.923176] INFO [conf file is upper case format.]
[2022-10-08 20:29:37.038294] INFO [start antman API service]
[2022-10-08 20:29:37.091022] INFO [SSH_AUTH=password SSH_USER=root SSH_PORT=22 SSH_PASSWORD=Ci*** SSH_KEY_FILE=/root/.ssh/id_rsa]
LB_MODE=none
[2022-10-08 20:29:37.363422] INFO [use inner proxy, OBPROXY settings is ignore]
[2022-10-08 20:29:37.476420] INFO [step1: check ssh authorization, logfile: /root/t-oceanbase-antman/logs/ssh_auth.log]
[2022-10-08 20:29:37.693177] INFO [step1: ssh authorization done]
[2022-10-08 20:29:37.705342] INFO [step2: no action is required when LB_MODE=none]
OCP_VERSION is 3.2.3
[2022-10-08 20:29:38.222613] INFO [step3: check whether OBSERVER port 2881,2882 are in use or not on 10.249.240.7]
[2022-10-08 20:29:39.078827] INFO [step3: OBSERVER port 2881,2882, 2022 are idle on 10.249.240.7]
[2022-10-08 20:29:39.082853] INFO [step3: installing ob cluster, logfile: /root/t-oceanbase-antman/logs/install_ob.log]
[2022-10-08 20:29:47.381391] INFO [load docker image: docker load -i /root/t-oceanbase-antman/metaob_OB2277_OBP320_x86_20220429.tgz]
Loaded image: reg.docker.alibaba-inc.com/antman/ob-docker:OB2277_OBP320_x86_20220429
data dir /data/1 avail space less than 90%
[2022-10-08 20:30:10.979000] INFO [installing OB docker and starting OB server on 10.249.240.7, pid: 18659, log: /root/t-oceanbase-antman/logs/install_OB_docker.log and /home/admin/logs/ob-server/ inside docker]
[2022-10-08 20:30:11.458260] ERROR [install_OB_docker.sh finished but reg.docker.alibaba-inc.com/antman/ob-docker:OB2277_OBP320_x86_20220429 NOT started on 10.249.240.7]
[2022-10-08 20:30:11.462201] ERROR [ANTMAN-303: OB docker on 10.249.240.7 is NOT started]
[2022-10-08 20:30:11.474268] ERROR [ANTMAN-314: ERROR occurred in install_ob, install.sh exit]
[root@pdmpcdp07 t-oceanbase-antman]#
##precheck 结果
SUMMARY OF ISSUES IN PRE-CHECK
check /data/1, NOT mounted … EXPECT mounted as individual disk … FAIL
TIPS: re-part disk to mount /data/1
clone.sh -p
check /data/log1, NOT mounted … EXPECT mounted as individual disk … FAIL
TIPS: re-part disk to mount /data/log1
clone.sh -p
check numa stat, node number is 4 … EXPECT X86 off / ARM on … WARN
TIPS: Please turn off numa in BIOS or UEFI, with BIOS/UEFI settings done, below system steps could be ignored.
The system steps to turn off numa:
1, add “numa=off” at the end of kernel command line(GRUB_CMDLINE_LINUX) in boot loader configuration file(/etc/default/grub), for example:
GRUB_CMDLINE_LINUX=“rd.lvm.lv=rhel_vm-210/root rd.lvm.lv=rhel_vm-210/swap vconsole.font=latarcyrheb-sun16 crashkernel=auto vconsole.keymap=us rhgb quiet numa=off”
2, grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
3, reboot the system
[root@pdmpcdp07 clonescripts]#
我这台服务器 /data/log1 /data/1 两个目录不是单独划分的磁盘,只是再根目录创建的目录用来测试。
不知道 是不是这两个目录造成的安装失败呢 ? 也没有明显的日志说明为什么报错。