OceanBase 4.0 all-in-one 三台服务器三节点部署 最后报错[ERROR] Cluster init failed

【 使用环境 】测试环境
【 OB or 其他组件 】all-in-one
【 使用版本 】4.0
【问题描述】OceanBase 4.0 使用 all-in-one 在三台服务器三节点部署,最后报错 Initialize cluster x [ERROR] Cluster init failed
【复现路径】按照 https://www.oceanbase.com/docs/community-observer-cn-10000000000900490 设置系统
【问题现象及影响】

系统信息:
CentOS Linux release 7.9.2009 (Core)
Linux 213 3.10.0-1160.76.1.el7.x86_64 #1 SMP Wed Aug 10 16:21:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

obd cluster deploy zzw -c myconfig.yaml
install oceanbase-ce-4.0.0.0 for local ok
install obproxy-ce-4.0.0 for local ok
install obagent-1.2.0 for local ok
install prometheus-2.37.1 for local ok
install grafana-7.5.17 for local ok
+--------------------------------------------------------------------------------------------+
|                                          Packages                                          |
+--------------+---------+------------------------+------------------------------------------+
| Repository   | Version | Release                | Md5                                      |
+--------------+---------+------------------------+------------------------------------------+
| oceanbase-ce | 4.0.0.0 | 103000022023011215.el7 | 1d56dc742f5f05a2d15797d291b51a94019e728d |
| obproxy-ce   | 4.0.0   | 5.el7                  | de53232a951184fad75b15884458d85e31d2f6c3 |
| obagent      | 1.2.0   | 4.el7                  | 0e8f5ee68c337ea28514c9f3f820ea546227fa7e |
| prometheus   | 2.37.1  | 10000102022110211.el7  | 58913c7606f05feb01bc1c6410346e5fc31cf263 |
| grafana      | 7.5.17  | 1                      | 1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 |
+--------------+---------+------------------------+------------------------------------------+
Repository integrity check ok
Parameter check ok
Open ssh connection ok
Cluster status check ok
Initializes observer work home ok
Initializes obproxy work home ok
Initializes obagent work home ok
Initializes prometheus work home ok
Initializes grafana work home ok
Remote oceanbase-ce-4.0.0.0-103000022023011215.el7-1d56dc742f5f05a2d15797d291b51a94019e728d repository install ok
Remote oceanbase-ce-4.0.0.0-103000022023011215.el7-1d56dc742f5f05a2d15797d291b51a94019e728d repository lib check !!
Remote obproxy-ce-4.0.0-5.el7-de53232a951184fad75b15884458d85e31d2f6c3 repository install ok
Remote obproxy-ce-4.0.0-5.el7-de53232a951184fad75b15884458d85e31d2f6c3 repository lib check ok
Remote obagent-1.2.0-4.el7-0e8f5ee68c337ea28514c9f3f820ea546227fa7e repository install ok
Remote obagent-1.2.0-4.el7-0e8f5ee68c337ea28514c9f3f820ea546227fa7e repository lib check ok
Remote prometheus-2.37.1-10000102022110211.el7-58913c7606f05feb01bc1c6410346e5fc31cf263 repository install ok
Remote prometheus-2.37.1-10000102022110211.el7-58913c7606f05feb01bc1c6410346e5fc31cf263 repository lib check ok
Remote grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 repository install ok
Remote grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 repository lib check ok
Try to get lib-repository
install oceanbase-ce-libs-4.0.0.0 for local ok
Remote oceanbase-ce-libs-4.0.0.0-103000022023011215.el7-ef48cff7633e3dbc39f5c0abdcd72348213e09a2 repository install ok
Remote oceanbase-ce-4.0.0.0-103000022023011215.el7-1d56dc742f5f05a2d15797d291b51a94019e728d repository lib check ok
zzw deployed
[root@213 conf]# obd cluster list
+-------------------------------------------------+
|                   Cluster List                  |
+------+------------------------+-----------------+
| Name | Configuration Path     | Status (Cached) |
+------+------------------------+-----------------+
| zzw  | /root/.obd/cluster/zzw | deployed        |
+------+------------------------+-----------------+
[root@213 conf]# obd cluster start zzw
Get local repositories ok
Search plugins ok
Open ssh connection ok
Load cluster param plugin ok
Check before start observer ok
[WARN] (10.0.0.211) clog and data use the same disk (/home)
[WARN] (10.0.0.212) clog and data use the same disk (/hotstor)
[WARN] (10.0.0.213) clog and data use the same disk (/home)

Check before start obproxy ok
Check before start obagent ok
Check before start prometheus ok
Check before start grafana ok
Start observer ok
observer program health check ok
Connect to observer ok
Initialize cluster x
[ERROR] Cluster init failed
See https://www.oceanbase.com/product/ob-deployer/error-codes .

配置信息如下

## Only need to configure when remote login is required
user:
  username: root
  password: 123456
#   key_file: your ssh-key file path if need
#   port: your ssh port, default 22
#   timeout: ssh connection timeout (second), default 30
oceanbase-ce:
  servers:
    - name: server1
      # Please don't use hostname, only IP can be supported
      ip: 10.0.0.211
    - name: server2
      ip: 10.0.0.212
    - name: server3
      ip: 10.0.0.213
  global:
    # if current hardware's memory capacity is smaller than 50G, please use the setting of "mini-single-example.yaml" and do a small adjustment.
    memory_limit: 32G # The maximum running memory for an observer
    # The reserved system memory. system_memory is reserved for general tenants. The default value is 30G.
    system_memory: 10G
    datafile_size: 192G # Size of the data file. 
    log_disk_size: 192G # The size of disk space used by the clog files.
    syslog_level: INFO # System log level. The default value is INFO.
    enable_syslog_wf: false # Print system logs whose levels are higher than WARNING to a separate log file. The default value is true.
    enable_syslog_recycle: true # Enable auto system log recycling or not. The default value is false.
    max_syslog_file_count: 4 # The maximum number of reserved log files before enabling auto recycling. The default value is 0.
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false
    # root_password: # root user password
  # In this example , support multiple ob process in single node, so different process use different ports.
  # If deploy ob cluster in multiple nodes, the port and path setting can be same. 
  server1:
    # Please set devname as the network adaptor's name whose ip is  in the setting of severs.
    # if set severs as "127.0.0.1", please set devname as "lo"
    # if current ip is 192.168.1.10, and the ip's network adaptor's name is "eth0", please use "eth0"
    devname: enp7s0
    mysql_port: 2881 # External port for OceanBase Database. The default value is 2881. DO NOT change this value after the cluster is started.
    rpc_port: 2882 # Internal port for OceanBase Database. The default value is 2882. DO NOT change this value after the cluster is started.
    # The working directory for OceanBase Database. OceanBase Database is started under this directory. This is a required field.
    home_path: /home/oceanbase/admin/oceanbase
    # The directory for data storage. The default value is $home_path/store.
    data_dir: /home/oceanbase/data
    # The directory for clog, ilog, and slog. The default value is the same as the data_dir value.
    redo_dir: /home/oceanbase/redo
    zone: zone1
  server2:
    # Please set devname as the network adaptor's name whose ip is  in the setting of severs.
    # if set severs as "127.0.0.1", please set devname as "lo"
    # if current ip is 192.168.1.10, and the ip's network adaptor's name is "eth0", please use "eth0"
    devname: enp6s0
    mysql_port: 2881 # External port for OceanBase Database. The default value is 2881. DO NOT change this value after the cluster is started.
    rpc_port: 2882 # Internal port for OceanBase Database. The default value is 2882. DO NOT change this value after the cluster is started.
    #  The working directory for OceanBase Database. OceanBase Database is started under this directory. This is a required field.
    home_path: /hotstor/oceanbase/admin/oceanbase
    # The directory for data storage. The default value is $home_path/store.
    data_dir: /hotstor/oceanbase/data
    # The directory for clog, ilog, and slog. The default value is the same as the data_dir value.
    redo_dir: /hotstor/oceanbase/redo
    zone: zone2
  server3:
    # Please set devname as the network adaptor's name whose ip is  in the setting of severs.
    # if set severs as "127.0.0.1", please set devname as "lo"
    # if current ip is 192.168.1.10, and the ip's network adaptor's name is "eth0", please use "eth0"
    devname: enp3s0
    mysql_port: 2881 # External port for OceanBase Database. The default value is 2881. DO NOT change this value after the cluster is started.
    rpc_port: 2882 # Internal port for OceanBase Database. The default value is 2882. DO NOT change this value after the cluster is started.
    #  The working directory for OceanBase Database. OceanBase Database is started under this directory. This is a required field.
    home_path: /home/oceanbase/admin/oceanbase
    # The directory for data storage. The default value is $home_path/store.
    data_dir: /home/oceanbase/data
    # The directory for clog, ilog, and slog. The default value is the same as the data_dir value.
    redo_dir: /home/oceanbase/redo
    zone: zone3
obproxy-ce:
  # Set dependent components for the component.
  # When the associated configurations are not done, OBD will automatically get the these configurations from the dependent components.
  depends:
    - oceanbase-ce
  servers:
    - 10.0.0.213
  global:
    listen_port: 2883 # External port. The default value is 2883.
    prometheus_listen_port: 2884 # The Prometheus port. The default value is 2884.
    home_path: /home/oceanbase/admin/obproxy
    # oceanbase root server list
    # format: ip:mysql_port;ip:mysql_port. When a depends exists, OBD gets this value from the oceanbase-ce of the depends.
    # rs_list: 192.168.1.2:2881;192.168.1.3:2881;192.168.1.4:2881
    enable_cluster_checkout: false
    # observer cluster name, consistent with oceanbase-ce's appname. When a depends exists, OBD gets this value from the oceanbase-ce of the depends.
    # cluster_name: obcluster
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false
    # obproxy_sys_password: # obproxy sys user password, can be empty. When a depends exists, OBD gets this value from the oceanbase-ce of the depends.
    # observer_sys_password: # proxyro user pasword, consistent with oceanbase-ce's proxyro_password, can be empty. When a depends exists, OBD gets this value from the oceanbase-ce of the depends.
obagent:
  depends:
    - oceanbase-ce
  servers:
    - name: server1
      # Please don't use hostname, only IP can be supported
      ip: 10.0.0.211
    - name: server2
      ip: 10.0.0.212
    - name: server3
      ip: 10.0.0.213
  global:
    home_path: /home/oceanbase/admin/obagent
    ob_monitor_status: active
prometheus:
  depends:
    - obagent
  servers:
    - 10.0.0.213
  global:
    home_path: /home/oceanbase/admin/prometheus
grafana:
  depends:
    - prometheus
  servers:
    - 10.0.0.213
  global:
    home_path: /home/oceanbase/admin/grafana
    login_password: oceanbase

今天刚刚发现,多执行 obd cluster start zzw 几次后就启动起来了

看启动日志除了warning,没有报其他错误。这表现出来的现象一般是机器比较差,第一次启动初始化集群的时候需要一些时间,性能差的机器不能在规定时间内完成初始化所以会报错失败。解决方式就是重新执行启动命令,OBD的命令是可重入的,再一次执行start会直接到等待初始化的阶段,多等几次就能就等成功了。

OBD-5000: select * from oceanbase.__all_server
这个解释不行,其他数据库都可以