obd扩容时执行obd cluster start obce-3zones后并没有把新observer添加进原OB集群

【 使用环境 】
测试环境
【 OB or 其他组件 】
OBD扩容
【 使用版本 】
OBServer:3.1.4
【问题描述】
原集群:

obclient [oceanbase]> select svr_ip,id,zone,status from __all_server;
+----------------+----+-------+--------+
| svr_ip         | id | zone  | status |
+----------------+----+-------+--------+
| 172.118.81.151 |  1 | zone1 | active |
| 172.118.81.152 |  2 | zone2 | active |
| 172.118.81.153 |  3 | zone3 | active |
+----------------+----+-------+--------+

obd部署原集群的yaml文件如下:

$ grep -v '#' obce-3zones.yml 
user:
  username: admin
  key_file: /home/admin/.ssh/id_rsa.pub
oceanbase-ce:
  version: 3.1.4
  servers:
    - name: server151
      ip: 172.118.81.151
    - name: server152
      ip: 172.118.81.152
    - name: server153
      ip: 172.118.81.153
  global:
    devname: ens192
    cluster_id: 20230720
    cpu_count: 12
    memory_limit: 30G
    system_memory: 6G
    datafile_size: 150G
    log_disk_size: 100G
    syslog_level: WARN
    enable_syslog_wf: false
    enable_syslog_recycle: true
    max_syslog_file_count: 20
    appname: obce-3zones
    root_password: cloudadmin
    proxyro_password: proxyro
  server151:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone1
  server152:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone2
  server153:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone3
obproxy-ce:
  version: 3.2.3
  servers:
    - 172.118.81.151
    - 172.118.81.156
  depends:
    - oceanbase-ce
  global:
    listen_port: 2883
    prometheus_listen_port: 2884
    home_path: /home/admin/obproxy
    enable_cluster_checkout: false
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false

三个新节点的yaml文件如下:

$ grep -v '#' obce-3zones_expansion.yml 
user:
  username: admin
  key_file: /home/admin/.ssh/id_rsa.pub
oceanbase-ce:
  version: 3.1.4
  servers:
    - name: server31
      ip: 10.192.45.31
    - name: server34
      ip: 10.192.45.34
    - name: server35
      ip: 10.192.45.35
  global:
    devname: ens192
    cluster_id: 20230720
    cpu_count: 12
    memory_limit: 30G
    system_memory: 6G
    syslog_level: WARN
    enable_syslog_wf: false
    enable_syslog_recycle: true
    max_syslog_file_count: 20
    appname: obce-3zones
    root_password: cloudadmin
  server31:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone1
    datafile_size: 100G
    log_disk_size: 100G
  server34:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone2
    datafile_size: 100G
    log_disk_size: 100G
  server35:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone3
    datafile_size: 100G
    log_disk_size: 100G

obd deploy 的结果如下:

$ obd cluster deploy obce-3zones_2 -c obce-3zones_expansion.yml
+--------------------------------------------------------------------------------------------+
|                                          Packages                                          |
+--------------+---------+------------------------+------------------------------------------+
| Repository   | Version | Release                | Md5                                      |
+--------------+---------+------------------------+------------------------------------------+
| oceanbase-ce | 3.1.4   | 103000102023020719.el7 | 95c5c0f24db6b4b4f9b5a75f364c59c78a46b500 |
+--------------+---------+------------------------+------------------------------------------+
Repository integrity check ok
Parameter check ok
Cluster status check ok
Initializes observer work home ok
Remote oceanbase-ce-3.1.4-103000102023020719.el7-95c5c0f24db6b4b4f9b5a75f364c59c78a46b500 repository install ok
Remote oceanbase-ce-3.1.4-103000102023020719.el7-95c5c0f24db6b4b4f9b5a75f364c59c78a46b500 repository lib check !!
Try to get lib-repository
Remote oceanbase-ce-libs-4.2.0.0-100000152023080109.el7-6368f1d3c05f9add8c11d0c9c3b87a2fac2055b1 repository install ok
Remote oceanbase-ce-3.1.4-103000102023020719.el7-95c5c0f24db6b4b4f9b5a75f364c59c78a46b500 repository lib check ok
obce-3zones_2 deployed
Trace ID: 783523b2-31bf-11ee-a887-0050568d699e
If you want to view detailed obd logs, please run: obd display-trace 783523b2-31bf-11ee-a887-0050568d699e

查看cluster list

$ obd cluster list
+--------------------------------------------------------------------------+
|                               Cluster List                               |
+---------------+----------------------------------------+-----------------+
| Name          | Configuration Path                     | Status (Cached) |
+---------------+----------------------------------------+-----------------+
| obce-3zones   | /home/admin/.obd/cluster/obce-3zones   | running         |
| obagent       | /home/admin/.obd/cluster/obagent       | running         |
| obce-3zones_2 | /home/admin/.obd/cluster/obce-3zones_2 | deployed        |
+---------------+----------------------------------------+-----------------+

修改原集群的config.yaml

$ grep -v '#' .obd/cluster/obce-3zones/config.yaml 
user:
  username: admin
  key_file: /home/admin/.ssh/id_rsa.pub
oceanbase-ce:
  version: 3.1.4
  servers:
  - name: server151
    ip: 172.118.81.151
  - name: server152
    ip: 172.118.81.152
  - name: server153
    ip: 172.118.81.153
  - name: server31
    ip: 10.192.45.31
  - name: server34
    ip: 10.192.45.34
  - name: server35
    ip: 10.192.45.35
  global:
    devname: ens192
    cluster_id: 20230720
    cpu_count: 12
    memory_limit: 30G
    system_memory: 6G
    syslog_level: WARN
    enable_syslog_wf: false
    enable_syslog_recycle: true
    max_syslog_file_count: 20
    appname: obce-3zones
    root_password: cloudadmin
    proxyro_password: proxyro
  server151:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone1
    datafile_size: 150G
    log_disk_size: 100G
  server152:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone2
    datafile_size: 150G
    log_disk_size: 100G
  server153:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone3
    datafile_size: 150G
    log_disk_size: 100G
  server31:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone1
    datafile_size: 100G
    log_disk_size: 100G
  server34:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone2
    datafile_size: 100G
    log_disk_size: 100G
  server35:
    mysql_port: 2881
    rpc_port: 2882
    home_path: /home/admin/oceanbase-ce
    data_dir: /data/obce
    redo_dir: /redo/obce
    zone: zone3
    datafile_size: 100G
    log_disk_size: 100G
obproxy-ce:
  version: 3.2.3
  servers:
  - 172.118.81.151
  - 172.118.81.156
  depends:
  - oceanbase-ce
  global:
    listen_port: 2883
    prometheus_listen_port: 2884
    home_path: /home/admin/obproxy
    enable_cluster_checkout: false
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false
    obproxy_sys_password: alPxVNMayB

执行obd cluster start obce-3zones后,新的三节点observer却没有加入到集群中

$ obd cluster start obce-3zones
Get local repositories ok
Search plugins ok
Open ssh connection ok
Load cluster param plugin ok
Cluster status check ok
Check before start observer ok
[WARN] OBD-1011: (10.192.45.31) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)
[WARN] OBD-1011: (10.192.45.34) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)
[WARN] OBD-1011: (10.192.45.35) The recommended value of fs.aio-max-nr is 1048576 (Current value: 65536)

Check before start obproxy ok
Start observer ok
observer program health check ok
Connect to observer ok
Start obproxy ok
obproxy program health check ok
Connect to obproxy ok
Initialize obproxy-ce ok
Wait for observer init ok
+--------------------------------------------------+
|                     observer                     |            # observer显示还是三台节点151~153,应该显示6个节点,包括31,34,35
+----------------+---------+------+-------+--------+
| ip             | version | port | zone  | status |
+----------------+---------+------+-------+--------+
| 172.118.81.151 | 3.1.4   | 2881 | zone1 | active |
| 172.118.81.152 | 3.1.4   | 2881 | zone2 | active |
| 172.118.81.153 | 3.1.4   | 2881 | zone3 | active |
+----------------+---------+------+-------+--------+
obclient -h172.118.81.151 -P2881 -uroot -p'cloudadmin' -Doceanbase -A

+--------------------------------------------------+
|                     obproxy                      |
+----------------+------+-----------------+--------+
| ip             | port | prometheus_port | status |
+----------------+------+-----------------+--------+
| 172.118.81.151 | 2883 | 2884            | active |
| 172.118.81.156 | 2883 | 2884            | active |
+----------------+------+-----------------+--------+
obclient -h172.118.81.151 -P2883 -uroot -p'cloudadmin' -Doceanbase -A
obce-3zones running
Trace ID: b1bc2ad2-31c3-11ee-ae18-0050568d699e
If you want to view detailed obd logs, please run: obd display-trace b1bc2ad2-31c3-11ee-ae18-0050568d699e

PS:因为机器配置不同,把datafile_size和log_disk_size两个配置项放到了server里。

【复现路径】
【问题现象及影响】

【附件】

我看这个视频5.5 使用 OBD 扩容里有介绍obd扩容。视频网址是https://www.oceanbase.com/video/5900009,是因为我所使用的obd或observer版本导致的么?

稍等 我了解一下回复你

视频里面有遗漏的步骤,已经配置文件路径没有说明。你可以参考这个文档 部署一下 我刚才根据文档是部署成功的 https://www.oceanbase.com/docs/community-observer-cn-10000000001879774

可以了,谢谢@谐云 。执行完“obd cluster start obce-3zones”后,需要连接到OB中执行以下sql语句

alter system add server '10.192.45.31:2882' zone 'zone1';
alter system add server '10.192.45.34:2882' zone 'zone2';
alter system add server '10.192.45.35:2882' zone 'zone3';

执行sql完后,查看__all_server表

obclient [oceanbase]> select svr_ip,id,zone,status from __all_server order by zone;
+----------------+----+-------+--------+
| svr_ip         | id | zone  | status |
+----------------+----+-------+--------+
| 10.192.45.31   |  4 | zone1 | active |
| 172.118.81.151 |  1 | zone1 | active |
| 10.192.45.34   |  5 | zone2 | active |
| 172.118.81.152 |  2 | zone2 | active |
| 10.192.45.35   |  6 | zone3 | active |
| 172.118.81.153 |  3 | zone3 | active |
+----------------+----+-------+--------+