obd扩容节点问题

【产品名称】oceanbase社区版

【产品版本】ob-deploy-1.2.1 oceanbase-ce-3.1.2

【问题描述】obd扩容集群失败1-1-1扩容到2-2-2

三台服务器通过obd部署集群obce-3zones为1-1-1模式,通过obd扩容为2-2-2,服务器数量不变,采用不同的端口启动observer。

1、修改配置文件添加新的observer配置不同的目录及端口

[admin@obd soft]$ obd cluster deploy obce-3zones-2 -c obce-3zone-addserver.yaml

Update OceanBase-community-stable-el7 ok

Update OceanBase-development-kit-el7 ok

oceanbase-ce-3.1.2 already installed.

±------------------------------------------------------------------------------------------+

| Packages |

±-------------±--------±----------------------±-----------------------------------------+

| Repository | Version | Release | Md5 |

±-------------±--------±----------------------±-----------------------------------------+

| oceanbase-ce | 3.1.2 | 10000392021123010.el7 | 7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed |

±-------------±--------±----------------------±-----------------------------------------+

Repository integrity check ok

Parameter check ok

Open ssh connection ok

Remote oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed repository install ok

Remote oceanbase-ce-3.1.2-7fafba0fac1e90cbd1b5b7ae5fa129b64dc63aed repository lib check ok

Cluster status check ok

Initializes observer work home ok

obce-3zones-2 deployed

[admin@obd soft]$ obd cluster list

±-------------------------------------------------------------------------+

| Cluster List |

±--------------±---------------------------------------±----------------+

| Name | Configuration Path | Status (Cached) |

±--------------±---------------------------------------±----------------+

| obce-3zones | /home/admin/.obd/cluster/obce-3zones | running |

| obagent-only | /home/admin/.obd/cluster/obagent-only | running |

| obce-3zones-2 | /home/admin/.obd/cluster/obce-3zones-2 | deployed |

±--------------±---------------------------------------±----------------+

2、添加部署的ob到原有集群

修改/home/admin/.obd/cluster/obce-3zones

vi config.yaml

重启集群

[admin@obd ~]$ obd cluster restart obce-3zones

Get local repositories and plugins ok

Open ssh connection ok

Stop observer ok

Stop obproxy ok

obce-3zones stopped

Get local repositories and plugins ok

Open ssh connection ok

Load cluster param plugin ok

Check before start observer ok

Check before start obproxy ok

Start observer ok

observer program health check ok

Connect to observer ok

Wait for observer init ok

±-------------------------------------------------+

| observer |

±---------------±--------±-----±------±-------+

| ip | version | port | zone | status |

±---------------±--------±-----±------±-------+

| 172.18.153.211 | 3.1.2 | 2881 | zone1 | active |

| 172.18.153.212 | 3.1.2 | 2881 | zone2 | active |

| 172.18.153.213 | 3.1.2 | 2881 | zone3 | active |

±---------------±--------±-----±------±-------+

Start obproxy ok

obproxy program health check ok

Connect to obproxy ok

±-------------------------------------------------+

| obproxy |

±---------------±-----±----------------±-------+

| ip | port | prometheus_port | status |

±---------------±-----±----------------±-------+

| 172.18.153.211 | 2883 | 2884 | active |

| 172.18.153.212 | 2883 | 2884 | active |

| 172.18.153.213 | 2883 | 2884 | active |

±---------------±-----±----------------±-------+

obce-3zones running

observer进程信息

[root@observer01 data]# ps -ef|grep obser

admin 12811 1 98 16:15 ? 00:05:09 /home/admin/oceanbase-ce/bin/observer -r 172.18.153.211:2882:2881;172.18.153.212:2882:2881;172.18.153.213:2882:2881 -o __min_full_resource_pool_memory=268435456,memory_limit=8G,system_memory=3G,stack_size=512K,cpu_count=16,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=4,major_freeze_duty_time=Disable,minor_freeze_times=10,enable_separate_sys_clog=0,enable_merge_by_turn=False,datafile_size=20G,enable_syslog_wf=False,enable_syslog_recycle=True,max_syslog_file_count=10 -z zone1 -p 2881 -P 2882 -n obce-3zones -c 2 -d /data/ob01 -i ens192 -l error

admin 12955 1 53 16:15 ? 00:02:48 /home/admin/oceanbase-ce02/bin/observer -r 172.18.153.211:2882:2881;172.18.153.212:2882:2881;172.18.153.213:2882:2881 -o __min_full_resource_pool_memory=268435456,memory_limit=8G,system_memory=3G,stack_size=512K,cpu_count=16,cache_wash_threshold=1G,workers_per_cpu_quota=10,schema_history_expire_time=1d,net_thread_count=4,major_freeze_duty_time=Disable,minor_freeze_times=10,enable_separate_sys_clog=0,enable_merge_by_turn=False,datafile_size=20G,enable_syslog_wf=False,enable_syslog_recycle=True,max_syslog_file_count=10 -z zone1 -p 3881 -P 3882 -n obce-3zones -c 2 -d /data/ob04 -i ens192 -l error

root 14887 24460 0 16:20 pts/0 00:00:00 grep --color=auto obser

[root@observer01 data]# netstat -ntlp

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name

tcp 0 0 0.0.0.0:3881 0.0.0.0:* LISTEN 12955/observer

tcp 0 0 0.0.0.0:3882 0.0.0.0:* LISTEN 12955/observer

tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6915/sshd

tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 7441/master

tcp 0 0 0.0.0.0:2881 0.0.0.0:* LISTEN 12811/observer

tcp 0 0 0.0.0.0:2882 0.0.0.0:* LISTEN 12811/observer

tcp 0 0 0.0.0.0:2883 0.0.0.0:* LISTEN 14229/obproxy

tcp 0 0 0.0.0.0:2884 0.0.0.0:* LISTEN 14229/obproxy

tcp6 0 0 :::22 :::* LISTEN 6915/sshd

tcp6 0 0 :::8088 :::* LISTEN 9636/monagent

tcp6 0 0 ::1:25 :::* LISTEN 7441/master

tcp6 0 0 :::8089 :::* LISTEN 9636/monagent

oceanbase-yaml.zip (3451 KB)

MySQL [oceanbase]> select svr_ip,svr_port,id,zone,status from __all_server;

+----------------+----------+----+-------+--------+

| svr_ip     | svr_port | id | zone | status |

+----------------+----------+----+-------+--------+

| 172.18.153.211 |   2882 | 1 | zone1 | active |

| 172.18.153.212 |   2882 | 2 | zone2 | active |

| 172.18.153.213 |   2882 | 3 | zone3 | active |

+----------------+----------+----+-------+--------+

3 rows in set (0.002 sec)


MySQL [oceanbase]> alter system add server '172.18.152.211:3882' zone 'zone1';

ERROR 4012 (HY000): Timeout

MySQL [oceanbase]> 

MySQL [oceanbase]> alter system add server '172.18.153.211:3882' zone 'zone1';

ERROR 4179 (HY000): add non-empty server not allowed

MySQL [oceanbase]> alter system add server '172.18.153.212:3882' zone 'zone2';

ERROR 4179 (HY000): add non-empty server not allowed

登陆集群无法添加observer到对应的zone中?

alter system add server 'ip:rpc_port' zone 'zone_name';

端口写错了,写成了mysql_port

MySQL [oceanbase]> alter system add server '172.18.153.212:3881' zone 'zone2';

ERROR 4012 (HY000): Timeout


我又重新部署了一遍,之前目录感觉有问题,但是还不行,新的yaml配置见附件

更新的配置yaml config.zip (1869 KB)

另外rpc_port应该就是3882

但RS上获取日志。

使用

select * from __all_server where with_rootserver = 1;
 查看RS。

使用grep 'alter system add server ' 定位日志,提取 trace ID,在使用trace ID获取该sql全部日志

MySQL [oceanbase]> select * from __all_server where with_rootserver = 1;

+----------------------------+----------------------------+----------------+----------+----+-------+------------+-----------------+--------+-----------------------+----------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+

| gmt_create         | gmt_modified        | svr_ip     | svr_port | id | zone | inner_port | with_rootserver | status | block_migrate_in_time | build_version                                     | stop_time | start_service_time | first_sessid | with_partition | last_offline_time |

+----------------------------+----------------------------+----------------+----------+----+-------+------------+-----------------+--------+-----------------------+----------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+

| 2022-02-08 17:25:04.095346 | 2022-02-08 18:01:35.941497 | 172.18.153.211 |   2882 | 1 | zone1 |    2881 |        1 | active |           0 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |     0 |  1644314487457117 |      0 |       1 |         0 |

+----------------------------+----------------------------+----------------+----------+----+-------+------------+-----------------+--------+-----------------------+----------------------------------------------------------------------------------------+-----------+--------------------+--------------+----------------+-------------------+

1 row in set (0.005 sec)

能具体点吗?刚接触ob 不太清楚日志体系,如何查看?谢谢

测试直连observer的3881报observer处于初始化

[admin@observer01 log]$ obclient -h172.18.153.211 -uroot@sys -P3881 -p0EI5N08d -c -A oceanbase

ERROR 8001 (08004): Server is initializing