OB 租户扩容副本数

本节主要分享 OB 3.2 下的租户副本扩容运维过程,这依然是 OB 弹性伸缩最有趣的功能。

问题现象

租户的扩容既可以通过扩大副本的资源单元规格,也可以通过扩大副本数来实现。
这里目的是将租户从单副本扩容到三副本。

默认的租户资源规格的内存都比较大。 需要自定义一个小内存的资源规格。

但是 OCP 这个页面显然是有问题的,新增过的资源单元规格没有展示出来。
反复重试多次都是这样。

问题解决

直接采取在命令行下手动对租户扩容。

  • 查看集群可用资源
select a.zone,concat(a.svr_ip,':',a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free
, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024,2) mem_free_gb
, round(disk_total/1024/1024/1024) disk_total_gb, round((disk_total-disk_assigned)/1024/1024/1024) disk_free_gb
from __all_virtual_server_stat a join __all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port)
order by a.zone, a.svr_ip
;
zone observer cpu_total cpu_free mem_total_gb mem_free_gb disk_total_gb disk_free_gb
zone1 10.0.0.61:2882 14 2 8 0 100 -1,000
ZONE2 10.0.0.62:2882 14 14 8 8 100 100
ZONE3 10.0.0.63:2882 14 14 8 8 100 100
  • 查看现有租户资源分布。
select t1.name resource_pool_name,  t2.`name` unit_config_name, t2.max_cpu, t2.min_cpu
, round(t2.max_memory/1024/1024/1024) max_mem_gb, round(t2.min_memory/1024/1024/1024) min_mem_gb
, round(t2.max_disk_size/1024/1024/1024) max_disk_size , t4.tenant_name
from __all_resource_pool t1 join __all_unit_config t2 on (t1.unit_config_id=t2.unit_config_id)
    join __all_unit t3 on (t1.`resource_pool_id` = t3.`resource_pool_id`)
    left join __all_tenant t4 on (t1.tenant_id=t4.tenant_id)
order by t1.`resource_pool_id`, t2.`unit_config_id`, t3.unit_id
;
resource_pool_name unit_config_name max_cpu min_cpu max_mem_gb min_mem_gb max_disk_size tenant_name
sys_pool sys_unit_config 5 5 4 4 100 sys
pool_oboracle_zone1_neq config_oboracle_zone1_S1_neq 4 4 1 1 500 oboracle
pool_obmysql_zone1_uvw config_obmysql_zone1_S1_uvw 3 3 3 3 500 obmysql
  • 在新节点上创建资源单元
create resource unit unit_3c3g max_cpu=3,min_cpu=3,max_memory='3G',min_memory='3G',max_iops=10000,min_iops=1000,max_session_num=100000,max_disk_size='200G';
create resource pool pool_obmysql_zone2 unit='unit_3c3g' , unit_num=1, zone_list=('ZONE2');
create resource pool pool_obmysql_zone3 unit='unit_3c3g' , unit_num=1, zone_list=('ZONE3');
  • 租户 obmysql 扩大资源池

注意,租户资源池的扩容必须一个个的加。

ALTER tenant obmysql resource_pool_list=('pool_obmysql_zone1_uvw','pool_obmysql_zone2');
ALTER tenant obmysql resource_pool_list=('pool_obmysql_zone1_uvw','pool_obmysql_zone2','pool_obmysql_zone3');

查看集群最近的变更事件。

select gmt_create , module , event, name1 , value1 , name2, value2, name3, value3 , rs_svr_ip 
from `__all_rootservice_event_history` areh 
WHERE 1=1
order by gmt_create  desc limit 10;
gmt_create module event name1 value1 name2 value2 name3 value3 rs_svr_ip
2023-06-08 13:10:11.162 root_service create_resource_pool ret 0 arg pool_name:pool_obmysql_zone3, unit:unit_3c3g, unit_num:1, zone_list:[ZONE3], replica_type:0, if_not_exist:false, is_tenant_sys_pool:false 10.0.0.61
2023-06-08 13:10:11.162 unit create_resource_pool name pool_obmysql_zone3 unit unit_3c3g zone_list [ZONE3] 10.0.0.61
2023-06-08 13:10:11.161 unit create_unit unit_id 1005 server 10.0.0.63:2882 10.0.0.61
2023-06-08 13:10:08.455 root_service create_resource_pool ret 0 arg pool_name:pool_obmysql_zone2, unit:unit_3c3g, unit_num:1, zone_list:[ZONE2], replica_type:0, if_not_exist:false, is_tenant_sys_pool:false 10.0.0.61
2023-06-08 13:10:08.455 unit create_resource_pool name pool_obmysql_zone2 unit unit_3c3g zone_list [ZONE2] 10.0.0.61
2023-06-08 13:10:08.453 unit create_unit unit_id 1004 server 10.0.0.62:2882 10.0.0.61
2023-06-08 13:08:47.096 root_service create_resource_unit ret 0 arg unit_name:unit_3c3g, min_cpu:3.000000000000000000e+00, min_iops:1000, min_memory:3221225472, max_cpu:3.000000000000000000e+00, max_memory:3221225472, max_iops:10000, max_disk_size:214748364800, max_session_num:100000, if_not_exist:false 10.0.0.61
2023-06-08 13:08:47.095 unit create_resource_unit name unit_3c3g 10.0.0.61
2023-06-08 12:53:07.465 root_service alter_resource_unit ret 0 arg unit_name:config_obmysql_zone1_S1_uvw, min_cpu:3.000000000000000000e+00, min_iops:0, min_memory:0, max_cpu:3.000000000000000000e+00, max_memory:0, max_iops:0, max_disk_size:0, max_session_num:0 10.0.0.61
2023-06-08 12:53:07.465 unit alter_resource_unit name config_obmysql_zone1_S1_uvw old_config {unit_config_id:1002, name:config_obmysql_zone1_S1_uvw, max_cpu:3.000000000000000000e+00, min_cpu:3.000000000000000000e+00, max_memory:3221225472, min_memory:3221225472, max_disk_size:536870912000, max_iops:1250, min_iops:1250, max_session_num:375} new_config {unit_config_id:1002, name:config_obmysql_zone1_S1_uvw, max_cpu:3.000000000000000000e+00, min_cpu:3.000000000000000000e+00, max_memory:3221225472, min_memory:3221225472, max_disk_size:536870912000, max_iops:1250, min_iops:1250, max_session_num:375} 10.0.0.61

查看 OCP 里租户资源池的变化。

此时还只是租户资源池发生变化,租户的数据分布并没有变,副本数也没有变。

  • 扩容租户的副本数
ALTER tenant obmysql locality='FULL{1}@zone1, FULL{1}@ZONE2';

此时,租户开始从单副本扩容到 2 副本,查看集群最近的事件,可以看到有很多的增加分区的事件(start_add_replicafinish_add_replica

只有扩容完毕后才能继续扩容到 3 副本。否则会报错:SQL 错误 [4179] [HY000]: alter tenant locality when previous operation is in progress not allowed

查看扩容结果。

SELECT job_id, gmt_create ,gmt_modified ,job_type ,job_status ,return_code ,progress ,tenant_id ,tenant_name ,rs_svr_ip ,sql_text  FROM `__all_rootservice_job` arj ;
job_id gmt_create gmt_modified job_type job_status return_code progress tenant_id tenant_name rs_svr_ip sql_text
1 2023-06-08 13:19:07.838 2023-06-08 13:22:02.912 ALTER_TENANT_LOCALITY SUCCESS 0 100 1,002 obmysql 10.0.0.61 /* ApplicationName=DBeaver 23.1.0 - SQLEditor <Script-3.sql> */ ALTER tenant obmysql locality=‘FULL{1}@zone1, FULL{1}@ZONE2
  • 继续从 2副本 扩容到 3副本。
ALTER tenant obmysql locality='FULL{1}@zone1, FULL{1}@ZONE2, FULL{1}@ZONE3';

报错了:SQL 错误 [1235] [0A000]: tenant primary zone span regions when GTS is on not supported
这个原因是默认建的租户 obmysql 的 PRIMARY_ZONE 是 RANDOM 。设置为 ZONE1 即可。

ALTER tenant obmysql PRIMARY_ZONE 'zone1';

继续执行,成功。

job_id gmt_create gmt_modified job_type job_status return_code progress tenant_id tenant_name rs_svr_ip sql_text
1 2023-06-08 13:19:07.838 2023-06-08 13:22:02.912 ALTER_TENANT_LOCALITY SUCCESS 0 100 1,002 obmysql 10.0.0.61 /* ApplicationName=DBeaver 23.1.0 - SQLEditor <Script-3.sql> */ ALTER tenant obmysql locality=‘FULL{1}@zone1, FULL{1}@ZONE2
2 2023-06-08 13:32:07.008 2023-06-08 13:35:14.477 ALTER_TENANT_LOCALITY SUCCESS 0 100 1,002 obmysql 10.0.0.61 /* ApplicationName=DBeaver 23.1.0 - SQLEditor <Script-3.sql> */ ALTER tenant obmysql locality=‘FULL{1}@zone1, FULL{1}@ZONE2, FULL{1}@ZONE3

  • 查看租户表分区分布
SELECT 
	t2.zone, ROLE,
	concat(t2.svr_ip, ':', t2.svr_port) observer,
	count(*) cnt, round(sum(t2.data_size) / 1024 / 1024) data_size_mb
FROM
	gv$table t1
	JOIN gv$partition t2 ON	(t1.tenant_id = t2.tenant_id AND t1.table_id = t2.table_id)
	LEFT JOIN __all_virtual_tablegroup t3 ON	(t1.tenant_id = t3.tenant_id	AND t1.tablegroup_id = t3.tablegroup_id)
 	JOIN gv$database t4 ON	(t1.tenant_Id = t4.tenant_id	AND t1.database_id = t4.database_id)
 	JOIN gv$tenant t5 ON	(t1.tenant_id = t5.tenant_id)
WHERE
	t5.tenant_id = 1002
	AND t4.database_name = 'TPCCDB'
GROUP BY ZONE, observer, role
ORDER BY ZONE, observer;
zone ROLE observer cnt data_size_mb
zone1 1 10.0.0.61:2882 199 4,121
ZONE2 2 10.0.0.62:2882 199 4,121
ZONE3 2 10.0.0.63:2882 199 4,121

调整一下租户 primary zone 的分布策略,再看看分区的分布。

zone ROLE observer cnt data_size_mb
zone1 1 10.0.0.61:2882 96 305
zone1 2 10.0.0.61:2882 103 3,815
ZONE2 1 10.0.0.62:2882 103 3,815
ZONE2 2 10.0.0.62:2882 96 305
ZONE3 2 10.0.0.63:2882 199 4,121

总结

租户扩容副本(单副本到三副本)主要分为租户资源池的扩容以及租户副本 locality 的扩容。租户内的表自动从单副本扩容到三副本。
如果是从三副本扩容到五副本,还需要对显示的将业务表的副本数从三副本扩容到五副本。详情可以参考: OceanBase 独立部署高级玩法三:副本数调整 (qq.com)

冷知识: 登录metadb 修改 config_properties里的ocp.operation.ob.tenant.allow-small-unit 为 ‘true’ 三分钟后就能展示新建的小规格unit。

2 个赞

验证了一下,确实如此。 :+1:

UPDATE ocp.config_properties SET value = 'true' WHERE `KEY` = 'ocp.operation.ob.tenant.allow-small-unit';

SELECT `KEY`, value, application ,profile ,description ,update_time  FROM ocp.config_properties cp 
WHERE `KEY` = 'ocp.operation.ob.tenant.allow-small-unit';
COMMIT;
  1. 修改 ocp 元数据库里配置信息
KEY value application profile description update_time
ocp.operation.ob.tenant.allow-small-unit true ocp default OB的版本大于等于2.0.0时,创建租户是否允许使用小Unit(小Unit指的是资源规格小于5G的Unit),默认为false,在特殊情况下,比如demo演示等非生产环境,可以将其改为true 2023-06-08 15:44:13.000

2. 重启 OCP 容器

docker restart ocp

  1. 新建租户,选择租户资源规格,看到了之前创建的小资源规格。

2 个赞