单副本多节点集群,创建的租户只能连接某一个节点,连接其他节点报错租户不存在

  1. 使用 OBD 安装集群,zone 数量1,节点数量4,使用最大占用模式
    进程启动参数:/opt/ob/myoceanbase6/oceanbase/bin/observer -r 10.1.1.82:26001:26000 -p 26000 -P 26001 -z zone1 -n myoceanbase6 -c 1710482739 -d /opt/ob/myoceanbase6/oceanbase/store -I 10.1.1.82 -o __min_full_resource_pool_memory=2147483648,datafile_size=60GB,log_disk_size=60GB,enable_syslog_recycle=True,memory_limit=128GB,cpu_count=16,enable_syslog_wf=False,max_syslog_file_count=4,system_memory=24G


  2. 使用OCP express 创建租户,租户信息如下

  3. 使用obclient 连接节点,只能连接 156节点,其它节点则报错,报错信息如下


这个现象是对的,你创建的租户unit_num是1,如果你要单zone,4个observer节点都有这个业务租户的话,你要把租户的unit数量设置成4

1 个赞

这个相当于其他节点没有租户数据

1 个赞

租户路由是要求所连接的 OBServer 上必须有租户的数据(资源单元)。
单副本4节点的情况下,你同样需要部署一个 OBProxy,应用连接 OBProxy的2883端口,OBProxy 会把你的请求路由到正确的 OBServer 节点上。

1 个赞

目前单个observer 的总资源是 16 核 128G,日志盘60G,数据盘60G;总共 1个 zone,4个节点;当配置单个 unit 为 12核,内存 15G,总共 4个unit,创建租户时会报错资源不够。报错如下。请问其他租户分配了资源会导致资源不足吗?

2024-03-18 06:38:39.123  INFO 3408 --- [pool-manual-subtask-executor14,6b1790fd781e4524,cb409c782051] c.o.o.t.engine.runner.JavaSubtaskRunner  : Run subtask, id=41, context=Context{parallelIdx=-1, stringMap={tenant_name=tenant4, ob_tenant_parameter_map=, prohibit_rollback=false, task_instance_id=13, task_operation=execute, whitelist=%, target_tenant_status=NORMAL, old_password=******, new_password=******, tenant_mode=MYSQL, system_variable_map=, create_tenant_param_json={"charset":"utf8mb4","collation":"utf8mb4_general_ci","mode":"MYSQL","name":"tenant4","parameters":[],"primaryZone":"zone1","rootPassword":"******","whitelist":"%","zones":[{"name":"zone1","replicaType":"FULL","resourcePool":{"unitCount":4,"unitSpec":{"cpuCore":12.00,"memoryBytes":16106127360,"memorySize":15}}}]}, latest_execution_start_time=2024-03-18T06:38:39.115Z, sub_task_instance_id=41}, listMap={}}, executor=10.1.1.82

2024-03-18 06:38:39.125  INFO 3408 --- [pool-manual-subtask-executor14,6b1790fd781e4524,cb409c782051] c.o.o.o.i.tenant.task.CreateTenantTask   : begin create tenant, param=CreateTenantParam(name=tenant4, mode=MYSQL, primaryZone=zone1, charset=utf8mb4, collation=utf8mb4_general_ci, description=null, whitelist=%, timeZone=null, rootPassword=******, zones=[CreateTenantParam.ZoneParam(name=zone1, replicaType=FULL, resourcePool=CreateTenantParam.PoolParam(unitSpec=UnitSpecParam(cpuCore=12.0, memorySize=15), unitCount=4))], parameters=[])

2024-03-18 06:38:39.205  WARN 3408 --- [pool-manual-subtask-executor14,6b1790fd781e4524,cb409c782051] c.o.o.task.engine.runner.RunnerFactory   : Execute task failed, subtask=SubtaskInstanceOverview{id=41, name=Create ob tenant, state=FAILED, operation=EXECUTE, className=com.oceanbase.ocp.obops.internal.tenant.task.CreateTenantTask, seriesId=1, startTime=2024-03-18T06:38:39.116Z, endTime=null}

com.oceanbase.ocp.obsdk.exception.OceanBaseException: (conn=3221518068) zone 'zone1' resource not enough to hold 4 unit. You can check resource info by views: DBA_OB_UNITS, GV$OB_UNITS, GV$OB_SERVERS.
server '"10.1.1.156:26001"' CPU resource not enough
server '"10.1.1.82:26001"' LOG_DISK resource not enough

重新安装集群后,创建4 unit(单unit 12 核 12G)资源的租户没问题,该租户可以连接4个节点。

分析“资源不足”类问题思路是:

  • 当前集群节点总共有多少资源可用?
  • 当前集群/租户实际使用资源情况?

4.2 版本可以跑下面 SQL:

select zone,concat(SVR_IP,':',SVR_PORT) observer,
	cpu_capacity_max cpu_total,cpu_assigned_max cpu_assigned,
	cpu_capacity-cpu_assigned_max as cpu_free,
	round(memory_limit/1024/1024/1024,2) as memory_total,
	round((memory_limit-mem_capacity)/1024/1024/1024,2) as system_memory,
	round(mem_assigned/1024/1024/1024,2) as mem_assigned,
	round((mem_capacity-mem_assigned)/1024/1024/1024,2) as memory_free,
	round(log_disk_capacity/1024/1024/1024,2) as log_disk_capacity,
	round(log_disk_assigned/1024/1024/1024,2) as log_disk_assigned,
	round((log_disk_capacity-log_disk_assigned)/1024/1024/1024,2) as log_disk_free,
	round((data_disk_capacity/1024/1024/1024),2) as data_disk,
	round((data_disk_in_use/1024/1024/1024),2) as data_disk_used,
	round((data_disk_capacity-data_disk_in_use)/1024/1024/1024,2) as data_disk_free
from gv$ob_servers;

select t1.name resource_pool_name, t2.`name` unit_config_name, 
	t2.max_cpu, t2.min_cpu, 
	round(t2.memory_size/1024/1024/1024,2) mem_size_gb,
	round(t2.log_disk_size/1024/1024/1024,2) log_disk_size_gb, t2.max_iops, 
	t2.min_iops, t3.unit_id, t3.zone, concat(t3.svr_ip,':',t3.`svr_port`) observer,
	t4.tenant_id, t4.tenant_name
from __all_resource_pool t1
	join __all_unit_config t2 on (t1.unit_config_id=t2.unit_config_id)
	join __all_unit t3 on (t1.`resource_pool_id` = t3.`resource_pool_id`)
	left join __all_tenant t4 on (t1.tenant_id=t4.tenant_id)
order by t1.`resource_pool_id`, t2.`unit_config_id`, t3.unit_id;

好的,非常感谢。