为啥__all_virtual_unit 和 GV$OB_SERVERS 查出来的结果不一样?

【 使用环境 】测试环境
【 OB or 其他组件 】observer
【 使用版本 】4.3.0
【问题描述】

obclient [oceanbase]> select svr_ip,SVR_port,cast(data_disk_in_use/1024/1024/1024 as DECIMAL(15,2)) from GV$OB_SERVERS;
+---------------+----------+--------------------------------------------------------+
| svr_ip        | SVR_port | cast(data_disk_in_use/1024/1024/1024 as DECIMAL(15,2)) |
+---------------+----------+--------------------------------------------------------+
| 172.16.130.38 |     2882 |                                                  13.50 |
| 172.16.130.37 |     2882 |                                                  13.27 |
+---------------+----------+--------------------------------------------------------+
2 rows in set (0.004 sec)


select a.svr_ip,a.svr_port,a.tenant_id,b.tenant_name, CAST(a.data_disk_in_use/1024/1024/1024 as DECIMAL(15,2)) data_disk_use_G, CAST(a.log_disk_size/1024/1024/1024 as DECIMAL(15,2)) log_disk_size, CAST(a.log_disk_in_use/1024/1024/1024 as DECIMAL(15,2)) log_disk_use_G, sum(CAST(a.data_disk_in_use/1024/1024/1024 as DECIMAL(15,2))) data_disk_in_use_G,1 from __all_virtual_unit a,dba_ob_tenants b where a.tenant_id=b.tenant_id and tenant_type='USER'  group by svr_ip,svr_port;
+---------------+----------+-----------+-------------+-----------------+---------------+----------------+--------------------+---+
| svr_ip        | svr_port | tenant_id | tenant_name | data_disk_use_G | log_disk_size | log_disk_use_G | data_disk_in_use_G | 1 |
+---------------+----------+-----------+-------------+-----------------+---------------+----------------+--------------------+---+
| 172.16.130.37 |     2882 |      1002 | pyhmy       |            8.48 |          5.40 |           3.14 |              16.15 | 1 |
| 172.16.130.38 |     2882 |      1002 | pyhmy       |            8.48 |          5.40 |           3.14 |              16.14 | 1 |
+---------------+----------+-----------+-------------+-----------------+---------------+----------------+--------------------+---+
2 rows in set (0.008 sec)

如题所示,通过GV$OB_SERVER 查出data_in_use 是13.50G,然后我通过__all_virtual_unit 把所有的租户数据相加就有16G,

就算加上meta 租户的数据后就有17G

obclient [oceanbase]> select SVR_IP,SVR_PORT,sum(cast(DATA_DISK_IN_USE/1024/1024/1024 as DECIMAL(15,2))) data_disk_in_use_G from GV$OB_UNITS group by SVR_IP,SVR_PORT;
+---------------+----------+--------------------+
| SVR_IP        | SVR_PORT | data_disk_in_use_G |
+---------------+----------+--------------------+
| 172.16.130.37 |     2882 |              17.50 |
| 172.16.130.38 |     2882 |              17.53 |
+---------------+----------+--------------------+
2 rows in set (0.003 sec)

这个对比是哪里不一样?有大佬知道的吗?

1 个赞

GV$OB_SERVER是动态视图,__all_virtual_unit是静态表

大佬,那这个有可能会对齐不?或者有什么规则会触发对齐操作的?收集统计信息可以不?

show create table GV$OB_SERVERS\G
*************************** 1. row ***************************
View: GV$OB_SERVERS
Create View: CREATE VIEW GV$OB_SERVERS AS SELECT SVR_IP, SVR_PORT, ZONE, SQL_PORT, CPU_CAPACITY, CPU_CAPACITY_MAX, CPU_ASSIGNED, CPU_ASSIGNED_MAX, MEM_CAPACITY, MEM_ASSIGNED, LOG_DISK_CAPACITY, LOG_DISK_ASSIGNED, LOG_DISK_IN_USE, DATA_DISK_CAPACITY, DATA_DISK_ASSIGNED, DATA_DISK_IN_USE, DATA_DISK_HEALTH_STATUS, MEMORY_LIMIT, DATA_DISK_ALLOCATED, (CASE WHEN data_disk_abnormal_time > 0 THEN usec_to_time(data_disk_abnormal_time) ELSE NULL END) AS DATA_DISK_ABNORMAL_TIME, (CASE WHEN ssl_cert_expired_time > 0 THEN usec_to_time(ssl_cert_expired_time) ELSE NULL END) AS SSL_CERT_EXPIRED_TIME FROM oceanbase.__all_virtual_server
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
1 row in set (0.003 sec)

数据来源仍然是一个静态表啊 __all_virtual_server

obclient [oceanbase]> show create table GV$OB_UNITS\G
*************************** 1. row ***************************
View: GV$OB_UNITS
Create View: CREATE VIEW GV$OB_UNITS AS SELECT SVR_IP, SVR_PORT, UNIT_ID, TENANT_ID, ZONE, ZONE_TYPE, REGION, MAX_CPU, MIN_CPU, MEMORY_SIZE, MAX_IOPS, MIN_IOPS, IOPS_WEIGHT, MAX_NET_BANDWIDTH, NET_BANDWIDTH_WEIGHT, LOG_DISK_SIZE, LOG_DISK_IN_USE, DATA_DISK_SIZE, DATA_DISK_IN_USE, STATUS, usec_to_time(create_time) AS CREATE_TIME FROM oceanbase.__all_virtual_unit
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
1 row in set (0.005 sec)

我对您的第三个图 17G的那个有点疑问 , 您分析下视图 数据来源都是表 __all_virtual_unit ,差别怎么那么大呢?? gv$ob_units;

您看到在仔细看下。

这个我是加了租户类型过滤,把meta 租户的数据过滤掉了,下面的GV$OB_UNITS磁盘使用量是加上租户的meta 数据的,所以__all_virtual_unit 和 gv$ob_units 之间有点差异,其实我比较好奇的是GV$OB_SERVERS 和 __all_virtual_unit 或者gv$ob_units 之间为啥有差异,按理来说租户的磁盘使用总和应该是等于集群server 的磁盘使用总和吧

__all_virtual_unit中去掉meta租户后求和确实比__all_virtual_server小点,我查看了这两个虚拟视图的源码,这里的data_disk_in_use字段没看出区别,我咨询下内核的老师,有进展回复你

好的好的,感谢

我这边查询出来的unit 表某个节点上的data_disk_in_use 总量是比 observer 视图查出来的data_disk_in_use 总量要大,不管是包括还是不包括meta信息,这个很奇怪,两个理解起来应该是对等的

内核这边分析代码给出结论:unit里的统计没有考虑宏块重用,如果一个宏块在多个sstable间重用了,会被统计多次,server查出来的更准确。

1 个赞

:+1:感谢大佬

对了,这块会有优化计划不?

还没有计划

好吧,多谢