内存一直不释放

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
observer.log (18.7 MB)

【 使用版本 】4.3.4
【问题描述】清晰明确描述问题
内存不释放

6 个赞

遇到问题没思路的时候先拿一份敏捷诊断工具obdiag的巡检报告回来:https://www.oceanbase.com/docs/common-obdiag-cn-1000000002968718

2 个赞

free -g 看下呢

2 个赞

https://open.oceanbase.com/blog/19735872560

2 个赞

obdiag_check_report_observer_2025-06-04-10-18-05.txt (55.8 KB)

3 个赞

这个诊断日志,帮忙看下吧

2 个赞

帮忙看下有问题么,内存占用一直没见释放呢
±----------±--------------±------------------------------±---------------±---------------+
| TENANT_ID | SVR_IP | CTX_NAME | HOLD_GB | USED_GB |
±----------±--------------±------------------------------±---------------±---------------+
| 1004 | 192.168.0.175 | DEFAULT_CTX_ID | 8.049874112010 | 8.028451883233 |
| 1004 | 192.168.0.175 | DO_NOT_USE_ME | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | MEMSTORE_CTX_ID | 0.133712768555 | 0.133651591837 |
| 1004 | 192.168.0.175 | EXECUTE_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | TRANS_CTX_MGR_ID | 0.001358032227 | 0.001347303390 |
| 1004 | 192.168.0.175 | PLAN_CACHE_CTX_ID | 0.313062489032 | 0.292100946418 |
| 1004 | 192.168.0.175 | WORK_AREA | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | GLIBC | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | CO_STACK | 0.106224060059 | 0.106026470661 |
| 1004 | 192.168.0.175 | LIBEASY | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | LOGGER_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | KVSTORE_CACHE_ID | 0.044921875000 | 0.000000000000 |
| 1004 | 192.168.0.175 | META_OBJ_CTX_ID | 0.150337219238 | 0.135130271315 |
| 1004 | 192.168.0.175 | TX_CALLBACK_CTX_ID | 0.015147149563 | 0.014789342880 |
| 1004 | 192.168.0.175 | LOB_CTX_ID | 0.000007569789 | 0.000007390976 |
| 1004 | 192.168.0.175 | PS_CACHE_CTX_ID | 0.001343593001 | 0.001154348254 |
| 1004 | 192.168.0.175 | RPC_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | PKT_NIO | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | TX_DATA_TABLE | 0.005609691143 | 0.005539774895 |
| 1004 | 192.168.0.175 | STORAGE_LONG_TERM_META_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | MDS_DATA_ID | 0.010945916176 | 0.010687351227 |
| 1004 | 192.168.0.175 | MDS_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | SCHEMA_SERVICE | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | UNEXPECTED_IN_500 | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | MERGE_RESERVE_CTX_ID | 0.000000000000 | 0.000000000000 |
| 1004 | 192.168.0.175 | MERGE_NORMAL_CTX_ID | 0.000000000000 | 0.000000000000 |
±----------±--------------±------------------------------±---------------±---------------+

3 个赞

基本上是数据库运行资源使用,可以查一下第二个sql看看。OBserver的进程占用总内存20-30%不算高,系统的其他内存现在用做系统级别的缓存。

2 个赞

image

2 个赞

从你发回来的obdiag巡检报告中倒是没看到内存这块有啥异常的,不过有几个其他的高危风险得提醒下你

  1. avx指令看起来你的机器不支持? 服务器需支持 AVX 指令集,可执行 lscpu | grep Flags | grep avx 命令查看是否支持 AVX 指令集。

  2. ip:192.168.0.175 ,data_dir and log_dir_disk are on the same disk. 这个节点数据盘和日志盘同盘了,这个是不建议的部署方式,会影响性能;

2 个赞

用诊断工具分析一下内存看看,会生成一个html的内存使用的分析结果
obdiag analyze memory
https://www.oceanbase.com/docs/common-obdiag-cn-1000000002968721

2 个赞

remote_192_168_0_175.rar (9.3 MB)
帮忙看下,obdiag analyze memory生成的报告

3 个赞

show parameters where name in (‘memory_limit’,‘memory_limit_percentage’,‘system_memory’,‘log_disk_size’,‘log_disk_percentage’,‘datafile_size’,‘datafile_disk_percentage’);

select zone,concat(SVR_IP,’:’,SVR_PORT) observer,
cpu_capacity_max cpu_total,cpu_assigned_max cpu_assigned,
cpu_capacity-cpu_assigned_max as cpu_free,
round(memory_limit/1024/1024/1024,2) as memory_total,
round((memory_limit-mem_capacity)/1024/1024/1024,2) as system_memory,
round(mem_assigned/1024/1024/1024,2) as mem_assigned,
round((mem_capacity-mem_assigned)/1024/1024/1024,2) as memory_free,
round(log_disk_capacity/1024/1024/1024,2) as log_disk_capacity,
round(log_disk_assigned/1024/1024/1024,2) as log_disk_assigned,
round((log_disk_capacity-log_disk_assigned)/1024/1024/1024,2) as log_disk_free,
round((data_disk_capacity/1024/1024/1024),2) as data_disk,
round((data_disk_in_use/1024/1024/1024),2) as data_disk_used,
round((data_disk_capacity-data_disk_in_use)/1024/1024/1024,2) as data_disk_free
from oceanbase.gv$ob_servers;

select a.zone,a.svr_ip,b.tenant_name,b.tenant_type, a.max_cpu, a.min_cpu,
round(a.memory_size/1024/1024/1024,2) memory_size_gb,
round(a.log_disk_size/1024/1024/1024,2) log_disk_size,
round(a.log_disk_in_use/1024/1024/1024,2) log_disk_in_use,
round(a.data_disk_in_use/1024/1024/1024,2) data_disk_in_use
from oceanbase.gv$ob_units a join oceanbase.dba_ob_tenants b on a.tenant_id=b.tenant_id order by b.tenant_name;
你查一下 上面的信息 看看你的配置是什么样的

2 个赞

从发回来的内存分析报告中可以看到是租户的这个模块DEFAULT_CTX_ID的IOcontrol 内存占用的多一点

2 个赞

grep MEMORY observer.log | grep IoControl 如果一直没有下降,可能是发生泄漏,IOControl模块目前内存上限就是租户system_memory

1 个赞

2 个赞

2 个赞

select * from oceanbase.__all_virtual_malloc_sample_info where mod_name=‘xxx’ order by alloc_size desc limit 10;
–获取到backtrace信息
addr2line -pCfe backtrace
–用addr2line打一下堆栈信息看看 分析一下 看看是否有问题

1 个赞

obclient [(none)]> select * from oceanbase.__all_virtual_malloc_sample_info where mod_name=‘IoControl’ order by alloc_bytes desc limit 10;
±--------------±---------±----------±-------±----------±----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±---------------±------------±------------+
| svr_ip | svr_port | tenant_id | ctx_id | mod_name | back_trace | ctx_name | alloc_count | alloc_bytes |
±--------------±---------±----------±-------±----------±----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±---------------±------------±------------+
| 192.168.0.175 | 2882 | 1004 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x117cd1dc 0x16ca0d20 0x16c927d0 0x16c91f4c 0x16983380 0x16982de4 0xb45d7c8 0xb45b704 0x1b956a10 0x1b954c24 0xffff9e1b00e4 | DEFAULT_CTX_ID | 209 | 14025866272 |
| 192.168.0.175 | 2882 | 1004 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x117cd1dc 0x16ca0d20 0x16c927d0 0x16c91f4c 0x16983380 0x16983d44 0xb45d87c 0xb45b704 0x1b956a10 0x1b954c24 0xffff9e1b00e4 | DEFAULT_CTX_ID | 57 | 1341144336 |
| 192.168.0.175 | 2882 | 1004 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 1003 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 1 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 1002 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 500 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c8d41c 0xc5d63ac 0x88d86e4 0x0 0x0 0x0 0x0 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 1001 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f77c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 15600544 |
| 192.168.0.175 | 2882 | 500 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f59c 0x16c8d41c 0xc5d63ac 0x88d86e4 0x0 0x0 0x0 0x0 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 9600544 |
| 192.168.0.175 | 2882 | 1004 | 0 | IoControl | 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x16c8f59c 0x16c99ae8 0x1714ab84 0xba6fb24 0x14bd9e08 0x14bd95e8 0xc5f0f38 0x88d86f8 0x0 0x0 0x0 | DEFAULT_CTX_ID | 1 | 9600544 |
±--------------±---------±----------±-------±----------±----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±---------------±------------±------------+
10 rows in set (0.023 sec)
back_trace怎么那么多,怎么办

addr2line -pCfe bin/observer 0x1be3362c 0x113f8f94 0x11458af8 0x1143f930 0x117cd000 0x117cd1dc 0x16ca0d20 0x16c927d0 0x16c91f4c 0x16983380 0x16982de4 0xb45d7c8 0xb45b704 0x1b956a10 0x1b954c24 0xffff9e1b00e4

找一个看一下