Oceanbase运行一段时间后异常

【使用环境 】生产环境

【 OB or 其他组件 】:使用obd启动部署的服务。

【 使用版本 】oceanbase-all-in-one-4.3.5

【问题描述】Oceanbase运行一段时间后异常,使用“obclient”命令没法连接数据库,将Oceanbase重启后正常。

Oceanbase数据库在2026-05-08 15.13前运行正常,在此之后出现问题,没法连接数据库。查看Oceanbase的log发现出现ERROR。

【复现路径】

【日志】返回错误:

以下为oceanbase-ce下log:

[2026-05-08 15:12:32.696144] INFO [RPC.OBRPC] do_server_loop (ob_net_keepalive.cpp:498) [5048][KeepAliveServer][T0][Y0-0000000000000000-0-0] [lt=16] socket need_disconn(n=-1, errno=104)

[2026-05-08 15:12:32.696166] INFO [RPC.OBRPC] do_server_loop (ob_net_keepalive.cpp:528) [5048][KeepAliveServer][T0][Y0-0000000000000000-0-0] [lt=21] server connection closed, fd: 130, addr: “127.0.0.1:42448”

[2026-05-08 15:12:32.696219] INFO [RPC.OBRPC] on_disconnect (ob_rpc_net_handler.cpp:338) [5035][RpcIO][T0][Y0-0000000000000000-0-0] [lt=27] connection disconnect(easy_connection_str(c)=0.0.0.0_127.0.0.1:48480_134_0x7fd1513ed840 tp=0 t=1777129340570219-1778224352603664 s=0 r=0 io=703565984/467003115 sq=467002747)

[2026-05-08 15:12:32.696241] INFO [RPC.OBRPC] on_disconnect (ob_rpc_net_handler.cpp:338) [5036][RpcIO][T0][Y0-0000000000000000-0-0] [lt=23] connection disconnect(easy_connection_str(c)=0.0.0.0_127.0.0.1:48496_136_0x7fd12ce04e40 tp=0 t=1777129340572408-1778224352090646 s=0 r=0 io=703935665/467038454 sq=467038454)

CRASH ERROR!!! IP=7fd160e15f77, RBP=7fd1270cde70, sig=11, sig_code=1, sig_addr=0x0, RLIMIT_CORE=0, timestamp=1778224352696599, tid=5310, tname=T1_L0_G0, trace_id=YB427F000001-0006504CC4CDB0EC-0-0, lbt=0x1f96b218 0x1f1b698d 0x7fd160e9141f 0x7fd160e15f77 0x157bf728 0x1587d19f 0x7bbae34 0x7bb7c23 0x7bb2b2d 0x118a7bd0 0x118a7763 0x7bb3747 0x79816f4 0x79215fb 0x1013b824 0x78c484e 0x78b88e3 0x78b1adf 0x78aefec 0x789e118 0xfc77118 0x1f95655d 0x7fd160e85608 0x7fd160db36c2, SQL_ID=, SQL_STRING=replace into t_historyloc0 values(‘6002’,0,1778224351,0,‘11392736’,‘1’,2257472,0,0.0,0,‘0’,1,0)

看着是CRASH ERROR!!!应该是数据库宕机了
你在服务器查看一下 core文件的生成目录 看一下是否有core文件生成
sysctl -a | grep pattern

addr2line -pCfe bin/observer $lbt
如果没有按照这个收集一下core dump
addr2line -pCfe bin/observer 0x1f96b218 0x1f1b698d 0x7fd160e9141f 0x7fd160e15f77 0x157bf728 0x1587d19f 0x7bbae34 0x7bb7c23 0x7bb2b2d 0x118a7bd0 0x118a7763 0x7bb3747 0x79816f4 0x79215fb 0x1013b824 0x78c484e 0x78b88e3 0x78b1adf 0x78aefec 0x789e118 0xfc77118 0x1f95655d 0x7fd160e85608 0x7fd160db36c2

CRASH ERROR!!!

[Gota@localhost 04 share]$ sysctl -a grep pattern

sysctl: permission denied on key 'fs. protected fifos'

sysctl: permission denied on key 'fs. protected hardlinks'

sysctl: permission denied on key 'fs. protected regular

sysctl: permission denied on key 'fs. protected symlinks'

sysctl: permission denied on key 'kernel. cad pid'​

kernel.core_pattern=1/usr/lib/systemd/systemd-coredump %p %u %g %s %t %c %h

sysctl: permission denied on key kernel.usermodehelper.bset

sysctl: permission denied on key 'kernel.usermodehelper. inheritable'

sysctl: permission denied on key 'net. core. bpf jit harden'

sysctl: permission denied on key 'net. core. bpf jit kallsyms'

sysctl: permission denied on key 'net. core. bpf jit limit'

sysctl: permission denied on key 'net. ipv4. tcp fastopen key'

sysctl: permission denied on key 'net. ipv6. conf. all. stable secret'

sysctl: permission denied on key 'net. ipv6. conf. default. stable secret'

sysctl: permission denied on key 'net. ipv6. conf. em1. stable secret'

sysctl: permission denied on key 'net. ipv6. conf. lo.stable secret'

sysctl: permission denied on key 'vm. mmap rnd bits'

sysctl: permission denied on key 'vm. mmap rnd compat bits'

sysctl: permission denied on key 'vm. stat refresh'

[Gota@localhost 04 share]s addr2line -pcfe bin/observer $lbt

addr2line:bin/observer:无此文件

[Gota@localhost 04 share]s addr2line -pcfe bin/observer 0x1f96b218 0x1f1b698d 0x7fd1609141f 0x7fd160e15f77 0x157bf728 0x1587d19f0x7bbae34 0x7bb7c23 0x7bb2b2d 0x118a7bd0 0x118a77630x7bb3747 0x79816f4 0x79215fb 0x1013b824 0x78c484e 0x78b883 0x78b1adf 0x78aefec 0x789e118 0xfc77118 0x1f95655d 0x7fd160e85608 0x7fd160db36c2

addr2line:bin/observer:无此文件

[Gota@localhost 04 share]s

学习一下

看日志比较麻烦

用root权限查看

异常的时候,不会已经crash了吧,进程还在么?

kernel.core_pattern=1/usr/lib/systemd/systemd-coredump %p %u %g %s %t %c %h
这个目录下面 看看有没有core dump文件
addr2line -pcfe bin/observer 0x1f96b218 0x1f1b698d 0x7fd1609141f 0x7fd160e15f77 0x157bf728 0x1587d19f0x7bbae34 0x7bb7c23 0x7bb2b2d 0x118a7bd0 0x118a77630x7bb3747 0x79816f4 0x79215fb 0x1013b824 0x78c484e 0x78b883 0x78b1adf 0x78aefec 0x789e118 0xfc77118 0x1f95655d 0x7fd160e85608 0x7fd160db36c2

执行这个命令的时候 bin/observer看一下这个进程在哪个目录下 ps -ef | grep observer| grep -v grep