【 使用环境 】
生产环境
【 OB or 其他组件 】
OBServer
【 使用版本 】
ob4.3.3.1
【问题描述】清晰明确描述问题
- OCP检测到OBServer crash告警,告警详情如下:
告警详情:[OBServer crash] 集群:xxx,主机:xxx,日志类型:observer,日志文件:/home/admin/oceanbase/log/observer.log,日志级别:INFO,关键字=CRASH ERROR!!!,错误码=-1,日志详情=[2025-03-10 02:20:00.566981] INFO [DETECT] record_summary_info_and_logout_when_necessary_ (ob_lcl_batch_sender_thread.cpp:202) [1182150][T1013_LCLSender][T1013][Y0-0000000000000000-0-0] [lt=82] ObLCLBatchSenderThread periodic report summary info(duty_ratio_percentage=0, total_constructed_detector=0, total_destructed_detector=0, total_alived_detector=0, lcl_op_interval=30000, lcl_msg_map.count()=0, *this={this:0x7fbc014362b0, is_inited:true, is_running:true, total_record_time:5010000, over_night_times:0})CRASH ERROR!!! IP=7fbff026d46b, RBP=7fbf9a74fa80, sig=6, sig_code=-6, sig_addr=0x1f400004d3e, RLIMIT_CORE=unlimited, timestamp=1741544400567265, tid=20124, tname=MultiTenant, trace_id=Y0-0000000000000000-0-0, extra_info=(), lbt=0x1f666c38 0x1eef714d 0x7fbff04014bf 0x7fbff026d46b 0x7fbff026e790 0x1f1c1bc0 0x78910f8 0x789ad2b 0x788cfab 0x1f64fbc6 0x1f64f951 0x1f6524af 0x1f64b865 0xf0031e7 0xf00387f 0x7c793ee 0x7c77ed8 0x7c77bbd 0x1f652857 0x1f6509f6 0x7fbff03f6f1a 0x7fbff032c1bf, SQL_ID=, SQL_STRING=。
【复现路径】问题出现前后相关操作 - 查看主机监控,发现问题出现前后时间点,主机的系统负载比较高,但CPU使用率不高
- 查看操作系统的系统日志未见异常
- 集群已使用数据盘:28GB
- 每日集群合并开始时间:2:00
- 每日备份开始时间是:4:00
- 目前Crash ERROR告警前后的OBServer日志已经被冲掉了
- 通过addr2line工具,分析OBServer进程崩溃的代码位置
[root@observer1 opt]# addr2line -pCfe ./usr/lib/debug/home/admin/oceanbase/bin/observer.debug 0x1f666c38 0x1eef714d 0x7fbff04014bf 0x7fbff026d46b 0x7fbff026e790 0x1f1c1bc0 0x78910f8 0x789ad2b 0x788cfab 0x1f64fbc6 0x1f64f951 0x1f6524af 0x1f64b865 0xf0031e7 0xf00387f 0x7c793ee 0x7c77ed8 0x7c77bbd 0x1f652857 0x1f6509f6 0x7fbff03f6f1a 0x7fbff032c1bf
safe_backtrace at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/signal/ob_libunwind.c:30
oceanbase::common::coredump_cb(int, int, void*, void*) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/signal/ob_signal_handlers.cpp:210
?? ??:0
?? ??:0
?? ??:0
ob_abort() at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/ob_abort.cpp:21
?? ??:0
oceanbase::lib::SubObjectMgr::alloc_object(unsigned long, oceanbase::lib::ObMemAttr const&) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/alloc/object_mgr.h:50
oceanbase::lib::ObTenantCtxAllocator::alloc(long, oceanbase::lib::ObMemAttr const&) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/alloc/ob_tenant_ctx_allocator.cpp:38
oceanbase::common::ob_malloc(long, oceanbase::lib::ObMemAttr const&) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/allocator/ob_malloc.h:37
oceanbase::lib::ProtectedStackAllocator::_alloc(unsigned long, unsigned long, long, bool) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/thread/protected_stack_allocator.cpp:83 (discriminator 2)
oceanbase::lib::ProtectedStackAllocator::alloc(unsigned long, long) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/thread/protected_stack_allocator.cpp:62
oceanbase::lib::Threads::start() at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/thread/threads.cpp:181
oceanbase::omt::create_worker(oceanbase::omt::ObThWorker*&, oceanbase::omt::ObTenant*, unsigned long, int, bool, oceanbase::omt::ObResourceGroup*) at ./build_rpm/src/observer/./src/observer/omt/ob_th_worker.cpp:71
oceanbase::omt::ObResourceGroup::acquire_more_worker(long, long&, bool) at ./build_rpm/src/observer/./src/observer/omt/ob_tenant.cpp:425
oceanbase::omt::ObResourceGroup::check_worker_count() at ./build_rpm/src/observer/./src/observer/omt/ob_tenant.cpp:508
oceanbase::omt::ObTenant::check_group_worker_count() at ./build_rpm/src/observer/./src/observer/omt/ob_tenant.cpp:1786
oceanbase::omt::ObMultiTenant::run1() at ./build_rpm/src/observer/./src/observer/omt/ob_multi_tenant.cpp:2523
oceanbase::lib::Threads::run(long) at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/thread/threads.cpp:203
oceanbase::lib::Thread::run() at ./build_rpm/deps/oblib/src/lib/./deps/oblib/src/lib/thread/thread.cpp:177
?? ??:0
?? ??:0
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!