[2025-02-12 16:36:01.720610] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=31][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx( |$)' /proc/cpuinfo")
[2025-02-12 16:36:01.724033] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=39][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx2( |$)' /proc/cpuinfo")
[2025-02-12 16:36:01.727475] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=36][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx512bw( |$)' /proc/cpuinfo")
[2025-02-12 16:36:01.727500] INFO [COMMON] CpuFlagSet (ob_cpu_topology.cpp:63) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=24] #flag is supported
[2025-02-12 16:36:01.727511] WDIAG [COMMON] CpuFlagSet (ob_cpu_topology.cpp:64) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=10][errcode=0] #flag is not supported
[2025-02-12 16:36:01.727531] WDIAG [COMMON] CpuFlagSet (ob_cpu_topology.cpp:65) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=20][errcode=0] #flag is not supported
[2025-02-12 16:36:01.727540] WDIAG [COMMON] CpuFlagSet (ob_cpu_topology.cpp:66) [32496][observer][T0][Y0-0000000000000001-0-0] [lt=8][errcode=0] #flag is not supported
……
raise_exception: ; preds = %normal_raise_block, %ob_fail, %ob_fail, %ob_fail
%raise_exception91 = call i32 @_Unwind_RaiseException(%unwind_exception* %create_exception)
unreachable
normal_raise_block: ; preds = %ob_fail
%get_exception_class = call i64 @eh_classify_exception(i8* %load_sql_state)
%get_exception_class.off = add i64 %get_exception_class, -3
%switch = icmp ult i64 %get_exception_class.off, 2
br i1 %switch, label %reset_ret_block, label %raise_exception
reset_ret_block: ; preds = %normal_raise_block
store i32 0, i32* %int_alloca, align 4
br label %ob_success
}
")
[2025-02-12 16:37:43.384657] INFO [COMMON] try_inc_thread_count (ob_dynamic_thread_pool.cpp:504) [32502][qth_mgr][T0][Y0-0000000000000000-0-0] [lt=10] try inc thread count(*this={name:TimerWK, this:0x7f67c63eb590, min_thread_cnt:4, max_thread_cnt:128, running_thread_cnt:4, threads_idle_time:239981740, tenant_id:1}, cur_thread_count=7, cnt=-1, new_thread_count=6)
[2025-02-12 16:37:43.384715] INFO [LIB] do_thread_recycle (threads.cpp:163) [32502][qth_mgr][T0][Y0-0000000000000000-0-0] [lt=32] recycle one thread(this=0x7f67c63eb590, total=7, remain=6)
[2025-02-12 16:37:43.384729] INFO [COMMON] try_inc_thread_count (ob_dynamic_thread_pool.cpp:509) [32502][qth_mgr][T0][Y0-0000000000000000-0-0] [lt=13] inc thread count(*this={name:TimerWK, this:0x7f67c63eb590, min_thread_cnt:4, max_thread_cnt:128, running_thread_cnt:4, threads_idle_time:239981740, tenant_id:1}, cur_thread_count=7, cnt=-1, new_thread_count=6)
[2025-02-12 16:37:43.392501] INFO [RPC.OBRPC] do_server_loop (ob_net_keepalive.cpp:498) [32637][KeepAliveServer][T0][Y0-0000000000000000-0-0] [lt=27] socket need_disconn(n=-1, errno=9)
[2025-02-12 16:37:43.392552] INFO [RPC.OBRPC] do_server_loop (ob_net_keepalive.cpp:528) [32637][KeepAliveServer][T0][Y0-0000000000000000-0-0] [lt=39] server connection closed, fd: 88, addr: "192.168.2.112:54768"
CRASH ERROR!!! IP=5566302e13a0, RBP=7f67548499c0, sig=4, sig_code=2, sig_addr=0x5566302e13a0, RLIMIT_CORE=unlimited, timestamp=1739349463393031, tid=33103, tname=T1_L0_G28, trace_id=YB42C0A80270-00062DEDDA7834BD-0-0, lbt=0x1f96b218 0x1f1b698d 0x7f67cd43e72f 0x8bb63a0 0x9be8a9c 0x9c0812c 0x9c08505 0x9be51fd 0x9a466c5 0xa5f92d9 0xa5fa810 0xa5f85ef 0x924c3cf 0x924cafc 0x9253edc 0x9253edc 0x924ed8d 0x92176d1 0x92177b1 0x9217ad3 0x9224c5c 0x9226bd3 0x9226edb 0x9237427 0x1ef0327a 0x1eee447d 0xebf34b2 0xebf0fc5 0xebe4b4b 0xec2577e 0xec5339b 0xec473c9 0xeaa14cd 0xea7362e 0x14a77faa 0x11bba464 0x7c4fe26 0x7923e9c 0x792151d 0x7c4d9c4 0x7c4cd09 0x7c482ee 0x7cf5030 0x7cf4929 0xf8dbc1a 0xf8f4a69 0x81cfe74 0x78b043b 0x789e118 0xfc77118, SQL_ID=E9E2014C8CE705871C555597A6A32456, SQL_STRING=CALL DBMS_STATS.ASYNC_GATHER_STATS_JOB_PROC(600000000);
原因是当前使用的cpu不支持avx指令,OB内核用到了avx指令,可以使用lscpu命令看下cpu指令集确认下
解决方案:
更换支持AVX指令的CPU型号
obdiag也可以巡检出来
另外从OB4.3.5的下一个版本开始没有avx指令集的机器会直接不让启动了
和这个帖子是一样的