执行SQL时的报错如何定位和排查这些问题

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】
OceanBase_CE 4.0.0.0
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【问题现象及影响】
执行SQL时碰到下面两种报错,如何定位这些报错的原因
Cause: java.sql.SQLException: System error

Cause: java.sql.SQLException: Entry already exist

【附件】

这是jdbc的报错,你要去看observer返回的报错,可以看部署目录的log

Entry already exist
这个我记得以前测试遇到过,是元数据不一致。大该是建表等失败,导致的。
不过我测试的是ob魔改的另一个数据库。问题现象是一个

System error对应的observer.log是
[2023-03-17 22:22:48.590507] ERROR [STORAGE] nonext_ext_compare (ob_datum_row.cpp:41) [247481][T1002_TNT_L0_G0][T1002][YB420A650079-0005F6EDAB55EF67-0-0] [lt=69] Unexpected datum in rowkey to compare(ret=-4015, right={len: 16, flag: 2,
null: 0, ptr: 0x7fb50cc52720, hex: 19070000000000000300000000000000}) BACKTRACE:0xb61bbbb 0xb60d4f6 0x3c12f81 0x3c12c84 0x3c12a99 0x3bf0efb 0x8fbf786 0x8fbf584 0x900a571 0x90108ae 0x9010511 0x39c19eb 0x39c1705 0x396baf6 0x396b9e0 0x396b
7eb 0x395a93f 0x3924839 0x872a23e 0x8740325 0x71ab4b9 0x71c0b00 0x6b5c343 0x6b1e679 0x73e0ebd 0x7499d9d 0x73e0ebd 0x75c4d99 0x75c45c1 0x75c6953 0x73e0ebd 0x7499d9d 0x73e0ebd 0x73e3d83 0x73e0530 0x3922302 0x70fc16d 0x3922119 0x39215e0 0x
394797b 0x3945b41 0x57d62fb 0x3911ef6 0x390eb08 0x390c66a 0x390ab41 0x45eb379 0x3902f24 0x45eb917 0xb5ffbb7 0xb5fa7ea 0x7fb672866ea5 0x7fb67258fb0d

这时JDBC返回的SQLException,就是OB端返回的异常

我意思是,你要定位报错的原因,要抓observer相关的报错日志,你需要先确定报错码和相关报错sql的trace id, 再用trace id去日志里拉出这条sql的报错栈

好的,请问sql异常报错后,从哪里可以查到sql的traceid,我在GV$OB_SQL_AUDIT里好像查不到

SQL的报错栈是这样用addr2line工具查看吧
addr2line -pCfe /data/oceanbase/bin/observer 0xb61bbbb 0xb60d4f6 0x3c12f81 0x3c12c84 0x3c12a99 0x3bf0efb 0x8fbf786 0x8fbf584 0x900a571 0x90108ae 0x9010511 0x39c19eb 0x39c1705 0x396baf6 0x396b9e0 0x396b7eb 0x395a93f 0x3924839 0x872a23e 0x8740325 0x71ab4b9 0x71c0b00 0x6b5c343 0x6b1e679 0x73e0ebd 0x7499d9d 0x73e0ebd 0x75c4d99 0x75c45c1 0x75c6953 0x73e0ebd 0x7499d9d 0x73e0ebd 0x73e3d83 0x73e0530 0x3922302 0x70fc16d 0x3922119 0x39215e0 0x394797b 0x3945b41 0x57d62fb 0x3911ef6 0x390eb08 0x390c66a 0x390ab41 0x45eb379 0x3902f24 0x45eb917 0xb5ffbb7 0xb5fa7ea 0x7fb672866ea5 0x7fb67258fb0d
oceanbase::common::lbt() at ??:?
oceanbase::common::ObLogger::backtrace_if_needed(oceanbase::common::ObPLogItem&, bool) at ??:?
void oceanbase::common::ObLogger::do_log_message<void oceanbase::common::ObLogger::log_message_kv<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&)::{lambda(char*, long, long&)#1}>(bool, char const*, int, char const*, int, char const*, bool, unsigned long, void oceanbase::common::ObLogger::log_message_kv<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&)::{lambda(char*, long, long&)#1}&) at 0_cxx.cxx:?
void oceanbase::common::ObLogger::log_it<void oceanbase::common::ObLogger::log_message_kv<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&)::{lambda(char*, long, long&)#1}&>(char const*, int, char const*, int, char const*, unsigned long, void oceanbase::common::ObLogger::log_message_kv<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&)::{lambda(char*, long, long&)#1}&) at 0_cxx.cxx:?
void oceanbase::common::ObLogger::log_message_kv<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&) at 0_cxx.cxx:?
void oceanbase::common::OB_PRINT<oceanbase::common::ObILogKV, oceanbase::common::ObILogKV>(char const*, int, char const*, int, char const*, unsigned long, char const*, char const*, oceanbase::common::ObILogKV const&&, oceanbase::common::ObILogKV const&&) at ??:?
_ZZN9oceanbase12blocksstableL18nonext_ext_compareERKNS0_14ObStorageDatumES3_RKNS_6common9ObCmpFuncERiENK6$_1196clEPKc.3e237e6f6aac22fd747ba8f7a6591ee7 at 1_cxx.cxx:?
_ZN9oceanbase12blocksstableL18nonext_ext_compareERKNS0_14ObStorageDatumES3_RKNS_6common9ObCmpFuncERi.3e237e6f6aac22fd747ba8f7a6591ee7 at 1_cxx.cxx:?
oceanbase::blocksstable::ObColumnDecoder::quick_compare(oceanbase::blocksstable::ObStorageDatum const&, oceanbase::blocksstable::ObStorageDatumCmpFunc const&, long, oceanbase::blocksstable::ObBitStream const&, char const*, long, int&) at ??:?
oceanbase::blocksstable::ObEncodeBlockGetReader::locate_row(oceanbase::blocksstable::ObDatumRowkey const&, oceanbase::blocksstable::ObStorageDatumUtils const&, char const*&, long&, long&, bool&, oceanbase::blocksstable::ObDatumRow&) at ??:?
oceanbase::blocksstable::ObEncodeBlockGetReader::get_row(oceanbase::blocksstable::ObMicroBlockData const&, oceanbase::blocksstable::ObDatumRowkey const&, oceanbase::storage::ObTableReadInfo const&, oceanbase::blocksstable::ObDatumRow&) at ??:?
oceanbase::blocksstable::ObMicroBlockRowGetter::inner_get_row(oceanbase::blocksstable::MacroBlockId const&, oceanbase::blocksstable::ObDatumRowkey const&, oceanbase::blocksstable::ObMicroBlockData const&, oceanbase::blocksstable::ObDatumRow const*&) at ??:?
oceanbase::blocksstable::ObMicroBlockRowGetter::get_block_row(oceanbase::storage::ObSSTableReadHandle&, oceanbase::blocksstable::ObMacroBlockReader&, oceanbase::blocksstable::ObDatumRow const*&) at ??:?
oceanbase::blocksstable::ObMicroBlockRowGetter::get_row(oceanbase::storage::ObSSTableReadHandle&, oceanbase::blocksstable::ObDatumRow const*&, oceanbase::blocksstable::ObMacroBlockReader&) at ??:?
oceanbase::storage::ObSSTableRowGetter::fetch_row(oceanbase::storage::ObSSTableReadHandle&, oceanbase::blocksstable::ObDatumRow const*&) at ??:?
oceanbase::storage::ObSSTableRowGetter::inner_get_next_row(oceanbase::blocksstable::ObDatumRow const*&) at ??:?
oceanbase::storage::ObSingleMerge::get_table_row(long, oceanbase::common::ObIArrayoceanbase::storage::ObITable* const&, oceanbase::blocksstable::ObDatumRow&, bool&, bool&) at ??:?
oceanbase::storage::ObMultipleMerge::get_next_row(oceanbase::blocksstable::ObDatumRow*&) at ??:?
oceanbase::storage::ObMultipleMerge::get_next_rows(long&, long) at ??:?
oceanbase::storage::ObTableScanIterator::get_next_rows(long&, long) at ??:?
oceanbase::sql::ObLocalIndexLookupOp::get_next_rows(long&, long) at ??:?
oceanbase::sql::DASOpResultIter::get_next_rows(long&, long) at ??:?
oceanbase::sql::ObTableScanOp::get_next_batch_with_das(long&, long) at ??:?
oceanbase::sql::ObTableScanOp::inner_get_next_batch(long) at ??:?
oceanbase::sql::ObOperator::get_next_batch(long, oceanbase::sql::ObBatchRows const*&) at ??:?
oceanbase::sql::ObSubPlanScanOp::inner_get_next_batch(long) at ??:?
oceanbase::sql::ObOperator::get_next_batch(long, oceanbase::sql::ObBatchRows const*&) at ??:?
oceanbase::sql::ObNestedLoopJoinOp::calc_right_batch_matched_result(long, bool&, oceanbase::sql::ObEvalCtx::BatchInfoScopeGuard&) at ??:?
oceanbase::sql::ObNestedLoopJoinOp::process_left_batch() at ??:?
oceanbase::sql::ObNestedLoopJoinOp::inner_get_next_batch(long) at ??:?
oceanbase::sql::ObOperator::get_next_batch(long, oceanbase::sql::ObBatchRows const*&) at ??:?
oceanbase::sql::ObSubPlanScanOp::inner_get_next_batch(long) at ??:?
oceanbase::sql::ObOperator::get_next_batch(long, oceanbase::sql::ObBatchRows const*&) at ??:?
oceanbase::sql::ObBatchRowIter::get_next_row(oceanbase::sql::ObEvalCtx&, oceanbase::sql::ObOpSpec const&) at ??:?
oceanbase::sql::ObOperator::get_next_row_vectorizely() at ??:?
oceanbase::sql::ObOperator::get_next_row() at ??:?
oceanbase::sql::ObTableModifyOp::inner_get_next_row() at ??:?
oceanbase::sql::ObOperator::get_next_row() at ??:?
oceanbase::sql::ObExecuteResult::get_next_row(oceanbase::sql::ObExecContext&, oceanbase::common::ObNewRow const*&) at ??:?
oceanbase::sql::ObResultSet::open_result() at ??:?
oceanbase::sql::ObResultSet::open() at ??:?
oceanbase::observer::ObAsyncPlanDriver::response_result(oceanbase::observer::ObMySQLResultSet&) at ??:?
oceanbase::observer::ObMPQuery::process_single_stmt(oceanbase::sql::ObMultiStmtItem const&, oceanbase::sql::ObSQLSessionInfo&, bool, bool, bool&, bool&) at ??:?
oceanbase::observer::ObMPQuery::process() at ??:?
oceanbase::rpc::frame::ObSqlProcessor::run() at ??:?
oceanbase::omt::ObWorkerProcessor::process(oceanbase::rpc::ObRequest&) at ??:?
oceanbase::omt::ObThWorker::process_request(oceanbase::rpc::ObRequest&) at ??:?
oceanbase::omt::ObThWorker::worker(long&, long&, int&) at ??:?
non-virtual thunk to oceanbase::omt::ObThWorker::run(long) at ??:?
_ZNSt17_Function_handlerIFvvEZN9oceanbase3lib7Threads5startEvE5$_144E9_M_invokeERKSt9_Any_data.c88c60a6397cb895dbc33a47bd3cf19d at 0_cxx.cxx:?
oceanbase::lib::thread::__th_start(void*) at ??:?

select last_trace_id();

如果是通过proxy连接的数据库,可以查看proxy日志目录下的obproxy_error.log,里面会有对应的sql的trace id(Y开头,较长的那串),然后根据trace id去observer所在的日志目录下(需要到对应的sql执行节点),搜索相关的报错日志

多谢,查到了
1、System error有堆栈信息,就是上面提供的
2、Entry already exit报错,没有堆栈,只有下面的日志

grep [trace-id] observer.log*
全局grep

1、System error对应的traceid:YB420A650079-0005F6EDAB55EF67-0-0

2、Entry already exit对应的traceid:YB420A650079-0005F6EDB3E604EE-0-0
匹配内容在下面附件中
trace.log (183.6 KB)

你好,方便提供一下复现的sql吗,我们内部复现下

表结构和SQL信息敏感,如何单独联系你们提供排查

私信你我的钉钉号了哈