【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】
【问题描述】obd部署的服务, 报错然后查询就超时;麒麟v10虚拟机上部署的
【复现路径】问题出现前后相关操作
【附件及日志】[2024-09-29 18:02:22.243731] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:356) [634204][T1_Occam][T1][Y0-0000000000000000-0-0] [lt=19][errcode=-4392] disk is hung(msg=“clog disk may be hung, add failure event”, clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang, sen: 0, level:FATAL})
[2024-09-29 18:02:22.370330] ERROR issue_dba_error (ob_log.cpp:1875) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=40][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4009, file=“ob_tx_data_functor.cpp”, line_no=391, info=“unexpected io error”)
[2024-09-29 18:02:22.370505] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_trans_part_ctx.cpp:5951) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=110][errcode=-4009] do data check function fail.(ret=-4009, ret=“OB_IO_ERROR”, *this={this:0x14cb3726b250, trans_id:{txid:12246}, tenant_id:1, is_exiting:false, trans_expired_time:1727604171773576, cluster_version:17180000520, trans_need_wait_wrap:{receive_gts_ts_:[mts=0], need_wait_interval_us:0}, stc:[mts=1727604141774837], ctx_create_time:1727604141773591})
[2024-09-29 18:02:22.370538] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_trans_ctx_mgr_v4.cpp:1378) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=32][errcode=-4009] failed to check tx status(ret=-4009, ret=“OB_IO_ERROR”)
[2024-09-29 18:02:22.370551] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_tx_ctx_table.cpp:352) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=12][errcode=-4009] check with tx data failed(ret=-4009, ret=“OB_IO_ERROR”, tx_id={txid:12246})
[2024-09-29 18:02:22.370563] WDIAG [STORAGE] check_with_tx_data (ob_tx_table.cpp:702) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=11][errcode=-4009] check tx data in tables failed(ret=-4009, ret=“OB_IO_ERROR”, ls_id_={id:1}, read_tx_data_arg={tx_id:{txid:12246}, read_epoch:0, tx_data_mini_cache:})
[2024-09-29 18:02:22.371549] WDIAG [SERVER] after_func (ob_query_retry_ctrl.cpp:947) [634187][OmtNodeBalancer][T1][YB42AC106E7B-0006233EC1F9FF57-0-0] [lt=15][errcode=-4009] [RETRY] check if need retry(v={force_local_retry:true, stmt_retry_times:0, local_retry_times:0, err_:-4009, err_:“OB_IO_ERROR”, retry_type:0, client_ret:-4009}, need_retry=false, THIS_WORKER.can_retry()=false, v.ctx_.multi_stmt_item_={is_part_of_multi_stmt:false, seq_num:0, sql:"", batched_queries:NULL, is_ps_mode:false, ab_cnt:0})
[2024-09-29 18:02:23.487042] ERROR issue_dba_error (ob_log.cpp:1875) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=92][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4009, file=“ob_tx_data_functor.cpp”, line_no=391, info=“unexpected io error”)
[2024-09-29 18:02:23.487197] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_trans_part_ctx.cpp:5951) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=105][errcode=-4009] do data check function fail.(ret=-4009, ret=“OB_IO_ERROR”, *this={this:0x14cb3726b250, trans_id:{txid:12246}, tenant_id:1, is_exiting:false, trans_expired_time:1727604171773576, cluster_version:17180000520, trans_need_wait_wrap:{receive_gts_ts_:[mts=0], need_wait_interval_us:0}, stc:[mts=1727604141774837], ctx_create_time:1727604141773591})
[2024-09-29 18:02:23.487234] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_trans_ctx_mgr_v4.cpp:1378) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=36][errcode=-4009] failed to check tx status(ret=-4009, ret=“OB_IO_ERROR”)
[2024-09-29 18:02:23.487246] WDIAG [STORAGE.TRANS] check_with_tx_data (ob_tx_ctx_table.cpp:352) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=10][errcode=-4009] check with tx data failed(ret=-4009, ret=“OB_IO_ERROR”, tx_id={txid:12246})
[2024-09-29 18:02:23.487257] WDIAG [STORAGE] check_with_tx_data (ob_tx_table.cpp:702) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=10][errcode=-4009] check tx data in tables failed(ret=-4009, ret=“OB_IO_ERROR”, ls_id_={id:1}, read_tx_data_arg={tx_id:{txid:12246}, read_epoch:0, tx_data_mini_cache:})
[2024-09-29 18:02:23.488113] WDIAG [SERVER] after_func (ob_query_retry_ctrl.cpp:947) [634577][TimezoneMgr][T1][YB42AC106E7B-0006233EC5E9FF4C-0-0] [lt=15][errcode=-4009] [RETRY] check if need retry(v={force_local_retry:true, stmt_retry_times:0, local_retry_times:0, err_:-4009, err_:“OB_IO_ERROR”, retry_type:0, client_ret:-4009}, need_retry=false, THIS_WORKER.can_retry()=false, v.ctx_.multi_stmt_item_={is_part_of_multi_stmt:false, seq_num:0, sql:"", batched_queries:NULL, is_ps_mode:false, ab_cnt:0})
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!