OB集群节点其中一台报错update logic clock fail

【 使用环境 】生产环境
【 OB or 其他组件 】OB节点
【 使用版本 】4.2.1.0
【问题描述】
该集群其中一个节点报错update logic clock fail
具体该时间段的observer.log日志如下
[2025-11-11 05:15:31.057114] WDIAG [RPC] handle_resp (ob_poc_rpc_proxy.cpp:143) [3403][pnio1][T0][YB42AC154B1F-0006415535DF20A8-0-0] [lt=0][errcode=0] ucb.process fail(tmp_ret=-4002)
[2025-11-11 05:15:31.066242] INFO [PALF] inner_append_log (palf_handle_impl.cpp:2126) [4074][T1012_IOWorker][T1012][Y0-0000000000000000-0-0] [lt=10] [PALF STAT INNER APPEND LOG SIZE](this={palf_id:1005, self:“172.21.75.35:2882”, has_set_deleted:false}, accum_size=6616371)
[2025-11-11 05:15:31.070665] INFO [STORAGE.TRANS] rollback (ob_memtable_context.cpp:848) [1230192][T1012_L0_G0][T1012][YB42AC154B21-000630D6A1ECE7B2-0-0] [lt=36] memtable handle rollback to successfuly(from_seq_no=4039380184692263, to_seq_no=4039380184691770, *this={ObIMvccCtx={alloc_type=0 ctx_descriptor=0 min_table_version=1715669542593752 max_table_version=1715669542593752 trans_version={val:4611686018427387903, v:0} commit_version={val:0, v:0} lock_wait_start_ts=0 replay_compact_version={val:0, v:0}} end_code=0 tx_status=0 is_readonly=false ref=0 trans_id={txid:569993490} ls_id=1005 callback_alloc_count=2242 callback_free_count=0 checksum=0 tmp_checksum=2803171025 checksum_scn={val:1762809330853622282, v:0} redo_filled_count=2237 redo_sync_succ_count=2237 redo_sync_fail_count=0 main_list_length=2243 unsynced_cnt=6 unsubmitted_cnt_=6 cb_statistics:[main=2244, slave=0, merge=0, tx_end=0, rollback_to=1, fast_commit=0, remove_memtable=0]})
[2025-11-11 05:15:31.070936] WDIAG [RPC] handle_resp (ob_poc_rpc_proxy.cpp:143) [3407][pnio1][T0][YB42AC154B21-000630D6A1ECE7B2-0-0] [lt=0][errcode=0] ucb.process fail(tmp_ret=-4002)
[2025-11-11 05:15:31.073891] INFO [STORAGE] scheduler_ls_ha_handler_ (ob_storage_ha_service.cpp:188) [4032][T1011_HAService][T1011][Y0-0000000000000000-0-0] [lt=19] start do ls ha handler(ls_id_array_=[])
[2025-11-11 05:15:31.074702] INFO [SQL.ENG] load_batch_insert_up_rows (ob_table_insert_up_op.cpp:929) [663566][T1012_L0_G0][T1012][YB42AC154B23-00063CB04876D177-0-0] [lt=68] insert up rows used memory over limit(ret=0, row_cnt=96, insert_rows=96)
[2025-11-11 05:15:31.074945] INFO [STORAGE.TRANS] rollback (ob_memtable_context.cpp:848) [2259612][T1012_L0_G0][T1012][YB42AC154B22-00063114AD1E2B66-0-0] [lt=36] memtable handle rollback to successfuly(from_seq_no=4039380184692302, to_seq_no=4039380184692179, *this={ObIMvccCtx={alloc_type=0 ctx_descriptor=0 min_table_version=1727228603663776 max_table_version=1727228603663776 trans_version={val:4611686018427387903, v:0} commit_version={val:0, v:0} lock_wait_start_ts=0 replay_compact_version={val:0, v:0}} end_code=0 tx_status=0 is_readonly=false ref=0 trans_id={txid:570454829} ls_id=1005 callback_alloc_count=6666 callback_free_count=1744 checksum=0 tmp_checksum=1664515129 checksum_scn={val:1762809330919788641, v:0} redo_filled_count=6616 redo_sync_succ_count=6616 redo_sync_fail_count=0 main_list_length=4922 unsynced_cnt=10 unsubmitted_cnt_=10 cb_statistics:[main=6678, slave=0, merge=0, tx_end=0, rollback_to=57, fast_commit=1699, remove_memtable=0]})
[2025-11-11 05:15:31.075226] WDIAG [RPC] handle_resp (ob_poc_rpc_proxy.cpp:143) [3404][pnio1][T0][YB42AC154B22-00063114AD1E2B66-0-0] [lt=0][errcode=0] ucb.process fail(tmp_ret=-4002)
[2025-11-11 05:15:31.075438] INFO [COMMON] replace_fragment_node (ob_kvcache_map.cpp:697) [3370][KVCacheRep][T0][Y0-0000000000000000-0-0] [lt=15] Cache replace map node details(ret=0, replace_node_count=1370, replace_time=8780, replace_start_pos=9800000, replace_num=100000)
[2025-11-11 05:15:31.075626] INFO [COMMON] replace_map (ob_kv_storecache.cpp:743) [3370][KVCacheRep][T0][Y0-0000000000000000-0-0] [lt=18] replace map num details(ret=0, replace_node_count=1370, map_once_replace_num_=100000, map_replace_skip_count_=0)
[2025-11-11 05:15:31.075768] INFO [STORAGE.TRANS] rollback (ob_memtable_context.cpp:848) [2667333][T1012_L0_G0][T1012][YB42AC154B24-000641A0318FB02B-0-0] [lt=13] memtable handle rollback to successfuly(from_seq_no=4039380184692310, to_seq_no=4039380184691870, *this={ObIMvccCtx={alloc_type=0 ctx_descriptor=0 min_table_version=1715669542593752 max_table_version=1715669542593752 trans_version={val:4611686018427387903, v:0} commit_version={val:0, v:0} lock_wait_start_ts=0 replay_compact_version={val:0, v:0}} end_code=0 tx_status=0 is_readonly=false ref=0 trans_id={txid:570169721} ls_id=1005 callback_alloc_count=1286 callback_free_count=0 checksum=0 tmp_checksum=4198144802 checksum_scn={val:1762809330853622285, v:0} redo_filled_count=1280 redo_sync_succ_count=1280 redo_sync_fail_count=0 main_list_length=1287 unsynced_cnt=7 unsubmitted_cnt_=7 cb_statistics:[main=1288, slave=0, merge=0, tx_end=0, rollback_to=1, fast_commit=0, remove_memtable=0]})
[2025-11-11 05:15:31.076066] WDIAG [RPC] handle_resp (ob_poc_rpc_proxy.cpp:143) [3407][pnio1][T0][YB42AC154B24-000641A0318FB02B-0-0] [lt=1][errcode=0] ucb.process fail(tmp_ret=-4002)
[2025-11-11 05:15:31.079261] INFO [SQL.ENG] load_batch_insert_up_rows (ob_table_insert_up_op.cpp:929) [86745][T1012_L0_G0][T1012][YB42AC154B23-00063CB04B09816A-0-0] [lt=68] insert up rows used memory over limit(ret=0, row_cnt=99, insert_rows=99)
[2025-11-11 05:15:31.082824] WDIAG [STORAGE.TRANS] update_logic_clock_ (ob_tx_free_route.cpp:486) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=34][errcode=-4016] logic clock is fast more than 1 day(ret=-4016, ret=“OB_ERR_UNEXPECTED”, logic_clock=4039380184692343)
[2025-11-11 05:15:31.082846] ERROR issue_dba_error (ob_log.cpp:1841) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=21][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4016, file=“ob_tx_free_route.cpp”, line_no=532, info=“update logic clock fail”)
[2025-11-11 05:15:31.082871] EDIAG [STORAGE.TRANS] txn_free_route__update_dynamic_state (ob_tx_free_route.cpp:532) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=25][errcode=-4016] update logic clock fail(ret=-4016) BACKTRACE:0x11652425 0x68ee8ce 0x4c86804 0x68ee624 0x68e7601 0xe612ea4 0xe612449 0xa689cef 0xa10b2a5 0x98fc43d 0x4886261 0x487e51b 0x48799c4 0x8ea6044 0x116685bf 0x7f446a3bd1ca 0x7f446a029e73
[2025-11-11 05:15:31.083045] WDIAG [STORAGE.TRANS] txn_free_route__update_dynamic_state (ob_tx_free_route.cpp:551) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=172][errcode=-4016] [tx-free-route::update_state](ret=-4016, header={compat_ver:1, tx_id:{txid:570005290}, global_version:6611, flag:{v:-128}, backend_sess_id:3221710787}, logic_clock=4039380184692343, session_id=3222642888, ctx={tx_id:{txid:570384055}, txn_addr:“172.21.75.35:2882”, is_proxy_support:true, in_txn_before_handle_request:true, can_free_route:false, is_fallbacked:false, static_changed:true, dynamic_changed:true, parts_changed:true, extra_changed:true, calculated:true, flag:{v:1}, local_version:1619, global_version:6566, global_version_water_mark:6566, state_sync_infos:{[{last_backend_sess_id:3221710787, last_version:6566}, {last_backend_sess_id:3221710787, last_version:5757}, {last_backend_sess_id:3221710787, last_version:5757}, {last_backend_sess_id:3221710787, last_version:5757}, ]}, audit_record:268436736}, tx=0x7f435ee7a9e0)
[2025-11-11 05:15:31.083094] WDIAG [SQL.EXE] update_txn_dynamic_state (ob_sql_trans_control.cpp:1461) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=48][errcode=-4016] fail to exec txs->txn_free_route__update_dynamic_state(session.get_sessid(), tx_desc, session.get_txn_free_route_ctx(), buf, len, pos)(ret=-4016, session={this:0x7f43db266c18, id:3222642888, deser:false, tenant:“dps”, tenant_id:1012, effective_tenant:“dps”, effective_tenant_id:1012, database:“tdp”, user:“dtstdp@%”, consistency_level:3, session_state:1, autocommit:false, tx:0x7f435ee7a9e0})
[2025-11-11 05:15:31.083121] WDIAG [SQL.EXE] update_txn_dynamic_state (ob_sql_trans_control.cpp:1461) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=27][errcode=-4016] update txn state fail(ret=-4016, state=“dynamic”, session.get_txn_free_route_ctx()={tx_id:{txid:570384055}, txn_addr:“172.21.75.35:2882”, is_proxy_support:true, in_txn_before_handle_request:true, can_free_route:false, is_fallbacked:false, static_changed:true, dynamic_changed:true, parts_changed:true, extra_changed:true, calculated:true, flag:{v:1}, local_version:1619, global_version:6566, global_version_water_mark:6566, state_sync_infos:{[{last_backend_sess_id:3221710787, last_version:6566}, {last_backend_sess_id:3221710787, last_version:5757}, {last_backend_sess_id:3221710787, last_version:5757}, {last_backend_sess_id:3221710787, last_version:5757}, ]}, audit_record:268436736}, session={this:0x7f43db266c18, id:3222642888, deser:false, tenant:“dps”, tenant_id:1012, effective_tenant:“dps”, effective_tenant_id:1012, database:“tdp”, user:“dtstdp@%”, consistency_level:3, session_state:1, autocommit:false, tx:0x7f435ee7a9e0}, prev_tx_id={txid:570005290}, tx_desc={this:0x7f435ee7a9e0, tx_id:{txid:570005290}, state:1, addr:“172.21.75.31:2882”, tenant_id:1012, session_id:3221710787, assoc_session_id:3222642888, xid:NULL, xa_mode:"", xa_start_addr:“0.0.0.0:0”, access_mode:0, tx_consistency_type:0, isolation:1, snapshot_version:{val:18446744073709551615, v:3}, snapshot_scn:0, active_scn:0, op_sn:1, alloc_ts:1762809297368869, active_ts:-1, commit_ts:-1, finish_ts:-1, timeout_us:86400000000, lock_timeout_us:-1, expire_ts:1762895729008055, coord_id:{id:-1}, parts:[], exec_info_reap_ts:0, commit_version:{val:18446744073709551615, v:3}, commit_times:0, commit_cb:null, cluster_id:1, cluster_version:17180000512, flags_.SHADOW:false, flags_.INTERRUPTED:false, flags_.BLOCK:false, flags_.REPLICA:true, can_elr:false, cflict_txs:[], abort_cause:0, commit_expire_ts:-1, commit_task_.is_registered():false, ref:2})
[2025-11-11 05:15:31.083199] WDIAG [SQL.SESSION] update_sess_sync_info (ob_sql_session_info.cpp:3013) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=75][errcode=-4016] failed to deserialize sess sync info(ret=-4016, sess_sync_info_type=6, buf=“data_size:171, data:0500690000000000000021F9972A00000000000019A680001801C00767C3F407018082888040018B80808000049F96D5E00A000000C216018580808000AAAEE68F020185808080000000000001010186808080000400000000000100C3CF9D800C80C0DDEEC102B7BBD0AD82EB90030600360000000000000021F9972A00000000000019D380001801C00767C3000E59CB7D057677B9010300B7FBF2BEC0E89003DBED94E8B7B996070001”, length=171, pos=149)
[2025-11-11 05:15:31.083214] WDIAG [SERVER] sync_session_info (obmp_utils.cpp:206) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=16][errcode=-4016] failed to update txn session sync info(ret=-4016, info_type=6, sess.get_sessid()=3222642888, succ_info_types=[32], info.pos=149, info.len=54)
[2025-11-11 05:15:31.083229] WDIAG [SERVER] process_extra_info (obmp_base.cpp:595) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=14][errcode=-4016] fail to update sess info(ret=-4016)
[2025-11-11 05:15:31.083239] WDIAG [SERVER] process (obmp_query.cpp:189) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=9][errcode=-4016] fail get process extra info(ret=-4016)
[2025-11-11 05:15:31.083298] WDIAG [SERVER] force_disconnect (obmp_packet_sender.cpp:761) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=8][errcode=-4000] force disconnect(lbt()=“0x1101b73c 0x91da81f 0x91a5688 0x4887213 0x487e51b 0x48799c4 0x8ea6044 0x116685bf 0x7f446a3bd1ca 0x7f446a029e73”)
[2025-11-11 05:15:31.083360] WDIAG [SERVER] disconnect (obmp_packet_sender.cpp:775) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=12][errcode=0] server close connection(sessid=3222642888, proxy_sessid=12399900239660821775, stack=“0x1101b73c 0x91da9a0 0x91a5482 0x91a5690 0x4887213 0x487e51b 0x48799c4 0x8ea6044 0x116685bf 0x7f446a3bd1ca 0x7f446a029e73”)
[2025-11-11 05:15:31.083374] WDIAG [SERVER] process (obmp_query.cpp:385) [2259612][T1012_L0_G0][T1012][YB42AC154B23-00063CB051275C7C-0-0] [lt=12][errcode=-4016] disconnect connection(ret=-4016, ret=“OB_ERR_UNEXPECTED”)
[2025-11-11 05:15:31.083402] WDIAG [RPC.OBMYSQL] push_close_req (ob_sql_nio.cpp:748) [3559][sql_nio0][T0][Y0-0000000000000000-0-0] [lt=52][errcode=-4015] close sql sock by user req(*s={this:0x7f433b614d70, session_id:3222642888, trace_id:Y0-0000000000000000-0-0, sql_handling_stage:256, sql_initiative_shutdown:true, fd:213, err:5, last_decode_time:1762809331082772, last_write_time:1762809331083396, pending_write_task:{buf:null, sz:0}, need_epoll_trigger_write:false, consume_size:40374954, pending_flag:1, may_handling_flag:true, handler_close_flag:false})
[2025-11-11 05:15:31.083443] INFO [RPC.OBMYSQL] on_disconnect (obsm_conn_callback.cpp:268) [3559][sql_nio0][T0][Y0-0000000000000000-0-0] [lt=32] kill and revert session(conn.sessid_=3222642888, proxy_sessid=12399900239660821775, server_id=5, ret=0)
[2025-11-11 05:15:31.083459] INFO [RPC.OBMYSQL] handle_pending_destroy_list (ob_sql_nio.cpp:838) [3559][sql_nio0][T0][Y0-0000000000000000-0-0] [lt=16] can close safely, do destroy(*s={this:0x7f433b614d70, session_id:3222642888, trace_id:Y0-0000000000000000-0-0, sql_handling_stage:256, sql_initiative_shutdown:true, fd:213, err:5, last_decode_time:1762809331082772, last_write_time:1762809331083396, pending_write_task:{buf:null, sz:0}, need_epoll_trigger_write:false, consume_size:40374954, pending_flag:1, may_handling_flag:false, handler_close_flag:false})
[2025-11-11 05:15:31.083483] INFO [RPC.OBMYSQL] sm_conn_unlock_tenant (obsm_conn_callback.cpp:159) [3559][sql_nio0][T0][Y0-0000000000000000-0-0] [lt=16] unlock session of tenant(conn.sessid_=3222642888, proxy_sessid=12399900239660821775, conn.tenant_id_=1012)

[

4 个赞

日志太多,用obdiag工具排查下

3 个赞

麻烦检查下该节点的时钟是否与其他节点不一致。

2 个赞

太复杂了,不用工具不好看

好工具

可能原因
主机 NTP 时间同步不稳定或跳变。
系统时钟被手动调整,导致逻辑时钟连续性被破坏。
建议你优先考虑升级一下集群吧,421.0太低了

用obdiag分析一下看看吧

学习了