observer版本:3.1.4 社区版
集群架构:20-20-20 x86集群
环境:生产环境
问题描述:
应用在通过obproxy连接集群的时候报错,应该是连接超时,报错i如下:
从报错上看是应该是由于clog超时导致的obproxy连接observer主节点断开,然后obproxy与应用连接也断开了,是否有办法解决该报错
建议升级ob到最新版的 obproxy.log、 obproxy_digest.log日志信息发一下
我们这个报错应该和obproxy的版本没有关系吧。这个是由于切主导致的报错
[2022-12-31 16:32:43.112435] ERROR [ELECT] leader_revoke (ob_election.cpp:2151) [42767][1849][Y0-0000000000000000] [lt=13] [dc=0] leader_revoke, please attention!(revoke reason=“clog sliding_window timeout”, election={partition:{tid:1102810162659554,
另外还有
observer.log.20250826052938:[2025-08-26 05:29:24.978530] ERROR [CLOG] is_reconfirm_role_change_or_sync_timeout_ (ob_log_state_mgr.cpp:2050) [114103][4043][Y0-0000000000000000] [lt=62] [dc=0] is_reconfirm_role_change_or_sync_timeout_(partition_key={tid:1099511627910, partition_id:10, part_cnt:0}, role=1, now=1756157364978525, last_check_start_id_time_=1756157354972813, max_log_id=24862, start_id=24862, is_wait_replay=false) BACKTRACE:0x9ab149e 0x98853c1 0x229fa40 0x229f65b 0x229f2a4 0x784a17e 0x776d290 0x776b2f5 0x77b1ee4 0x75d1ec4 0x8ba5d47 0x2c9f684 0x2ca1fb2 0x9839025 0x9837a12 0x98344cf
observer.log.20250826052938:[2025-08-26 05:29:33.516595] WARN [CLOG] need_update_leader_ (ob_log_state_mgr.cpp:2499) [114103][4043][Y0-0000000000000000] [lt=18] [dc=0] get_elect_leader_ failed, leader_ is valid, need update(ret=-7006, partition_key={tid:1099511627910, partition_id:10, part_cnt:0}, self=“19.112.31.51:2882”, leader_={server:“19.112.31.51:2882”, cluster_id:1}, bool_ret=true)
observer.log.20250826052952:[2025-08-26 05:29:40.041591] WARN [CLOG] ack_log (ob_partition_log_service.cpp:1990) [15010][3684][YB4213701F25-0005F949EFDB0D93] [lt=11] [dc=0] can not ack log(partition_key={tid:1099511627910, partition_id:10, part_cnt:0}, role=1, state=3, current_proposal_id={time_to_usec:1756157375973282, server:“19.112.31.51:2882”}, server=“19.112.31.37:2882”, self=“19.112.31.51:2882”, log_id=24862, proposal_id={time_to_usec:1681239099421961, server:“19.112.31.51:2882”})
observer.log.20250826052952:[2025-08-26 05:29:40.188891] WARN [CLOG] get_log_with_cursor_ (ob_partition_log_service.cpp:3061) [13708][1358][Y0-0000000000000000] [lt=10] [dc=0] in reconfirm state, not allow send unconfirmed log with old proposal_id(partition_key={tid:1099511627910, partition_id:10, part_cnt:0}, server=“19.112.31.37:2882”, state_mgr proposal_id={time_to_usec:1756157375973282, server:“19.112.31.51:2882”}, reconfirm proposal_id={time_to_usec:1756157375973282, server:“19.112.31.51:2882”}, log_proposal_id={time_to_usec:1681239099421961, server:“19.112.31.51:2882”})
你这个是CLOG滑动窗口超时了,分区无主,导致应用连接断开