【 使用环境 】测试环境
【 OB or 其他组件 】
【 使用版本 】5.7.25-OceanBase_CE-v4.3.1.0
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):
【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)
在我的测试环境create tenant时发生timeout,
1.发布的ceate tenant语法如下:
create tenant obuser primary_zone=‘zone1’, resource_pool_list=(‘obuser_pool’);
2.查询事发的session以及trace_id
select id,user,tenant,state,trace_id from v$ob_session where state=‘active’
返回:
id=3221488449
trace_id=YB42AC136BFB-000623CC0A7BE7D9-0-0
info:create tenant obuser primary_zone=‘zone1’, resource_pool_list=(‘obuser_pool’)
3.根据返回的id,查询wait event:
select sid,event,p1,p2,p3,wait_class,state,wait_time_micro from v$session_wait where sid=3221488449 \G
*************************** 1. row ***************************
sid: 3221488449
event: sync rpc
p1: 518
p2: 491
p3: 0
wait_class: NETWORK
state: WAITED KNOWN TIME
wait_time_micro: 258401
1 row in set (0.003 sec)
4.根据trace_id查询syslog输出:
tail -f ./observer.log|grep “YB42AC136BFB-000623CC0A7BE7D9-0-0”
输出日志截取如下:
[2024-10-06 19:20:05.903005] INFO [RPC.OBMYSQL] print_session_info (ob_sql_nio.cpp:1160) [6970][sql_nio0][T0][Y0-0000000000000000-0-0] [lt=64] [sql nio session](*s={this:0x7dc146ffc030, session_id:3221488449, trace_id:YB42AC136BFB-000623CC0A7BE7D9-0-0, sql_handling_stage:11, sql_initiative_shutdown:false, reader:{fd:167}, err:0, last_decode_time:1728216219286256, pending_write_task:{buf:null, sz:0}, need_epoll_trigger_write:false, consume_size:1508, pending_flag:1, may_handling_flag:true, handler_close_flag:false})
[2024-10-06 19:20:19.286874] WDIAG [RPC] send (ob_poc_rpc_proxy.h:168) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=127][errcode=-4012] sync rpc execute fail(ret=-4012, addr=“172.19.107.251:2882”, pcode=518, timeout=999999721)
[2024-10-06 19:20:19.286919] WDIAG [SQL.ENG] execute (ob_tenant_executor.cpp:94) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=28][errcode=-4012] rpc proxy create tenant failed(ret=-4012)
[2024-10-06 19:20:19.286933] INFO [SQL.ENG] execute (ob_tenant_executor.cpp:107) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=12] [CREATE TENANT] create tenant(ret=-4012, ret=“OB_TIMEOUT”, create_tenant_arg=tenant_schema:{tenant_id:18446744073709551615, schema_version:1, tenant_name:“obuser”, zone_list:[cnt:0], primary_zone:“zone1”, charset_type:2, locked:false, comment:"", name_case_mode:-1, read_only:false, locality_str:"", zone_replica_attr_array:[cnt:0], primary_zone_array:[], previous_locality_str:"", default_tablegroup_id:18446744073709551615, default_tablegroup_name:"", compatibility_mode:-1, drop_tenant_time:-1, status:0, in_recyclebin:false, arbitration_service_status:{status:3}}, pool_list:[“obuser_pool”], if_not_exist:false, sys_var_list:[], name_case_mode:-1, is_restore:false, palf_base_info:{prev_log_info:{log_id:-1, lsn:{lsn:18446744073709551615}, scn:{val:18446744073709551615, v:3}, log_proposal_id:9223372036854775807, accum_checksum:-1}, curr_lsn:{lsn:18446744073709551615}}, recovery_until_scn:{val:0, v:0}, compatible_version:0, is_creating_standby:false, log_restore_source:"", is_tmp_tenant_for_recover:false, source_tenant_id:0, cost=999999522)
[2024-10-06 19:20:19.287063] INFO [SHARE] add_event (ob_event_history_table_operator.h:261) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=52] event table add task(ret=0, event_table_name="_all_server_event_history", sql=INSERT INTO all_server_event_history (gmt_create, module, event, name1, value1, name2, value2, name3, value3, name4, value4, value5, value6, svr_ip, svr_port) VALUES (usec_to_time(1728217219286989), ‘sql’, ‘execute_cmd’, ‘cmd_type’, 8, ‘sql_text’, X’6372656174652074656E616E74206F6275736572207072696D6172795F7A6F6E653D277A6F6E6531272C207265736F757263655F706F6F6C5F6C6973743D28276F62757365725F706F6F6C2729’, ‘return_code’, -4012, ‘tenant_id’, 1, ‘’, ‘’, ‘172.19.107.251’, 2882))
[2024-10-06 19:20:19.287084] WDIAG [SQL] open_cmd (ob_result_set.cpp:102) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=13][errcode=-4012] execute cmd failed(ret=-4012)
[2024-10-06 19:20:19.287096] WDIAG [SQL] open (ob_result_set.cpp:161) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=10][errcode=-4012] execute plan failed(ret=-4012)
[2024-10-06 19:20:19.287111] WDIAG [SERVER] response_result (ob_sync_cmd_driver.cpp:143) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=10][errcode=-4012] close result set fail(cret=-4012)
[2024-10-06 19:20:19.287124] WDIAG [SERVER] after_func (ob_query_retry_ctrl.cpp:986) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=10][errcode=-4012] [RETRY] check if need retry(v={force_local_retry:false, stmt_retry_times:0, local_retry_times:0, err:-4012, err:“OB_TIMEOUT”, retry_type:0, client_ret:-4012}, need_retry=false)
[2024-10-06 19:20:19.287142] WDIAG [SERVER] response_result (ob_sync_cmd_driver.cpp:149) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=17][errcode=-4012] result set open failed, check if need retry(ret=-4012, cli_ret=-4012, retry_ctrl.need_retry()=0)
[2024-10-06 19:20:19.287384] INFO [SERVER] send_error_packet (obmp_packet_sender.cpp:378) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=12] REACH SYSLOG RATE LIMIT [bandwidth]
[2024-10-06 19:20:19.287422] WDIAG [SERVER] do_process (obmp_query.cpp:818) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=17][errcode=-4012] execute query fail(ret=-4012, timeout_timestamp=1728217219287156)
[2024-10-06 19:20:19.287487] WDIAG [SERVER.OMT] process_one (ob_worker_processor.cpp:89) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=12][errcode=-4012] process request fail(ret=-4012)
[2024-10-06 19:20:19.287501] WDIAG [SERVER.OMT] process (ob_worker_processor.cpp:157) [7232][T1_L0_G0][T1][YB42AC136BFB-000623CC0A7BE7D9-0-0] [lt=12][errcode=-4012] process request fail(ret=-4012)
以上日志截取了一部分(太多)
能帮忙分析一下原因吗,谢谢
另外,感觉大概15分钟create tenant语句就返回timeout,顺便问一下这个超时时间由哪个参数控制?
【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!