【 使用环境 】测试环境,单机部署
【 OB 】
【 使用版本 】5.7.25-OceanBase_CE-v4.2.1.2
【问题描述】执行查询 select count(1) from table where id in (超过3万个) 会报4013异常;
执行以下语句也没用 set global ob_sql_work_area_percentage = 10;
服务器内存仍有10G空闲,observer目前只占用2G左右。
in 语句有限制长度?有时又可执行成功。
【复现路径】问题出现前后相关操作
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):
执行失败后执行select last_trace_id(); 得到trace_id,然后到observer.log*里找一下这个trace_id相应的日志,发一下4013报错日志的上下文
trace_id YB427F000001-00060DF0D11D4C27-0-0
完整的日志如下
observer.log (3.2 MB)
[2024-01-24 17:44:12.770458] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C26-0-0] [lt=19][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5472054039224, cache_obj->added_lc()=false, cache_obj->get_object_id()=43986, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
[2024-01-24 17:44:12.772111] INFO [SQL.JO] compute_table_location (ob_join_order.cpp:245) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=27] succeed to calculate base table sharding info(table_id=18446744073709551614, ref_table_id=12008, is_global_index=false)
[2024-01-24 17:44:12.772146] INFO [SQL.JO] init_est_sel_info_for_access_path (ob_join_order.cpp:12024) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=25] total rowcount, use statistics(table_meta_info_.table_row_count_=2, table_meta_info_.average_row_size_=4.000000000000000000e+01, table_meta_info_.micro_block_count_=0, table_meta_info_.part_size_=8.000000000000000000e+01)
[2024-01-24 17:44:12.772592] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=37][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5472054041359, cache_obj->added_lc()=false, cache_obj->get_object_id()=43987, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
[2024-01-24 17:44:12.775707] INFO [SQL.JO] compute_table_location (ob_join_order.cpp:245) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C29-0-0] [lt=25] succeed to calculate base table sharding info(table_id=18446744073709551614, ref_table_id=12008, is_global_index=false)
[2024-01-24 17:44:12.775746] INFO [SQL.JO] init_est_sel_info_for_access_path (ob_join_order.cpp:12024) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C29-0-0] [lt=31] total rowcount, use statistics(table_meta_info_.table_row_count_=2, table_meta_info_.average_row_size_=4.000000000000000000e+01, table_meta_info_.micro_block_count_=0, table_meta_info_.part_size_=8.000000000000000000e+01)
[2024-01-24 17:44:12.776067] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C29-0-0] [lt=16][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5472054044836, cache_obj->added_lc()=false, cache_obj->get_object_id()=43988, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
[2024-01-24 17:44:12.776887] INFO [SHARE.SCHEMA] get_table_id (ob_schema_getter_guard.cpp:1868) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C2A-0-0] [lt=26] table not exist(tenant_id=1, tenant_id=1, database_id=201002, session_id=3221731051, table_name=PROFILING, is_index=false, schema_version=1706078392694528, schema_mgr_tenant_id=1)
这似乎不是全部日志,没有找到几条trace_id对应的日志,你执行grep “trace_id” observer.log.* observer.log 把对应trace_id的日志捞一下吧
select last_trace_id();我看这个变得很快即使没有新得查询。不知道捞得日志对不对
我把日志清理 然后执行报错语句的日志都传上来;YB427F000001-00060DF0D11D513F-0-0 是清理后的日志。
YB427F000001-00060DF0D11D4C27-0-0.zip (3.2 MB)
YB427F000001-00060DF0D11D513F-0-0.zip (1.7 MB)
grep “YB427F000001-00060DF0D11D4C27-0-0” observer.log.* observer.log
observer.log.20240124174836070:[2024-01-24 17:44:12.772111] INFO [SQL.JO] compute_table_location (ob_join_order.cpp:245) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=27] succeed to calculate base table sharding info(table_id=18446744073709551614, ref_table_id=12008, is_global_index=false)
observer.log.20240124174836070:[2024-01-24 17:44:12.772146] INFO [SQL.JO] init_est_sel_info_for_access_path (ob_join_order.cpp:12024) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=25] total rowcount, use statistics(table_meta_info_.table_row_count_=2, table_meta_info_.average_row_size_=4.000000000000000000e+01, table_meta_info_.micro_block_count_=0, table_meta_info_.part_size_=8.000000000000000000e+01)
observer.log.20240124174836070:[2024-01-24 17:44:12.772592] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=37][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5472054041359, cache_obj->added_lc()=false, cache_obj->get_object_id()=43987, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
[fire@XF_CLOUD_TEST log]$ grep “YB427F000001-00060DF0D11D5029-0-0” observer.log.* observer.log
observer.log:[2024-01-24 18:07:42.164148] INFO [SQL.JO] compute_table_location (ob_join_order.cpp:245) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D5029-0-0] [lt=26] succeed to calculate base table sharding info(table_id=18446744073709551614, ref_table_id=12008, is_global_index=false)
observer.log:[2024-01-24 18:07:42.164184] INFO [SQL.JO] init_est_sel_info_for_access_path (ob_join_order.cpp:12024) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D5029-0-0] [lt=27] total rowcount, use statistics(table_meta_info_.table_row_count_=2, table_meta_info_.average_row_size_=4.000000000000000000e+01, table_meta_info_.micro_block_count_=0, table_meta_info_.part_size_=8.000000000000000000e+01)
observer.log:[2024-01-24 18:07:42.164565] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D5029-0-0] [lt=12][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5473463433336, cache_obj->added_lc()=false, cache_obj->get_object_id()=43996, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
[fire@XF_CLOUD_TEST log]$ grep “YB427F000001-00060DF0D11D4C27-0-0” observer.log.* observer.log
observer.log.20240124174836070:[2024-01-24 17:44:12.772111] INFO [SQL.JO] compute_table_location (ob_join_order.cpp:245) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=27] succeed to calculate base table sharding info(table_id=18446744073709551614, ref_table_id=12008, is_global_index=false)
observer.log.20240124174836070:[2024-01-24 17:44:12.772146] INFO [SQL.JO] init_est_sel_info_for_access_path (ob_join_order.cpp:12024) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=25] total rowcount, use statistics(table_meta_info_.table_row_count_=2, table_meta_info_.average_row_size_=4.000000000000000000e+01, table_meta_info_.micro_block_count_=0, table_meta_info_.part_size_=8.000000000000000000e+01)
observer.log.20240124174836070:[2024-01-24 17:44:12.772592] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [23618][T1_L0_G0][T1][YB427F000001-00060DF0D11D4C27-0-0] [lt=37][errcode=0] set logical del time(cache_obj->get_logical_del_time()=5472054041359, cache_obj->added_lc()=false, cache_obj->get_object_id()=43987, cache_obj->get_tenant_id()=1, lbt()=“0x1178639c 0x9e959f0 0x499dc84 0x9e0e27a 0x499aacd 0x4996b83 0x499127d 0x498788a 0x4982d1b 0x9570ea4 0x11a51fbf 0x7f7cd0d29ea5 0x7f7cd0a52b0d”)
你是在系统租户上测试的吧,租户只有1g的内存,内存不足了。你建一个测试租户测试看看吧
系统租户内存不能直接调整吗?
系统租户没有扩展能力,不建议使用系统租户来执行测试,创建一个用户租户是推荐的用法