ob-ce 3.1.5 安装失败

[admin@observer-01 ~]$ obd cluster start obcp-cluster
Get local repositories ok
Search plugins ok
Open ssh connection ok
Load cluster param plugin ok
Check before start observer ok
[WARN] OBD-2010: (server1(192.168.122.148)): system_memory too large. system_memory should be less than 0.7 * memory_limit/memory_limit_percentage.
[WARN] OBD-1012: (192.168.122.148) clog and data use the same disk (/)
[WARN] OBD-2010: (server2(192.168.122.149)): system_memory too large. system_memory should be less than 0.7 * memory_limit/memory_limit_percentage.
[WARN] OBD-1012: (192.168.122.149) clog and data use the same disk (/)
[WARN] OBD-2010: (server3(192.168.122.150)): system_memory too large. system_memory should be less than 0.7 * memory_limit/memory_limit_percentage.
[WARN] OBD-1012: (192.168.122.150) clog and data use the same disk (/)

Start observer ok
observer program health check ok
Connect to observer ok
Initialize oceanbase-ce x
[ERROR] OBD-5000: set session ob_query_timeout=1000000000 execute failed
[ERROR] OBD-5000: select * from oceanbase.__all_rootservice_event_history where module = “bootstrap” and event = “bootstrap_succeed” execute failed
[ERROR] oceanbase-ce-py_script_bootstrap-3.1.0 RuntimeError: (2013, ‘Lost connection to MySQL server during query’)

[ERROR] Cluster init failed

1.看下机器配置。free -h df -Th
2. 是手动部署的还是通过配置文件部署的。
可以给一下部署方式或者yaml文件
3. 新环境按这里配置下
4.麻烦提供下obd和observer.log日志

system_memory too large. system_memory should be less than 0.7 * memory_limit/memory_limit_percentage.

参数设置不合理,配置文件贴一下呢
配置这两个参数时推荐按这个方法配置
memory_limt/3 ≤ system_memory ≤ memory_limt/2
另外memory_limt 不能低于 8G

obd_log-observer-log.zip (2.6 MB)

[2023-11-17 14:40:14.772336] ERROR [SERVER.OMT] alloc (ob_worker_pool.cpp:93) [8459][454][Y0-0000000000000000] [lt=71] [dc=0] worker cnt larger than max cnt(worker_cnt_=256, max_cnt_=256) BACKTRACE:0x9b8f29e

可以看下这个答案 worker cnt larger than max cnt


报这个错是为啥?

image
这个日志和observer.log日志麻烦提供下

[2023-12-07 10:11:27.975915] WDIAG [STORAGE.TRANS] do_cluster_heartbeat_ (ob_tenant_weak_read_service.cpp:874) [2861588][T1_TenantWeakRe][T1][Y0-0000000000000000-0-0] [lt=13][errcode=-4076] tenant weak read service do cluster heartbeat fail(ret=-4076, ret=“OB_NEED_WAIT”, tenant_id_=1, last_post_cluster_heartbeat_tstamp_=1701915087875871, cluster_heartbeat_interval_=1000000, cluster_service_tablet_id={id:226}, cluster_service_master=“0.0.0.0:0”)
[2023-12-07 10:11:27.975927] WDIAG [STORAGE.TRANS] generate_min_weak_read_version (ob_weak_read_util.cpp:82) [2861588][T1_TenantWeakRe][T1][Y0-0000000000000000-0-0] [lt=6][errcode=-4023] get gts cache error(ret=-4023, tenant_id=1)
[2023-12-07 10:11:27.975934] WDIAG [STORAGE.TRANS] generate_server_version (ob_tenant_weak_read_service.cpp:314) [2861588][T1_TenantWeakRe][T1][Y0-0000000000000000-0-0] [lt=7][errcode=-4023] generate min weak read version error(ret=-4023, tenant_id=1)
[2023-12-07 10:11:27.975937] WDIAG [STORAGE.TRANS] generate_tenant_weak_read_timestamp_ (ob_tenant_weak_read_service.cpp:593) [2861588][T1_TenantWeakRe][T1][Y0-0000000000000000-0-0] [lt=4][errcode=-4023] generate server version for tenant fail(ret=-4023, ret=“OB_EAGAIN”, tenant_id=1, index=0x4000215daea0, server_version_epoch_tstamp_=1701915087975920)
[2023-12-07 10:11:27.990746] INFO [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:1156) [2861084][KVCacheWash][T0][Y0-0000000000000000-0-0] [lt=14] Wash compute wash size(is_wash_valid=false, sys_total_wash_size=-636274290688, global_cache_size=0, tenant_max_wash_size=0, tenant_min_wash_size=0, tenant_ids_=[500, 508, 509, 1])
[2023-12-07 10:11:27.995263] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=6][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)
[2023-12-07 10:11:27.995343] WDIAG [SHARE.PT] find_leader (ob_ls_info.cpp:847) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=8][errcode=-4018] fail to get leader replica(ret=-4018, ret=“OB_ENTRY_NOT_EXIST”, *this={tenant_id:1, ls_id:{id:1}, replicas:[]}, replica count=0)
[2023-12-07 10:11:27.995354] WDIAG [SHARE.PT] find_leader (ob_ls_info.cpp:847) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=11][errcode=-4018] fail to get leader replica(ret=-4018, ret=“OB_ENTRY_NOT_EXIST”, *this={tenant_id:1, ls_id:{id:1}, replicas:[]}, replica count=0)
[2023-12-07 10:11:27.995361] INFO [SHARE.PT] get_ls_info_ (ob_rpc_ls_table.cpp:140) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=7] leader doesn’t exist, try use all_server_list(tmp_ret=-4018, tmp_ret=“OB_ENTRY_NOT_EXIST”, ls_info={tenant_id:1, ls_id:{id:1}, replicas:[]})
[2023-12-07 10:11:27.995369] INFO [SHARE.PT] get_ls_info_ (ob_rpc_ls_table.cpp:151) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=6] server_list is empty, do nothing(ret=0, ret=“OB_SUCCESS”, server_list=[])
[2023-12-07 10:11:27.995376] INFO [SHARE.LOCATION] batch_update_caches_ (ob_ls_location_service.cpp:943) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=4] [LS_LOCATION]ls location cache has changed(ret=0, ret=“OB_SUCCESS”, old_location={cache_key:{tenant_id:0, ls_id:{id:-1}, cluster_id:-1}, renew_time:0, replica_locations:[]}, new_location={cache_key:{tenant_id:1, ls_id:{id:1}, cluster_id:1698041970}, renew_time:1701915087995375, replica_locations:[]})
[2023-12-07 10:11:27.995387] INFO [SHARE.LOCATION] batch_renew_tablet_locations (ob_location_service.cpp:440) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=10] [TABLET_LOCATION] batch renew tablet locations finished(ret=0, ret=“OB_SUCCESS”, tenant_id=1, renew_type=0, is_nonblock=false, tablet_list=[{id:1}], ls_ids=[{id:1}], error_code=-4721)
[2023-12-07 10:11:27.995400] WDIAG [SQL] do_close_plan (ob_result_set.cpp:749) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=1][errcode=-4006] exec result is null(ret=-4006)
[2023-12-07 10:11:27.995406] WDIAG [SQL] close (ob_result_set.cpp:844) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=6][errcode=0] fail close main query(ret=0, do_close_plan_ret=-4006)
[2023-12-07 10:11:27.995464] WDIAG [SQL.PC] common_free (ob_lib_cache_object_manager.cpp:141) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=0][errcode=0] set logical del time(cache_obj->get_logical_del_time()=4734891821688, cache_obj->added_lc()=false, cache_obj->get_object_id()=1002, cache_obj->get_tenant_id()=1, lbt()=“0x10d5bfb0 0x8fde6b4 0xad1a480 0xad172e4 0xb039d18 0xb0155c0 0xb0106ac 0xb00ffdc 0xb00fe80 0xb00fce0 0xf5ad334 0xf8c88a8 0xf8c84c8 0xb10cc30 0xddccc50 0xaf2e7c0 0x10fe0efc 0x10fdd66c 0x40002148e7b0 0x40002163122c”)
[2023-12-07 10:11:27.996276] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=10][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)
[2023-12-07 10:11:27.996383] WDIAG [SHARE.PT] find_leader (ob_ls_info.cpp:847) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=1][errcode=-4018] fail to get leader replica(ret=-4018, ret=“OB_ENTRY_NOT_EXIST”, *this={tenant_id:1, ls_id:{id:1}, replicas:[]}, replica count=0)
[2023-12-07 10:11:27.996396] WDIAG [SHARE.PT] find_leader (ob_ls_info.cpp:847) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=13][errcode=-4018] fail to get leader replica(ret=-4018, ret=“OB_ENTRY_NOT_EXIST”, *this={tenant_id:1, ls_id:{id:1}, replicas:[]}, replica count=0)
[2023-12-07 10:11:27.996403] INFO [SHARE.PT] get_ls_info_ (ob_rpc_ls_table.cpp:140) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=6] leader doesn’t exist, try use all_server_list(tmp_ret=-4018, tmp_ret=“OB_ENTRY_NOT_EXIST”, ls_info={tenant_id:1, ls_id:{id:1}, replicas:[]})
[2023-12-07 10:11:27.996411] INFO [SHARE.PT] get_ls_info_ (ob_rpc_ls_table.cpp:151) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=6] server_list is empty, do nothing(ret=0, ret=“OB_SUCCESS”, server_list=[])
[2023-12-07 10:11:27.996418] INFO [SHARE.LOCATION] batch_update_caches_ (ob_ls_location_service.cpp:943) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=4] [LS_LOCATION]ls location cache has changed(ret=0, ret=“OB_SUCCESS”, old_location={cache_key:{tenant_id:0, ls_id:{id:-1}, cluster_id:-1}, renew_time:0, replica_locations:[]}, new_location={cache_key:{tenant_id:1, ls_id:{id:1}, cluster_id:1698041970}, renew_time:1701915087996417, replica_locations:[]})
[2023-12-07 10:11:27.996454] INFO [SERVER] sleep_before_local_retry (ob_query_retry_ctrl.cpp:91) [2861538][T1_FreInfoReloa][T1][YB421414DD7C-00060BE1F8819CA7-0-0] [lt=0] will sleep(sleep_us=42000, remain_us=1078997, base_sleep_us=1000, retry_sleep_type=1, v.stmt_retry_times_=42, v.err_=-4721, timeout_timestamp=1701915089075450)
[2023-12-07 10:11:28.003855] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD7E-00060BE1F8735512-0-0] [lt=7][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)
[2023-12-07 10:11:28.004843] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD7E-00060BE1F8735512-0-0] [lt=8][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)
[2023-12-07 10:11:28.008913] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD80-00060BE1F885027C-0-0] [lt=6][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)
[2023-12-07 10:11:28.009812] WDIAG [SERVER] fill_ls_replica (ob_service.cpp:2570) [2861644][T1_L0_G9][T1][YB421414DD80-00060BE1F885027C-0-0] [lt=7][errcode=-4719] get ls handle failed(ret=-4719, ret=“OB_LS_NOT_EXIST”)

需要完整的obd和observer,log日志么? 麻烦老师打包发一下呢

不知道为啥我log文件传不上来

压缩一下呢

也不行 试过了

[2023-12-07 10:27:01.051] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [INFO] Initialize oceanbase
[2023-12-07 10:27:01.052] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] - Call oceanbase-py_script_bootstrap-4.0.0.0 for oceanbase-4.2.1.0-100000182023092722.el7-8874d247a31ec1a46ca59e9221c4eecd6ef57504
[2023-12-07 10:27:01.052] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] - import bootstrap
[2023-12-07 10:27:01.056] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] - add bootstrap ref count to 1
[2023-12-07 10:27:01.057] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: set session ob_query_timeout=1000000000
[2023-12-07 10:27:01.057] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: set session ob_query_timeout=1000000000. args: None
[2023-12-07 10:27:01.059] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: alter system bootstrap REGION “sys_region” ZONE “zone1” SERVER “20.20.221.124:2882”,REGION “sys_region” ZONE “zone2” SERVER “20.20.221.126:2882”,REGION “sys_region” ZONE “zone3” SERVER “20.20.221.128:2882”. args: None
[2023-12-07 10:27:01.063] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – OBD-5000: alter system bootstrap REGION “sys_region” ZONE “zone1” SERVER “20.20.221.124:2882”,REGION “sys_region” ZONE “zone2” SERVER “20.20.221.126:2882”,REGION “sys_region” ZONE “zone3” SERVER “20.20.221.128:2882” execute failed
[2023-12-07 10:27:01.064] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: create user “proxyro” IDENTIFIED BY %s. args: [‘gzXpjZKL8o’]
[2023-12-07 10:43:41.132] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [ERROR] OBD-5000: create user “proxyro” IDENTIFIED BY %s execute failed
[2023-12-07 10:43:41.132] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: grant select on oceanbase.* to proxyro IDENTIFIED BY %s. args: [‘gzXpjZKL8o’]
[2023-12-07 10:43:41.134] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [ERROR] OBD-5000: grant select on oceanbase.* to proxyro IDENTIFIED BY %s execute failed
[2023-12-07 10:43:41.135] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: alter user “root” IDENTIFIED BY %s. args: [‘zfNk20j9G3sm3A7Ewoyp’]
[2023-12-07 11:00:21.200] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [ERROR] OBD-5000: alter user “root” IDENTIFIED BY %s execute failed
[2023-12-07 11:00:21.201] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: select * from oceanbase.__all_server. args: None
[2023-12-07 11:00:21.201] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – OBD-5000: select * from oceanbase.__all_server execute failed
[2023-12-07 11:00:22.202] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: select * from oceanbase.__all_server. args: None
[2023-12-07 11:00:22.203] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – OBD-5000: select * from oceanbase.__all_server execute failed
[2023-12-07 11:00:23.204] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – execute sql: select * from oceanbase.__all_server. args: None
[2023-12-07 11:00:23.204] [0c3a2a76-94a8-11ee-8804-b04fa62047bf] [DEBUG] – OBD-5000: select * from oceanbase.__all_server execute failed

可以了 应该是yaml文件配置的不对