部署好OCP后,创建集群报错

测试环境部署了一套OCP,部署完成后,在OCP新增了三台主机,在集群-创建集群新建集群,选择三台新增主机,开始部署后,执行到Create resource manager for default user报错

2 个赞


Create_resource_manager_for_default_user_100_FAILED.log (63.6 KB)

发下OCP版本 以及要创建的集群的OB版本

检查下OCP的meta集群是否正常

2025-11-14 13:22:35.572  WARN 214857 --- [manual-subtask-executor15,a9d38ad3f7d90a79,d60f2015eab12cbd] com.alibaba.druid.pool.DruidDataSource   : get connection timeout retry : 1
2025-11-14 13:22:40.587  WARN 214857 --- [manual-subtask-executor15,a9d38ad3f7d90a79,d60f2015eab12cbd] o.s.jdbc.support.SQLErrorCodesFactory    : Error while extracting database name

org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta-data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 5000, active 0, maxActive 20, creating 0, createErrorCount 3
	at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:363)
	at org.springframework.jdbc.support.SQLErrorCodesFactory.resolveErrorCodes(SQLErrorCodesFactory.java:235)
	at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.lambda$setDataSource$0(SQLErrorCodeSQLExceptionTranslator.java:138)
	at org.springframework.util.function.SingletonSupplier.get(SingletonSupplier.java:97)
	at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.setDataSource(SQLErrorCodeSQLExceptionTranslator.java:139)

版本号: 4.3.5-20250319105844
OceanBase 版本号 4.3.5.0

三个节点中,只有一个节点observer进程在,另外两个observer进程都没有

那就是OCP的meta集群挂了

1 个赞

挂掉的原因 取下observer.log 压缩上传看下

observer.log.tar.gz (22.3 MB)

1 个赞

看起来是CPU指令集没有AVX指令集

[2025-11-14 13:21:49.213377] INFO  [SERVER] inner_main (main.cpp:597) [275853][observer][T0][Y0-0000000000000001-0-0] [lt=12] observer starts(observer_version="OceanBase_CE 4.3.5.0")
[2025-11-14 13:21:49.213400] INFO  [SERVER] init (ob_server.cpp:297) [275853][observer][T0][Y0-0000000000000001-0-0] [lt=14] [OBSERVER_NOTICE] start to init observer
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc rep_good nopl cpuid tsc_known_freq pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm pti
[2025-11-14 13:21:49.218152] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [275853][observer][T0][Y0-0000000000000001-0-0] [lt=19][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx( |$)' /proc/cpuinfo")
[2025-11-14 13:21:49.220143] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [275853][observer][T0][Y0-0000000000000001-0-0] [lt=44][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx2( |$)' /proc/cpuinfo")
[2025-11-14 13:21:49.222140] WDIAG [COMMON] init_from_os (ob_cpu_topology.cpp:97) [275853][observer][T0][Y0-0000000000000001-0-0] [lt=46][errcode=0] cpu flag is not found(CPU_FLAG_CMDS[i]="grep -E ' avx512bw( |$)' /proc/cpuinfo")
[2025-11-14 13:27:11.090942] INFO  [PL] dump_module (ob_llvm_helper.cpp:644) [276894][T1_L0_G0][T1][YB420A643CF2-000643872F893DC4-0-0] [lt=38] Dump LLVM Compile Module!
(s.str().c_str()="; ModuleID = 'PL/SQL'
source_filename = "PL/SQL"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

%pl_exec_context = type { i64, i64, i64, %seg_param_store*, %obj*, i32*, i64, i8, i64 }
%seg_param_store = type { i64, i32, %wrapper_allocator, %seg_pointer_array, i64, i64 }
%wrapper_allocator = type { i64, i64 }
%seg_pointer_array = type { i64, [1 x i64]*, i64, [1 x i64], i64, i64, i64, i32, %wrapper_allocator, i8, %memory_context }
%memory_context = type { i64, i64, i64 }
%obj = type { %obj_meta, i32, i64 }
%obj_meta = type { i8, i8, i8, i8 }
%objparam = type { %obj, i64, i32, %obj.0, i32, i32, %obj_meta }
%obj.0 = type { i64, i8 }
%data_type = type { %obj_meta, i64, i32, i8, i8 }
%unwind_exception = type { i64 }
%pl_condition_value = type { i64, i64, i8*, i64, i64, i8 }

declare i32 @spi_calc_expr_at_idx(%pl_exec_context*, i64, i64, %objparam*)

declare i32 @spi_calc_package_expr(%pl_exec_context*, i64, i64, %objparam*)

declare i32 @spi_convert_objparam(%pl_exec_context*, %objparam*, i64, %objparam*, i8)

declare i32 @spi_set_variable_to_expr(%pl_exec_context*, i64, %objparam*, i8, i8)

declare i32 @spi_query_into_expr_idx(%pl_exec_context*, i8*, i64, i64*, i64, %data_type*, i64, i8*, i64*, i8, i8, i8)

declare i32 @spi_end_trans(%pl_exec_context*, i8*, i8)

declare i32 @spi_update_location(%pl_exec_context*, i64)

declare i32 @spi_execute_with_expr_idx(%pl_exec_context*, i8*, i64, i64*, i64, i64*, i64, %data_type*, i64, i8*, i64*, i8, i8, i8, i8)

declare i32 @spi_execute_immediate(%pl_exec_context*, i64, i64, i64*, i64, i64*, i64, %data_type*, i64, i8*, i64*, i8, i8, i8)

declare i32 @spi_alloc_complex_var(%pl_exec_context*, i8, i64, i64, i32, i64*, i64)

declare i32 @spi_construct_collection(%pl_exec_context*, i64, %objparam*)

declare i32 @spi_clear_diagnostic_area(%pl_exec_context*)

declare i32 @spi_extend_collection(%pl_exec_context*, i64, i64, i64, i64, i64)

declare i32 @spi_delete_collection(%pl_exec_context*, i64, i64, i64, i64)

declare i32 @spi_trim_collection(%pl_exec_context*, i64, i64, i64)

declare i32 @spi_cursor_init(%pl_exec_context*, i64)

declare i32 @spi_cursor_open_with_param_idx(%pl_exec_context*, i8*, i8*, i64, i8, i8, i64*, i64, i64, i64, i64, i64*, i64*, i64, i8)

declare i32 @spi_dynamic_open(%pl_exec_context*, i64, i64*, i64, i64, i64, i64)

declare i32 @spi_cursor_fetch(%pl_exec_context*, i64, i64, i64, i64*, i64, %data_type*, i64, i8*, i64*, i8, i64, %data_type*, i64, i8)

declare i32 @spi_cursor_close(%pl_exec_context*, i64, i64, i64, i8)

declare i32 @spi_destruct_collection(%pl_exec_context*, i64)

declare i32 @spi_reset_composite(i64, i8, i32)

declare i32 @spi_sub_nestedtable(%pl_exec_context*, i64, i64, i32, i32)

declare i32 @spi_copy_datum(%pl_exec_context*, i64, %obj*, %obj*, %data_type*, i64)

declare i32 @spi_destruct_obj(%pl_exec_context*, %obj*)

declare i32 @spi_set_pl_exception_code(%pl_exec_context*, i64, i8, i32)

declare i32 @spi_get_pl_exception_code(%pl_exec_context*, i64*)

declare i32 @spi_raise_application_error(%pl_exec_context*, i64, i64)

declare i32 @spi_check_early_exit(%pl_exec_context*)

declare i32 @spi_pipe_row_to_result(%pl_exec_context*, %objparam*)

declare i32 @spi_check_exception_handler_legal(%pl_exec_context*, i64)

declare i32 @spi_interface_impl(%pl_exec_context*, i8*)

declare i32 @spi_process_nocopy_params(%pl_exec_context*, i64, i8)

declare i32 @spi_add_ref_cursor_refcount(%pl_exec_context*, %obj*, i64)

declare i32 @spi_handle_ref_cursor_refcount(%pl_exec_context*, i64, i64, i64, i64)

declare i32 @spi_update_package_change_info(%pl_exec_context*, i64, i64)

declare i32 @spi_check_composite_not_null(%objparam*)

declare i32 @spi_process_resignal(%pl_exec_context*, i64, i64, i8*, i64*, i8*, i8)

declare i32 @spi_check_autonomous_trans(%pl_exec_context*)

declare i32 @spi_opaque_assign_null(i64)

declare i32 @spi_pl_profiler_before_record(%pl_exec_context*, i64, i64)

declare i32 @spi_init_composite(i64, i64, i8, i8)

declare i32 @spi_get_parent_allocator(i64, i64*)

declare i32 @spi_get_current_expr_allocator(%pl_exec_context*, i64*)

declare i32 @spi_pl_profiler_after_record(%pl_exec_context*, i64, i64)

declare %unwind_exception* @eh_create_exception(i64, i64, i64, i64, %pl_condition_value*)

declare i32 @_Unwind_RaiseException(%unwind_exception*)

declare void @_Unwind_Resume(%unwind_exception*)

declare i32 @eh_personality(i32, i32, i64, i8, i8)

declare i32 @eh_convert_exception(i8, i32, i64*, i64*, i8**, i64*)

declare i64 @eh_classify_exception(i8*)

declare void @eh_debug_int64(i8*, i64, i64)

declare void @eh_debug_int64ptr(i8*, i64, i64*)

declare void @eh_debug_int32(i8*, i64, i32)

declare void @eh_debug_int32ptr(i8*, i64, i32*)

declare void @eh_debug_int8(i8*, i64, i8)

declare void @eh_debug_int8ptr(i8*, i64, i8*)

declare void @eh_debug_obj(i8*, i64, %obj*)

declare void @eh_debug_objparam(i8*, i64, %objparam*)

declare i32 @pl_execute(%pl_exec_context*, i64, i64, i64*, i64, i64, i64, i64, i64*, i64)

declare i32 @set_user_type_var(%pl_exec_context*, i64, i64, i32)

declare i32 @set_implicit_cursor_in_forall(%pl_exec_context*, i8)

declare i32 @unset_implicit_cursor_in_forall(%pl_exec_context*)
")

CPU确实没有avx指令集,之前部署的那一套用的服务器和这一套是一样的,那个就部署成功了,这套我刚刚把集群关闭又重新启动了,现在看起来好像是可以了


但是告警里还有一个告警等级为严重的告警

1 个赞

用重启大法好了?

看起来是好了,就是任务中心有个失败的任务挂在那不太好看

1 个赞

还是会有问题的,这个版本没有avx不稳定 会不定时crash,可以选择以下版本部署

1 个赞

我是用ocp-all-in-one-4.3.5-20250319105844.el7.x86_64.tar.gz这个安装包安装的OCP,然后在OCP页面直接安装的OCEANBASE库,如果要安装高版本的库,我是不是在官网下载最新的OCP安装包就可以了

这种需要下载独立OB包 单独部署,然后部署OCP时选择已有OB集群 作为meta集群,

或者你如果是虚拟机好像可以选择支持avx

我看官网已经有4.4.1.0版本了,下载这个是不是也可以用

1 个赞

可以