问题
准生产OCP重启失败,OBD重试多次启动ocp-server失败, 版本号: 4.4.0-20251114143405
OCP主机系统版本,el8,AlmaLinux release 8.10 (Cerulean Leopard)
从低版本OCP4.3.6升级而来,用的ocp-all-in-one-4.4.0-20251114143405.el7.x86_64.tar.gz,升级好后几天重启是正常,昨天下午OCP启动失败,有做过变动
OBD启动OCP日志
obd cluster start myocp
Get local repositories ok
Load cluster param plugin ok
Cluster status check ok
Check before start ocp-server-ce ok
Start ocp-server-ce ok
[ERROR] failed to start xxxxx.0.43 ocp-server-ce
Trace ID: ea67d2ca-c99f-11f0-88d8-00163e61ea7c
If you want to view detailed obd logs, please run: obd display-trace ea67d2ca-c99f-11f0-88d8-00163e61ea7c
obd.txt (88.6 KB)
obd启动ocp短暂期间,看ocp_meta租户有相应ocp-server的连接,过后ocp-server的连接都断开
obclient -hxxxxx.3.138 -P2881 -uroot@ocp_meta -p -A -Dmeta_database -e"show processlist"
Enter password:
+------------+------+-------------------+---------------+---------+------+--------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------------+------+-------------------+---------------+---------+------+--------+------------------+
| 3221641345 | root | xxxxx.0.43:50178 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641358 | root | xxxxx.0.43:50234 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641331 | root | xxxxx.0.43:50236 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641365 | root | xxxxx.0.43:50154 | meta_database | Sleep | 3 | SLEEP | NULL |
| 3221641335 | root | xxxxx.0.43:50164 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221599521 | root | xxxxx.0.43:56986 | oceanbase | Sleep | 10 | SLEEP | NULL |
| 3221640571 | root | xxxxx.0.43:50256 | meta_database | Query | 0 | ACTIVE | show processlist |
| 3221641356 | root | xxxxx.0.43:50246 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641348 | root | xxxxx.0.43:50206 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641363 | root | xxxxx.0.43:50184 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641309 | root | xxxxx.0.43:57294 | meta_database | Sleep | 15 | SLEEP | NULL |
| 3221641362 | root | xxxxx.0.43:50218 | meta_database | Sleep | 5 | SLEEP | NULL |
| 3221641334 | root | xxxxx.0.43:50200 | meta_database | Sleep | 5 | SLEEP | NULL |
+------------+------+-------------------+---------------+---------+------+--------+------------------+
ocp-server日志
egrep 'Caused by' ocp-server.log
Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'obTableSchemaServiceImpl': Unsatisfied dependency expressed through field 'sqlStatContextService': Error creating bean with name 'sqlStatContextServiceImpl': Invocation of init method failed
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'sqlStatContextServiceImpl': Invocation of init method failed
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'scopedTarget.sqlStatProperties': Invocation of init method failed
Caused by: java.util.IllegalFormatConversionException: d != java.time.Duration
Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'obTableSchemaServiceImpl': Unsatisfied dependency expressed through field 'sqlStatContextService': Error creating bean with name 'sqlStatContextServiceImpl': Invocation of init method failed
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'sqlStatContextServiceImpl': Invocation of init method failed
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'scopedTarget.sqlStatProperties': Invocation of init method failed
Caused by: java.util.IllegalFormatConversionException: d != java.time.Duration
ocp-server.log (102.3 KB)
昨天OCP平台做过的操作
ocp metadb主机的配置低,磁盘空间不足
-
修改过监控数据保留时间
如下ocp系统参数
ocp.perf.sql.sql-hist-level2-granularity 10m ocp.perf.sql.sql-hist-level2-query-interval 48h ocp.perf.sql.sql-hist-level2-retention 15d -
ocp_meta租户的file本地备份文件手动清理过
清理过backup_set_1_full和backup_set_2_full,历史的全量备份集
ll /data/ocp_metabak/myocp_meta/1753959742/tenant_incarnation_1/1002/data/
drwx------ 6 admin admin 218 Nov 24 15:36 backup_set_3_full
drwx------ 2 admin admin 198 Nov 25 04:59 backup_sets
drwx------ 2 admin admin 53 Nov 21 18:49 check_file
-rw------- 1 admin admin 192 Nov 21 18:49 format.obbak