ocp-ce 4.3.1 重启失败

【 使用环境 】测试环境
【 OB or 其他组件 】ocp-ce
【 使用版本 】ocp-ce 4.3.1
【问题描述】通过obd 重启ocp-ce 4.3.1 失败
【复现路径】报错如下
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!

重启命令
[admin@mxt-master01 ~]$ obd cluster restart upgrade --wop  --skip-create-tenant 
[WARN] no such option: --wop
[WARN] no such option: --skip-create-tenant
Get local repositories and plugins ok
Load cluster param plugin ok
Open ssh connection ok
Cluster status check ok
Check before start ocp-server ok
Stop ocp-server ok
Check before start ocp-server ok
Start ocp-server ok
ocp-server program health check ok
Start ocp-server ok
ocp-server program health check -

ocp-server.log 报错如下:
2025-01-02 17:45:20.087 ERROR 54071 --- [prometheus-minute-schedule-1,eaea8678b7e8df43,01fcac3191f9e06d] c.o.ocp.monitor.OcpMonitorManager        : Collect single exporter failed, exporter address = http://10.59
.12.14:62889/metrics/stat

java.lang.RuntimeException: Agent cache does not contain host id, hostId=44
at com.oceanbase.ocp.monitor.helper.ExporterRequestHelper.get(ExporterRequestHelper.java:159)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.asyncCollect(OcpMetricCollectServiceImpl.java:152)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.collectIntervalMetricData(OcpMetricCollectServiceImpl.java:135)
at com.oceanbase.ocp.monitor.OcpMonitorManager$IntervalCollectScheduleTask.run(OcpMonitorManager.java:284)
at com.oceanbase.ocp.common.trace.TraceDecorator.lambda$decorate$0(TraceDecorator.java:33)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

2025-01-02 17:45:20.087 ERROR 54071 --- [prometheus-minute-schedule-1,eaea8678b7e8df43,01fcac3191f9e06d] c.o.ocp.monitor.OcpMonitorManager        : Collect single exporter failed, exporter address = http://10.59
.12.15:62888/metrics/stat

java.lang.RuntimeException: Agent cache does not contain host id, hostId=45
at com.oceanbase.ocp.monitor.helper.ExporterRequestHelper.get(ExporterRequestHelper.java:159)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.asyncCollect(OcpMetricCollectServiceImpl.java:152)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.collectIntervalMetricData(OcpMetricCollectServiceImpl.java:135)
at com.oceanbase.ocp.monitor.OcpMonitorManager$IntervalCollectScheduleTask.run(OcpMonitorManager.java:284)
at com.oceanbase.ocp.common.trace.TraceDecorator.lambda$decorate$0(TraceDecorator.java:33)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

2025-01-02 17:45:20.087 ERROR 54071 --- [prometheus-minute-schedule-1,eaea8678b7e8df43,01fcac3191f9e06d] c.o.ocp.monitor.OcpMonitorManager        : Collect single exporter failed, exporter address = http://10.59
.12.15:62889/metrics/stat

java.lang.RuntimeException: Agent cache does not contain host id, hostId=45
at com.oceanbase.ocp.monitor.helper.ExporterRequestHelper.get(ExporterRequestHelper.java:159)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.asyncCollect(OcpMetricCollectServiceImpl.java:152)
at com.oceanbase.ocp.monitor.service.OcpMetricCollectServiceImpl.collectIntervalMetricData(OcpMetricCollectServiceImpl.java:135)
at com.oceanbase.ocp.monitor.OcpMonitorManager$IntervalCollectScheduleTask.run(OcpMonitorManager.java:284)
at com.oceanbase.ocp.common.trace.TraceDecorator.lambda$decorate$0(TraceDecorator.java:33)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

根据您提供的信息,obd cluster restart upgrade --wop --skip-create-tenant 命令中使用了不正确的选项 --wop--skip-create-tenant,这可能是导致重启失败的原因之一。此外,从日志来看,OCP 服务器在启动后尝试收集监控数据时遇到了问题,具体表现为 Agent cache does not contain host id 的错误。

分析与建议

  1. 修正命令参数

    • 您使用的重启命令包含无效的选项 --wop--skip-create-tenant。请参考官方文档或使用 obd cluster restart --help 查看正确的命令格式。
    • 正确的重启命令应该是:
      obd cluster restart <deploy_name>
      
  2. 解决 Agent Cache 问题

    • 日志中的错误提示表明 OCP 在尝试访问某些主机(ID为44和45)的监控数据时未能找到对应的缓存记录。这可能是因为这些主机上的 OBAgent 没有正确注册或者配置有问题。
    • 您可以尝试以下步骤来解决问题:
      • 确认所有节点上的 OBAgent 是否正常运行,并且能够与 OCP 进行通信。
      • 检查 OBAgent 的配置文件,确保它们指向正确的 OCP 地址并且配置无误。
      • 如果问题依旧存在,考虑重新部署或更新 OBAgent 到最新版本,以确保兼容性和稳定性。
  3. 检查网络连接

    • 确保所有涉及的机器之间网络畅通,特别是那些被提及的 IP 地址(如 10.59.12.1410.59.12.15),确保可以从 OCP 服务器访问这些地址上的服务端口(例如 62889)。
  4. 查看详细日志

    • 使用 OceanBase 敏捷诊断工具 obdiag 收集更详细的日志信息,以便进一步分析问题根源。您可以按照提供的链接进行操作:OceanBase敏捷诊断工具obdiag

综上所述,请先调整重启命令并排查 OBAgent 的状态及网络状况,如果问题仍然无法解决,建议联系 OceanBase 的技术支持团队获取更多帮助。

更多信息请查看:

  1. OceanBase 安装部署工具 V1.6.0
  2. 使用 OCP 接管 obd 部署的集群
  3. OceanBase 安装部署工具 V1.3.3
  4. 错误码
  5. OceanBase 安装部署工具 V2.0.1
  6. OceanBase 安装部署工具 V2.2.0

发一下 obd.log日志和yaml文件

[admin@mxt-master01 ocp-server]$ obd cluster list

+----------------------------------------------------------------------------+
|                                Cluster List                                |
+----------------+-----------------------------------------+-----------------+
| Name           | Configuration Path                      | Status (Cached) |
+----------------+-----------------------------------------+-----------------+
| monitor        | /home/admin/.obd/cluster/monitor        | stopped         |
| CatlCloudDBtst | /home/admin/.obd/cluster/CatlCloudDBtst | running         |
| upgrade        | /home/admin/.obd/cluster/upgrade        | stopped         |

[admin@mxt-master01 ocp-server]$ obd cluster  edit-config   upgrade  
user:
  username: admin
  password: xxxx
  port: 22
ocp-server-ce:
  servers:
  - 10.38.14.21
  global:
    home_path: /home/admin/ocp-server
    port: 8080
    admin_password: '********'
    memory_size: 32g
    jdbc_url: jdbc:oceanbase://10.38.xx.xx:2881/meta_database
    jdbc_username: root@ocp
    jdbc_password: xxxx
    ocp_meta_tenant:
tenant_name: ocp
max_cpu: 4
memory_size: 8G
    ocp_meta_username: root
    ocp_meta_password: xxxx
    ocp_meta_db: meta_database
    ocp_monitor_tenant:
tenant_name: monitor
max_cpu: 4
memory_size: 16G
    ocp_monitor_username: root
    ocp_monitor_password: xxxx
    ocp_monitor_db: monitor_database
    soft_dir: /home/admin/ocp-server/data/files/
    log_dir: /home/admin/ocp-server/log
~                                                                                                                                                                                                                  
~                                           

根据您提供的 obd cluster listobd cluster edit-config upgrade 的输出信息,我们可以进一步分析并尝试解决 OCP-CE 4.3.1 重启失败的问题。

分析与建议

1. 确认集群状态

obd cluster list 的输出可以看到:

  • upgrade 集群处于 stopped 状态。
  • CatlCloudDBtst 集群正在运行。
  • monitor 集群也处于 stopped 状态。

这表明 upgrade 集群确实已经停止,因此可以尝试重新启动它。但在此之前,请确保其他依赖的服务(如 monitor)也正常工作。

2. 检查配置文件

obd cluster edit-config upgrade 的输出来看,配置文件中存在一些潜在问题:

  • 租户配置格式错误:在 ocp_meta_tenantocp_monitor_tenant 的配置中,tenant_namemax_cpumemory_size 的配置格式不正确。正确的格式应该是嵌套结构,而不是直接在同一层级下定义。请参考以下修正后的格式:
ocp-server-ce:
  servers:
    - 10.38.14.21
  global:
    home_path: /home/admin/ocp-server
    port: 8080
    admin_password: '********'
    memory_size: 32g
    jdbc_url: jdbc:oceanbase://10.38.xx.xx:2881/meta_database
    jdbc_username: root@ocp
    jdbc_password: xxxx
    ocp_meta_tenant:
      tenant_name: ocp
      max_cpu: 4
      memory_size: 8G
      ocp_meta_username: root
      ocp_meta_password: xxxx
      ocp_meta_db: meta_database
    ocp_monitor_tenant:
      tenant_name: monitor
      max_cpu: 4
      memory_size: 16G
      ocp_monitor_username: root
      ocp_monitor_password: xxxx
      ocp_monitor_db: monitor_database
      soft_dir: /home/admin/ocp-server/data/files/
      log_dir: /home/admin/ocp-server/log

3. 重启命令

请使用正确的命令来重启 upgrade 集群:

obd cluster restart upgrade

4. 日志分析

如果重启后仍然遇到相同错误,请继续查看 ocp-server.log 文件中的详细日志信息。特别是关注以下几点:

  • 确认所有节点上的 OBAgent 是否正常运行,并且能够与 OCP 进行通信。
  • 检查 OBAgent 的配置文件,确保它们指向正确的 OCP 地址并且配置无误。
  • 如果问题依旧存在,考虑重新部署或更新 OBAgent 到最新版本,以确保兼容性和稳定性。

5. 网络连接

确保所有涉及的机器之间网络畅通,特别是那些被提及的 IP 地址(如 10.59.12.1410.59.12.15),确保可以从 OCP 服务器访问这些地址上的服务端口(例如 62889)。

6. 使用 obdiag 收集诊断信息

使用 OceanBase 敏捷诊断工具 obdiag 收集更详细的日志信息,以便进一步分析问题根源。您可以按照提供的链接进行操作:OceanBase敏捷诊断工具obdiag

总结

请先修正配置文件中的租户配置格式,然后使用正确的命令重启 upgrade 集群。如果问题仍然无法解决,建议继续排查 OBAgent 的状态及网络状况,并使用 obdiag 工具收集更多诊断信息。如果需要进一步帮助,请联系 OceanBase 的技术支持团队。

更多信息请查看:

  1. OceanBase 安装部署工具 V1.6.0
  2. 使用 OCP 接管 obd 部署的集群
  3. OceanBase 安装部署工具 V1.3.3
  4. 错误码
  5. OceanBase 安装部署工具 V2.0.1
  6. OceanBase 安装部署工具 V2.2.0

希望这些建议能帮助您解决问题。如果有任何疑问或需要进一步的帮助,请随时告知。

更多信息请查看:

  1. OceanBase 安装部署工具 V2.0.1
  2. 通过图形化界面部署 OCP
  3. 常见问题汇总
  4. 配置文件示例
  5. 升级 OCP Express
  6. 使用命令行部署 OceanBase 数据库生产环境
  7. 通过图形化界面部署 OceanBase 集群
  8. 使用命令行部署 OCP Express
  9. 使用命令行部署 OCP

(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)

–obd

[2025-01-02 19:00:08.783] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- cp rpm & pos
[2025-01-02 19:00:08.868] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - current remote_transporter RemoteTransporter.CLIENT
[2025-01-02 19:00:08.868] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: mkdir -p /home/admin/ocp-server/ocp-server/lib && rm -fr /home/admin/ocp-server/ocp-server/lib/ocp-agent-ce
-4.2.1-20231208144448.el7.x86_64.rpm 
[2025-01-02 19:00:08.920] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:00:08.921] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- send /home/admin/.obd/mirror/local/ocp-agent-ce-4.2.1-20231208144448.el7.x86_64.rpm to /home/admin/ocp-server/ocp-server/lib/ocp-agent
-ce-4.2.1-20231208144448.el7.x86_64.rpm
[2025-01-02 19:00:13.107] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - admin@10.38.14.21 execute: chmod 644 /home/admin/ocp-server/ocp-server/lib/ocp-agent-ce-4.2.1-20231208144448.el7.x86_64.rpm 
[2025-01-02 19:00:13.184] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exited code 0
[2025-01-02 19:00:13.185] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: mkdir -p /home/admin/ocp-server/data/files && rm -fr /home/admin/ocp-server/data/files/ocp-agent-ce-4.2.1-2
0231208144448.el7.x86_64.rpm 
[2025-01-02 19:00:13.241] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:00:13.242] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- send /home/admin/.obd/mirror/local/ocp-agent-ce-4.2.1-20231208144448.el7.x86_64.rpm to /home/admin/ocp-server/data/files/ocp-agent-ce-
4.2.1-20231208144448.el7.x86_64.rpm
[2025-01-02 19:00:17.146] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - admin@10.38.14.21 execute: chmod 644 /home/admin/ocp-server/data/files/ocp-agent-ce-4.2.1-20231208144448.el7.x86_64.rpm 
[2025-01-02 19:00:17.184] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exited code 0
[2025-01-02 19:00:17.184] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: mkdir -p /home/admin/ocp-server/ocp-server/lib && rm -fr /home/admin/ocp-server/ocp-server/lib/ocp-agent-ce
-4.2.1-20231208144448.el7.aarch64.rpm 
[2025-01-02 19:00:17.269] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:00:17.270] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- send /home/admin/.obd/mirror/local/ocp-agent-ce-4.2.1-20231208144448.el7.aarch64.rpm to /home/admin/ocp-server/ocp-server/lib/ocp-agen
t-ce-4.2.1-20231208144448.el7.aarch64.rpm
[2025-01-02 19:00:19.626] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - admin@10.38.14.21 execute: chmod 644 /home/admin/ocp-server/ocp-server/lib/ocp-agent-ce-4.2.1-20231208144448.el7.aarch64.rpm 
[2025-01-02 19:00:22.357] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exited code 0
[2025-01-02 19:00:22.359] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: mkdir -p /home/admin/ocp-server/data/files && rm -fr /home/admin/ocp-server/data/files/ocp-agent-ce-4.2.1-2
0231208144448.el7.aarch64.rpm 
[2025-01-02 19:00:22.406] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:00:22.407] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- send /home/admin/.obd/mirror/local/ocp-agent-ce-4.2.1-20231208144448.el7.aarch64.rpm to /home/admin/ocp-server/data/files/ocp-agent-ce
-4.2.1-20231208144448.el7.aarch64.rpm
[2025-01-02 19:00:24.962] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - admin@10.38.14.21 execute: chmod 644 /home/admin/ocp-server/data/files/ocp-agent-ce-4.2.1-20231208144448.el7.aarch64.rpm 
[2025-01-02 19:00:24.998] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exited code 0
[2025-01-02 19:00:24.998] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: cat /home/admin/ocp-server/run/ocp-server.pid 
[2025-01-02 19:00:25.069] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:00:25.069] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/80971 
[2025-01-02 19:00:25.140] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 2, error output:
[2025-01-02 19:00:25.140] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] ls: cannot access /proc/80971: No such file or directory
[2025-01-02 19:00:25.140] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] 
[2025-01-02 19:00:25.141] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: cd /home/admin/ocp-server; export JDBC_URL=jdbc:oceanbase://10.38.14.21:2881/meta_database; export JDBC_USE
RNAME=root@ocp;export JDBC_PASSWORD='ABcd__1324'; export JDBC_PUBLIC_KEY=;java -jar -Xms16g -Xmx16g -Docp.iam.encrypted-system-password=oceanbase /home/admin/ocp-server/lib/ocp-server.jar --bootstrap --progress-
log=/home/admin/ocp-server/log/bootstrap.log --with-property=server.port:8080 --with-property=logging.file.max-size:100MB --with-property=logging.file.total-size-cap:1GB --with-property=ocp.monitordb.host:10.38.
14.21 --with-property=ocp.monitordb.username:root@monitor --with-property=ocp.monitordb.port:2881 --with-property=ocp.monitordb.password:'ABcd__1324' --with-property=ocp.monitordb.database:monitor_database --wit
h-property=logging.file.name:/home/admin/ocp-server/log/ocp-server.log --with-property=ocp.site.url:http://10.38.14.21:8080 --with-property=obsdk.ob.connection.mode:direct --with-property=ocp.file.local.built-in
.dir:/home/admin/ocp-server/ocp-server/lib --with-property=ocp.log.download.tmp.dir:/home/admin/ocp-server/logs/ocp --with-property=ocp.file.local.dir:/home/admin/ocp-server/data/files/ > /dev/null 2>&1 & 
[2025-01-02 19:00:25.217] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:25.278] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ps -aux | grep -F 'java -jar -Xms16g -Xmx16g -Docp.iam.encrypted-system-password=oceanbase /home/admin/ocp-
server/lib/ocp-server.jar --bootstrap' | grep -v grep | awk '{print $2}'  
[2025-01-02 19:01:25.450] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:25.451] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- write 117682 to admin@10.38.14.21:22: /home/admin/ocp-server/run/ocp-server.pid
[2025-01-02 19:01:25.454] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: mkdir -p /home/admin/ocp-server/run && rm -fr /home/admin/ocp-server/run/ocp-server.pid 
[2025-01-02 19:01:25.526] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:25.527] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- send /tmp/tmpezokkpfo to /home/admin/ocp-server/run/ocp-server.pid
[2025-01-02 19:01:25.570] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - admin@10.38.14.21 execute: chmod 600 /home/admin/ocp-server/run/ocp-server.pid 
[2025-01-02 19:01:25.599] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exited code 0
[2025-01-02 19:01:25.724] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [INFO] ocp-server program health check
[2025-01-02 19:01:25.725] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:01:25.726] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:01:25.752] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:25.752] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:01:25.841] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:25.841] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 39
[2025-01-02 19:01:40.857] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:01:40.857] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:01:40.907] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:40.908] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:01:41.003] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:41.004] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 38
[2025-01-02 19:01:56.020] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:01:56.020] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:01:56.107] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:56.108] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:01:56.196] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:01:56.197] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 37
[2025-01-02 19:02:11.212] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:02:11.212] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:02:11.260] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:11.261] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:02:11.353] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:11.353] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 36
[2025-01-02 19:02:26.370] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:02:26.371] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:02:26.447] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:26.448] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:02:28.941] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:28.942] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 35
[2025-01-02 19:02:43.958] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:02:43.958] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:02:44.009] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:44.010] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:02:44.091] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:02:44.091] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 34
[2025-01-02 19:02:59.106] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:02:59.106] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:03:00.359] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:00.360] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:03:01.439] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:01.439] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 33
[2025-01-02 19:03:16.455] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:03:16.455] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:03:16.483] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:16.484] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:03:16.582] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:16.582] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 32
[2025-01-02 19:03:31.597] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:03:31.597] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:03:35.597] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:35.598] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:03:35.690] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:35.691] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 31
[2025-01-02 19:03:50.706] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:03:50.706] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:03:50.750] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:50.750] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:03:50.837] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:03:50.837] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 30
[2025-01-02 19:04:05.852] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:04:05.852] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:04:07.087] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:07.087] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:04:07.134] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:07.135] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 29
[2025-01-02 19:04:22.150] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:04:22.150] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:04:22.183] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:22.184] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:04:22.286] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:22.286] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 28
[2025-01-02 19:04:37.301] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:04:37.301] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:04:37.340] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:37.340] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:04:37.463] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:37.464] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 27
[2025-01-02 19:04:52.479] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:04:52.479] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:04:52.542] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:52.545] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:04:52.623] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:04:52.624] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 26
[2025-01-02 19:05:07.639] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:05:07.640] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:05:07.693] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:07.693] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:05:07.818] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:07.818] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 25
[2025-01-02 19:05:22.832] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:05:22.833] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:05:22.871] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:22.871] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:05:22.964] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:22.964] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 24
[2025-01-02 19:05:37.980] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:05:37.980] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:05:39.243] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:39.243] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:05:39.318] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:39.319] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 23
[2025-01-02 19:05:54.334] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:05:54.334] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:05:54.422] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:54.423] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:05:54.471] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:05:54.471] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 22
[2025-01-02 19:06:09.486] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:06:09.487] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:06:09.706] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:06:09.706] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:11:16.861] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:11:16.862] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- failed to start 10.38.14.21 ocp-server, remaining retries: 1
[2025-01-02 19:11:31.877] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- 10.38.14.21 program health check
[2025-01-02 19:11:31.878] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: ls /proc/117682 
[2025-01-02 19:11:31.958] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:11:31.959] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- admin@10.38.14.21 execute: bash -c 'cat /proc/net/{tcp*,udp*}' | awk -F' ' '{print $2,$10}' | grep '00000000:1F90' | awk -F' ' '{print
 $2}' | uniq 
[2025-01-02 19:11:32.135] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] --- exited code 0
[2025-01-02 19:11:32.192] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [ERROR] failed to start 10.38.14.21 ocp-server
[2025-01-02 19:11:32.192] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [ERROR] start ocp-server failed
[2025-01-02 19:11:32.192] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] -- sub start ref count to 0
[2025-01-02 19:11:32.192] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] -- export start
[2025-01-02 19:11:32.192] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [ERROR] ocp-server-ce start failed
[2025-01-02 19:11:32.193] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [ERROR] OBD-1005: Some of the servers in the cluster have been stopped
[2025-01-02 19:11:32.198] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [INFO] Trace ID: b97bcd40-c8f8-11ef-ad9c-e4434b2db820
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [INFO] If you want to view detailed obd logs, please run: obd display-trace b97bcd40-c8f8-11ef-ad9c-e4434b2db820
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - share lock /home/admin/.obd/lock/mirror_and_repo release, count 0
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - unlock /home/admin/.obd/lock/mirror_and_repo
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - exclusive lock /home/admin/.obd/lock/deploy_upgrade release, count 0
[2025-01-02 19:11:32.199] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - unlock /home/admin/.obd/lock/deploy_upgrade
[2025-01-02 19:11:32.200] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - share lock /home/admin/.obd/lock/global release, count 0
[2025-01-02 19:11:32.200] [b97bcd40-c8f8-11ef-ad9c-e4434b2db820] [DEBUG] - unlock /home/admin/.obd/lock/global

[admin@mxt-master01 ~]$ obd cluster restart upgrade --wop --skip-create-tenant
为什么要无参启动,什么原因呢。
直接obd cluster restart upgrade 会报错么

麻烦执行如下查询

obclient -hxxx -Pxxx -uroot@ocp_meta -p'xxx' -Dmeta_database -A

select * from ocp_exporter_address;

select * from compute_host;


你的yaml文件是手搓的么,跟我的不一样,按理说租户应该都是在oceanbase-ce下的。而且并未看到你的oceanbase-ce组件

obd 可以部署ocp 也可以部署 observer 。
我的yml 文件只部署了 ocp-server .
问题找到了,我的服务器上 root 用户下 有一个ocp-server ,在admin 下也有一个ocp-server ,有点乱,我把admin 下的ocp-server 停掉 启动root下的就好了。