obd cluster display test 报错

【 使用环境 】 测试环境
【 OB or 其他组件 】
【 使用版本 】 OCEABASE 4.1
【问题描述】 obd cluster display test 没有显示集群信息显示如下:
Deploy “test” is deployed
See OceanBase分布式数据库-海量数据 笔笔算数 .
Trace ID: af5fe84c-cc7d-11ed-bc50-d4ae5296982a
If you want to view detailed obd logs, please run: obd display-trace af5fe84c-cc7d-11ed-bc50-d4ae5296982a

【复现路径】redeploy 和 destroy 后换了个名字也是一样
【问题现象及影响】 obd cluster list 没问题
obd cluster list
±---------------------------------------------------------------+
| Cluster List |
±------------±-------------------------------±----------------+
| Name | Configuration Path | Status (Cached) |
±------------±-------------------------------±----------------+
| myoceanbase | /root/.obd/cluster/myoceanbase | destroyed |
| test | /root/.obd/cluster/test | deployed |
±------------±-------------------------------±----------------+
Trace ID: a2da6b38-cc7d-11ed-81b2-d4ae5296982a
If you want to view detailed obd logs, please run: obd display-trace a2da6b38-cc7d-11ed-81b2-d4ae5296982a

日志:
obd display-trace af5fe84c-cc7d-11ed-bc50-d4ae5296982a
[2023-03-27 16:59:30.364] [DEBUG] - mkdir /root/.obd/lock/
[2023-03-27 16:59:30.364] [DEBUG] - unknown lock mode
[2023-03-27 16:59:30.365] [DEBUG] - try to get share lock /root/.obd/lock/global
[2023-03-27 16:59:30.365] [DEBUG] - share lock /root/.obd/lock/global, count 1
[2023-03-27 16:59:30.365] [DEBUG] - cmd: [‘test’]
[2023-03-27 16:59:30.365] [DEBUG] - opts: {}
[2023-03-27 16:59:30.365] [DEBUG] - Get Deploy by name
[2023-03-27 16:59:30.366] [DEBUG] - mkdir /root/.obd/cluster/
[2023-03-27 16:59:30.366] [DEBUG] - mkdir /root/.obd/config_parser/
[2023-03-27 16:59:30.367] [DEBUG] - try to get exclusive lock /root/.obd/lock/deploy_test
[2023-03-27 16:59:30.367] [DEBUG] - exclusive lock /root/.obd/lock/deploy_test, count 1
[2023-03-27 16:59:30.384] [DEBUG] - Deploy status judge
[2023-03-27 16:59:30.384] [INFO] Deploy “test” is deployed
[2023-03-27 16:59:30.385] [INFO] See OceanBase分布式数据库-海量数据 笔笔算数 .
[2023-03-27 16:59:30.385] [INFO] Trace ID: af5fe84c-cc7d-11ed-bc50-d4ae5296982a
[2023-03-27 16:59:30.385] [INFO] If you want to view detailed obd logs, please run: obd display-trace af5fe84c-cc7d-11ed-bc50-d4ae5296982a
[2023-03-27 16:59:30.385] [DEBUG] - exclusive lock /root/.obd/lock/deploy_test release, count 0
[2023-03-27 16:59:30.385] [DEBUG] - unlock /root/.obd/lock/deploy_test
[2023-03-27 16:59:30.386] [DEBUG] - share lock /root/.obd/lock/global release, count 0
[2023-03-27 16:59:30.386] [DEBUG] - unlock /root/.obd/lock/global

【附件】

obd cluster deploy命令只完成了部署,没有完成启动,所以不能display。
请在执行obd cluster start 启动集群

obd cluster start test 后 ocp-express 启动失败 报错如下:
bootstrap.log (42.4 KB)

java.sql.SQLException: execute sql task failed. task: ocp_metric_expr_config, sql:INSERT INTO ocp_metric_expr_config(metric,expr) VALUES (‘wait_event_rt’,‘sum(rate(ob_waitevent_wait_seconds_total{@LABELS}[@INTERVAL])) by (@GBLABELS) / sum(rate(ob_waitevent_wait_total{@LABELS}[@INTERVAL])) by (@GBLABELS)’) ON DUPLICATE KEY UPDATE expr=‘sum(rate(ob_waitevent_wait_seconds_total{@LABELS}[@INTERVAL])) by (@GBLABELS) / sum(rate(ob_waitevent_wait_total{@LABELS}[@INTERVAL])) by (@GBLABELS)’
at com.oceanbase.ocp.bootstrap.db.DbInitializer.executeSqlTask(DbInitializer.java:216)
at com.oceanbase.ocp.bootstrap.db.DbInitializer.writeDefaultData(DbInitializer.java:254)
at com.oceanbase.ocp.bootstrap.db.DbInitializer.install(DbInitializer.java:122)
at com.oceanbase.ocp.bootstrap.db.DbInitializer.initialize(DbInitializer.java:101)
at com.oceanbase.ocp.bootstrap.spring.DBInitInterceptor.afterDataSourceCreation(DBInitInterceptor.java:70)
at com.oceanbase.ocp.bootstrap.spring.DataSourceInterceptor.lambda$afterDataSourceCreationHook$0(DataSourceInterceptor.java:43)
at java.util.ArrayList.forEach(ArrayList.java:1255)
at com.oceanbase.ocp.bootstrap.spring.DataSourceInterceptor.afterDataSourceCreationHook(DataSourceInterceptor.java:41)
at com.oceanbase.ocp.bootstrap.spring.DataSourceInterceptor.postProcessAfterInitialization(DataSourceInterceptor.java:34)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsAfterInitialization(AbstractAutowireCapableBeanFactory.java:455)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1808)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:620)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
:

又执行了一次 obd cluster start test 好了,
不过我看observer 和obproxy 进程是昨天就启动了,
我怀疑 display 是不是要所有服务都正常启动才能成功

是的。display需要服务都启动之后才能正常执行

我也遇到这个问题了。 必须全部成功才能展示。 可以给他们提提建议。 不知道他们改了没

刚磁盘有个节点报磁盘满了,stop 又start 一下ocp-express 又启动报错了,报错和之前一样,怎么看类似INSERT 超时
bootstrap (2).log (84.3 KB)

Caused by: java.sql.SQLException: execute sql task failed. task: ocp_metric_expr_config, sql:INSERT INTO ocp_metric_expr_config(metric,expr) VALUES (‘tps’,‘sum(rate(ob_sysstat{stat_id=“30007”,@LABELS}[@INTERVAL])) by (@GBLABELS) + sum(rate(ob_sysstat{stat_id=“30009”,@LABELS}[@INTERVAL])) by (@GBLABELS) + sum(rate(ob_sysstat{stat_id=“30011”,@LABELS}[@INTERVAL])) by (@GBLABELS)’) ON DUPLICATE KEY UPDATE expr=‘sum(rate(ob_sysstat{stat_id=“30007”,@LABELS}[@INTERVAL])) by (@GBLABELS) + sum(rate(ob_sysstat{stat_id=“30009”,@LABELS}[@INTERVAL])) by (@GBLABELS) + sum(rate(ob_sysstat{stat_id=“30011”,@LABELS}[@INTERVAL])) by (@GBLABELS)’

Caused by: java.sql.SQLTransientConnectionException: (conn=42) Timeout, query has reached the maximum query timeout: 10000000(us), maybe you can adjust the session variable ob_query_timeout or query_timeout hint, and try again.

收到 优化建议和报错我们都看一下

是不是又新开了一个帖子,停止集群再启动集群 ocp-express起不来