ocp 租户没有监控数据

【 使用环境 】测试环境
【 OB or 其他组件 】ocp 4.3.5-20250319105844 、ocp_agent 4.3.5-20250319105844 、
ob 4.2.5.6
【 使用版本 】
【问题描述】这个集群下面 的 ob租户没有监控数据(有的指标有,有的指标没有)、sql 诊断数据也没有 ,但是主机监控数据有
【复现路径】
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!


3 个赞

推荐使用 obdiag 工具进行日志采集,这可以帮助我们更好地了解问题所在,并将采集到的日志提供给我们以便确认问题。

日志采集场景: 集群基础信息
日志采集命令: obdiag gather scene run --scene=observer.base

麻烦您提供一些信息以便我们更好地帮助您:

  1. 集群中是否有任何最近的操作或变更?
  2. 缺失监控数据的时间范围是什么时候开始的?
  3. 在 OCP 管理平台上是否有任何异常告警信息?
  4. 能否提供一个具体的 SQL 语句示例,用于说明 SQL 诊断数据缺失的情况?

附上敏捷诊断工具 obdiag 使用帮助链接

2 个赞

学习一下

2 个赞

是OCP的元数据库集群吗?几个副本?

几个租户?所有租户都是有的指标有,有的指标没有?

ocp_agent状态正常吗?

发下ocp-server.log以及monagent.log

2 个赞

不是,是业务集群,部分指标没有,sql 诊断没有
ocp 集群是单节点
ocp_agent状态正常, 我都重装了,重启了 都不行。

2 个赞

上面说的日志发下吧

2 个赞

正在处理:ocp没有监控数据.7z…
链接:https://foreignfile.catl.com/outpublish.html?code=A750eddb72ff04f2d85c1a8f0e6f35b85&lang=zh-cn#view
密码:5BBA052C

2 个赞

链接:https://foreignfile.catl.com/outpublish.html?code=A750eddb72ff04f2d85c1a8f0e6f35b85&lang=zh-cn#view
密码:5BBA052C

日志不能外发,被拦截了,我上传到这里了,看看这个能打开不

2 个赞

这个集群各节点的IP是什么

2025-12-29T11:26:23.33671+08:00 INFO [337983,75240fae7ea847a7] caller=common/middleware.go:60:func1: API request: [GET /metrics/stat, client=10.38.14.21, ocpServerIp=, traceId=75240fae7ea847a7, body=]
2025-12-29T11:26:23.38291+08:00 INFO [337983,910da67a09d92a97] caller=common/middleware.go:60:func1: API request: [GET /metrics/ob/basic, client=10.38.37.221, ocpServerIp=, traceId=910da67a09d92a97, body=]
2025-12-29T11:26:23.38296+08:00 ERROR [337983,] caller=common/middleware_auth.go:65:Authorize: invalid header authorization: , should contain 2 content. url: /metrics/ob/basic
2025-12-29T11:26:23.38304+08:00 ERROR [337983,910da67a09d92a97] caller=web/http_server.go:50:AuthorizeMiddleware: basic auth Authorize failed, err:OcpAgentError: code = 1101, message = Authentication failed for invalid header authorization fields: url=/metrics/ob/basic
2025-12-29T11:26:25.66655+08:00 ERROR [337983,] caller=common/middleware_auth.go:65:Authorize: invalid header authorization: , should contain 2 content. url: /metrics/obproxy
2025-12-29T11:26:25.66688+08:00 ERROR [337983,c174ba90b3791d4e] caller=web/http_server.go:50:AuthorizeMiddleware: basic auth Authorize failed, err:OcpAgentError: code = 1101, message = Authentication failed for invalid header authorization fields:, url=/metrics/obproxy
2025-12-29T11:26:25.66648+08:00 INFO [337983,] caller=logtailer/log_tailer_executor.go:360:getWatchedNewLogs: getLogsWithinTime match file fields:, matchedFileRealPath=/home/admin/CatlCloudDBtst/oceanbase/log/observer.log.20251229112610991
2025-12-29T11:26:27.61063+08:00 WARN [337983,606e7e9ccc907c2b] caller=mysql/table_input.go:401:collectData: slow sql, name: ob_system_event, duration: 15.54191752s (over 100ms), sql: select /* MONITOR_AGENT */ con_id tenant_id, wait_class as event_group, sum(total_waits) as total_waits, sum(time_waited_micro / 1000000) as time_waited from v$system_event where v$system_event.wait_class <> 'IDLE' and (con_id > 1000 or con_id = 1) group by tenant_id, event_group
2025-12-29T11:26:27.61289+08:00 WARN [337983,606e7e9ccc907c2b] caller=common/input_cache.go:47:Update: update cache for key ob_system_event, err: collect ob_metric ob_system_event timeout
github.com/oceanbase/obagent/monitor/plugins/inputs/mysql.(*TableInput).collectData
	/workspace/code-repo/rpm/.rpm_create/SOURCES/ocp-agent-ce/monitor/plugins/inputs/mysql/table_input.go:489
github.com/oceanbase/obagent/monitor/plugins/inputs/mysql.(*TableInput).InitCache.func1
	/workspace/code-repo/rpm/.rpm_create/SOURCES/ocp-agent-ce/monitor/plugins/inputs/mysql/table_input.go:211
github.com/oceanbase/obagent/monitor/plugins/common.(*InputCache).Update
	/workspace/code-repo/rpm/.rpm_create/SOURCES/ocp-agent-ce/monitor/plugins/common/input_cache.go:45
runtime.goexit
	/home/admin/go/src/runtime/asm_amd64.s:1598
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /10.200.2.124:62889
Caused by: java.net.NoRouteToHostException: No route to host
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:714)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.lang.Thread.run(Thread.java:748)

2025-12-29 14:05:23.362  INFO 31051 --- [metric-parse-9,,] c.o.o.m.s.OcpMetricCollectServiceImpl    : Collect failed, exporter=http://10.200.2.124:62889/metrics/stat, collectAt=1766988323, message=java.net.ConnectException: No route to host: /10.200.2.124:62889, rootCause=NoRouteToHostException: No route to host
2 个赞

业务集群名 CatlCloudDBtst
集群节点IP :10.38.14.23 、10.38.14.24、10.38.14.25

2 个赞
2025-12-29T09:21:18.66653+08:00 INFO [368767,] caller=logtailer/log_tailer_executor.go:360:getWatchedNewLogs: getLogsWithinTime match file fields: matchedFileRealPath=/home/admin/CatlCloudDBtst/oceanbase/log/trace.log
2025-12-29T09:21:18.70418+08:00 ERROR [368767,cde06c8e26a4d65f] caller=engine/route_manager.go:212:ServeHTTP: failed to write http response from buffer fields: error="write tcp 10.38.14.23:62889->10.38.14.21:65256: write: broken pipe"
2025-12-29T09:21:18.70426+08:00 INFO [368767,cde06c8e26a4d65f] caller=common/middleware.go:64:func1: request end fields: duration=3.599781244s, status=200, ocpServerIp=, client=10.38.14.21, url=/metrics/ob/basic

@旭辉 这个是啥问题?

2 个赞

跟这个帖子 报错类似,但是按照这个帖子方法来 操作,没效果。

sql 诊断 没有数据,但是我在业务库查询给gv$ob_sql_audit 是有纪录的。

3 个赞

学习一下

2 个赞

看起来是一样的,建议升级到OCP4.3.6或者4.4.0吧,更推荐4.4.0

3 个赞

版本太低了

1 个赞

升级到 4.4.0 的monagent 还是不行,没有监控数据。

2025-12-30T13:31:18.75933+08:00 ERROR [337983,cd84a61265620f21] caller=common/adapter_common.go:23:1: failed to write http response from buffer fields: error="write tcp 10.38.14.23:62889->10
.38.14.21:62160: write: broken pipe"

还有这个报错

1 个赞

学习了!