obd部署时报错 Failed to install repository grafana

【 使用环境 】测试环境 anolis 8.9、python 3.6.8
【 OB or 其他组件 】obd部署
【 使用版本 】oceanbase-all-in-one-4.2.1_bp10_20241122.el8.x86_64.tar.gz
【问题描述】obd cluster deploy obcluster -c all-components-min.yaml
【复现路径】
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!

在172.17.202.234上使用obd,通过配置文件all-components-min.yaml部署obcluster。
172.17.202.234打算部署ocp-express、obproxy、prometheus、grafana。
observer节点172.17.202.235、172.17.202.236、172.17.202.237。

操作系统信息:

[root@master-234 ~]# python -V
-bash: python: 未找到命令
[root@master-234 ~]# python3 -V
Python 3.6.8
[root@master-234 ~]# lsb_release -a
-bash: lsb_release: 未找到命令
[root@master-234 ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.9"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.9"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.9"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"
[root@master-234 ~]# obd cluster deploy gkobcluster -c all-components-min.yaml
+------------------------------------------------------------------------------------------------+
|                                            Packages                                            |
+-----------------+----------+------------------------+------------------------------------------+
| Repository      | Version  | Release                | Md5                                      |
+-----------------+----------+------------------------+------------------------------------------+
| oceanbase-ce    | 4.2.1.10 | 110000072024112010.el8 | b03c714bf9d03e3424203240514359a9e8b9317a |
| obproxy-ce      | 4.3.2.0  | 26.el8                 | ea8b7867af0f99ca867d40ee12de0b89f317fa7d |
| obagent         | 4.2.2    | 100000042024011120.el8 | bf152b880953c2043ddaf80d6180cf22bb8c8ac2 |
| prometheus      | 2.37.1   | 10000102022110211.el8  | e4f8a3e784512fca75bf1b3464247d1f31542cb9 |
| grafana         | 7.5.17   | 1                      | 1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 |
| ocp-express     | 4.2.2    | 100000022024011120.el8 | e5c152ebdd65839ed5f5521ff6c73e6a29cb9e75 |
| ob-configserver | 1.0.0    | 2.el8                  | 664f93205c913d5dc84e0779d565768fd60f1d5e |
+-----------------+----------+------------------------+------------------------------------------+
Repository integrity check ok
Load param plugin ok
Open ssh connection ok
Parameter check ok
Cluster status check ok
Initializes observer work home ok
Initializes obproxy work home ok
Initializes obagent work home ok
Initializes prometheus work home ok
Initializes grafana work home ok
Initializes ocp-express work home ok
Initializes ob-configserver work home ok
Remote oceanbase-ce-4.2.1.10-110000072024112010.el8-b03c714bf9d03e3424203240514359a9e8b9317a repository install ok
Remote oceanbase-ce-4.2.1.10-110000072024112010.el8-b03c714bf9d03e3424203240514359a9e8b9317a repository lib check !!
Remote obproxy-ce-4.3.2.0-26.el8-ea8b7867af0f99ca867d40ee12de0b89f317fa7d repository install ok
Remote obproxy-ce-4.3.2.0-26.el8-ea8b7867af0f99ca867d40ee12de0b89f317fa7d repository lib check ok
Remote obagent-4.2.2-100000042024011120.el8-bf152b880953c2043ddaf80d6180cf22bb8c8ac2 repository install ok
Remote obagent-4.2.2-100000042024011120.el8-bf152b880953c2043ddaf80d6180cf22bb8c8ac2 repository lib check ok
Remote prometheus-2.37.1-10000102022110211.el8-e4f8a3e784512fca75bf1b3464247d1f31542cb9 repository install ok
Remote prometheus-2.37.1-10000102022110211.el8-e4f8a3e784512fca75bf1b3464247d1f31542cb9 repository lib check ok
Remote grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 repository install x
[ERROR] Failed to install repository grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 to /root/grafana

See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 2042d1f0-b2b9-11ef-a533-000c2902cada
If you want to view detailed obd logs, please run: obd display-trace 2042d1f0-b2b9-11ef-a533-000c2902cada

grafana部分日志:

[2024-12-05 11:29:43.372] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type f | xargs -i ln -fs ${source}/{} ${target}/{} 
[2024-12-05 11:29:43.418] [DEBUG] -- exited code 0
[2024-12-05 11:29:43.419] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type l | xargs -i ln -fs ${source}/{} ${target}/{} 
[2024-12-05 11:29:43.449] [DEBUG] -- exited code 0
[2024-12-05 11:29:43.450] [DEBUG] -- root@172.17.202.234 export source='/root/.obd/repository/grafana/7.5.17/1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6/public'
[2024-12-05 11:29:43.450] [DEBUG] -- root@172.17.202.234 export target='/root/grafana/public'
[2024-12-05 11:29:43.450] [DEBUG] -- root@172.17.202.234 execute: ls -1 ${source} 
[2024-12-05 11:29:43.520] [DEBUG] -- exited code 0
[2024-12-05 11:29:43.521] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type f | xargs -i cp -f ${source}/{} ${target}/{} 
[2024-12-05 11:29:48.525] [ERROR] Traceback (most recent call last):
[2024-12-05 11:29:48.525] [ERROR]   File "paramiko/channel.py", line 699, in recv
[2024-12-05 11:29:48.525] [ERROR]   File "paramiko/buffered_pipe.py", line 164, in read
[2024-12-05 11:29:48.525] [ERROR] paramiko.buffered_pipe.PipeTimeout
[2024-12-05 11:29:48.525] [ERROR] 
[2024-12-05 11:29:48.525] [ERROR] During handling of the above exception, another exception occurred:
[2024-12-05 11:29:48.525] [ERROR] 
[2024-12-05 11:29:48.525] [ERROR] Traceback (most recent call last):
[2024-12-05 11:29:48.525] [ERROR]   File "core.py", line 1545, in deploy_cluster
[2024-12-05 11:29:48.525] [ERROR]   File "core.py", line 1623, in _deploy_cluster
[2024-12-05 11:29:48.525] [ERROR]   File "core.py", line 1690, in install_repositories_to_servers
[2024-12-05 11:29:48.525] [ERROR]   File "core.py", line 198, in call_plugin
[2024-12-05 11:29:48.525] [ERROR]   File "_plugin.py", line 348, in __call__
[2024-12-05 11:29:48.525] [ERROR]   File "_plugin.py", line 305, in _new_func
[2024-12-05 11:29:48.525] [ERROR]   File "/root/.obd/plugins/general/0.1/install_repo.py", line 110, in install_repo
[2024-12-05 11:29:48.525] [ERROR]     if not install_to_home_path():
[2024-12-05 11:29:48.525] [ERROR]   File "/root/.obd/plugins/general/0.1/install_repo.py", line 61, in install_to_home_path
[2024-12-05 11:29:48.525] [ERROR]     success = client.execute_command("cd ${source} && find -type f | xargs -i %(install_cmd)s ${source}/{} ${target}/{}" % {"install_cmd": install_cmd}) and success
[2024-12-05 11:29:48.525] [ERROR]   File "_plugin.py", line 225, in new_method
[2024-12-05 11:29:48.526] [ERROR]   File "_stdio.py", line 956, in func_wrapper
[2024-12-05 11:29:48.526] [ERROR]   File "ssh.py", line 504, in execute_command
[2024-12-05 11:29:48.526] [ERROR]   File "_stdio.py", line 956, in func_wrapper
[2024-12-05 11:29:48.526] [ERROR]   File "ssh.py", line 465, in _execute_command
[2024-12-05 11:29:48.526] [ERROR]   File "paramiko/file.py", line 200, in read
[2024-12-05 11:29:48.526] [ERROR]   File "paramiko/channel.py", line 1361, in _read
[2024-12-05 11:29:48.526] [ERROR]   File "paramiko/channel.py", line 701, in recv
[2024-12-05 11:29:48.526] [ERROR] socket.timeout
[2024-12-05 11:29:48.526] [ERROR] 
[2024-12-05 11:29:48.526] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type l | xargs -i cp -f ${source}/{} ${target}/{} 
[2024-12-05 11:29:48.609] [DEBUG] -- exited code 0
[2024-12-05 11:29:48.609] [DEBUG] -- root@172.17.202.234 export source='/root/.obd/repository/grafana/7.5.17/1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6/plugins-bundled'
[2024-12-05 11:29:48.609] [DEBUG] -- root@172.17.202.234 export target='/root/grafana/plugins-bundled'
[2024-12-05 11:29:48.609] [DEBUG] -- root@172.17.202.234 execute: ls -1 ${source} 
[2024-12-05 11:29:48.647] [DEBUG] -- exited code 0
[2024-12-05 11:29:48.647] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type f | xargs -i cp -f ${source}/{} ${target}/{} 
[2024-12-05 11:29:48.747] [DEBUG] -- exited code 0
[2024-12-05 11:29:48.747] [DEBUG] -- root@172.17.202.234 execute: cd ${source} && find -type l | xargs -i cp -f ${source}/{} ${target}/{} 
[2024-12-05 11:29:48.780] [DEBUG] -- exited code 0

prometheus和grafana配置:

prometheus:
  servers:
    - 172.17.202.234
  depends:
    - obagent
  global:
    # The working directory for prometheus. prometheus is started under this directory. This is a required field.
    home_path: /root/prometheus
grafana:
  servers:
    - 172.17.202.234
  depends:
    - prometheus
  global:
    home_path: /root/grafana
    login_password: 'grafana' # Grafana login password.

根据文档库中的信息,您遇到的 Failed to install repository grafana 错误可能由以下几个原因导致:

  1. 密码长度不足:Grafana 组件 admin 用户的 password 长度不能小于 5 位。请确保您的 login_password 配置项中设置的密码长度符合要求。例如,您可以将 login_password 修改为一个更长的密码,如 'grafana123'

  2. 网络超时:从日志中可以看到,安装过程中出现了 socket.timeout 异常,这可能是由于网络连接不稳定或服务器响应时间过长导致的。建议您检查网络连接,并确保所有节点之间的网络通信正常。

解决步骤

  1. 修改密码长度
    执行以下命令打开配置文件,修改 login_password 信息,然后保存并执行命令行中输出的命令使修改生效。

    obd cluster edit-config <deploy_name>
    

    在配置文件中,找到 grafana 部分,修改 login_password 为一个长度大于等于 5 位的密码,例如:

    grafana:
      servers:
        - 172.17.202.234
      depends:
        - prometheus
      global:
        home_path: /root/grafana
        login_password: 'grafana123' # 修改为一个长度大于等于 5 位的密码
    
  2. 检查网络连接
    确保所有节点之间的网络通信正常,特别是从 172.17.202.234 到其他节点的网络连接。您可以使用 ping 命令测试网络连通性:

    ping 172.17.202.235
    ping 172.17.202.236
    ping 172.17.202.237
    
  3. 重新部署
    修改配置文件并保存后,重新尝试部署集群:

    obd cluster deploy obcluster -c all-components-min.yaml
    

如果以上步骤仍然无法解决问题,建议您查看详细的 obd 日志以获取更多信息。您可以使用以下命令查看详细的 obd 日志:

obd display-trace 2042d1f0-b2b9-11ef-a533-000c2902cada

希望这些步骤能帮助您解决问题。如果问题依然存在,请联系 OceanBase 的人工答疑以获取更多帮助。

具体信息可参考以下文档:

  1. 错误码
  2. OceanBase 安装部署工具 V1.6.1
  3. 为现有集群增加白屏监控
  4. OceanBase 安装部署工具 V1.6.0
  5. OceanBase 安装部署工具 V1.6.2
  6. ob_error
  7. 5000 ~ 5999
  8. ob_admin 简介
  9. 0001 ~ 3999
  10. 9000 ~ 9499

解决了,user.timeout调得太低,还是保持默认30秒比较合适,这里并不是连接的超时,执行命令也有影响。

该参数并不是yaml文件中的吧user.timeout
这是修改了哪里呢

就是部署文件,我使用的模板是“all-components-min.yaml”,最开头,user:timeout这里。