重启ocp后启动不了【紧急】【重要】

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【问题现象及影响】
– start to sync target config

  • sub start ref count to 0
  • export start
  • Call obagent-py_script_connect-1.3.0 for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
  • import connect
  • add connect ref count to 1
    onnect to Obagent
    – connect obagent (192.168.1.20:8089 by user admin)
    – send http request method: GET, url: http://192.168.1.20:8089/api/v1/agent/status, data: None
    – connect obagent (192.168.1.21:8089 by user admin)
    – send http request method: GET, url: http://192.168.1.21:8089/api/v1/agent/status, data: None
    – connect obagent (192.168.1.22:8089 by user admin)
    – send http request method: GET, url: http://192.168.1.22:8089/api/v1/agent/status, data: None
  • sub connect ref count to 0
  • export connect
  • Call ocp-express-py_script_start-1.0.1 for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
  • import start
  • add start ref count to 1
    tart ocp-express
    root@192.168.1.23 execute: cat /home/oceanbase/obs/ocpexpress/run/ocp-express.pid
    – exited code 0
    root@192.168.1.23 execute: ls /proc/2994
    – exited code 2, error output:
    ls: cannot access /proc/2994: No such file or directory

– connect 192.168.1.23 -P2883 -umeta@ocp -p213123
– connect 192.168.1.23 -P2883 -umeta@ocp 11
– connect 192.168.1.23 -P2883 -umeta@ocp -111

OCP 日志报错信息是啥

启动用户,目录权限对吗?
挂的盘有没有掉?

麻烦把obd的日志附件上传一下看看,默认在~/.obd/log/obd

持续关注

obd.zip (56.8 KB)

这是日志
出现了大概两三次了 我请过一次日志
恢复是删除 /home/oceanbase/obs/ocpexpress/run/ocp-express.pid这个文件

看执行的display输出都是正常的,目前有什么问题吗?

[2023-07-20 02:33:41.301] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - Call oceanbase-ce-py_script_display-3.1.0 for oceanbase-ce-4.1.0.1-102000042023061314.el7-d03fafa6fa8ceb0636e4db05b5b5f6c3ac2256a3
[2023-07-20 02:33:41.301] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - import display
[2023-07-20 02:33:41.303] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - add display ref count to 1
[2023-07-20 02:33:41.303] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] Wait for observer init
[2023-07-20 02:33:41.304] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- execute sql: select * from oceanbase.__all_server. args: None
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +------------------------------------------------+
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] |                    observer                    |
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | ip           | version | port | zone  | status |
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.20 | 4.1.0.1 | 2881 | zone1 | ACTIVE |
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.21 | 4.1.0.1 | 2881 | zone2 | ACTIVE |
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.22 | 4.1.0.1 | 2881 | zone3 | ACTIVE |
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.312] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] obclient -h192.168.1.20 -P2881 -uroot -p'***************' -Doceanbase -A
[2023-07-20 02:33:41.435] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +------------------------------------------------+
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] |                    observer                    |
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | ip           | version | port | zone  | status |
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.20 | 4.1.0.1 | 2881 | zone1 | ACTIVE |
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.21 | 4.1.0.1 | 2881 | zone2 | ACTIVE |
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.22 | 4.1.0.1 | 2881 | zone3 | ACTIVE |
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+---------+------+-------+--------+
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] obclient -h192.168.1.20 -P2881 -uroot -p'***************' -Doceanbase -A
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO]
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - sub display ref count to 0
[2023-07-20 02:33:41.436] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - export display
[2023-07-20 02:33:41.437] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - Call obproxy-ce-py_script_display-3.1.0 for obproxy-ce-4.1.0.0-7.el7-2a9d9bf67f179dcca2a8c9e7c77373d94e7e2abe
[2023-07-20 02:33:41.437] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - import display
[2023-07-20 02:33:41.438] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - add display ref count to 1
[2023-07-20 02:33:41.439] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- execute sql: show proxyconfig like "%port". args: None
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +------------------------------------------------+
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] |                    obproxy                     |
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+------+-----------------+--------+
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | ip           | port | prometheus_port | status |
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+------+-----------------+--------+
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.23 | 2883 | 2884            | active |
[2023-07-20 02:33:41.448] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+------+-----------------+--------+
[2023-07-20 02:33:41.449] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] obclient -h192.168.1.23 -P2883 -uroot -p'***************' -Doceanbase -A
[2023-07-20 02:33:41.449] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - sub display ref count to 0
[2023-07-20 02:33:41.449] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - export display
[2023-07-20 02:33:41.450] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - Call obagent-py_script_display-1.3.0 for obagent-1.3.1-5.el7-ccfe93272a79ab1073e76f00580386c9c52e8324
[2023-07-20 02:33:41.450] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - import display
[2023-07-20 02:33:41.451] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - add display ref count to 1
[2023-07-20 02:33:41.452] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- send http request method: GET, url: http://192.168.1.20:8089/api/v1/agent/status, data: None
[2023-07-20 02:33:41.468] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- send http request method: GET, url: http://192.168.1.21:8089/api/v1/agent/status, data: None
[2023-07-20 02:33:41.486] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- send http request method: GET, url: http://192.168.1.22:8089/api/v1/agent/status, data: None
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +-----------------------------------------------------------------+
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] |                             obagent                             |
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+--------------------+--------------------+--------+
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | ip           | mgragent_http_port | monagent_http_port | status |
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+--------------------+--------------------+--------+
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.20 | 8089               | 8088               | active |
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.21 | 8089               | 8088               | active |
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | 192.168.1.22 | 8089               | 8088               | active |
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------+--------------------+--------------------+--------+
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - sub display ref count to 0
[2023-07-20 02:33:41.504] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - export display
[2023-07-20 02:33:41.505] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - Call ocp-express-py_script_display-1.0.1 for ocp-express-1.0.1-100000072023051917.el7-59eb8062858271a23080c824c98a72f9e5896235
[2023-07-20 02:33:41.505] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - import display
[2023-07-20 02:33:41.506] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - add display ref count to 1
[2023-07-20 02:33:41.506] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] -- send http request method: GET, url: http://192.168.1.23:8180/api/v1/status, data: None
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +-----------------------------------------------------------------+
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] |                           ocp-express                           |
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------------------+----------+------------------+--------+
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | url                      | username | initial password | status |
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------------------+----------+------------------+--------+
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] | http://192.168.1.23:8180 | admin    | zGiI2%+3         | active |
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] +--------------------------+----------+------------------+--------+
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - sub display ref count to 0
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - export display
[2023-07-20 02:33:41.519] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - Set obs deploy status to running
[2023-07-20 02:33:41.520] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - dump deploy info to /root/.obd/cluster/obs/.data
[2023-07-20 02:33:41.522] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] obs running
[2023-07-20 02:33:41.525] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] Trace ID: b6d215ac-2662-11ee-a4d5-005056831b9e
[2023-07-20 02:33:41.526] [b6d215ac-2662-11ee-a4d5-005056831b9e] [INFO] If you want to view detailed obd logs, please run: obd display-trace b6d215ac-2662-11ee-a4d5-005056831b9e
[2023-07-20 02:33:41.526] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 3
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 2
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 1
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 0
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - unlock /root/.obd/lock/mirror_and_repo
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - exclusive lock /root/.obd/lock/deploy_obs release, count 0
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - unlock /root/.obd/lock/deploy_obs
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - share lock /root/.obd/lock/global release, count 0
[2023-07-20 02:33:41.527] [b6d215ac-2662-11ee-a4d5-005056831b9e] [DEBUG] - unlock /root/.obd/lock/global

重启了2-3次服务器可以起来了

目前提供的日志附件里看不出什么问题,都是正常的。

1、重启ocp 是指的ocp express 吗,使用什么命令重启的?
2、确认一下ocp express组件没有启动成功时,其他组件是否已经启动成功?