[ERROR] OBD-2002: Failed to start 127.0.0.1 observer:

【 使用环境 】生产环境
ob工具安装部署observer,启动失败
【 使用版本 】

【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
部署obd工具

tar -zxvf oceanbase-all-in-one-*.tar.gz

cd oceanbase-all-in-one/bin/

./install.sh

source ~/.oceanbase-all-in-one/bin/env.sh

修改初始化配置文件

vi ~/.oceanbase-all-in-one/obd/usr/obd/example/mini-single-example.yaml

部署OceanBase数据库

obd cluster deploy jby -c ~/.oceanbase-all-in-one/obd/usr/obd/example/mini-single-example.yaml

启动 OceanBase 数据库

obd cluster start jby

obd display-trace dfadef22-232d-11f0-abca-fefcfe4314d1

【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!

1 个赞

根据文档库中的信息,以下是可能导致问题的原因及解决方法:

1. 内存不足

  • 错误原因:根据文档片段 10 的信息,OBD-2002 错误可能与内存不足有关。
  • 解决方法
    • 尝试释放缓存内存:
      sudo sysctl -w vm.drop_caches=1
      # 或
      sudo echo 1 > /proc/sys/vm/drop_caches
      
    • 如果内存仍然不足,请调整 memory_limitsystem_memory 参数。编辑配置文件:
      obd cluster edit-config jby
      
      确保以下条件满足:
      • 对于 OceanBase 数据库 4.x 版本,memory_limit 不低于 6G。
      • 对于 OceanBase 数据库 4.x 之前版本,memory_limit 不低于 8G。
      • 参数范围:memory_limit/3 ≤ system_memory ≤ memory_limit/2

2. observer 启动失败

  • 错误原因:根据文档片段 2 和 5 的信息,observer 进程启动失败可能是由于内部组件或服务初始化失败导致的。
  • 解决方法
    • 检查日志文件以获取更多详细信息。日志路径通常为 /home/admin/oceanbase/log/var/lib/oceanbase/log
    • 如果问题无法通过日志排查,请联系技术支持人员协助。

3. 权限问题

  • 错误原因:根据文档片段 3 的信息,observer 启动时可能会遇到权限问题,例如安装目录或数据盘目录的所属权不一致。
  • 解决方法
    • 确保安装目录(默认为 /home/admin/oceanbase)下的 etc 目录所属权正确:
      chown root /home/admin/oceanbase/etc
      
    • 确保数据盘目录(默认为 /var/lib/oceanbase/)有足够的磁盘空间。

4. obd 工具版本问题

  • 错误原因:根据文档片段 1 的信息,obd 工具版本过低可能导致部署或启动失败。
  • 解决方法
    • 确保已安装最新版本的 obd。如果当前版本低于 V2.10.0,请参考 常见问题 中的“如何升级 obd”一节进行升级。

5. 其他建议

  • 使用 obdiag 收集诊断信息以进一步排查问题。确保已通过以下命令安装 obdiag
    obd obdiag deploy
    
  • 如果问题仍未解决,推荐使用 OceanBase 敏捷诊断工具 obdiag 收集诊断信息,并联系技术支持人员。

如果以上方法仍无法解决问题,请提供更多日志信息或附件以便进一步分析。

更多信息请查看:

  1. 通过图形化界面变更组件
  2. 4000 ~ 4499
  3. 使用 systemd 部署 OceanBase 数据库
  4. 单机部署 OceanBase 数据库
  5. OceanBase 安装部署工具 V1.6.0
  6. 升级 OceanBase 数据库
  7. 错误码
  8. 集群诊断
2 个赞

[root@localhost ~]# obd cluster start jby
Get local repositories ok
Load cluster param plugin ok
Open ssh connection ok
[WARN] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 20480)
[WARN] OBD-1012: (127.0.0.1) clog and data use the same disk (/)
cluster scenario: htap
Start observer x
[ERROR] OBD-2002: Failed to start 127.0.0.1 observer:
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: dfadef22-232d-11f0-abca-fefcfe4314d1
If you want to view detailed obd logs, please run: obd display-trace dfadef22-232d-11f0-abca-fefcfe4314d1
[root@localhost ~]# obd display-trace dfadef22-232d-11f0-abca-fefcfe4314d1
[2025-04-27 14:07:18.172] [DEBUG] - cmd: [‘jby’]
[2025-04-27 14:07:18.172] [DEBUG] - opts: {‘servers’: None, ‘components’: None, ‘force_delete’: None, ‘strict_check’: None, ‘without_parameter’: None}
[2025-04-27 14:07:18.173] [DEBUG] - mkdir /root/.obd/lock/
[2025-04-27 14:07:18.173] [DEBUG] - unknown lock mode
[2025-04-27 14:07:18.173] [DEBUG] - try to get share lock /root/.obd/lock/global
[2025-04-27 14:07:18.174] [DEBUG] - share lock /root/.obd/lock/global, count 1
[2025-04-27 14:07:18.174] [DEBUG] - Get Deploy by name
[2025-04-27 14:07:18.174] [DEBUG] - mkdir /root/.obd/cluster/
[2025-04-27 14:07:18.174] [DEBUG] - mkdir /root/.obd/config_parser/
[2025-04-27 14:07:18.175] [DEBUG] - try to get exclusive lock /root/.obd/lock/deploy_jby
[2025-04-27 14:07:18.175] [DEBUG] - exclusive lock /root/.obd/lock/deploy_jby, count 1
[2025-04-27 14:07:18.185] [DEBUG] - Deploy status judge
[2025-04-27 14:07:18.186] [INFO] Get local repositories
[2025-04-27 14:07:18.188] [DEBUG] - mkdir /root/.obd/repository
[2025-04-27 14:07:18.188] [DEBUG] - Get local repository oceanbase-ce-4.3.5.1-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:18.189] [DEBUG] - Search repository oceanbase-ce version: 4.3.5.1, tag: 3a4f23adb7973d6d1d6969bcd9ae108f8c564b66, release: None, package_hash: None
[2025-04-27 14:07:18.189] [DEBUG] - try to get share lock /root/.obd/lock/mirror_and_repo
[2025-04-27 14:07:18.189] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 1
[2025-04-27 14:07:18.189] [DEBUG] - mkdir /root/.obd/repository/oceanbase-ce
[2025-04-27 14:07:18.194] [DEBUG] - Found repository oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:18.320] [DEBUG] - Get deploy config
[2025-04-27 14:07:18.341] [INFO] Load cluster param plugin
[2025-04-27 14:07:18.342] [DEBUG] - Get local repository oceanbase-ce-4.3.5.1-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:18.342] [DEBUG] - Searching param plugin for components …
[2025-04-27 14:07:18.342] [DEBUG] - Search param plugin for oceanbase-ce
[2025-04-27 14:07:18.342] [DEBUG] - mkdir /root/.obd/plugins
[2025-04-27 14:07:18.344] [DEBUG] - Found for oceanbase-ce-param-4.3.3.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:18.344] [DEBUG] - Applying oceanbase-ce-param-4.3.3.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.194] [INFO] Open ssh connection
[2025-04-27 14:07:19.332] [DEBUG] - Searching start_check template for components …
[2025-04-27 14:07:19.333] [DEBUG] - mkdir /root/.obd/workflows
[2025-04-27 14:07:19.334] [DEBUG] - Call workflow oceanbase-ce-py_script_workflow_start_check-4.3.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.335] [DEBUG] - mkdir /root/.obd/mirror
[2025-04-27 14:07:19.335] [DEBUG] - mkdir /root/.obd/mirror/remote
[2025-04-27 14:07:19.335] [DEBUG] - mkdir /root/.obd/mirror/local
[2025-04-27 14:07:19.335] [DEBUG] - mkdir /root/.obd/optimize/
[2025-04-27 14:07:19.336] [DEBUG] - mkdir /root/.obd/tool/
[2025-04-27 14:07:19.336] [DEBUG] - import start_check
[2025-04-27 14:07:19.337] [DEBUG] - add start_check ref count to 1
[2025-04-27 14:07:19.337] [DEBUG] - sub start_check ref count to 0
[2025-04-27 14:07:19.337] [DEBUG] - export start_check
[2025-04-27 14:07:19.337] [DEBUG] - plugin oceanbase-ce-py_script_workflow_start_check-4.3.0.0 result: True
[2025-04-27 14:07:19.337] [DEBUG] - Found for oceanbase-ce-py_script_workflow_start_check-4.3.0.0 for oceanbase-ce-4.3.0.0
[2025-04-27 14:07:19.337] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo, count 2
[2025-04-27 14:07:19.341] [DEBUG] - Searching start_check_pre plugin for components …
[2025-04-27 14:07:19.341] [DEBUG] - Searching start_check_pre plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.342] [DEBUG] - Found for oceanbase-ce-py_script_start_check_pre-4.3.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.342] [DEBUG] - Call plugin oceanbase-ce-py_script_start_check_pre-4.3.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.342] [DEBUG] - import start_check_pre
[2025-04-27 14:07:19.344] [DEBUG] - add start_check_pre ref count to 1
[2025-04-27 14:07:19.345] [DEBUG] - sub start_check_pre ref count to 0
[2025-04-27 14:07:19.346] [DEBUG] - export start_check_pre
[2025-04-27 14:07:19.346] [DEBUG] - plugin oceanbase-ce-py_script_start_check_pre-4.3.0.0 result: True
[2025-04-27 14:07:19.346] [DEBUG] - Searching status_check plugin for components …
[2025-04-27 14:07:19.346] [DEBUG] - Searching status_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.347] [DEBUG] - Found for oceanbase-ce-py_script_status_check-4.2.1.4 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.347] [DEBUG] - Call plugin oceanbase-ce-py_script_status_check-4.2.1.4 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.347] [DEBUG] - import status_check
[2025-04-27 14:07:19.348] [DEBUG] - add status_check ref count to 1
[2025-04-27 14:07:19.349] [DEBUG] – local execute: ls /usr/local/observer/store/clog/tenant_1/
[2025-04-27 14:07:19.355] [DEBUG] – exited code 0
[2025-04-27 14:07:19.355] [DEBUG] – local execute: cat /usr/local/observer/run/observer.pid
[2025-04-27 14:07:19.361] [DEBUG] – exited code 1, error output:
[2025-04-27 14:07:19.361] [DEBUG] cat: /usr/local/observer/run/observer.pid: No such file or directory
[2025-04-27 14:07:19.361] [DEBUG]
[2025-04-27 14:07:19.362] [DEBUG] - sub status_check ref count to 0
[2025-04-27 14:07:19.362] [DEBUG] - export status_check
[2025-04-27 14:07:19.362] [DEBUG] - plugin oceanbase-ce-py_script_status_check-4.2.1.4 result: True
[2025-04-27 14:07:19.362] [DEBUG] - Searching parameter_check plugin for components …
[2025-04-27 14:07:19.362] [DEBUG] - Searching parameter_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.363] [DEBUG] - Found for oceanbase-ce-py_script_parameter_check-4.0.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.363] [DEBUG] - Call plugin oceanbase-ce-py_script_parameter_check-4.0.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.363] [DEBUG] - import parameter_check
[2025-04-27 14:07:19.365] [DEBUG] - add parameter_check ref count to 1
[2025-04-27 14:07:19.367] [DEBUG] – local execute: ls /usr/local/observer/store/sstable/block_file
[2025-04-27 14:07:19.373] [DEBUG] – exited code 2, error output:
[2025-04-27 14:07:19.374] [DEBUG] ls: 无法访问 ‘/usr/local/observer/store/sstable/block_file’: No such file or directory
[2025-04-27 14:07:19.374] [DEBUG]
[2025-04-27 14:07:19.374] [DEBUG] - sub parameter_check ref count to 0
[2025-04-27 14:07:19.374] [DEBUG] - export parameter_check
[2025-04-27 14:07:19.374] [DEBUG] - plugin oceanbase-ce-py_script_parameter_check-4.0.0.0 result: True
[2025-04-27 14:07:19.375] [DEBUG] - Searching system_limits_check plugin for components …
[2025-04-27 14:07:19.375] [DEBUG] - Searching system_limits_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.376] [DEBUG] - Found for oceanbase-ce-py_script_system_limits_check-3.1.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.376] [DEBUG] - Call plugin oceanbase-ce-py_script_system_limits_check-3.1.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.376] [DEBUG] - import system_limits_check
[2025-04-27 14:07:19.379] [DEBUG] - add system_limits_check ref count to 1
[2025-04-27 14:07:19.379] [DEBUG] – local execute: cat /proc/sys/fs/aio-max-nr /proc/sys/fs/aio-nr
[2025-04-27 14:07:19.385] [DEBUG] – exited code 0
[2025-04-27 14:07:19.386] [DEBUG] – local execute: ulimit -a
[2025-04-27 14:07:19.391] [DEBUG] – exited code 0
[2025-04-27 14:07:19.392] [WARNING] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 20480)
[2025-04-27 14:07:19.392] [DEBUG] – local execute: sysctl -a
[2025-04-27 14:07:19.429] [DEBUG] – exited code 0
[2025-04-27 14:07:19.433] [DEBUG] - sub system_limits_check ref count to 0
[2025-04-27 14:07:19.433] [DEBUG] - export system_limits_check
[2025-04-27 14:07:19.434] [DEBUG] - plugin oceanbase-ce-py_script_system_limits_check-3.1.0 result: True
[2025-04-27 14:07:19.434] [DEBUG] - Searching resource_check plugin for components …
[2025-04-27 14:07:19.434] [DEBUG] - Searching resource_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.435] [DEBUG] - Found for oceanbase-ce-py_script_resource_check-4.0.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.435] [DEBUG] - Call plugin oceanbase-ce-py_script_resource_check-4.0.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.435] [DEBUG] - import resource_check
[2025-04-27 14:07:19.439] [DEBUG] - add resource_check ref count to 1
[2025-04-27 14:07:19.439] [DEBUG] – local execute: cat /proc/meminfo
[2025-04-27 14:07:19.448] [DEBUG] – exited code 0
[2025-04-27 14:07:19.449] [DEBUG] – local execute: df --block-size=1024
[2025-04-27 14:07:19.460] [DEBUG] – exited code 0
[2025-04-27 14:07:19.461] [DEBUG] – get disk info for path /dev, total: 4194304 avail: 4194304
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /dev/shm, total: 67234385920 avail: 67234385920
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /run, total: 26893758464 avail: 26104553472
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /sys/fs/cgroup, total: 4194304 avail: 4194304
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /, total: 1039367151616 avail: 508312240128
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /tmp, total: 67234390016 avail: 67148599296
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /boot, total: 1020702720 avail: 770285568
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /home, total: 32694411264 avail: 30767812608
[2025-04-27 14:07:19.462] [DEBUG] – get disk info for path /var/log/rtlog, total: 62914560 avail: 62914560
[2025-04-27 14:07:19.462] [DEBUG] – local execute: df --block-size=1024 /usr/local/observer/store
[2025-04-27 14:07:19.468] [DEBUG] – exited code 0
[2025-04-27 14:07:19.469] [DEBUG] – get disk info for path /, total: 1039367151616 avail: 508312240128
[2025-04-27 14:07:19.469] [DEBUG] – local execute: df --block-size=1024 /usr/local/observer/store/clog
[2025-04-27 14:07:19.474] [DEBUG] – exited code 0
[2025-04-27 14:07:19.474] [DEBUG] – get disk info for path /, total: 1039367151616 avail: 508312240128
[2025-04-27 14:07:19.474] [DEBUG] – disk: {’/dev’: {‘total’: 4194304, ‘avail’: 4194304, ‘need’: 0}, ‘/dev/shm’: {‘total’: 67234385920, ‘avail’: 67234385920, ‘need’: 0}, ‘/run’: {‘total’: 26893758464, ‘avail’: 26104553472, ‘need’: 0}, ‘/sys/fs/cgroup’: {‘total’: 4194304, ‘avail’: 4194304, ‘need’: 0}, ‘/’: {‘total’: 1039367151616, ‘avail’: 508312240128, ‘need’: 0}, ‘/tmp’: {‘total’: 67234390016, ‘avail’: 67148599296, ‘need’: 0}, ‘/boot’: {‘total’: 1020702720, ‘avail’: 770285568, ‘need’: 0}, ‘/home’: {‘total’: 32694411264, ‘avail’: 30767812608, ‘need’: 0}, ‘/var/log/rtlog’: {‘total’: 62914560, ‘avail’: 62914560, ‘need’: 0}}
[2025-04-27 14:07:19.475] [WARNING] OBD-1012: (127.0.0.1) clog and data use the same disk (/)
[2025-04-27 14:07:19.475] [DEBUG] - sub resource_check ref count to 0
[2025-04-27 14:07:19.475] [DEBUG] - export resource_check
[2025-04-27 14:07:19.475] [DEBUG] - plugin oceanbase-ce-py_script_resource_check-4.0.0.0 result: True
[2025-04-27 14:07:19.476] [DEBUG] - Searching environment_check plugin for components …
[2025-04-27 14:07:19.476] [DEBUG] - Searching environment_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.477] [DEBUG] - Found for oceanbase-ce-py_script_environment_check-4.2.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.477] [DEBUG] - Call plugin oceanbase-ce-py_script_environment_check-4.2.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.477] [DEBUG] - import environment_check
[2025-04-27 14:07:19.480] [DEBUG] - add environment_check ref count to 1
[2025-04-27 14:07:19.481] [DEBUG] – 127.0.0.1 port check
[2025-04-27 14:07:19.481] [DEBUG] – local execute: bash -c ‘cat /proc/net/{tcp*,udp*}’ | awk -F’ ’ ‘{if($4==“0A”) print $2,$4,$10}’ | grep ‘:0B41’ | awk -F’ ’ ‘{print $3}’ | uniq
[2025-04-27 14:07:19.500] [DEBUG] – exited code 0
[2025-04-27 14:07:19.501] [DEBUG] – local execute: bash -c ‘cat /proc/net/{tcp*,udp*}’ | awk -F’ ’ ‘{if($4==“0A”) print $2,$4,$10}’ | grep ‘:0B42’ | awk -F’ ’ ‘{print $3}’ | uniq
[2025-04-27 14:07:19.519] [DEBUG] – exited code 0
[2025-04-27 14:07:19.520] [DEBUG] – local execute: ping -W 1 -c 1 127.0.0.1
[2025-04-27 14:07:19.530] [DEBUG] – exited code 0
[2025-04-27 14:07:19.531] [DEBUG] - sub environment_check ref count to 0
[2025-04-27 14:07:19.531] [DEBUG] - export environment_check
[2025-04-27 14:07:19.531] [DEBUG] - plugin oceanbase-ce-py_script_environment_check-4.2.0.0 result: True
[2025-04-27 14:07:19.531] [DEBUG] - Searching obshell_port_check plugin for components …
[2025-04-27 14:07:19.532] [DEBUG] - Searching obshell_port_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.533] [DEBUG] - Found for oceanbase-ce-py_script_obshell_port_check-4.2.1.4 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.533] [DEBUG] - Call plugin oceanbase-ce-py_script_obshell_port_check-4.2.1.4 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.533] [DEBUG] - import obshell_port_check
[2025-04-27 14:07:19.534] [DEBUG] - add obshell_port_check ref count to 1
[2025-04-27 14:07:19.535] [DEBUG] – 127.0.0.1 port check
[2025-04-27 14:07:19.535] [DEBUG] – local execute: bash -c ‘cat /proc/net/{tcp*,udp*}’ | awk -F’ ’ ‘{if($4==“0A”) print $2,$4,$10}’ | grep ‘:0B46’ | awk -F’ ’ ‘{print $3}’ | uniq
[2025-04-27 14:07:19.555] [DEBUG] – exited code 0
[2025-04-27 14:07:19.555] [DEBUG] - sub obshell_port_check ref count to 0
[2025-04-27 14:07:19.556] [DEBUG] - export obshell_port_check
[2025-04-27 14:07:19.556] [DEBUG] - plugin oceanbase-ce-py_script_obshell_port_check-4.2.1.4 result: True
[2025-04-27 14:07:19.556] [DEBUG] - Searching scenario_start_check plugin for components …
[2025-04-27 14:07:19.556] [DEBUG] - Searching scenario_start_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.557] [DEBUG] - Found for oceanbase-ce-py_script_scenario_start_check-4.3.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.557] [DEBUG] - Call plugin oceanbase-ce-py_script_scenario_start_check-4.3.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.557] [DEBUG] - import scenario_start_check
[2025-04-27 14:07:19.558] [DEBUG] - add scenario_start_check ref count to 1
[2025-04-27 14:07:19.559] [DEBUG] - sub scenario_start_check ref count to 0
[2025-04-27 14:07:19.559] [DEBUG] - export scenario_start_check
[2025-04-27 14:07:19.559] [DEBUG] - plugin oceanbase-ce-py_script_scenario_start_check-4.3.0.0 result: True
[2025-04-27 14:07:19.559] [DEBUG] - Searching ocp_tenant_check plugin for components …
[2025-04-27 14:07:19.559] [DEBUG] - Searching ocp_tenant_check plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.560] [DEBUG] - Found for oceanbase-ce-py_script_ocp_tenant_check-4.0.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.560] [DEBUG] - Call plugin oceanbase-ce-py_script_ocp_tenant_check-4.0.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.560] [DEBUG] - import ocp_tenant_check
[2025-04-27 14:07:19.562] [DEBUG] - add ocp_tenant_check ref count to 1
[2025-04-27 14:07:19.563] [DEBUG] - sub ocp_tenant_check ref count to 0
[2025-04-27 14:07:19.563] [DEBUG] - export ocp_tenant_check
[2025-04-27 14:07:19.563] [DEBUG] - plugin oceanbase-ce-py_script_ocp_tenant_check-4.0.0.0 result: True
[2025-04-27 14:07:19.563] [DEBUG] - Searching start template for components …
[2025-04-27 14:07:19.564] [DEBUG] - Call workflow oceanbase-ce-py_script_workflow_start-4.2.1.4 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.564] [DEBUG] - import start
[2025-04-27 14:07:19.565] [DEBUG] - add start ref count to 1
[2025-04-27 14:07:19.565] [DEBUG] - sub start ref count to 0
[2025-04-27 14:07:19.565] [DEBUG] - export start
[2025-04-27 14:07:19.565] [DEBUG] - plugin oceanbase-ce-py_script_workflow_start-4.2.1.4 result: True
[2025-04-27 14:07:19.565] [DEBUG] - Found for oceanbase-ce-py_script_workflow_start-4.2.1.4 for oceanbase-ce-4.2.1.4
[2025-04-27 14:07:19.565] [DEBUG] - Searching configserver_pre plugin for components …
[2025-04-27 14:07:19.566] [DEBUG] - Searching configserver_pre plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.566] [DEBUG] - Found for oceanbase-ce-py_script_configserver_pre-3.1.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.566] [DEBUG] - Call plugin oceanbase-ce-py_script_configserver_pre-3.1.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.566] [DEBUG] - import configserver_pre
[2025-04-27 14:07:19.568] [DEBUG] - add configserver_pre ref count to 1
[2025-04-27 14:07:19.568] [DEBUG] - sub configserver_pre ref count to 0
[2025-04-27 14:07:19.568] [DEBUG] - export configserver_pre
[2025-04-27 14:07:19.568] [DEBUG] - plugin oceanbase-ce-py_script_configserver_pre-3.1.0 result: True
[2025-04-27 14:07:19.569] [DEBUG] - Searching start_pre plugin for components …
[2025-04-27 14:07:19.569] [DEBUG] - Searching start_pre plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.570] [DEBUG] - Found for oceanbase-ce-py_script_start_pre-4.3.0.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.570] [DEBUG] - Call plugin oceanbase-ce-py_script_start_pre-4.3.0.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.570] [DEBUG] - import start_pre
[2025-04-27 14:07:19.572] [DEBUG] - add start_pre ref count to 1
[2025-04-27 14:07:19.573] [INFO] cluster scenario: htap
[2025-04-27 14:07:19.573] [DEBUG] – local execute: ls /usr/local/observer/store/clog/tenant_1/
[2025-04-27 14:07:19.579] [DEBUG] – exited code 0
[2025-04-27 14:07:19.580] [DEBUG] – local execute: cat /usr/local/observer/run/observer.pid
[2025-04-27 14:07:19.586] [DEBUG] – exited code 1, error output:
[2025-04-27 14:07:19.586] [DEBUG] cat: /usr/local/observer/run/observer.pid: No such file or directory
[2025-04-27 14:07:19.586] [DEBUG]
[2025-04-27 14:07:19.586] [DEBUG] – 127.0.0.1 start command construction
[2025-04-27 14:07:19.586] [DEBUG] – update large_query_threshold to 600s because of scenario
[2025-04-27 14:07:19.587] [DEBUG] – update enable_record_trace_log to False because of scenario
[2025-04-27 14:07:19.587] [DEBUG] – update enable_syslog_recycle to 1 because of scenario
[2025-04-27 14:07:19.587] [DEBUG] – update max_syslog_file_count to 300 because of scenario
[2025-04-27 14:07:19.587] [DEBUG] - sub start_pre ref count to 0
[2025-04-27 14:07:19.587] [DEBUG] - export start_pre
[2025-04-27 14:07:19.587] [DEBUG] - plugin oceanbase-ce-py_script_start_pre-4.3.0.0 result: True
[2025-04-27 14:07:19.587] [DEBUG] - Searching start plugin for components …
[2025-04-27 14:07:19.588] [DEBUG] - Searching start plugin for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.589] [DEBUG] - Found for oceanbase-ce-py_script_start-3.1.0 for oceanbase-ce-4.3.5.1
[2025-04-27 14:07:19.589] [DEBUG] - Call plugin oceanbase-ce-py_script_start-3.1.0 for oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66
[2025-04-27 14:07:19.589] [DEBUG] - import start
[2025-04-27 14:07:19.590] [DEBUG] - add start ref count to 1
[2025-04-27 14:07:19.590] [INFO] Start observer
[2025-04-27 14:07:19.591] [DEBUG] – starting 127.0.0.1 observer
[2025-04-27 14:07:19.592] [DEBUG] – root@127.0.0.1 export LD_LIBRARY_PATH=’/usr/local/observer/lib:’
[2025-04-27 14:07:19.593] [DEBUG] – local execute: cd /usr/local/observer; /usr/local/observer/bin/observer -r ‘127.0.0.1:2882:2881’ -p 2881 -P 2882 -z ‘zone1’ -n ‘jby’ -c 1 -d ‘/usr/local/observer/store’ -I ‘127.0.0.1’ -o __min_full_resource_pool_memory=2147483648,memory_limit=‘32G’,datafile_size=‘2G’,datafile_next=‘2G’,datafile_maxsize=‘20G’,log_disk_size=‘14G’,cpu_count=16,enable_syslog_wf=False,max_syslog_file_count=4,large_query_threshold=‘600s’,enable_record_trace_log=False,enable_syslog_recycle=1
[2025-04-27 14:07:20.159] [DEBUG] – exited code -4, error output:
[2025-04-27 14:07:20.159] [DEBUG]
[2025-04-27 14:07:20.160] [DEBUG] – root@127.0.0.1 delete env LD_LIBRARY_PATH
[2025-04-27 14:07:20.245] [ERROR] OBD-2002: Failed to start 127.0.0.1 observer:
[2025-04-27 14:07:20.246] [DEBUG] - sub start ref count to 0
[2025-04-27 14:07:20.246] [DEBUG] - export start
[2025-04-27 14:07:20.246] [DEBUG] - plugin oceanbase-ce-py_script_start-3.1.0 result: None
[2025-04-27 14:07:20.251] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 1
[2025-04-27 14:07:20.251] [DEBUG] - share lock /root/.obd/lock/mirror_and_repo release, count 0
[2025-04-27 14:07:20.251] [DEBUG] - unlock /root/.obd/lock/mirror_and_repo
[2025-04-27 14:07:20.251] [DEBUG] - exclusive lock /root/.obd/lock/deploy_jby release, count 0
[2025-04-27 14:07:20.252] [DEBUG] - unlock /root/.obd/lock/deploy_jby
[2025-04-27 14:07:20.252] [DEBUG] - share lock /root/.obd/lock/global release, count 0
[2025-04-27 14:07:20.252] [DEBUG] - unlock /root/.obd/lock/global
[2025-04-27 14:07:20.252] [INFO] See https://www.oceanbase.com/product/ob-deployer/error-codes .
[2025-04-27 14:07:20.252] [INFO] Trace ID: dfadef22-232d-11f0-abca-fefcfe4314d1
[2025-04-27 14:07:20.252] [INFO] If you want to view detailed obd logs, please run: obd display-trace dfadef22-232d-11f0-abca-fefcfe4314d1

1 个赞

根据您提供的日志信息和文档库中的内容,以下是可能导致问题的原因及解决方法:


1. 内存不足

  • 错误原因:从日志中可以看到 memory_limit='32G' 被设置为 32GB,但启动失败可能是因为实际可用内存不足。
  • 解决方法
    • 检查系统内存是否满足要求。对于 OceanBase 数据库 4.x 版本,memory_limit 不低于 6G 是最低要求。
    • 如果内存不足,可以尝试释放缓存内存:
      sudo sysctl -w vm.drop_caches=1
      # 或
      sudo echo 1 > /proc/sys/vm/drop_caches
      
    • 确保 memory_limitsystem_memory 参数配置合理。参考范围为:
      • memory_limit/3 ≤ system_memory ≤ memory_limit/2

2. observer 启动失败

  • 错误原因:日志中显示 exited code -4,这通常表示进程因信号中断而退出(例如 SIGILL、SIGSEGV 等)。
  • 解决方法
    • 检查日志文件 /usr/local/observer/log/observer.log 以获取更多详细信息。
    • 如果问题无法通过日志排查,请联系技术支持人员协助。

3. 权限问题

  • 错误原因:日志中提到 /usr/local/observer/run/observer.pid 文件不存在,可能是由于权限问题导致无法创建或写入该文件。
  • 解决方法
    • 确保 /usr/local/observer/run 目录的权限正确:
      chmod -R 755 /usr/local/observer/run
      chown -R observer:observer /usr/local/observer/run
      
    • 如果使用的是 root 用户部署,请确保所有相关目录的权限都已正确设置。

4. 磁盘空间不足

  • 错误原因:日志中提到 clogdata 使用了相同的磁盘路径 /,并且磁盘空间可能不足。
  • 解决方法
    • 检查磁盘空间是否充足:
      df -h /
      
    • 如果磁盘空间不足,可以清理不必要的文件或扩展磁盘容量。
    • 推荐将 clogdata 分别存储在不同的磁盘上,以避免性能瓶颈。修改配置文件中的 store_dir 参数,例如:
      store_dir: /mnt/data
      clog_dir: /mnt/clog
      

5. 参数配置问题

  • 错误原因:日志中显示多个参数被动态调整(如 large_query_thresholdenable_record_trace_log 等),可能某些参数配置不合理。
  • 解决方法
    • 检查配置文件中是否有不合理的参数设置。例如:
      • __min_full_resource_pool_memory=2147483648 是否过低?
      • cpu_count=16 是否与实际硬件资源匹配?
    • 可以尝试恢复默认配置并重新启动:
      obd cluster edit-config jby
      

6. 其他建议

  • 检查系统限制:日志中提到 stack size 的推荐值为 unlimited,当前值为 20480。可以通过以下命令调整:
    ulimit -s unlimited
    
  • 收集诊断信息:如果问题仍未解决,可以使用 obdiag 收集诊断信息:
    obd obdiag deploy
    
  • 升级 obd 工具:确保已安装最新版本的 obd。如果当前版本低于 V2.10.0,请参考 常见问题 中的“如何升级 obd”一节进行升级。

如果以上方法仍无法解决问题,请提供更多日志信息或附件以便进一步分析。

详情请参考:

  1. 创建 MySQL 模式的用户租户

(小助手的答复已结束,如未能解决您的问题,请继续提问并等待其他同学的回复,谢谢!)

1 个赞

麻烦提供一份yaml文件,和observer日志~/xxx/oceanbase/log

1 个赞

yaml文件内容

## Only need to configure when remote login is required
user:
   username: root
   password: *****
#   key_file: your ssh-key file path if need
#   port: your ssh port, default 22
#   timeout: ssh connection timeout (second), default 30
oceanbase-ce:
  servers:
    # Please don't use hostname, only IP can be supported
    - 127.0.0.1
  global:
    #  The working directory for OceanBase Database. OceanBase Database is started under this directory. This is a required field.
    home_path: /usr/local/observer
    # The directory for data storage. The default value is $home_path/store.
    # data_dir: /data
    # The directory for clog, ilog, and slog. The default value is the same as the data_dir value.
    # redo_dir: /redo
    # Starting from observer version 4.2, the network selection for the observer is based on the 'local_ip' parameter, and the 'devname' parameter is no longer mandatory.
    # If the 'local_ip' parameter is set, the observer will first use this parameter for the configuration, regardless of the 'devname' parameter.
    # If only the 'devname' parameter is set, the observer will use the 'devname' parameter for the configuration.
    # If neither the 'devname' nor the 'local_ip' parameters are set, the 'local_ip' parameter will be automatically assigned the IP address configured above.
    # devname: eth0
    mysql_port: 2881 # External port for OceanBase Database. The default value is 2881. DO NOT change this value after the cluster is started.
    rpc_port: 2882 # Internal port for OceanBase Database. The default value is 2882. DO NOT change this value after the cluster is started.
    obshell_port: 2886 # Operation and maintenance port for Oceanbase Database. The default value is 2886. This parameter is valid only when the version of oceanbase-ce is 4.2.2.0 or later.
    zone: zone1
    cluster_id: 1
    # please set memory limit to a suitable value which is matching resource.
    memory_limit: 120G # The maximum running memory for an observer
    #system_memory: 1G # The reserved system memory. system_memory is reserved for general tenants. The default value is 30G.
    datafile_size: 2G # Size of the data file.
    datafile_next: 2G # the auto extend step. Please enter an capacity, such as 2G
    datafile_maxsize: 20G # the auto extend max size. Please enter an capacity, such as 20G
    log_disk_size: 14G # The size of disk space used by the clog files.
    cpu_count: 16
    production_mode: false
    enable_syslog_wf: false # Print system logs whose levels are higher than WARNING to a separate log file. The default value is true.
    max_syslog_file_count: 4 # The maximum number of reserved log files before enabling auto recycling. The default value is 0.
    root_password: ****** # root user password, can be empty
    appname: jby

log是空的
图片

obd cluster list看一下集群状态
确认下你的主机有120G内存么memory_limit是内存参数。

1 个赞


我改成32G也不行

2 个赞

先把这个集群铲掉,obd cluster destroy xxxx
有两种建议
1.使用obd demo部署个demo集群进行测试使用,再根据demo集群的yaml参数自己配置参数大小进行部署
2.使用obd web进行白屏化部署

2 个赞

我刚刚试了下,obd demo也不行,一样的错误

[root@localhost .oceanbase-all-in-one]# obd demo
Package prometheus-2.37.1-10000102022110211.el8 is available.
Package obproxy-ce-4.3.3.0-5.el8 is available.
Package grafana-7.5.17-1 is available.
Package obagent-4.2.2-100000042024011120.el8 is available.
install prometheus-2.37.1 for local ok
install obproxy-ce-4.3.3.0 for local ok
install grafana-7.5.17 for local ok
install obagent-4.2.2 for local ok
Cluster param config check ok
Open ssh connection ok
Generate prometheus configuration ok
Generate obproxy configuration ok
Generate grafana configuration ok
Generate obagent configuration ok
+--------------------------------------------------------------------------------------------+
|                                          Packages                                          |
+--------------+---------+------------------------+------------------------------------------+
| Repository   | Version | Release                | Md5                                      |
+--------------+---------+------------------------+------------------------------------------+
| prometheus   | 2.37.1  | 10000102022110211.el8  | e4f8a3e784512fca75bf1b3464247d1f31542cb9 |
| obproxy-ce   | 4.3.3.0 | 5.el8                  | 3e5179c4e9864a29704d6b4c2a35a4ad40e7ad56 |
| grafana      | 7.5.17  | 1                      | 1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 |
| obagent      | 4.2.2   | 100000042024011120.el8 | bf152b880953c2043ddaf80d6180cf22bb8c8ac2 |
| oceanbase-ce | 4.3.5.1 | 101000042025031818.el8 | 3a4f23adb7973d6d1d6969bcd9ae108f8c564b66 |
+--------------+---------+------------------------+------------------------------------------+
Repository integrity check ok
Load param plugin ok
Open ssh connection ok
Initializes obagent work home ok
Initializes observer work home ok
Initializes obproxy work home ok
Initializes prometheus work home ok
Initializes grafana work home ok
Parameter check ok
Remote prometheus-2.37.1-10000102022110211.el8-e4f8a3e784512fca75bf1b3464247d1f31542cb9 repository install ok
Remote prometheus-2.37.1-10000102022110211.el8-e4f8a3e784512fca75bf1b3464247d1f31542cb9 repository lib check ok
Remote obproxy-ce-4.3.3.0-5.el8-3e5179c4e9864a29704d6b4c2a35a4ad40e7ad56 repository install ok
Remote obproxy-ce-4.3.3.0-5.el8-3e5179c4e9864a29704d6b4c2a35a4ad40e7ad56 repository lib check ok
Remote grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 repository install ok
Remote grafana-7.5.17-1-1bf1f338d3a3445d8599dc6902e7aeed4de4e0d6 repository lib check ok
Remote obagent-4.2.2-100000042024011120.el8-bf152b880953c2043ddaf80d6180cf22bb8c8ac2 repository install ok
Remote obagent-4.2.2-100000042024011120.el8-bf152b880953c2043ddaf80d6180cf22bb8c8ac2 repository lib check ok
Remote oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66 repository install ok
Remote oceanbase-ce-4.3.5.1-101000042025031818.el8-3a4f23adb7973d6d1d6969bcd9ae108f8c564b66 repository lib check ok
demo deployed
Get local repositories ok
Load cluster param plugin ok
Open ssh connection ok
[WARN] OBD-1007: (127.0.0.1) The recommended number of stack size is unlimited (Current value: 20480)
[WARN] OBD-1012: (127.0.0.1) clog and data use the same disk (/)
Check before start obagent ok
Check before start prometheus ok
Check before start grafana ok
cluster scenario: express_oltp
Start observer x
[ERROR] OBD-2002: Failed to start 127.0.0.1 observer: 
See https://www.oceanbase.com/product/ob-deployer/error-codes .
Trace ID: 6d126318-233b-11f0-9f11-fefcfe4314d1
If you want to view detailed obd logs, please run: obd display-trace 6d126318-233b-11f0-9f11-fefcfe4314d1

我之前部署过其他的,都没问题,就这次有问题了

1 个赞

有生成observer日志么,lscpu看一下是否有avx指令集

1 个赞

lscpu

没有生成observer日志,只有obd display-trace 9ba3320e-233e-11f0-837a-fefcfe4314d1这种

observer 服务未能成功启动,导致客户端无法连接。
排查方案
1.observer 启动状态
2.日志
3.配置参数
4.数据目录权限和磁盘空间

没有avx指令集,是无法安装运行ob的。建议更换cpu。或者你尝试安装老版本 421.0试试

1 个赞

:+1: :+1: :+1:牛逼(破音)!确实是这个问题,已解决。非常感谢!