重启后obproxy无法启动

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】obproxy (OceanBase 4.3.1.0 4.el8)
【 使用版本 】
【问题描述】重启后obproxy无法启动,以下是所有日志,进程退出返回值0

[2024-10-16 09:00:20.116632] INFO [PROXY] init_log (ob_proxy_main.cpp:622) [8506][Y0-0000000000000000] [lt=0] [dc=0] succ to init logger(max_log_file_size=268435456, async_tid=281460330719888)
[2024-10-16 09:00:20.116660] INFO open (ob_file.cpp:59) [8506][Y0-0000000000000000] [lt=0] [dc=0] open fname=[/dev/urandom] fd=16 flags=0 succ
[2024-10-16 09:00:20.116964] INFO [PROXY] start (ob_proxy_main.cpp:501) [8506][Y0-0000000000000000] [lt=0] [dc=0] ObProxy-OceanBase 4.3.1.0-4.el8-1-local-81eeae92526f5efbee516e863c7234532dfa6175
[2024-10-16 09:00:20.116971] INFO [PROXY] start (ob_proxy_main.cpp:507) [8506][Y0-0000000000000000] [lt=0] [dc=0] has no inherited sockets, start new obproxy((info={is_inherited:false, upgrade_version:-1, need_conn_accept:true, user_rejected:0, ipv4_fd:-1, ipv6_fd:-1, rpc_ipv4_fd:-1, rpc_ipv6_fd:-1, received_sig:-1, sub_pid:-1, graceful_exit_end_time:0, graceful_exit_start_time:0, active_client_vc_count:-1, local_addr:, rpc_local_addr:, rc_status:"", hu_cmd:"", state:“HU_STATE_WAIT_HU_CMD”, hu_status:"", is_parent:true, sub_status:"", last_parent_status:"", last_sub_status:"", upgrade_version_buf:"", argc:11, argv[0]="/opt/obproxy/bin/obproxy", argv[1]="-o", argv[2]=“client_session_id_version=2,proxy_id=3585,obproxy_sys_password=***,enable_strict_kernel_release=False,ip_listen_mode=3,enable_cluster_checkout=False,skip_proxy_sys_private_check=True,proxy_id=3585,client_session_id_version=2”, argv[3]="–listen_port", argv[4]=“2883”, argv[5]="–prometheus_listen_port", argv[6]=“2884”, argv[7]="–rs_list", argv[8]=“22.22.22.5:2881”, argv[9]="–cluster_name", argv[10]=“cluster1”, inherited_argv[0]="/opt/obproxy/bin/obproxy", inherited_argv[1]="(null)", inherited_argv[2]="(null)", inherited_argv[3]="(null)"})
[2024-10-16 09:00:20.117180] INFO [PROXY] cleanup_log_file (ob_log_file_processor.cpp:199) [8506][Y0-0000000000000000] [lt=0] [dc=0] start clean up log file
[2024-10-16 09:00:20.117263] INFO [PROXY] init (ob_table_cache.cpp:220) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObTableCache
[2024-10-16 09:00:20.117314] INFO [PROXY] init (ob_partition_cache.cpp:250) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObPartitionCache
[2024-10-16 09:00:20.117380] INFO [PROXY] init (ob_index_cache.cpp:238) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObIndexCache
[2024-10-16 09:00:20.117417] INFO [PROXY] init (ob_table_query_async_cache.cpp:224) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObTableQueryAsyncCache
[2024-10-16 09:00:20.117516] INFO [PROXY] init (ob_routine_cache.cpp:319) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObTableQueryAsyncCache
[2024-10-16 09:00:20.117573] INFO [PROXY] init (ob_sql_table_cache.cpp:40) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObSqlTableCache
[2024-10-16 09:00:20.117613] INFO [PROXY.NET] init (ob_ssl_processor.cpp:41) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init ObSSLProcessor
[2024-10-16 09:00:20.118984] INFO [PROXY] open_sqlite3 (ob_config_processor.cpp:853) [8506][Y0-0000000000000000] [lt=0] [dc=0] start open sqlite3
[2024-10-16 09:00:20.119115] INFO [PROXY] check_and_create_table (ob_config_processor.cpp:295) [8506][Y0-0000000000000000] [lt=0] [dc=0] start check_and_create_table
[2024-10-16 09:00:20.119712] INFO [PROXY] init_local_config (ob_proxy.cpp:714) [8506][Y0-0000000000000000] [lt=0] [dc=0] start init_local_config
[2024-10-16 09:00:20.119741] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“client_session_id_version”, value=“2”)
[2024-10-16 09:00:20.119755] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“proxy_id”, value=“3585”)
[2024-10-16 09:00:20.119761] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“obproxy_sys_password”, value=“bad1133d69d244bdede506ea90ef8fe8630b0f68”)
[2024-10-16 09:00:20.119768] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“enable_strict_kernel_release”, value=“False”)
[2024-10-16 09:00:20.119773] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“ip_listen_mode”, value=“3”)
[2024-10-16 09:00:20.119782] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“enable_cluster_checkout”, value=“False”)
[2024-10-16 09:00:20.119786] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“skip_proxy_sys_private_check”, value=“True”)
[2024-10-16 09:00:20.119789] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“proxy_id”, value=“3585”)
[2024-10-16 09:00:20.119791] INFO [SHARE] add_config (ob_common_config.cpp:135) [8506][Y0-0000000000000000] [lt=0] [dc=0] Load config succ(name=“client_session_id_version”, value=“2”)
[2024-10-16 09:00:20.119846] INFO [PROXY] init_proxy_kernel_release (ob_config_server_processor.cpp:1217) [8506][Y0-0000000000000000] [lt=0] [dc=0] succ to init_proxy_kernel_release by unknown kernel(kernel_release_=RELEASE_UNKNOWN, enable_strict_kernel_release=“False”)
[2024-10-16 09:00:20.119884] INFO [PROXY] load_service_name_info_from_local (ob_config_server_processor.cpp:2928) [8506][Y0-0000000000000000] [lt=0] [dc=0] fail to get service name info buf size, maybe file does not exist(ret=-4009)
[2024-10-16 09:00:20.119889] WDIAG [PROXY] init (ob_config_server_processor.cpp:205) [8506][Y0-0000000000000000] [lt=0] [dc=0] fail to load service name info from local, maybe file not exist(ret=-4009)

我把obproxy组件去了重新增加也不行,还是上图这些日志

是重启obproxy后吗?

ps -ef|grep obproxy

发下obproxy和observer的版本

UID PID PPID C STIME TTY TIME CMD
root 1 0 0 08:45 ? 00:00:00 /bin/bash /entrypoint.sh start
root 8 1 0 08:45 ? 00:00:00 /usr/sbin/sshd
root 1780 1 0 08:46 ? 00:00:00 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep infinity
root 1824 0 0 08:46 pts/0 00:00:00 sh -c clear; (bash || ash || sh)
root 1831 1824 0 08:46 pts/0 00:00:00 sh -c clear; (bash || ash || sh)
root 1832 1831 0 08:46 pts/0 00:00:00 bash
root 4022 0 0 08:54 pts/1 00:00:00 sh -c clear; (bash || ash || sh)
root 4029 4022 0 08:54 pts/1 00:00:00 sh -c clear; (bash || ash || sh)
root 4030 4029 0 08:54 pts/1 00:00:00 bash
root 4548 1 49 08:55 ? 00:07:55 /opt/oceanbase/bin/observer -r 22.22.22.5:2882:2881 -p 2881 -P 2882 -z zone1 -n cluster1 -c 1719387089 -d /opt/oceanbase/store -l WARN -I 22.22.22.5 -o __min_full
root 4607 1 0 08:55 ? 00:00:00 /opt/oceanbase/bin/obshell daemon --ip 22.22.22.5 --port 2886
root 4641 4607 0 08:55 ? 00:00:01 /opt/oceanbase/bin/obshell server --ip 22.22.22.5 --port 2886
root 6171 1 0 08:56 ? 00:00:00 /opt/obagent/bin/ob_agentd -c /opt/obagent/conf/agentd.yaml
root 6180 6171 0 08:56 ? 00:00:00 /opt/obagent/bin/ob_mgragent
root 6181 6171 0 08:56 ? 00:00:07 /opt/obagent/bin/ob_monagent
root 6302 1 12 08:56 ? 00:01:53 java -jar -Xms766m -Xmx766m -DJDBC_URL=jdbc:oceanbase://22.22.22.5:2881/ocp_meta -DJDBC_USERNAME=meta@ocp_meta -DPUBLIC_KEY= /opt/ocpexpress/lib/ocp-express-serve
root 12934 1832 0 09:11 pts/0 00:00:00 ps -ef

obproxy-ce-4.3.1.0-4.el8-446ebd84845911de07dc90db93e55c62338adb9b
oceanbase-ce-4.2.4.0-100000082024070810.el8-8b1365b55251aae29758571ac6009989f2262ca9
重启OB集群后起不来了


这是正常启动时的日志,这两个fail后面还有输出,启动不来的这个日志都没了。

obd 安装的么 yaml文件发一下
使用obd启动一下集群然后把obd的详细日志发出来一份

test.log (35.4 KB)

user:
username: root
port: 10022
oceanbase-ce:
22.22.22.5:
zone: zone1
servers:

  • 22.22.22.5
    global:
    appname: cluster1
    root_password: pw5
    proxyro_password: pw4
    mysql_port: 2881
    rpc_port: 2882
    home_path: /opt/oceanbase
    scenario: htap
    cluster_id: 1719387089
    enable_syslog_recycle: true
    enable_syslog_wf: false
    max_syslog_file_count: 4
    memory_limit: 24240M
    memory_size: 16144M
    datafile_size: 8G
    datafile_next: 2G
    log_disk_size: 12G
    __min_full_resource_pool_memory: 1073741824
    syslog_level: WARN
    trace_log_slow_query_watermark: 300ms
    system_memory: 2048M
    cpu_count: 6
    production_mode: false
    ocp_agent_monitor_password: pw6
    ocp_root_password: pw7
    ocp_meta_password: pw8
    obproxy-ce:
    depends:
  • oceanbase-ce
    servers:
  • 22.22.22.5
    global:
    listen_port: 2883
    ip_listen_mode: 3
    home_path: /opt/obproxy
    enable_cluster_checkout: false
    skip_proxy_sys_private_check: true
    enable_strict_kernel_release: false
    obproxy_sys_password: pw9
    observer_sys_password: pw4
    22.22.22.5:
    proxy_id: 3585
    client_session_id_version: 2
    obagent:
    servers:
  • 22.22.22.5
    global:
    monagent_http_port: 8088
    mgragent_http_port: 8089
    home_path: /opt/obagent
    http_basic_auth_password: pw3
    ob_monitor_status: active
    depends:
  • oceanbase-ce
    ocp-express:
    servers:
  • 22.22.22.5
    global:
    port: 8180
    home_path: /opt/ocpexpress
    admin_passwd: pw2
    ocp_root_password: pw1
    memory_size: 1532M
    depends:
  • obagent
  • oceanbase-ce

当前proxy进程存在么,查看下是否可以使用root@proxysys登录proxy
测试使用proxyro@sys登录ob试试
proxy_id: 3585是否与yaml的一致

proxy进程不存在的,2883端口监听也没。

能把yaml文件发个附件过来么 格式有点乱套感觉参数设置好像也有点问题

test.log (1.6 KB)

image
yaml文件是你手搓的吧
这个参数应该是obproxy的,ob没有memory_size。建议你obd web白屏化搭建一个试试参数比较透明标识了含义的

现在是obproxy起不来,observer没问题的

有结果了吗

obd cluster display ‘cluster-name’
obd cluster start ‘cluster-name’ -c obproxy-ce

cluster-name换成你真实的集群名称

发下日志

另外看你的yaml配置文件中缺少obproxy的相关配置,你obproxy是怎么部署的?

找到问题了,编译源码加日志排查出来了,obproxy获取本地ip不知道哪出问题了,加了个local_bound_ip起来了。