OceanBase CE4.1.0 ECS自建部署CDC使用obcdc_tailf报错

【 使用环境 】测试环境
【 OB or 其他组件 】Oceanbase CDC
【 使用版本 】ob CE4.1.0 , libobcdc 4.1.0.1
【问题描述】ECS上通过all in one 部署ob demo版本,在安装CDC并使用obcdc_tailf时报错
【复现路径】
配置/root/home/admin/oceanbase/etc/libobcdc.conf

cluster_user=wzy@sys
cluster_password=test1234
tb_white_list=*.*.*   # 这个默认是*.*报错了,我更改为文档中的*.*.*
rootserver_list=127.0.0.1:2882:2881

【问题现象及影响】
执行./obcdc_tailf -f ../etc/libobcdc.conf -o -t 1690449956 后查看/root/home/admin/oceanbase/bin/log/libobcdc.log

[2023-07-27 17:26:10.448876] ERROR issue_dba_error (ob_log.cpp:1792) [1672016][][T0][Y0-0000000000000000-0-0] [lt=14][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4016, file="ob_log_binlog_record.cpp", line_no=55, info="DRCMessageFactory::createBinlogRecord fails")
[2023-07-27 17:26:10.449196] ERROR issue_dba_error (ob_log.cpp:1792) [1672016][][T0][Y0-0000000000000000-0-0] [lt=308][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_log_binlog_record.cpp", line_no=152, info="IBinlogRecord has not been created")
[2023-07-27 17:26:10.449235] ERROR issue_dba_error (ob_log.cpp:1792) [1672016][][T0][Y0-0000000000000000-0-0] [lt=28][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4006, file="ob_log_committer.cpp", line_no=686, info="init HEARTBEAT binlog record fail")
[2023-07-27 17:26:10.449306] ERROR issue_dba_error (ob_log.cpp:1792) [1672016][][T0][Y0-0000000000000000-0-0] [lt=64][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4006, file="ob_log_committer.cpp", line_no=747, info="dispatch_heartbeat_binlog_record_ fail")
[2023-07-27 17:26:10.449369] INFO  [TLOG] handle_error (ob_log_instance.cpp:1809) [1672016][][T0][Y0-0000000000000000-0-0] [lt=7] HANDLE_ERROR: err_cb=0x55f003f788b0, errno=-4006, errmsg="committer HEARTBEAT thread exits, err=-4006"
[2023-07-27 17:26:10.449375] INFO  [TLOG] handle_error (ob_log_instance.cpp:1817) [1672016][][T0][Y0-0000000000000000-0-0] [lt=5] ERROR_CALLBACK begin(err_cb_=0x55f003f788b0)
[2023-07-27 17:26:10.449420] INFO  [TLOG] handle_error (ob_log_instance.cpp:1819) [1672016][][T0][Y0-0000000000000000-0-0] [lt=6] ERROR_CALLBACK end(err_cb_=0x55f003f788b0)

【附件】

observer.log里过滤下-4016,看下报错


observer.log中并无相关报错

libobcdc.log*可以打包压缩发下吗?

http://330205.oss-cn-hangzhou-zmf.aliyuncs.com/libobcdc.log.zip?OSSAccessKeyId=LTAI5tA3AyaXcv7HBjQGxBnf&Expires=1690567244&Signature=PQxw2y%2FdXyQvD6w3fefhYLhWxmw%3D

grep的时候少了反斜杠吧,应该这样:
grep “-4016” observer.log*

不好意思敲错了,你那个错误码应该是-4006,你换一下。

grep “-4006” rotoservice.*

看一下你那边的日志时间,以及事发时间,确认下日志是否被刷了

我再执行一次这个命令 输出如下

[root@iZbp1as69ao6s2meszbhvuZ bin]# ./obcdc_tailf -f ../etc/libobcdc.conf -o
succ to open, filename=./log/libobcdc.log, fd=3, wf_fd=2
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
succ to open, filename=./log/libobcdc.log, fd=31, wf_fd=2
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
build_tsc_timestamp: use_tsc=1 scale=419
[2023-08-01 14:38:23.998392] INFO  [RS] destroy (ob_root_service.cpp:945) [3925118][][T0][Y0-0000000000000000-0-0] [lt=5] [ROOTSERVICE_NOTICE] start to destroy rootservice
[2023-08-01 14:38:23.998433] INFO  [RS] destroy (ob_root_service.cpp:953) [3925118][][T0][Y0-0000000000000000-0-0] [lt=35] start destroy archive_service_
[2023-08-01 14:38:23.998451] INFO  [RS] destroy (ob_root_service.cpp:958) [3925118][][T0][Y0-0000000000000000-0-0] [lt=10] finish destroy archive_service_
[2023-08-01 14:38:23.998464] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2023-08-01 14:38:23.998478] INFO  [RS] destroy (ob_root_service.cpp:966) [3925118][][T0][Y0-0000000000000000-0-0] [lt=14] root balance destroy
[2023-08-01 14:38:23.998484] INFO  [RS] destroy (ob_root_service.cpp:973) [3925118][][T0][Y0-0000000000000000-0-0] [lt=5] empty server checker destroy
[2023-08-01 14:38:23.998490] INFO  [RS] destroy (ob_root_service.cpp:980) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] rs_monitor_check : thread checker destroy
[2023-08-01 14:38:23.998499] INFO  [RS] destroy (ob_root_service.cpp:986) [3925118][][T0][Y0-0000000000000000-0-0] [lt=8] schema history recycler destroy
[2023-08-01 14:38:23.998513] INFO  [RS] destroy (ob_root_service.cpp:990) [3925118][][T0][Y0-0000000000000000-0-0] [lt=13] inner queue destroy
[2023-08-01 14:38:23.998518] INFO  [RS] destroy (ob_root_service.cpp:992) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] inspect queue destroy
[2023-08-01 14:38:23.998537] INFO  [RS] destroy (ob_root_service.cpp:994) [3925118][][T0][Y0-0000000000000000-0-0] [lt=12] ddl builder destroy
[2023-08-01 14:38:23.998549] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2023-08-01 14:38:23.998556] INFO  [RS] destroy (ob_root_service.cpp:999) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] heartbeat checker destroy
[2023-08-01 14:38:23.998570] INFO  [RS] destroy (ob_root_service.cpp:1003) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11] event table operator destroy
[2023-08-01 14:38:23.998578] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [3925118][][T0][Y0-0000000000000000-0-0] [lt=8] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2023-08-01 14:38:23.998585] INFO  [RS] destroy (ob_root_service.cpp:1010) [3925118][][T0][Y0-0000000000000000-0-0] [lt=7] root backup task scheduler destroy
[2023-08-01 14:38:23.998597] INFO  [RS] destroy (ob_root_service.cpp:1017) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11] root backup mgr destroy
[2023-08-01 14:38:23.998603] INFO  [RS] destroy (ob_root_service.cpp:1020) [3925118][][T0][Y0-0000000000000000-0-0] [lt=5] start destroy backup_lease_service_
[2023-08-01 14:38:23.998617] INFO  [RS] destroy (ob_backup_lease_service.cpp:171) [3925118][][T0][Y0-0000000000000000-0-0] [lt=14] start destroy ObBackupLeaseService
[2023-08-01 14:38:23.998642] WDIAG [RS] stop_lease (ob_backup_lease_service.cpp:133) [3925118][][T0][Y0-0000000000000000-0-0] [lt=10][errcode=-4006] not inited(ret=-4006)
[2023-08-01 14:38:23.998655] INFO  [RS] wait_backup_scheduler_stop_ (ob_backup_lease_service.cpp:274) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11] [BACKUP_LEASE] start wait_backup_scheduler_stop_
[2023-08-01 14:38:23.998662] INFO  [RS] wait_backup_scheduler_stop_ (ob_backup_lease_service.cpp:286) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] [BACKUP_LEASE] finish wait_backup_scheduler_stop_(cost_ts=7)
[2023-08-01 14:38:23.998677] INFO  [RS] wait_mgr_stop_ (ob_backup_lease_service.cpp:291) [3925118][][T0][Y0-0000000000000000-0-0] [lt=15] [BACKUP_LEASE] start waiting backup mgr stop
[2023-08-01 14:38:23.998684] INFO  [RS] wait_mgr_stop_ (ob_backup_lease_service.cpp:297) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] [BACKUP_LEASE] finish waiting backup mgr stop(cost_ts=1)
[2023-08-01 14:38:23.998700] INFO  [RS] destroy (ob_backup_lease_service.cpp:180) [3925118][][T0][Y0-0000000000000000-0-0] [lt=15] finish destroy ObBackupLeaseService(cost_ts=84)
[2023-08-01 14:38:23.998707] INFO  [RS] destroy (ob_root_service.cpp:1022) [3925118][][T0][Y0-0000000000000000-0-0] [lt=6] finish destroy backup_lease_service_
[2023-08-01 14:38:23.998734] WDIAG [RS] destroy (ob_dbms_job_master.cpp:94) [3925118][][T0][Y0-0000000000000000-0-0] [lt=4][errcode=-4006] scheduler task not inited(ret=-4006, inited_=false)
[2023-08-01 14:38:23.998748] INFO  [RS] destroy (ob_root_service.cpp:1025) [3925118][][T0][Y0-0000000000000000-0-0] [lt=13] ObDBMSJobMaster destory
[2023-08-01 14:38:23.998754] INFO  [RS] destroy (ob_rs_reentrant_thread.cpp:115) [3925118][][T0][Y0-0000000000000000-0-0] [lt=5] rs_monitor_check : reentrant thread check unregister success(thread_name="", last_run_timestamp=0)
[2023-08-01 14:38:23.998762] INFO  [RS] destroy (ob_root_service.cpp:1031) [3925118][][T0][Y0-0000000000000000-0-0] [lt=7] disaster recovery task mgr destroy
[2023-08-01 14:38:23.998781] WDIAG [RS] destroy (ob_dbms_sched_job_master.cpp:95) [3925118][][T0][Y0-0000000000000000-0-0] [lt=11][errcode=-4006] scheduler task not inited(ret=-4006, inited_=false)
[2023-08-01 14:38:23.998791] INFO  [RS] destroy (ob_root_service.cpp:1035) [3925118][][T0][Y0-0000000000000000-0-0] [lt=9] ObDBMSSchedJobMaster destory
[2023-08-01 14:38:23.998808] INFO  [RS] destroy (ob_root_service.cpp:1037) [3925118][][T0][Y0-0000000000000000-0-0] [lt=10] global ctx timer destroyed
[2023-08-01 14:38:23.998818] INFO  [RS] destroy (ob_root_service.cpp:1046) [3925118][][T0][Y0-0000000000000000-0-0] [lt=10] [ROOTSERVICE_NOTICE] destroy rootservice end(ret=0, ret="OB_SUCCESS")
[2023-08-01 14:38:24.000465] INFO  [RS] stop (ob_disaster_recovery_task_table_updater.cpp:190) [3925118][][T0][Y0-0000000000000000-0-0] [lt=7] stop ObDRTaskTableUpdater success
[2023-08-01 14:38:24.000478] INFO  [RS] wait (ob_disaster_recovery_task_table_updater.cpp:196) [3925118][][T0][Y0-0000000000000000-0-0] [lt=12] wait ObDRTaskTableUpdater

抓取libobcdc.log中的ERROR如下:

[root@iZbp1as69ao6s2meszbhvuZ log]# grep ERROR libobcdc.log
[2023-08-01 14:37:58.834405] ERROR issue_dba_error (ob_log.cpp:1792) [3925618][][T0][Y0-0000000000000000-0-0] [lt=17][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4016, file="ob_log_binlog_record.cpp", line_no=55, info="DRCMessageFactory::createBinlogRecord fails")
[2023-08-01 14:37:58.834731] ERROR issue_dba_error (ob_log.cpp:1792) [3925618][][T0][Y0-0000000000000000-0-0] [lt=287][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_log_binlog_record.cpp", line_no=152, info="IBinlogRecord has not been created")
[2023-08-01 14:37:58.834782] ERROR issue_dba_error (ob_log.cpp:1792) [3925618][][T0][Y0-0000000000000000-0-0] [lt=25][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4006, file="ob_log_committer.cpp", line_no=686, info="init HEARTBEAT binlog record fail")
[2023-08-01 14:37:58.834847] ERROR issue_dba_error (ob_log.cpp:1792) [3925618][][T0][Y0-0000000000000000-0-0] [lt=56][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4006, file="ob_log_committer.cpp", line_no=747, info="dispatch_heartbeat_binlog_record_ fail")
[2023-08-01 14:37:58.834928] INFO  [TLOG] handle_error (ob_log_instance.cpp:1809) [3925618][][T0][Y0-0000000000000000-0-0] [lt=4] HANDLE_ERROR: err_cb=0x55d7c1d858b0, errno=-4006, errmsg="committer HEARTBEAT thread exits, err=-4006"
[2023-08-01 14:37:58.834932] INFO  [TLOG] handle_error (ob_log_instance.cpp:1817) [3925618][][T0][Y0-0000000000000000-0-0] [lt=3] ERROR_CALLBACK begin(err_cb_=0x55d7c1d858b0)
[2023-08-01 14:37:58.834991] INFO  [TLOG] handle_error (ob_log_instance.cpp:1819) [3925618][][T0][Y0-0000000000000000-0-0] [lt=4] ERROR_CALLBACK end(err_cb_=0x55d7c1d858b0)
[2023-08-01 14:37:59.570541] ERROR issue_dba_error (ob_log.cpp:1792) [3925659][][T0][YE47EAC100096-0000000000C00001-0-0] [lt=3][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=0, file="ob_log_batch_buffer.cpp", line_no=198, info="try_freeze failed")
[2023-08-01 14:38:23.997878] ERROR issue_dba_error (ob_log.cpp:1792) [3925118][][T0][Y0-0000000000000000-0-0] [lt=20][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4000, file="ob_fifo_allocator.cpp", line_no=200, info="current_using_ is still used now")

查询observer.log 依旧为空,具体信息如下:

[root@iZbp1as69ao6s2meszbhvuZ log]# cd /root/oceanbase-ce/log/
[root@iZbp1as69ao6s2meszbhvuZ log]# ls
4016.txt                        election.log.wf                 observer.log.wf                    rootservice.log.wf
election.log                    observer.log                    rootservice.log                    trace.log
election.log.20230729005914921  observer.log.20230801133110007  rootservice.log.20230801014042778  trace.log.20230731220000357
election.log.20230730040450377  observer.log.20230801135140355  rootservice.log.20230801055500747
election.log.20230731070937876  observer.log.20230801141208155  rootservice.log.20230801101517908
election.log.20230801101416128  observer.log.20230801143237214  rootservice.log.20230801143528453
[root@iZbp1as69ao6s2meszbhvuZ log]# grep "-4006" rootservice.log*
[root@iZbp1as69ao6s2meszbhvuZ log]# grep "-4016" rootservice.log*
[root@iZbp1as69ao6s2meszbhvuZ log]# grep "-4388" rootservice.log*
[root@iZbp1as69ao6s2meszbhvuZ log]# grep "-4000" rootservice.log*
[root@iZbp1as69ao6s2meszbhvuZ log]# 

这个是OBCDC的问题,目前看还不需要看observer相关的日志哈

那这个obcdc的问题该怎么解决呢?

obcdc是通过RPM包获取的吗?RPM包名麻烦发下,我们本地看看能不能复现

另外前面上传的日志文件好像拿不下来了,麻烦重新上传下?

包名:oceanbase-ce-cdc-4.1.0.1-102000052023061516.el8.x86_64.rpm
文件链接:http://330205.oss-cn-hangzhou-zmf.aliyuncs.com/libobcdc.log.zip?OSSAccessKeyId=LTAI5tA3AyaXcv7HBjQGxBnf&Expires=1690915746&Signature=ViaC%2FjxC2TOSUQHVu5IUCG6f3W8%3D

麻烦看下libobcdc.conf里面除了列出的配置项,是否有其他配置项?
drc_message_factory_binlog_record_type配置项的值是什么?

drc_message_factory_binlog_record_type配置项目前不需要用户指定,如果指定了,需要在传给OBCDC的配置项文件或map中,把这个配置项删掉