OMS 每运行几小时后报错停止

【 使用环境 】生产环境 ,
【 OB or 其他组件 】OMS
【 使用版本 】OMS 4.10_CE
【问题描述】OMS每次运行任务数小时后开始报错,报错信息如图。

  1. 无法查看cm组件的日志(不知道路径在哪)
  2. store和checker都正常,日志没有啥可疑信息
  3. 按照社区文档排查docker的资源使用情况,使用率很低,排除内存不足等资源因素
  4. 同步任务重试可成功恢复任务进度
    现在生产每个几小时任务中断,造成极大不便,且重试任务要开单非常麻烦,麻烦看下这个问题如何解决。

【复现路径】问题出现前后相关操作
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(系统巡检和诊断信息收集)

看样子是组件异常,可以看下组件监控,看看这个组件的日志

OMS中断时间点的日志在:

2023-12-23 17:43:53.101 [runningProjectHandleExecutor-9] ERROR c.a.o.s.s.ProjectHandler 63 - [b0a335e6-47fc-4aff-877b-9f7df95ad3c8] Exception Stack:

com.alipay.oms.dto.nexception.AccessCmException: The response from the CM service is not success.: [RM_API_ERROR] {“message”:“Do get for CM url (http://19.112.31.11:8088/checker/overview) failed, error : I/O error on GET request for “http://19.112.31.11:8088/checker/overview”: Connection timed out (Connection timed out); nested exception is java.net.ConnectException: Connection timed out (Connection timed out)”}

at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.executeGetRequestToCm(DrcCmServiceImpl.java:1893)

at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.doCmGetRequestByProjectId(DrcCmServiceImpl.java:1800)

其他时间日志文件里大部分刷的都是

2023-12-23 15:49:30.240 [http-nio-8090-exec-5] ERROR c.a.o.s.i.o.o.OmsOperatorStoreServiceImpl 352 - [9b7d53e3-44d7-4b2b-9c73-7cfa8b240009] Failed to find drc cm service by sub topic :p_5c5xmjcgel5s_dest-000-0.

但很奇怪,OMS正常转移的时候也不停的刷这个日志。

这个topic貌似还和store组件有关,里面的topic 的uuid可以在store组件里看到。

image

common-error.log.2023-12-23.zip (164.5 KB)

截图中全量迁移报错了,“查看组件监控” 看一下全量组件的情况,另外在看一下后台全量进程是否存在了

值班岗优先恢复了,目前只有日志麻烦看下,当时出问题的时候看组件进程都是在的

全量迁移报错的原因就是 “The response from the CM service is not success”,我问了当时处理的人,他看过监控,checker和store都是正常的

当时的组件是否都正常:supervisorctl status
请将当时的/home/admin/logs/ghana/Ghana/common-default.log和common-error.log
/home/admin/logs/cm/log/common-error.log和cm-api.log
发送一下

oms运维监控-机器 当时的截图,还有oms容器 df -h 当时的截图

common-error.log已经在附件里了,其他的明天我们去生产上取

相关信息:
supervisorctl status 执行过,没异常
oms容器 df -h 资源是够的

p_5c5xmjcgel5s_dest-000-0 看一下这个store是否存在异常

只有这个p_5c5xmjcgel5s_dest-000-0:0000000018

容器内查看进程和磁盘情况

[root@w3131001 ~]# supervisorctl status
nginx                            RUNNING   pid 1564, uptime 147 days, 19:41:33
oms_console                      RUNNING   pid 1567, uptime 147 days, 19:41:23
oms_drc_cm                       RUNNING   pid 1717, uptime 147 days, 19:41:12
oms_drc_supervisor               RUNNING   pid 2062, uptime 147 days, 19:41:02
sshd                             RUNNING   pid 2440, uptime 147 days, 19:40:52
[root@w3131001 ~]# df -h
Filesystem             Size  Used Avail Use% Mounted on
overlay                100G  8.4G   92G   9% /
tmpfs                  378G     0  378G   0% /dev
tmpfs                  378G     0  378G   0% /sys/fs/cgroup
/dev/mapper/vg00-lv2   100G  8.4G   92G   9% /etc/hosts
shm                     64M     0   64M   0% /dev/shm
/dev/mapper/vgob-data   37T  1.6T   35T   5% /u01/ds/run

OMS部署在物理机上面,96C756G40T磁盘,且该物理机没有运行其他程序

oms-api.log

2023-12-23 17:40:04.501 [INFO] [runningProjectHandleExecutor-13] [e8659ebf-7dc5-4df0-9941-56e9b976c016] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":163320,"consistent":0,"insertTps":163320,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14940007231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":163320,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":163320,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":673617497,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324399,"rps":163320,"numberOfInsertedRecords":14940007231,"processedRecords":14940007231,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":673617497,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:09.556 [INFO] [runningProjectHandleExecutor-1] [5798c600-d197-4cd7-994d-b71efd13b410] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161864,"consistent":0,"insertTps":161864,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14941626031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161880,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161880,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667813473,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324409,"rps":161880,"numberOfInsertedRecords":14941626031,"processedRecords":14941626031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667813473,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:14.616 [INFO] [runningProjectHandleExecutor-15] [93a59c24-79a0-43a6-a5c1-3219cfc958bd] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161864,"consistent":0,"insertTps":161864,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14941626031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161880,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161880,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667813473,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324409,"rps":161880,"numberOfInsertedRecords":14941626031,"processedRecords":14941626031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667813473,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:19.674 [INFO] [runningProjectHandleExecutor-12] [48dacb49-30a5-4278-a24e-3ba556acc266] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161836,"consistent":0,"insertTps":161836,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14943244231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161760,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161760,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667151182,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324419,"rps":161760,"numberOfInsertedRecords":14943243631,"processedRecords":14943243631,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667151182,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:24.734 [INFO] [runningProjectHandleExecutor-9] [b95135ec-1020-4819-8d57-fb5d2ddffa65] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161836,"consistent":0,"insertTps":161836,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14943244231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161760,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161760,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667151182,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324419,"rps":161760,"numberOfInsertedRecords":14943243631,"processedRecords":14943243631,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667151182,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:29.790 [INFO] [runningProjectHandleExecutor-16] [a78d60b2-79b5-4151-8252-a60c48f28c1f] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":160200,"consistent":0,"insertTps":160200,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14944846231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":160244,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":160244,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":661451058,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324429,"rps":160244,"numberOfInsertedRecords":14944846231,"processedRecords":14944846231,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":661451058,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:34.914 [INFO] [runningProjectHandleExecutor-8] [d4713df2-781f-4a38-b96b-5aa530a88b9f] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":160200,"consistent":0,"insertTps":160200,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14944846231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":160244,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":160244,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":661451058,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324429,"rps":160244,"numberOfInsertedRecords":14944846231,"processedRecords":14944846231,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":661451058,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:39.971 [INFO] [runningProjectHandleExecutor-4] [3069b228-a1b2-4ba4-bf6b-0b23ad3da8a0] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":160800,"consistent":0,"insertTps":160800,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14946454231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":160816,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":160816,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":662318030,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324439,"rps":160816,"numberOfInsertedRecords":14946454231,"processedRecords":14946454231,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":662318030,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:45.030 [INFO] [runningProjectHandleExecutor-7] [b82f5d4f-7aee-4be7-a30e-74b007817651] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":160800,"consistent":0,"insertTps":160800,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14946454231,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":160816,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":160816,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":662318030,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324439,"rps":160816,"numberOfInsertedRecords":14946454231,"processedRecords":14946454231,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":662318030,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:50.086 [INFO] [runningProjectHandleExecutor-3] [24108859-447f-403c-a5a4-36b1ad391e22] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161504,"consistent":0,"insertTps":161504,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14948069431,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161504,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161504,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":664444985,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324449,"rps":161504,"numberOfInsertedRecords":14948069431,"processedRecords":14948069431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":664444985,"status":"running"}},"isSuccess":true})
2023-12-23 17:40:55.145 [INFO] [runningProjectHandleExecutor-5] [94638e3f-6f7f-49ce-8f3b-afc8d9bf011d] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":161504,"consistent":0,"insertTps":161504,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14948069431,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":161504,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":161504,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":664444985,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324449,"rps":161504,"numberOfInsertedRecords":14948069431,"processedRecords":14948069431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":664444985,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:00.204 [INFO] [runningProjectHandleExecutor-6] [124ac8ee-568f-4759-9235-e3ccede68d87] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":163320,"consistent":0,"insertTps":163320,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14949702631,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":163276,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":163276,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":668670055,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324459,"rps":163276,"numberOfInsertedRecords":14949702031,"processedRecords":14949702031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":668670055,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:05.263 [INFO] [runningProjectHandleExecutor-10] [79eafb9b-089c-4cc9-af9f-9878ba09b5d3] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":163320,"consistent":0,"insertTps":163320,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14949702631,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":163276,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":163276,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":668670055,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324459,"rps":163276,"numberOfInsertedRecords":14949702031,"processedRecords":14949702031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":668670055,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:10.319 [INFO] [runningProjectHandleExecutor-11] [dfc80895-d066-4945-9f67-e13ab481ff9e] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":162256,"consistent":0,"insertTps":162256,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14951325031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":162240,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":162240,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667172300,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324469,"rps":162240,"numberOfInsertedRecords":14951324431,"processedRecords":14951324431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667172300,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:15.375 [INFO] [runningProjectHandleExecutor-2] [efbef26a-6bd6-4c53-86ee-1c946a2e1e5d] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":162256,"consistent":0,"insertTps":162256,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14951325031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":162240,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":162240,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":667172300,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324469,"rps":162240,"numberOfInsertedRecords":14951324431,"processedRecords":14951324431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":667172300,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:20.431 [INFO] [runningProjectHandleExecutor-14] [5b1ad7eb-9070-4989-a124-b4deeb365e6e] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":162900,"consistent":0,"insertTps":162900,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14952954031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":162960,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":162960,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":669779862,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324479,"rps":162960,"numberOfInsertedRecords":14952954031,"processedRecords":14952954031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":669779862,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:25.494 [INFO] [runningProjectHandleExecutor-13] [e826983c-50f5-4fb9-88b2-205a6a8103f7] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":162900,"consistent":0,"insertTps":162900,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14952954031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":162960,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":162960,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":669779862,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324479,"rps":162960,"numberOfInsertedRecords":14952954031,"processedRecords":14952954031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":669779862,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:30.657 [INFO] [runningProjectHandleExecutor-1] [a45d8a07-99ad-41a6-9930-9ca5e488eb99] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":163140,"consistent":0,"insertTps":163140,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14954585431,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":163124,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":163124,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":665400795,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324489,"rps":163124,"numberOfInsertedRecords":14954585431,"processedRecords":14954585431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":665400795,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:35.715 [INFO] [runningProjectHandleExecutor-15] [c48e2e6f-c953-4ba5-b173-da63ba52ce79] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":163140,"consistent":0,"insertTps":163140,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14954585431,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":163124,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":163124,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":665400795,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324489,"rps":163124,"numberOfInsertedRecords":14954585431,"processedRecords":14954585431,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":665400795,"status":"running"}},"isSuccess":true})
2023-12-23 17:41:40.774 [INFO] [runningProjectHandleExecutor-12] [f33dcdcd-c9d3-4ab5-bd3b-b61e1e1ded86] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":162960,"consistent":0,"insertTps":162960,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":14956215031,"name":"overview-running","progress":"0.085","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":162960,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":162960,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":666787064,"message":"","startingGmtTime":1703240603,"recordProgress":"0.085","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703324499,"rps":162960,"numberOfInsertedRecords":14956215031,"processedRecords":14956215031,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":666787064,"status":"running"}},"isSuccess":true})
2023-12-23 20:19:56.894 [INFO] [http-nio-8090-exec-8] [17629573-6cfa-46ba-8b3a-bd9cc6d1590a] Do get for CM url (http://19.112.31.11:8088/checker/overview) success, return ({"overview":{"total":1,"tableOverviewList":[{"schema":"tbrhkdb","imageOnly":0,"readTps":170083,"consistent":0,"insertTps":170083,"capacity":176520816793,"mismatched":0,"destTable":"tbl_ubqat_qtonl_trans_flow","deleted":0,"destSchema":"tbrhkdb","inserted":16487958631,"name":"overview-running","progress":"0.093","masterOnly":0,"startingDate":"2023-12-22 18:23:28","updated":0,"table":"tbl_ubqat_qtonl_trans_flow","readDestTps":0,"status":"Running"}],"heartBeat":{"dstRpsRef":32000,"srcRps":169980,"numberOfUpdatedRecords":0,"type":"migrate","consistentQuantity":0,"capacity":176520816793,"srcRt":0,"dstRt":0,"predictedTimeToFinish":604800,"srcIopsRef":33554432,"id":90231,"finishedTables":0,"dstRps":169980,"dstRtRef":1,"inconsistentQuantity":0,"srcIops":735628428,"message":"","startingGmtTime":1703240603,"recordProgress":"0.093","numberOfDeletedRecords":0,"subId":1,"reportingGmtTime":1703333989,"rps":169980,"numberOfInsertedRecords":16487956831,"processedRecords":16487956831,"progress":"0.000","srcRpsRef":32000,"srcRtRef":1,"dstIops":735628428,"status":"running"}},"isSuccess":true})

common-error.log

2023-12-23 15:49:30.240 [http-nio-8090-exec-5] ERROR c.a.o.s.i.o.o.OmsOperatorStoreServiceImpl 352 - [9b7d53e3-44d7-4b2b-9c73-7cfa8b240009] Failed to find drc cm service by sub topic :p_5c5xmjcgel5s_dest-000-0.
2023-12-23 17:43:53.100 [runningProjectHandleExecutor-9] ERROR c.a.o.s.i.ApiCommonService 43 - [b0a335e6-47fc-4aff-877b-9f7df95ad3c8] Do get for CM url (http://19.112.31.11:8088/checker/overview) failed, error : I/O error on GET request for "http://19.112.31.11:8088/checker/overview": Connection timed out (Connection timed out); nested exception is java.net.ConnectException: Connection timed out (Connection timed out)
2023-12-23 17:43:53.101 [runningProjectHandleExecutor-9] ERROR c.a.o.s.s.ProjectHandler 63 - [b0a335e6-47fc-4aff-877b-9f7df95ad3c8] Exception Stack:
com.alipay.oms.dto.nexception.AccessCmException: The response from the CM service is not success.: [RM_API_ERROR] {"message":"Do get for CM url (http://19.112.31.11:8088/checker/overview) failed, error : I/O error on GET request for \"http://19.112.31.11:8088/checker/overview\": Connection timed out (Connection timed out); nested exception is java.net.ConnectException: Connection timed out (Connection timed out)"}
	at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.executeGetRequestToCm(DrcCmServiceImpl.java:1893)
	at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.doCmGetRequestByProjectId(DrcCmServiceImpl.java:1800)
	at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.getCheckerOverview(DrcCmServiceImpl.java:1121)
	at com.alipay.oms.service.impl.action.step.CheckerCommonAction.updateCheckerProgress(CheckerCommonAction.java:380)
	at com.alipay.oms.service.impl.action.step.CheckerCommonAction.rmCommonRunningAction(CheckerCommonAction.java:243)
	at com.alipay.oms.service.impl.action.step.FullMigrationAction.doRunningAction(FullMigrationAction.java:34)
	at com.alipay.oms.service.scheduler.AbstractProjectHandler.doActionForRunningSteps(AbstractProjectHandler.java:176)
	at com.alipay.oms.service.scheduler.AbstractProjectHandler.innerHandleRunningProject(AbstractProjectHandler.java:95)
	at com.alipay.oms.service.scheduler.ProjectHandler.handleRunningProject(ProjectHandler.java:43)
	at com.alipay.oms.service.scheduler.ProjectHandler$$FastClassBySpringCGLIB$$1947796d.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:750)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
	at org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:853)
Caused by: com.alipay.oms.dto.exception.RmApiException: [RM_API_ERROR] {"message":"Do get for CM url (http://19.112.31.11:8088/checker/overview) failed, error : I/O error on GET request for \"http://19.112.31.11:8088/checker/overview\": Connection timed out (Connection timed out); nested exception is java.net.ConnectException: Connection timed out (Connection timed out)"}
	at com.alipay.oms.service.impl.ApiCommonService.doDefaultGetRequest(ApiCommonService.java:44)
	at com.alipay.oms.service.impl.drc.DrcCmServiceImpl.executeGetRequestToCm(DrcCmServiceImpl.java:1876)
	... 17 common frames omitted
2023-12-23 20:19:56.530 [http-nio-8090-exec-6] ERROR c.a.o.s.i.o.o.OmsOperatorStoreServiceImpl 352 - [1befc2ce-8a90-49f0-b432-dbdcf6dc0b23] Failed to find drc cm service by sub topic :p_5c5xmjcgel5s_dest-000-0.

p_5c5xmjcgel5s_dest-000-0这运行是否正常?

查看一下cm服务进程的gc情况:
/opt/alibaba/java/bin/jstat -gcutil 1717 1s


现在这个状态是正常的

这个不正常啊,fullgc非常多了,建议加大cm的jvm内存
进入oms容器,修改:/home/admin/conf/command/start_oms_cm.sh
-server -Xmx4g -Xms4g -Xmn3g
改成
-server -Xmx8g -Xms8g -Xmn7g

1 个赞

修改后需要重启容器才生效么?

不需要重启容器,重启cm组件就可以了
supervisorctl restart oms_drc_cm

正在同步的任务会受影响么?

不会