OMS 增量同步组件启动失败 GHANA-OPERAT000003

【 使用环境 】 测试环境
【 OB or 其他组件 】OMS
【 使用版本 】4.2.9_CE
【问题描述】通过OMS 配置增量同步的方式将postgresql 15部分表的数据实时迁移到OceanBase_CE 4.3.5.0 。
会出现 GHANA-OPERAT000003 报错
错误信息:The response from the CM service is not success.
排查/home/admin/logs/ghana/Ghana/common-error.log 文件
有报错 :
CM response not success.
No enough machin resource for Store task,reason current cpu usage 4.1550465 exceed limited 0.85.

这个告警最终是因为 cpu 资源不足吗? 但是这台机器是 8 核心,当时查看容器整体 cpu 负载和 容器内 top 查看负载情况,都没有超过 2.0,也就是说 CPU 使用都没跑满 2 个核。为什么还会有这个错误并导致任务失败。

想知道这里的 0.85 是基于总的 CPU 使用率来计算的吗?

我现在这个 8核 CPU 的机器,如果每个核心的使用率为 80%,则总的 CPU 使用率为 640% / 8 = 80%。此时总使用率未达到 85% 的阈值,因此不会触发限制吗?

当时是通过重启 OMS 容器解决问题。
如果修改 drc-cm.properties 的 drcCfg.cpuUsedPercentThresHold参数,调整的更大,然后
supervisorctl restart oms_drc_cm 重启oms_drc_cm 服务。是否可以临时解决这个问题?

1 个赞

看一下组件监控 具体看看什么问题
排查/home/admin/logs/ghana/Ghana/common-error.log 文件 这个文件也发一下

1 个赞

查看组件监控,只有 Store 组件 且运行正常。
但是没有 Incr-Sync 组件

log文件无法拿取


1 个赞

我更想知道这里的 0.85 是基于总的 CPU 使用率来计算的吗?
如果是,那就是说当时获取的 cpu使用率是 415% × 8 ? 甚至我的日志有 5.1 的,那就是 5 × 8 ?4000%?

1 个赞

:+1: :+1: :grinning: :grinning:

你当时看的 load average是多少
image

oms容器执行一下
cat /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us 看下值是多少

都在 2.0 以下,没有超过2.0 的情况

cat /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us
你按照楼上发的 查一下这个值是多少

1 个赞

cat: /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us: No such file or directory

容器是通过docker_remote_deploy.sh 一步步创建的,没有其他额外配置。

执行一下docker stats看一下。之前有没有限制过oms容器的cpu占用的

之前没有限制过 cpu ,会报错,但是最新重新起了一个限制 cpu 个数的测试环境,限制为 6 个 cpu ,到目前没有这样的报错出现了。

docker info 看一下Cgroup Version是v1还是v2

执行一下,发一下结果
cgroup2的话
cat /sys/fs/cgroup/cpu.stat
cat /proc/stat
cat /sys/fs/cgroup/cpuset.cpus.effective

cgroup1的话
cat /sys/fs/cgroup/cpuacct/cpuacct.usage
cat /proc/stat
cat /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu
cat /sys/fs/cgroup/cpuset/cpuset.cpus

Server Version: 20.10.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 1

root@uat03:~# cat /sys/fs/cgroup/cpuacct/cpuacct.usage

169627395330301594

root@uat03:~# cat /proc/stat

cpu 9297750780 12338120 4735523752 124476209864 487214568 0 1808373155 0 0 0

cpu0 374898244 269865 107107848 2906643948 11110960 0 102609316 0 0 0

cpu1 247559423 250509 119888597 3049982157 12331930 0 61447478 0 0 0

cpu2 234167111 379430 126942298 3095016198 12661461 0 45417431 0 0 0

cpu3 230497867 261148 118560424 3108906982 14472042 0 47272117 0 0 0

cpu4 329148719 330285 109569634 3042463347 11237892 0 28602582 0 0 0

cpu5 226543891 273824 117927825 3120002002 12850508 0 45793050 0 0 0

cpu6 224496139 375993 122762404 3121264181 12081988 0 36967646 0 0 0

cpu7 285678159 236427 118183384 3060797336 11845799 0 47440634 0 0 0

cpu8 251535199 337772 118795284 3101054236 11459476 0 37231703 0 0 0

cpu9 271396172 249519 116591919 3077971975 12251183 0 45494321 0 0 0

cpu10 222022319 379438 125056470 3118951987 11701761 0 40843544 0 0 0

cpu11 252508394 164207 115299160 3096279402 12083440 0 45719109 0 0 0

cpu12 219326316 374023 124004477 3121763829 11623377 0 41977326 0 0 0

cpu13 235417113 281800 116982080 3116064384 13634616 0 42477415 0 0 0

cpu14 219288017 378758 124961809 3119716621 11567251 0 43641190 0 0 0

cpu15 230012558 276901 116415437 3122547407 13476803 0 42227696 0 0 0

cpu16 214836883 365107 121817665 3128229545 11440205 0 44173140 0 0 0

cpu17 227140781 276861 115901497 3125963711 13207990 0 42406460 0 0 0

cpu18 215002284 342465 122481196 3127062262 11333350 0 44318882 0 0 0

cpu19 225604800 276468 116000707 3127503399 13003397 0 42622622 0 0 0

cpu20 222747391 403166 125495252 3116213795 11265678 0 43836062 0 0 0

cpu21 222759850 259990 114309273 3128418965 12734092 0 44418378 0 0 0

cpu22 210378245 373672 122291990 3131397923 11521810 0 44133937 0 0 0

cpu23 214787554 262107 114245794 3138756250 12680277 0 43183971 0 0 0

cpu24 210996913 407520 122307068 3132327049 11473348 0 43750102 0 0 0

cpu25 213934650 268921 114154998 3139680080 12721897 0 43467268 0 0 0

cpu26 225293505 360108 124888439 3112943616 11546484 0 43963137 0 0 0

cpu27 210338015 278375 112684390 3145981912 12641908 0 42850900 0 0 0

cpu28 252734878 308263 120156341 3089400308 10556478 0 46814155 0 0 0

cpu29 210678809 256604 112127754 3145856668 12622539 0 42967892 0 0 0

cpu30 216138200 375078 122739199 3126738093 11445252 0 42523207 0 0 0

cpu31 210826912 174872 112402507 3143919127 12555511 0 43848220 0 0 0

cpu32 212046782 380243 122222000 3131869984 11529401 0 42601258 0 0 0

cpu33 210963065 288611 114013263 3141939177 12769205 0 43712183 0 0 0

cpu34 208652158 378682 121631635 3135532873 11553918 0 42810621 0 0 0

cpu35 211222966 280907 114392137 3140664568 12775966 0 44132466 0 0 0

cpu36 239326406 351593 119648496 3102500376 10920196 0 44397490 0 0 0

cpu37 210411042 276256 114082535 3141477499 12812771 0 44434940 0 0 0

cpu38 236220273 298353 122238261 3100885836 12912103 0 47339305 0 0 0

cpu39 210212761 273981 114242288 3141520833 12800284 0 44503988 0 0 0

intr 134690384110 18 0 0 0 0 0 0 0 1 2 285 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 193 0 17743313 0 127184576 125384270 109694552 107845030 99084458 98952656 96115491 94234697

91709984 92125195 89032230 91667827 89828464 91471986 85222934 89200631 89331897 89366264 86720661 88097861 123542764 132319914 131470037 122241560 122989450 123231216 124772

284 124830046 124909463 125013898 122587117 124358228 124326209 122141036 123282521 124421562 125058607 124794828 124938485 124606873 0 0 0 0 24 2127391808 1318437185 34566055

82 1510523514 3003500095 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

ctxt 5544619389928

btime 1715677343

processes 1993891773

procs_running 4

procs_blocked 2

softirq 439278788352 7 259436422 200503251 3157643229 3937598263 0 523868867 1908890160 0 4089085849

root@uat03:~# cat /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu

5304076579085191 4422128182546035 4320986569961504 4253406901142401 4944273232524405 4207901414151785 4153422600816206 4798982793669041 4380283047591475 4630010916211362 41993

62359061926 4419663353043737 4174675924890766 4252502340256291 4201769498153940 4191171300409322 4124792188962883 4159809293165682 4134014360421029 4146092314932298 4239865464

045428 4121566706296147 4083199315062415 4037068092943702 4081982745694573 4027034239461367 4255369853439062 3967992605303583 4474040320931252 3967919791862728 413330500612121

7 3988654664056208 4081894620926440 4004330059522311 4046693395238522 4017681884245174 4314509394571336 4007071094461832 4349200502720386 4008690442947811

root@uat03:~# cat /sys/fs/cgroup/cpuset/cpuset.cpus

0-39

这个是在oms容器里执行的吗,容器里执行一下看看?

是的,容器里

这个场景能够复现出来吗,可以复现出来麻烦发一下supervisor的日志 /home/admin/logs/supervisor/supervisor.log


运维监控-机器 可以看到cpu的使用量,正常情况下不会超过100%