突然出现错误日志errcode=-4392] disk is hung

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】社区版4.1
【问题描述】
之前测试用的OB集群,已经闲置一周多了,今天登录OCP看了下,发现有个disk is hung的报警,又查了一下日志,发现只出现了一分钟,后面报警就自己恢复了。
按字面理解【磁盘挂了】还是很严重的事,想请教一下,这个可能是什么原因引起的?

【附件】

[2023-06-02 09:39:13.058690] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5479][T1005_Occam][T1005][Y0-0000000000000000-0-0] [lt=25][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669948055441, hung time=5003071)
[2023-06-02 09:39:13.068131] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5300][T1_Occam][T1][Y0-0000000000000000-0-0] [lt=74007][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669948006984, hung time=5061092)
[2023-06-02 09:39:13.075658] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5640][T1006_Occam][T1006][Y0-0000000000000000-0-0] [lt=26][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669947998205, hung time=5077301)
[2023-06-02 09:39:13.058690] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5479][T1005_Occam][T1005][Y0-0000000000000000-0-0] [lt=25][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669948055441, hung time=5003071)
[2023-06-02 09:39:13.068131] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5300][T1_Occam][T1][Y0-0000000000000000-0-0] [lt=74007][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669948006984, hung time=5061092)
[2023-06-02 09:39:13.075658] ERROR detect_palf_hang_failure_ (ob_failure_detector.cpp:346) [5640][T1006_Occam][T1006][Y0-0000000000000000-0-0] [lt=26][errcode=-4392] disk is hung(msg="clog disk may be hung, add failure event", clog_disk_hang_event={type:PROCESS HANG, module:LOG, info:clog disk hang event, level:FATAL}, clog_disk_last_working_time=1685669947998205, hung time=5077301)
1 个赞

查看/var/log/message日志,报错时间点前,有没有磁盘故障的信息

1 个赞

没有

1 个赞

兄弟,你的问题解决了么?我遇到了类似问题。

估计还是io的问题,我决定把clog 调整到ssd盘上,应该会解决