ob_common_unexpected_internal_error

【 使用环境 】生产环境
【 OB or 其他组件 】
OCP 社区版版本号: 4.3.3-20241219140415
OCP OB集群版本:4.2.1.8
【 使用版本 】

【问题描述】
告警事件详情

告警规则
ob_common_unexpected_internal_error
来源
ob_common_unexpected_internal_error

告警概述:alarm_template_id=0:ob_cluster=myocp-1735117217:host=10.0.202.93 OBServer 非预期的内部错误

告警详情:[OBServer 非预期的内部错误] 集群:myocp,主机:10.0.202.93,日志类型:observer,日志文件:/home/admin/oceanbase/log/observer.log,日志级别:ERROR,关键字=Unexpected internal error happen,错误码=4388,日志详情=[2025-03-13 23:06:35.232744] ERROR issue_dba_error (ob_log.cpp:1875) [4080407][T1002_L0_G0][T1002][YB420A00CA5D-00062B7F1A71A492-0-0] [lt=22][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4002, file=“ob_req_time_service.h”, line_no=60, info=“invalid start and end time”)。
【复现路径】问题出现前后相关操作
没有复现路径,是在告警通知中看到的

【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

@论坛小助手

根因分析场景: 事务回滚报错
根因分析命令: obdiag rca run --scene=transaction_rollback

日志采集场景: 未知问题场景
日志采集命令: obdiag gather scene run --scene=observer.unknown

为了更好地帮助您定位和解决问题,建议您使用上述 obdiag 命令进行根因分析和日志采集,并将采集到的日志提供给我们以便进一步确认问题所在。

另外,为了确定具体的问题,在此需要您提供一些信息:

  1. 是否可以提供完整的错误日志内容或上下文信息?特别是包含 invalid start and end time 的完整堆栈信息。
  2. 错误发生时,是否有执行特定的 SQL 或者操作?如果有,请提供相关的 SQL 语句。
  3. 当前 OceanBase 集群的版本号是多少?
  4. 是否有最近的配置变更或者数据量的变化?
  5. 错误发生的频率如何?是偶现还是每次操作都会触发?

附上敏捷诊断工具 obdiag 使用帮助链接

只有这一个报错信息吗,建议进行如下两个操作

  1. 用敏捷诊断工具obdiag 拿一下巡检报告obdiag check run ,文档 https://www.oceanbase.com/docs/common-obdiag-cn-1000000002488230
  2. 用obdiag 分析下报错发生前15分钟左右的日志看看还有没有别的问题:https://www.oceanbase.com/docs/common-obdiag-cn-1000000002488226
1 个赞

我这个是all-in-one包装的ocp集群,这个集群是个单机,用来管理其他ob集群的,然后这个错误就是在ocp上报的,已经很久了,现在日志找不到这个错误了,然后一键巡检结果是:
±---------------------------------------------------------------------------------------------------------------------------------------------+
| critical-tasks-report |
±---------------------------±----------------------------------------------------------------------------------------------------------------+
| task | task_report |
±---------------------------±----------------------------------------------------------------------------------------------------------------+
| cluster.data_path_settings | [critical] [remote_10.0.202.93] data_dir_path is null . Please check your nodes.data_dir need absolute Path |
| disk.sstable_abnormal_file | [critical] [remote_10.0.202.93] sstable_dir_path is null . Please check your nodes.data_dir need absolute Path |
| network.TCP-retransmission | [critical] [remote_10.0.202.93] tsar is not installed. we can not check tcp retransmission. |
±---------------------------±----------------------------------------------------------------------------------------------------------------+
±-----------------------------------------------------------------------------------------+
| warning-tasks-report |
±--------------------------------------------------±-------------------------------------+
| task | task_report |
±--------------------------------------------------±-------------------------------------+
| bugs.bug_385 | [warning] Unadapted by version. SKIP |
| cluster.ob_enable_plan_cache_bad_version | [warning] Unadapted by version. SKIP |
| cluster.optimizer_better_inlist_costing_parmmeter | [warning] Unadapted by version. SKIP |
| cluster.part_trans_action_max | [warning] Unadapted by version. SKIP |
| cluster.table_history_too_many | [warning] Unadapted by version. SKIP |
| system.instruction_set_avx2 | [warning] Unadapted by version. SKIP |
±--------------------------------------------------±-------------------------------------+
±--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| all-tasks-report |
±--------------------------------------------------±----------------------------------------------------------------------------------------------------------------+
| task | task_report |
±--------------------------------------------------±----------------------------------------------------------------------------------------------------------------+
| bugs.bug_182 | all pass |
| bugs.bug_385 | [warning] Unadapted by version. SKIP |
| bugs.bug_469 | all pass |
| clog.clog_disk_full | all pass |
| cluster.core_file_find | all pass |
| cluster.data_path_settings | [critical] [remote_10.0.202.93] data_dir_path is null . Please check your nodes.data_dir need absolute Path |
| cluster.deadlocks | all pass |
| cluster.global_indexes_too_much | all pass |
| cluster.major | all pass |
| cluster.mod_too_large | all pass |
| cluster.ob_enable_plan_cache_bad_version | [warning] Unadapted by version. SKIP |
| cluster.observer_not_active | all pass |
| cluster.optimizer_better_inlist_costing_parmmeter | [warning] Unadapted by version. SKIP |
| cluster.part_trans_action_max | [warning] Unadapted by version. SKIP |
| cluster.resource_limit_max_session_num | all pass |
| cluster.sys_log_level | all pass |
| cluster.table_history_too_many | [warning] Unadapted by version. SKIP |
| cluster.task_opt_stat | all pass |
| cluster.task_opt_stat_gather_fail | all pass |
| cluster.tenant_number | all pass |
| cpu.oversold | all pass |
| disk.clog_abnormal_file | all pass |
| disk.disk_full | all pass |
| disk.disk_hole | all pass |
| disk.disk_iops | all pass |
| disk.sstable_abnormal_file | [critical] [remote_10.0.202.93] sstable_dir_path is null . Please check your nodes.data_dir need absolute Path |
| disk.xfs_repair | all pass |
| err_code.find_err_4000 | all pass |
| err_code.find_err_4001 | all pass |
| err_code.find_err_4012 | all pass |
| err_code.find_err_4013 | all pass |
| err_code.find_err_4015 | all pass |
| err_code.find_err_4016 | all pass |
| err_code.find_err_4103 | all pass |
| err_code.find_err_4105 | all pass |
| err_code.find_err_4377 | all pass |
| network.TCP-retransmission | [critical] [remote_10.0.202.93] tsar is not installed. we can not check tcp retransmission. |
| system.aio | all pass |
| system.clock_source | all pass |
| system.core_pattern | all pass |
| system.dependent_software | all pass |
| system.dependent_software_swapon | all pass |
| system.getenforce | all pass |
| system.instruction_set_avx2 | [warning] Unadapted by version. SKIP |
| system.parameter | all pass |
| system.parameter_ip_local_port_range | all pass |
| system.parameter_tcp_rmem | all pass |
| system.parameter_tcp_wmem | all pass |
| system.ulimit_parameter | all pass |
| table.information_schema_tables_two_data | all pass |
| version.bad_version | all pass |
| version.old_version | all pass |
±--------------------------------------------------±----------------------------------------------------------------------------------------------------------------+

上面几个critical不知道是不是有问题,因为这个ocp集群就是个单机 也没有obproxy, 安装目录都是默认的,我巡检其他集群 都不会有这个critical。。。。。。

看告警时间点在这里2025-03-13 23:06:35.232744,这个时间的observer.log还在吗?