avx问题 系统是rocky9.5已经打开了虚拟机的CPU硬件虚拟化还是报错

【 使用环境 】生产环境 or 测试环境
【 OB or 其他组件 】
【 使用版本 】
【问题描述】清晰明确描述问题
【复现路径】问题出现前后相关操作
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

【备注】基于 LLM 和开源文档 RAG 的论坛小助手已开放测试,在发帖时输入 [@论坛小助手] 即可召唤小助手,欢迎试用!
【 使用环境 】生产环境
【 OB or 其他组件 】OB
【 使用版本 】4.2.5.2升级到4.2.5.6
【问题描述】系统是rocky9.5已经打开了虚拟机的CPU硬件虚拟化还是报错
| system.instruction_set_avx | [critical] CPU on remote_192_168_18_21 does not support AVX instruction set. if you want to use observer, please upgrade the observer version to ‘4.2.5.6 or later’ or ‘4.3.5.4 or later’ or ‘4.4.1.0 or later’ |
【复现路径】问题出现前后相关操作
【附件及日志】

在这种情况下,问题主要集中在系统检测到了 CPU 不支持 AVX 指令集,这可能是因为系统内核或虚拟化层没有正确识别到 CPU 的全部功能,尽管虚拟机已开启硬件虚拟化支持。对于这种情况,首先可以通过 obdiag 工具来收集系统的相关信息,以帮助我们进一步诊断问题。

推荐您使用 obdiag 工具进行日志采集,以便我们能够更准确地判断导致该问题的具体原因。

日志采集场景: 集群基础信息
日志采集命令: obdiag gather scene run --scene=observer.base

请您先执行上述命令并上传采集的日志压缩包给我们。此外,为了更有效地定位问题,请您回答以下几个问题:

  1. 虚拟机所使用的具体型号及其 CPU 型号是什么?
  2. 是否有最新的 BIOS 更新,以及是否已经应用了这些更新?
  3. 在升级 OceanBase 之前,是否有同样的错误信息出现过?
  4. 您是否尝试过在相同配置的物理机上直接安装 OceanBase 来验证是否也存在同样的问题?
  5. 能否提供具体的升级过程,包括使用的脚本或命令?
  6. 虚拟机管理软件的版本信息是什么?

这些问题的答案将有助于我们更深入地了解您的环境,并为您提供更有效的解决方案。

附上敏捷诊断工具 obdiag 使用帮助链接

| bugs.bug_469 | [critical] [remote_192_168_18_21] glibc version need less 2.34. This may cause observer crash. More information: [Enhancement]: obdiag check 巡检增加 glibc 版本检查项 · Issue #469 · oceanbase/obdiag · GitHub |
| system.core_pattern | [critical] [remote_192_168_18_21] kernel.core_pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h , is not recommended for configuring functions other than the specified core path |
| system.instruction_set_avx | [critical] CPU on remote_192_168_18_21 does not support AVX instruction set. if you want to use observer, please upgrade the observer version to ‘4.2.5.6 or later’ or ‘4.3.5.4 or later’ or ‘4.4.1.0 or later’ |
| system.mount_options | [critical] node: remote_192_168_18_21 /home/obbak mount option sync is not exist |
| | [critical] node: remote_192_168_18_21 /home/obbak mount option lookupcache=positive is not exist |
| | [critical] node: remote_192_168_18_21 /home/obbak mount option nfsvers=4.1 is not exist |
| network.TCP-retransmission | [critical] [remote_192_168_18_21] tsar is not installed. we can not check tcp retransmission. |
| network.network_drop | [critical] [remote_192_168_18_21] network: ens192 RX drop is not 0, please check by ip -s link show ens192 |
| network.network_write_cond_wakeup | [critical] Found 7 ‘write cond wakeup’ occurrences in observer.log on remote_192_168_18_21. This indicates potential network issues between client and OBServer. Please check network connectivity and performance. |
| cluster.data_path_settings | [critical] [remote_192_168_18_21] ip:192.168.18.21 ,data_dir and log_dir_disk are on the same disk.

                                                                                  warning-tasks-report                                                                                            |

±------------------------------------------±----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| task | task_report |
±------------------------------------------±----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| system.check_command | [warning] node: remote_192_168_18_21 nslookup is not existed. bind-utls DNS toolset. Suggested installation |
| | [warning] node: remote_192_168_18_21 mtr is not existed. Network testing tools. Suggested installation |
| system.parameter | [warning] [remote_192_168_18_21] vm.max_map_count : 65530. recommended:327680 ≤ value ≤ 1000000. |
| | [warning] [remote_192_168_18_21] fs.pipe-user-pages-soft : 16384. recommended: 0. |
| system.python_version | [warning] node: remote_192_168_18_21 python version: 3.9.19. OceanBase related scripts depend on Python 2.7. x |
| system.ulimit_parameter | [warning] [remote_192_168_18_21] On ip : 192.168.18.21, ulimit -u as “max user processes” is 655350 . recommended: 655360. |
| | [warning] [remote_192_168_18_21] On ip : 192.168.18.21, ulimit -n as “open files” is 131072 . recommended: unlimited. |
| | [warning] [remote_192_168_18_21] On ip : 192.168.18.21, ulimit -s as “stack size” is 8192 . recommended: unlimited. |
| tenant.macroblock_utilization_rate_tenant | [warning] tenant: roc ratio: 0.74, dataSize: 161.00G, requiredSize: 219.01G. need major |
| tenant.parameters_default | [warning] the enable_record_trace_log value: False, default_value: True |
| | [warning] the cluster_id value: 4, default_value: 0 |
| | [warning] the data_dir value: /home/ob/oceanbase/store/backupob, default_value: store |
| | [warning] the _enable_mysql_compatible_dates tenant_ids: 1,1004, value: True, default_value: False |
| | [warning] the _enable_dbms_job_package value: False, default_value: True |
| | [warning] the system_memory value: 6G, default_value: 0M |
| | [warning] the devname value: ens192, default_value: bond0 |
| | [warning] the _parallel_ddl_control tenant_ids: 1004, value: TRUNCATE_TABLE:ON, SET_COMMENT:ON, CREATE_INDEX:ON, CREATE_VIEW:ON, DROP_TABLE:ON, default_value: |
| | [warning] the config_additional_dir value: /home/ob/log/backupob/etc2;/home/ob/data/backupob/etc3, default_value: etc2;etc3 |
| | [warning] the cpu_quota_concurrency tenant_ids: 1, value: 10, default_value: 4 |
| | [warning] the enable_ps_parameterize tenant_ids: 1004, value: False, default_value: True |
| | [warning] the observer_id value: 1, default_value: 0 |
| | [warning] the max_syslog_file_count value: 300, default_value: 0 |
| | [warning] the large_query_threshold value: 600s, default_value: 5s |
| | [warning] the cluster value: backupob, default_value: obcluster |
| | [warning] the partition_balance_schedule_interval tenant_ids: 1004, value: 0, default_value: 2h |
| | [warning] the log_disk_size value: 360G, default_value: 0M |
| | [warning] the _enable_ddl_worker_isolation tenant_ids: 1004, value: True, default_value: False |
| | [warning] the enable_syslog_recycle value: True, default_value: False |
| cluster.large_query_threshold | [warning] svr_ip: 192.168.18.21. large_query_threshold is 600s, recommended value is 5s. |
| cluster.mod_too_large | [warning] [cluster:rocob] mod max memory over 10G,Please check on oceanbase.__all_virtual_memory_info to find some large mod |
| cluster.cgroup | [warning] node: remote_192_168_18_21 cgroup path is not exist. Tenant isolation not enabled. issue #849

你使用哪种方式升级的
SHOW VARIABLES like ‘version_comment’; 具体的版本信息查一下

version_comment OceanBase_CE 4.2.5.6 (r106000022025082510-69711158c2c965d735f9513257aaccab8b921834) (Built Aug 25 2025 10:15:07)

使用的是ocp的版本升级
image

另外问下这个glibc 2.34. This may cause observer crash.

ldd (GNU libc) 2.34

有没有影响?

查一下机器的cpu的avx指令集 你当时的ob4252怎么部署上的

4.2.5.2 是用ocp部署的

我用cetos7.9 打开向客户机操作系统公开硬件辅助的虚拟化 avx是打开了
用rocky9.5一样的操作 就是没有
image
image

lscpu |grep avx 你这样查看一下
ocp的版本信息 发一下

cpuid |grep -i avx可以吗
image

image

你用obdiag巡检的是不支持的avx指令集是么?obdiag的版本是哪个?

obdiag version: 3.6.0

目前你用ocp升级了么?还只是使用obdiag巡检 报的信息不支持avx是么?

开始4.2.5.2巡检也是这样报错
现在是用ocp 升级ob 从4.2.5.2至4.2.5.6 然后再用obdiag巡检 也是这样 上面有obdiag的报告
| bugs.bug_469 | [critical] [remote_192_168_18_21] glibc version need less 2.34. This may cause observer crash. More information: [Enhancement]: obdiag check 巡检增加 glibc 版本检查项 · Issue #469 · oceanbase/obdiag · GitHub |
| system.core_pattern | [critical] [remote_192_168_18_21] kernel.core_pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h , is not recommended for configuring functions other than the specified core path |
| system.instruction_set_avx | [critical] CPU on remote_192_168_18_21 does not support AVX instruction set. if you want to use observer, please upgrade the observer version to ‘4.2.5.6 or later’ or ‘4.3.5.4 or later’ or ‘4.4.1.0 or later’ |

理解一下 ocp升级没有错 也没有报不支持avx指令集的问题 obdiag巡检报不支持avx指令集是么?

是的

好的 我们先看一下