【 使用环境 】生产环境
【 OB or 其他组件 】社区版OB
【 使用版本 】社区版OB 4.1.0
【问题描述】
sys租户5月11日02点发起的自动合并未完成,普通租户5月4号02点发起的自动合并未完成
你的问题我们已经收到,稍后会有相关同学给你回复
查询日志信息:
grep “check merge progress success” rootservice.log |tail -10
grep 'WARN ’ observer.log |grep merge
查询下面结果:
SELECT * FROM __all_zone WHERE name LIKE ‘%merge%’;
select * from __all_zone where name = “is_merge_error” or info = “ERROR”;
select * from __all_rootservice_event_history where module = “daily_merge” and event like “%merge_error%” order by gmt_create desc limit 1;
select * from __all_zone where name = “global_broadcast_version” or name = “broadcast_version”;
select svr_ip,count(*) from __all_virtual_meta_table group by svr_ip ;
[admin@obdb1 log]$ grep “check merge progress success” rootservice.log* |tail -10
[admin@obdb1 log]$ grep “WARN” observer.log* |grep merge
[admin@obdb2 log]$ grep “check merge progress success” rootservice.log* |tail -10
[admin@obdb2 log]$ grep “WARN” observer.log* |grep merge
[admin@obdb3 log]$ grep “check merge progress success” rootservice.log* |tail -10
[admin@obdb3 log]$ grep “WARN” observer.log* |grep merge
注:observer.log日志目前只有5月15日之后的
麻烦看下
select * from GV$OB_COMPACTION_DIAGNOSE_INFO;
select * from GV$OB_COMPACTION_PROGRESS;
select * from __all_virtual_tablet_meta_table where tenant_id = 1002 and compaction_scn < 1683136804641344843;
看起来是备机读时间戳没推导致合并前的转储无法执行。
我们找个事务同学确认下
1、
SELECT gmt_create,svr_ip,svr_port,event,name3,value3 FROM __all_server_event_history WHERE module=“ELECTION” AND value1=1002 AND value2=1 ORDER BY gmt_create;
根据查出来的leader确认下节点是哪个,同时,查询结果也回传下
到那个节点上帮忙取一下rootserver.log
2、10.168.89.11,10.168.89.12 取一下这两台的observer日志
3、select * from __all_virtual_dag_warning_history where tenant_id=1002;
查询结果.txt (4.4 KB)
election_89.13.tar.gz (2.6 MB)
election_89.11.tar.gz (1.4 MB)
observer_89_11.tar.gz (3.1 MB)
election_89.12.tar.gz (240.2 KB)
observer_89_12.tar.gz (2.7 MB)
observer.log文件较大,先提供最新的50000行记录,目前observer.log里有比较多报错
Unexpected internal error happen, please checkout the internal errcode(errcode=-4103, file=“log_iterator_impl.h”, line_no=700, info=“verify accumlate checksum failed”)
Log out of disk space(msg=“log disk space is almost full”, ret=-4264, total_size(MB)=14745, used_size(MB)=12945, used_percent(%)=87, warn_size(MB)=11796, warn_percent(%)=80, limit_size(MB)=14008, limit_percent(%)=95, maximum_used_size(MB)=12945, maximum_log_stream=1, oldest_log_stream=1, oldest_scn={val:1683731979634087599})
5月10号有进行了observer替换操作,第一次将89.12替换成了89.14,第二次又将89.14替换回89.12
看看12节点的log盘是不是满了,df -h一下看看
同时select * from __all_server;也帮忙执行下
如果是目录满了,看看除了ob的文件外,是否还有其他文件占用
[admin@obdb2 log]$ df -mh
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 300G 7.6G 293G 3% /
devtmpfs 63G 0 63G 0% /dev
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 63G 98M 63G 1% /run
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sda2 194M 113M 81M 59% /boot
/dev/sda1 200M 9.8M 191M 5% /boot/efi
/dev/mapper/vg_ob-lv_pro 300G 195G 106G 65% /data/oceanbase/product
/dev/mapper/vg_ob-lv_red 200G 180G 20G 91% /data/oceanbase/redolog
/dev/mapper/vg_ob-lv_sto 1.2T 1.1T 120G 91% /data/oceanbase/storage
/dev/mapper/rhel-gpdata 6.2T 34M 6.2T 1% /data/soft
tmpfs 13G 0 13G 0% /run/user/6100
tmpfs 13G 0 13G 0% /run/user/0
[admin@obdb2 log]$ cd /data/oceanbase/redolog/ob/obcluster/
[admin@obdb2 obcluster]$ ls
clog etc2
[admin@obdb2 obcluster]$ du -sh *
180G clog
8.0K etc2
[admin@obdb2 obcluster]$ cd clog/
[admin@obdb2 clog]$ du -sh *
97G log_pool
3.7G tenant_1
13G tenant_1001
67G tenant_1002
mysql> select * from __all_server;
±---------------------------±---------------------------±-------------±---------±—±------±-----------±----------------±-------±----------------------±------------------------------------------------------------------------------------------±----------±-------------------±-------------±---------------+
| gmt_create | gmt_modified | svr_ip | svr_port | id | zone | inner_port | with_rootserver | status | block_migrate_in_time | build_version | stop_time | start_service_time | first_sessid | with_partition |
±---------------------------±---------------------------±-------------±---------±—±------±-----------±----------------±-------±----------------------±------------------------------------------------------------------------------------------±----------±-------------------±-------------±---------------+
| 2023-04-20 19:22:06.756554 | 2023-04-20 19:25:52.897573 | 10.168.89.11 | 2982 | 1 | zone1 | 2981 | 1 | ACTIVE | 0 | 4.1.0.0_100000202023040520-0765e69043c31bf86e83b5d618db0530cf31b707(Apr 5 2023 20:26:14) | 0 | 1681989951900812 | 0 | 1 |
| 2023-05-10 16:47:28.757269 | 2023-05-10 16:50:42.757071 | 10.168.89.12 | 2982 | 7 | zone2 | 2981 | 0 | ACTIVE | 0 | 4.1.0.0_100000202023040520-0765e69043c31bf86e83b5d618db0530cf31b707(Apr 5 2023 20:26:14) | 0 | 1683708604272665 | 0 | 1 |
| 2023-04-20 19:22:06.793627 | 2023-04-20 19:25:53.293155 | 10.168.89.13 | 2982 | 3 | zone3 | 2981 | 0 | ACTIVE | 0 | 4.1.0.0_100000202023040520-0765e69043c31bf86e83b5d618db0530cf31b707(Apr 5 2023 20:26:14) | 0 | 1681989952296456 | 0 | 1 |
±---------------------------±---------------------------±-------------±---------±—±------±-----------±----------------±-------±----------------------±------------------------------------------------------------------------------------------±----------±-------------------±-------------±---------------+
3 rows in set (0.00 sec)
帮忙在10.168.89.11节点上取一下rootserver.log日志,
select * from __all_virtual_disk_stat; 查下这个
mysql> select * from __all_virtual_disk_stat;
±-------------±---------±--------------±------------±--------------±--------------±--------------------+
| svr_ip | svr_port | total_size | used_size | free_size | is_disk_valid | disk_error_begin_ts |
±-------------±---------±--------------±------------±--------------±--------------±--------------------+
| 10.168.89.12 | 2982 | 1159066550272 | 7851737088 | 1151210618880 | 1 | 0 |
| 10.168.89.11 | 2982 | 1159066550272 | 10045358080 | 1149016997888 | 1 | 0 |
| 10.168.89.13 | 2982 | 1159066550272 | 9208594432 | 1149853761536 | 1 | 0 |
±-------------±---------±--------------±------------±--------------±--------------±--------------------+
3 rows in set (0.05 sec)
rootservice89_11.tar.gz (4.6 MB)
您好,操作系统是x86 还是arm 的可以告知下吗
同时,日志盘和数据盘是什么型号的也告知下。
然后执行下lsblk
操作系统是x86的
[root@obdb1 ~]# uname -a
Linux obdb1 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@obdb1 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.4 (Maipo)
[root@obdb1 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 6.6T 0 disk
├─sda1 8:1 0 200M 0 part /boot/efi
├─sda2 8:2 0 200M 0 part /boot
└─sda3 8:3 0 6.6T 0 part
├─rhel-root 253:0 0 300G 0 lvm /
├─rhel-swap 253:1 0 128G 0 lvm [SWAP]
└─rhel-gpdata 253:5 0 6.1T 0 lvm /data/soft
sdb 8:16 0 1.8T 0 disk
├─vg_ob-lv_pro 253:2 0 300G 0 lvm /data/oceanbase/product
├─vg_ob-lv_red 253:3 0 200G 0 lvm /data/oceanbase/redolog
└─vg_ob-lv_sto 253:4 0 1.2T 0 lvm /data/oceanbase/storage
日志盘和数据库是用的IBM的ssd盘做的lvm
麻烦有空再帮忙看看哈
磁盘 扩容了没 ??? 现在 解决了没