【 使用环境 】生产环境 or 测试环境
测试环境
【 OB or 其他组件 】
k8s v1.8.14
docker 18.06.1-ce
【 使用版本 】
镜像版本:
oceanbasedev/oceanbase-cn:4.0.0-snapshot-20230113112829
oceanbase/obagent:1.2.0
【问题描述】清晰明确描述问题
使用k8s集群部署ob集群,部署完成后,查看容器组ob-server启动失败,日志中显示"fail to reserve log pool(ret=-4009, ret=“OB_IO_ERROR”)"
【复现路径】问题出现前后相关操作
1、kubectl apply -f crd.yaml
2、kubectl apply -f local-path-storage.yaml
3、kubectl create ns obcluster
【问题现象及影响】
容器组中的observer启动失败,导致集群部署失败
【附件】
由于新注册用户发帖无法上传附件,日志及yaml文件以文字方式发出
observer.log
[2023-01-18 10:15:58.212417] ERROR [CLOG] resize (ob_server_log_block_mgr.cpp:197) [51][][T0][Y0-0000000000000000-0-0] [lt=41] do_resize_ failed(ret=-4009, this={dir::"/home/admin/oceanbase/store//clog/log_pool", dir_fd:14, meta_fd:15, log_pool_meta:{curr_total_size:0, next_total_size:0, status:0}, min_block_id:0, max_block_id:0, is_inited:true}, old_log_pool_meta={curr_total_size:0, next_total_size:0, status:0}, new_log_pool_meta={curr_total_size:0, next_total_size:42949672960, status:1}) BACKTRACE:0xc9699fb 0xc5d6ba6 0x45e4e98 0x45e4b92 0x45e498f 0x45dd0ff 0x4ce93b6 0x4ce8663 0x4ce8295 0x677bb00 0x45beff4 0x7f624395a463 0x45bdd41
[2023-01-18 10:15:58.212466] ERROR [CLOG] reserve (ob_server_log_block_mgr.cpp:155) [51][][T0][Y0-0000000000000000-0-0] [lt=47] resize failed(ret=-4009, this={dir::"/home/admin/oceanbase/store//clog/log_pool", dir_fd:14, meta_fd:15, log_pool_meta:{curr_total_size:0, next_total_size:0, status:0}, min_block_id:0, max_block_id:0, is_inited:true}) BACKTRACE:0xc9699fb 0xc5d6ba6 0x45e4321 0x45e4024 0x45e3e37 0x45c89eb 0x4ce87c1 0x4ce82e1 0x677bb00 0x45beff4 0x7f624395a463 0x45bdd41
[2023-01-18 10:15:58.212505] ERROR [SERVER] start (ob_server.cpp:598) [51][][T0][Y0-0000000000000000-0-0] [lt=30] fail to reserve log pool(ret=-4009, ret="OB_IO_ERROR") BACKTRACE:0xc9699fb 0xc5d6ba6 0x45e4321 0x45e4024 0x45e3e37 0x45c89eb 0x677d22f 0x677c19c 0x45beff4 0x7f624395a463 0x45bdd41
[2023-01-18 10:15:58.212538] INFO [SERVER] start (ob_server.cpp:767) [51][][T0][Y0-0000000000000000-0-0] [lt=30] check if multi tenant synced(ret=-4009, ret="OB_IO_ERROR", stop_=true, synced=false)
[2023-01-18 10:15:58.212552] INFO [SERVER] start (ob_server.cpp:795) [51][][T0][Y0-0000000000000000-0-0] [lt=13] check if schema ready(ret=-4009, ret="OB_IO_ERROR", stop_=true, schema_ready=false)
[2023-01-18 10:15:58.212561] INFO [SERVER] start (ob_server.cpp:805) [51][][T0][Y0-0000000000000000-0-0] [lt=8] check if timezone usable(ret=-4009, ret="OB_IO_ERROR", stop_=true, timezone_usable=false)
[2023-01-18 10:15:58.212574] INFO [SERVER] start (ob_server.cpp:810) [51][][T0][Y0-0000000000000000-0-0] [lt=12] [NOTICE] check if sys srs usable(ret=-4009, stop_=true)
[2023-01-18 10:15:58.212581] ERROR [SERVER] start (ob_server.cpp:824) [51][][T0][Y0-0000000000000000-0-0] [lt=7] failure occurs, try to set stop and wait(ret=-4009, ret="OB_IO_ERROR") BACKTRACE:0xc9699fb 0xc5d6ba6 0x45e4321 0x45e4024 0x45e3e37 0x45c89eb 0x6780cd9 0x677c75d 0x45beff4 0x7f624395a463 0x45bdd41
obcluster.yaml
apiVersion: cloud.oceanbase.com/v1
kind: OBCluster
metadata:
name: ob-test
namespace: obcluster
spec:
imageRepo: 10.1.1.56:1080/dxhydev/oceanbasedev/oceanbase-cn
tag: 4.0.0
imageObagent: 10.1.1.56:1080/dxhydev/oceanbase/obagent:1.2.0
clusterID: 1
topology:
- cluster: cn
zone:
- name: zone1
region: region1
nodeSelector:
topology.kubernetes.io/zone: zone1
replicas: 1
- name: zone2
region: region1
nodeSelector:
topology.kubernetes.io/zone: zone2
replicas: 1
- name: zone3
region: region1
nodeSelector:
topology.kubernetes.io/zone: zone2
replicas: 1
parameters:
- name: log_disk_size
value: "40G"
resources:
cpu: 2
memory: 5Gi
storage:
- name: data-file
storageClassName: "nfs-data"
size: 50Gi
- name: data-log
storageClassName: "nfs-data"
size: 50Gi
- name: log
storageClassName: "nfs-data"
size: 30Gi
- name: obagent-conf-file
storageClassName: "nfs-data"
size: 1Gi
volume:
name: backup
path: /data/nfs