OB4.2.1.8社区LOAD DATA旁路导入不支持索引和分区

【 使用环境 】 测试环境
【 OB or 其他组件 】OB
【 使用版本 】OB4.2.1.8社区版
【问题描述】LOAD DATA旁路导入不支持索引和分区,旁路导入的表格中有索引或者分区是会报错

【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):
建表sql如下:
CREATE TABLE hg38_splice_ai_20220217 (
chr varchar(32) DEFAULT NULL,
pos bigint(20) NOT NULL,
id varchar(10) DEFAULT NULL,
ref varchar(10) DEFAULT NULL,
alt varchar(10) DEFAULT NULL,
qual varchar(10) DEFAULT NULL,
filter varchar(10) DEFAULT NULL,
allele varchar(10) DEFAULT NULL,
gene varchar(30) DEFAULT NULL,
ds_ag decimal(6,5) DEFAULT NULL,
ds_al decimal(6,5) DEFAULT NULL,
ds_dg decimal(6,5) DEFAULT NULL,
ds_dl decimal(6,5) DEFAULT NULL,
dp_ag int(11) DEFAULT NULL,
dp_al int(11) DEFAULT NULL,
dp_dg int(11) DEFAULT NULL,
dp_dl int(11) DEFAULT NULL
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = ‘zstd_1.3.8’ REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
partition by list(pos%5)
(partition p0 values in (0),
partition p1 values in (1),
partition p2 values in (2),
partition p3 values in (3),
partition p4 values in (4))

导入报错如下:

Server version: 5.7.25 OceanBase_CE 4.2.1.8 (r108000022024072217-3149c25ca2dadbb7707686ad02a1367b1b43e0b5) (Built Jul 23 2024 02:01:58)

Copyright (c) 2000, 2024, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

mysql> use falcon-dev
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
±------------------------+
| Tables_in_falcon-dev |
±------------------------+
| hg38_splice_ai_20220217 |
| tbl1 |
±------------------------+
2 rows in set (0.01 sec)

mysql> source ./hg38_splice_ai_20220217.sql
Query OK, 0 rows affected (0.05 sec)

Query OK, 0 rows affected, 1 warning (3.98 sec)

ERROR:
No query specified

mysql> LOAD DATA /*+ direct(true,1024) parallel(16) /INFILE ‘/home/admin/hg38_splice_ai_20220217.2612.tsv’ INTO TABLE tbl1 FIELDS TERMINATED BY ‘\t’;
ERROR 1172 (42000): Result consisted of more than one row
mysql> LOAD DATA /
+ direct(true,1024) parallel(16) */INFILE ‘/home/admin/hg38_splice_ai_20220217.2612.tsv’ INTO TABLE tbl1 FIELDS TERMINATED BY ‘\t’;
ERROR 1172 (42000): Result consisted of more than one row

先参考下这个帖子
【旁路导入】LoadData 导入一直报错`ERROR 1172 (42000): Result consisted of more than one row` - #16,来自 bulk_load_testp2nmzz_gaMTIwMjU5NTAzNC4xNzE0OTc1MTY1_ga_T35KTM57DZ*MTcyNDcyMjM4OC4xNjMuMS4xNzI0NzI1MDY1LjYwLjAuMA…

另外发下load data相关的完整日志 以及 数据格式

我的导入文件是用tab分割的,我OB的版本4.2.1.8 我看您推荐的帖子是4.2.2不知道这个会不会有影响
下面是我看过帖子之后的两次修改:
第一次:
mysql> LOAD DATA /*+ direct(false,11) parallel(16) */INFILE ‘/home/admin/hg38_splice_ai_20220217.2612.tsv’ INTO TABLE hg38_splice_ai_20220217 FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ (chr, pos, id, ref, alt, qual, filter, allele, gene, ds_ag, ds_al, ds_dg, ds_dl, dp_ag, dp_al, dp_dg, dp_dl);
ERROR 1235 (0A000): Not supported feature or function

第二次:
mysql> LOAD DATA /*+ direct(false,11) parallel(16) */INFILE ‘/home/admin/hg38_splice_ai_20220217.2612.tsv’ INTO TABLE hg38_splice_ai_20220217 FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’;
ERROR 1172 (42000): Result consisted of more than one row

原始文件截取行:
admin@iZwz94mqx5zw8nt7d8sfnnZ:~$ head hg38_splice_ai_20220217.2612.tsv
chr10 134786 . C CA . . CA ZMYND11 0.00 0.00 0.00 0.07 2 -10 15 0
chr10 134786 . C CC . . CC ZMYND11 0.00 0.00 0.00 0.13 0 -10 15 0
chr10 134786 . C CG . . CG ZMYND11 0.00 0.00 0.00 0.00 -10 1 0 38
chr10 134786 . C CT . . CT ZMYND11 0.00 0.00 0.00 0.13 0 -10 15 0
chr10 134786 . CG C . . C ZMYND11 0.00 0.00 0.00 0.35 -10 9 15 0
chr10 134786 . CGT C . . C ZMYND11 0.00 0.00 0.00 0.35 3 -10 15 0
chr10 134786 . CGTA C . . C ZMYND11 0.00 0.00 0.00 0.35 47 -10 15 0
chr10 134786 . CGTAC C . . C ZMYND11 0.00 0.00 0.00 0.35 47 -10 15 0
chr10 141080 . A AA . . AA ZMYND11 0.00 0.00 0.00 0.00 8 10 8 -1
chr10 141080 . A AC . . AC ZMYND11 0.00 0.00 0.00 0.00 8 10 8 -1

数据可以脱敏后放附件几行吗?我测试下

归档.zip (111.1 KB)

压缩文件中包含两个文件,.sql是建表语句,head.tsv是原始文件的前10000行,
导入使用的命令是:
LOAD DATA /
+ direct(true,1024) parallel(16) */INFILE ‘/home/admin/hg38_splice_ai_20220217.2612.tsv’ INTO TABLE hg38_splice_ai_20220217 FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’;
导入的时候报错,如果取head.tsv的前1000行,则可以正确导入
我已经在本地复现了,请帮忙看一下
ob版本4.2.18

联系这块的老师分析中

1.第一次报错原因:指定部分列是不支持的,可以指定列的顺序,但是必须包含所有列,指定部分列这个是在425和433支持

2.第二次报错原因:表列数和数据文件列数不一致,虽然第一个列是自增列,但是数据里面需要为这列加上空值才行

3.第三次报错原因:direct(true, 1024),这里第二个参数表示可接受的错误行是1024,只导入1000行,不会报错的,但这种select不出来数据的,因为没有符合条件的数据导入