两张大表更新如何优化?

【 使用环境 】 测试环境
【 OB or 其他组件 】OB
【 使用版本 】4.2.1
【问题描述】 update a b set a.name=b.name
where a.id=b.id
a表 b表各有2000万左右数据

这个sql执行太耗时了。有什么办法优化下? a表非分区表,id字段是索引。b表是分区表,id是分区字段。
a表改成分区是不是 这个sql会更快??

【复现路径】问题出现前后相关操作
【附件及日志】推荐使用OceanBase敏捷诊断工具obdiag收集诊断信息,详情参见链接(右键跳转查看):

【SOP系列 22 】——故障诊断第一步(自助诊断和诊断信息收集)

不管在哪个数据库里,大批量数据更新都是不友好的行为,数据库上更新性能会很差,且让数据库陷于一种高风险的地步。

在 OB 里,大批量数据更新的风险就是大事务对租户增量内存大量消耗导致转储变多,IO 负载增加,数据库性能下降。此外,更新期间的大事务长期持有记录锁也很可能导致其他会话在该表上的 DML 被阻塞。
OB 为了规避大事务、长事务,有事务未提交超时、空闲超时两个机制,默认时间在 100秒和120秒(不同版本可能不一样,有的会改为 86400秒)。所以大批量数据更新还可能会碰到这个超时而失败。

正确的做法是写存储过程查询大表要更新的记录,然后用游标循环 更新数据,批量提交。如果存储过程参数设计得当,还可以并行调用。如果不会写程序,可以写程序去并行更新。这种分批并行的方法是所有方案中效率最高风险最低的方案。

撇开上面这些因素如果还要坚持数据库更新,那就用 OB 的 DML 并行更新特性。

(SYS@localhost:sql.sock) [TPCH]> explain extended update /*+ enable_parallel_dml parallel(8) */ (select a.o_orderkey a_o_orderkey, b.o_orderkey b_o_orderkey from orders a, orders3 b where a.o_orderkey = b.o_orderkey ) set a_o_orderkey = b_o_ord
erkey ;                                                                                                                                                                                                 
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan                                                                                                                                                                                           |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =====================================================================                                                                                                                                |
| |ID|OPERATOR                        |NAME    |EST.ROWS |EST.TIME(us)|                                                                                                                                |
| ---------------------------------------------------------------------                                                                                                                                |
| |0 |DISTRIBUTED UPDATE              |        |150000000|1144921448  |                                                                                                                                |
| |1 |+-PX COORDINATOR                |        |150000000|10111136    |                                                                                                                                |
| |2 |  +-EXCHANGE OUT DISTR          |:EX10000|150000000|4041736     |                                                                                                                                |
| |3 |    +-PX PARTITION ITERATOR     |        |150000000|2395623     |                                                                                                                                |
| |4 |      +-MERGE JOIN              |        |150000000|2395623     |                                                                                                                                |
| |5 |        |-TABLE FULL SCAN       |B       |150000000|483532      |                                                                                                                                |
| |6 |        +-COLUMN TABLE FULL SCAN|A       |150000000|926676      |                                                                                                                                |
| =====================================================================                                                                                                                                |
| Outputs & filters:                                                                                                                                                                                   |
| -------------------------------------                                                                                                                                                                |
|   0 - output(nil), filter(nil)                                                                                                                                                                       |
|       table_columns([{A: ({ORDERS: (A.O_ORDERKEY(0x7f49f3426a30), A.O_CUSTKEY(0x7f49f343f970), A.O_ORDERSTATUS(0x7f49f343fcb0), A.O_TOTALPRICE(0x7f49f345d6e0),                                      |
|        A.O_ORDERDATE(0x7f49f3466f00), A.O_ORDERPRIORITY(0x7f49f3467240), A.O_CLERK(0x7f49f3472c20), A.O_SHIPPRIORITY(0x7f49f347e5f0), A.O_COMMENT(0x7f49f3487e20))})}]),                             |
|                                                                                                                                                                                                      |
|       update([A.O_ORDERKEY(0x7f49f3426a30)=column_conv(NUMBER,PS:(-1,-85),NOT NULL,B.O_ORDERKEY(0x7f49f34290c0))(0x7f49f34380d0)])                                                                   |
|   1 - output([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)], [B.O_ORDERKEY(0x7f49f34290c0)]), filter(nil), rowset=256 |
|   2 - output([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)], [B.O_ORDERKEY(0x7f49f34290c0)]), filter(nil), rowset=256 |
|       dop=8                                                                                                                                                                                          |
|   3 - output([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)], [B.O_ORDERKEY(0x7f49f34290c0)]), filter(nil), rowset=256 |
|       partition wise, force partition granule                                                                                                                                                        |
|   4 - output([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)], [B.O_ORDERKEY(0x7f49f34290c0)]), filter(nil), rowset=256 |
|       equal_conds([A.O_ORDERKEY(0x7f49f3426a30) = B.O_ORDERKEY(0x7f49f34290c0)(0x7f49f3429ef0)]), other_conds(nil)                                                                                   |
|       merge_directions([ASC])                                                                                                                                                                        |
|   5 - output([B.O_ORDERKEY(0x7f49f34290c0)]), filter(nil), rowset=256                                                                                                                                |
|       access([B.O_ORDERKEY(0x7f49f34290c0)]), partitions(p[0-15])                                                                                                                                    |
|       is_index_back=false, is_global_index=false,                                                                                                                                                    |
|       range_key([B.O_ORDERKEY(0x7f49f34290c0)]), range(MIN ; MAX)always true                                                                                                                         |
|   6 - output([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)]), filter(nil), rowset=256                                 |
|       access([A.O_ORDERKEY(0x7f49f3426a30)], [A.O_CUSTKEY(0x7f49f343f970)], [A.O_ORDERSTATUS(0x7f49f343fcb0)], [A.O_TOTALPRICE(0x7f49f345d6e0)], [A.O_ORDERDATE(0x7f49f3466f00)],                    |
|        [A.O_ORDERPRIORITY(0x7f49f3467240)], [A.O_CLERK(0x7f49f3472c20)], [A.O_SHIPPRIORITY(0x7f49f347e5f0)], [A.O_COMMENT(0x7f49f3487e20)]), partitions(p[0-15])                                     |
|       is_index_back=false, is_global_index=false,                                                                                                                                                    |
|       range_key([A.O_ORDERKEY(0x7f49f3426a30)]), range(MIN ; MAX)always true                                                                                                                         || Used Hint:                                                                                                                                                                                           |
| -------------------------------------                                                                                                                                                                |
|   /*+                                                                                                                                                                                                |
|                                                                                                                                                                                                      |
|       PARALLEL(8)                                                                                                                                                                                    |
|       ENABLE_PARALLEL_DML                                                                                                                                                                            |
|   */                                                                                                                                                                                                 |
| Qb name trace:                                                                                                                                                                                       |
| -------------------------------------                                                                                                                                                                |
|   stmt_id:0, stmt_type:T_EXPLAIN                                                                                                                                                                     |
|   stmt_id:1, UPD$1 > UPD$3E5BD4E4                                                                                                                                                                    |
|   stmt_id:2, SEL$1                                                                                                                                                                                   |
| Outline Data:                                                                                                                                                                                        |
| -------------------------------------                                                                                                                                                                |
|   /*+                                                                                                                                                                                                |
|       BEGIN_OUTLINE_DATA                                                                                                                                                                             |
|       USE_DISTRIBUTED_DML(@"UPD$3E5BD4E4")                                                                                                                                                           |
|       LEADING(@"UPD$3E5BD4E4" ("TPCH"."B"@"SEL$1" "TPCH"."A"@"SEL$1"))                                                                                                                               |
|       USE_MERGE(@"UPD$3E5BD4E4" "TPCH"."A"@"SEL$1")                                                                                                                                                  |
|       PQ_DISTRIBUTE(@"UPD$3E5BD4E4" "TPCH"."A"@"SEL$1" NONE NONE)                                                                                                                                    |
|       PARALLEL(@"UPD$3E5BD4E4" "B"@"SEL$1" 8)                                                                                                                                                        |
|       FULL(@"UPD$3E5BD4E4" "B"@"SEL$1")                                                                                                                                                              |
|       PARALLEL(@"UPD$3E5BD4E4" "A"@"SEL$1" 8)                                                                                                                                                        |
|       FULL(@"UPD$3E5BD4E4" "A"@"SEL$1")                                                                                                                                                              |
|       USE_COLUMN_TABLE(@"UPD$3E5BD4E4" "A"@"SEL$1")                                                                                                                                                  |
|       MERGE(@"SEL$1" > "UPD$1")                                                                                                                                                                      |
|       PARALLEL(8)                                                                                                                                                                                    |
|       OPTIMIZER_FEATURES_ENABLE('4.3.0.0')                                                                                                                                                           |
|       END_OUTLINE_DATA                                                                                                                                                                               |
|   */                                                                                                                                                                                                 |
| Optimization Info:                                                                                                                                                                                   |
| -------------------------------------                                                                                                                                                                |
|   B:                                                                                                                                                                                                 |
|       table_rows:150000000                                                                                                                                                                           |
|       physical_range_rows:150000000                                                                                                                                                                  |
|       logical_range_rows:150000000                                                                                                                                                                   |
|       index_back_rows:0                                                                                                                                                                              |
|       output_rows:150000000                                                                                                                                                                          |
|       table_dop:8                                                                                                                                                                                    |
|       dop_method:Global DOP                                                                                                                                                                          |
|       avaiable_index_name:[ORDERS3]                                                                                                                                                                  |
|       stats version:1712911468008216                                                                                                                                                                 |
|       dynamic sampling level:0                                                                                                                                                                       |
|       estimation method:[OPTIMIZER STATISTICS]                                                                                                                                                       |
|   A:                                                                                                                                                                                                 ||       table_rows:150000000                                                                                                                                                                           |
|       physical_range_rows:150000000                                                                                                                                                                  |
|       logical_range_rows:150000000                                                                                                                                                                   |
|       index_back_rows:0                                                                                                                                                                              |
|       output_rows:150000000                                                                                                                                                                          |
|       table_dop:8                                                                                                                                                                                    |
|       dop_method:Global DOP                                                                                                                                                                          |
|       avaiable_index_name:[ORDERS]                                                                                                                                                                   |
|       stats version:1713054613616275                                                                                                                                                                 |
|       dynamic sampling level:0                                                                                                                                                                       |
|       estimation method:[OPTIMIZER STATISTICS]                                                                                                                                                       |
|   Plan Type:                                                                                                                                                                                         |
|       DISTRIBUTED                                                                                                                                                                                    |
|   Note:                                                                                                                                                                                              |
|       Degree of Parallelism is 8 because of hint                                                                                                                                                     |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
100 rows in set (0.008 sec)

1 个赞

在不改a表为分区表的前提下,可以考虑分批次更新,每批次按照b表的分区去扫描,也就是每批次更新b一个分区的数据(同时要关注b一个分区的数据量,如果过大的话,还是可能要继续拆小)

a表改为分区表可能要好一些,结合表组和并行技术,优化的力度要更大一些。参考这个文档
https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000641829

1 个赞