大佬,一张38亿的表按月聚合每天的数据量月900+万,有没有最佳实践。

【 使用环境 】 测试环境
【 OB or 其他组件 】
【 使用版本 】
【问题描述】
按月聚合后,将聚合后的数据插入表中,数据量巨大执行不出来。

基本上就是全表聚合,where条件筛选出来的数据与全表相当。

看看怎么解决

2 个赞

sql

+----------------------------+--------------+-------------+----------+---------+-----------------------------------+------------------+----------------------------------+------------+-------------+-----------+-------------+-----------+----------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| rt                         | ELAPSED_TIME | SVR_IP      | SVR_PORT | PLAN_ID | TRACE_ID                          | TRANSACTION_HASH | SQL_ID                           | SID        | CLIENT_IP   | TENANT_ID | TENANT_NAME | USER_NAME | USER_CLIENT_IP | DB_NAME | query_sql                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | RET_CODE |
+----------------------------+--------------+-------------+----------+---------+-----------------------------------+------------------+----------------------------------+------------+-------------+-----------+-------------+-----------+----------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
| 2026-04-08 17:14:32.270690 |    293718605 | 10.8*.0.241 |     2882 |   24237 | YB420A5400F1-000648D34B242517-0-0 |                0 | 756D373E5D197215E6D3121BEDD87B91 | 3221731027 | 10.8*.1.50  |      2112 | DMJYWCS     | DMJ       | 10.*.1.50      | DMJ     | INSERT INTO DMJ.DWD_CUST_TRAN_DAILY_YTD SELECT  a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE, sum(TRAN_AMT) TRAN_AMT from (select th.CLIENT_NO CUST_ID, th.PROD_TYPE PROD_TYPE, th.GL_CODE GL_CODE, th.TRAN_AMT TRAN_AMT, th.CR_DR_IND CR_DR_IND, th.TRAN_DESC TRAN_DESC, th.AMT_CALC_TYPE AMT_CALC_TYPE, th.SOURCE_TYPE SOURCE_TYPE, th.SYSTEM_CODE SYSTEM_CODE, th.NARRATIVE_CODE NARRATIVE_CODE, TO_CHAR(TO_DATE(th.ORIG_TRAN_TIMESTAMP,'YYYY-MM-DD HH24:MI:SS'),'Day') WEEKDAY, SUBSTR(to_char(th.dt),1,6) YEAR_MONTH, th.dt DT from ODS_DC_RB_TRAN_HIST th JOIN APP_DCC_CUST_INFO_YTD ci on th.client_no=ci.cust_id where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250105 and th.dt<20250111 )a group by a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE;                     |    -5066 |
| 2026-04-08 16:58:46.938647 |    503095045 | 10.8*.0.241 |     2882 |   24237 | YB420A5400F1-000648D2FC543775-0-0 |                0 | 756D373E5D197215E6D3121BEDD87B91 | 3221537347 | 10.8*.0.241 |      2112 | DMJYWCS     | DMJ       | 10.*.1.50      | DMJ     | INSERT INTO DMJ.DWD_CUST_TRAN_DAILY_YTD SELECT  a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE, sum(TRAN_AMT) TRAN_AMT from (select th.CLIENT_NO CUST_ID, th.PROD_TYPE PROD_TYPE, th.GL_CODE GL_CODE, th.TRAN_AMT TRAN_AMT, th.CR_DR_IND CR_DR_IND, th.TRAN_DESC TRAN_DESC, th.AMT_CALC_TYPE AMT_CALC_TYPE, th.SOURCE_TYPE SOURCE_TYPE, th.SYSTEM_CODE SYSTEM_CODE, th.NARRATIVE_CODE NARRATIVE_CODE, TO_CHAR(TO_DATE(th.ORIG_TRAN_TIMESTAMP,'YYYY-MM-DD HH24:MI:SS'),'Day') WEEKDAY, SUBSTR(to_char(th.dt),1,6) YEAR_MONTH, th.dt DT from ODS_DC_RB_TRAN_HIST th JOIN APP_DCC_CUST_INFO_YTD ci on th.client_no=ci.cust_id where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250101 and th.dt<20250111 )a group by a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE;                     |    -5066 |
| 2026-04-08 16:33:52.926025 |    902627325 | 10.8*.0.241 |     2882 |   24186 | YB420A5400F1-000648D2A0F61E44-0-0 |                0 | 1B2A10F89F6455361D4F7F29C296F077 | 3221652628 | 10.8*.1.50  |      2112 | DMJYWCS     | DMJ       | 10.*.1.50      | DMJ     | INSERT INTO DMJ.DWD_CUST_TRAN_DAILY_YTD PARTITION(ym202501) SELECT  a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE, sum(TRAN_AMT) TRAN_AMT from (select th.CLIENT_NO CUST_ID, th.PROD_TYPE PROD_TYPE, th.GL_CODE GL_CODE, th.TRAN_AMT TRAN_AMT, th.CR_DR_IND CR_DR_IND, th.TRAN_DESC TRAN_DESC, th.AMT_CALC_TYPE AMT_CALC_TYPE, th.SOURCE_TYPE SOURCE_TYPE, th.SYSTEM_CODE SYSTEM_CODE, th.NARRATIVE_CODE NARRATIVE_CODE, TO_CHAR(TO_DATE(th.ORIG_TRAN_TIMESTAMP,'YYYY-MM-DD HH24:MI:SS'),'Day') WEEKDAY, SUBSTR(to_char(th.dt),1,6) YEAR_MONTH, th.dt DT from ODS_DC_RB_TRAN_HIST th JOIN APP_DCC_CUST_INFO_YTD ci on th.client_no=ci.cust_id where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250101 and th.dt<20250201 )a group by a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE; |    -5066 |
| 2026-04-08 16:26:20.817082 |    442180854 | 10.8*.0.241 |     2882 |   24186 | YB420A5400F1-000648D34BF42647-0-0 |                0 | 1B2A10F89F6455361D4F7F29C296F077 | 3221592629 | 10.8*.1.50  |      2112 | DMJYWCS     | DMJ       | 10.*.1.50      | DMJ     | INSERT INTO DMJ.DWD_CUST_TRAN_DAILY_YTD PARTITION(ym202501) SELECT  a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE, sum(TRAN_AMT) TRAN_AMT from (select th.CLIENT_NO CUST_ID, th.PROD_TYPE PROD_TYPE, th.GL_CODE GL_CODE, th.TRAN_AMT TRAN_AMT, th.CR_DR_IND CR_DR_IND, th.TRAN_DESC TRAN_DESC, th.AMT_CALC_TYPE AMT_CALC_TYPE, th.SOURCE_TYPE SOURCE_TYPE, th.SYSTEM_CODE SYSTEM_CODE, th.NARRATIVE_CODE NARRATIVE_CODE, TO_CHAR(TO_DATE(th.ORIG_TRAN_TIMESTAMP,'YYYY-MM-DD HH24:MI:SS'),'Day') WEEKDAY, SUBSTR(to_char(th.dt),1,6) YEAR_MONTH, th.dt DT from ODS_DC_RB_TRAN_HIST th JOIN APP_DCC_CUST_INFO_YTD ci on th.client_no=ci.cust_id where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250101 and th.dt<20250201 )a group by a.YEAR_MONTH, a.DT, a.WEEKDAY, a.CUST_ID, a.PROD_TYPE, a.GL_CODE, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE; |    -5066 |
| 2026-04-08 15:35:13.954941 |    906345617 | 10.8*.0.241 |     2882 |   23740 | YB420A5400F1-000648D2FC5435A9-0-0 |                0 | FD1F4929B7C2DB381376B00F63EF2ECF | 3221732979 | 10.8*.1.50  |      2112 | DMJYWCS     | DMJ       | 10.*.1.50      | DMJ     | INSERT INTO DMJ.DWD_DC_RB_TRAN_DAILY PARTITION(ym202501) SELECT  a.year_month, a.dt, a.WEEKDAY, a.cust_id, a.prod_type, a.gl_code, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE, sum(TRAN_AMT) TRAN_AMT from (select th.CLIENT_NO CUST_ID, th.PROD_TYPE PROD_TYPE, th.GL_CODE GL_CODE, th.TRAN_AMT TRAN_AMT, th.CR_DR_IND CR_DR_IND, th.TRAN_DESC TRAN_DESC, th.AMT_CALC_TYPE AMT_CALC_TYPE, th.SOURCE_TYPE SOURCE_TYPE, th.SYSTEM_CODE SYSTEM_CODE, th.NARRATIVE_CODE NARRATIVE_CODE, TO_CHAR(TO_DATE(th.ORIG_TRAN_TIMESTAMP,'YYYY-MM-DD HH24:MI:SS'),'Day') WEEKDAY, SUBSTR(to_char(th.dt),1,6) YEAR_MONTH, th.dt DT from ODS_DC_RB_TRAN_HIST th where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250101 and th.dt<20250201 )a group by a.year_month, a.dt, a.WEEKDAY, a.cust_id, a.prod_type, a.gl_code, a.CR_DR_IND, a.AMT_CALC_TYPE, a.TRAN_DESC, a.SOURCE_TYPE, a.SYSTEM_CODE, a.NARRATIVE_CODE;                                                             |    -5066 |
+----------------------------+--------------+-------------+----------+---------+-----------------------------------+------------------+----------------------------------+------------+-------------+-----------+-------------+-----------+----------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+

1 个赞

数据量


obclient> select owner,table_name,num_rows from all_tables where table_name = 'ODS_DC_RB_TRAN_HIST';
+-------+---------------------+------------+
| OWNER | TABLE_NAME          | NUM_ROWS   |
+-------+---------------------+------------+
| DMJ   | ODS_DC_RB_TRAN_HIST | 3814311829 |
+-------+---------------------+------------+
1 row in set (0.02 sec)


一个分区数据量

obclient> select  count(*) from DMJ.ODS_DC_RB_TRAN_HIST PARTITION (dt20250101) ;
+----------+
| COUNT(*) |
+----------+
|  9782416 |
+----------+
1 row in set (4.88 sec)

1 个赞

表结构:表结构.txt (70.9 KB)

1 个赞
from ODS_DC_RB_TRAN_HIST th JOIN APP_DCC_CUST_INFO_YTD ci on th.client_no=ci.cust_id where th.CCY='CNY' and th.TRAN_STATUS != 'F' and th.dt>=20250105 and th.dt<20250111

看SQL关联条件是这个,但你提供的表结构一个索引都没有,先把基础的索引构建好再优化

1 个赞

学习一下

1 个赞

SQL 诊断 此处env中的trace_id对应gv$ob_sql_audit的trace_id

obdiag gather scene run --scene=observer.sql_err --env “{db_connect=’-hxx -Pxx -uxx -pxx -Dxx’, trace_id=‘xx’}”
使用obdiag提取一下sql信息

1 个赞

这个子查询基本上就是查的全表的数据,整个分区的数据做sum分组聚合,where条件过滤不了数据,我试了建了索引也不会走索引。大佬有什么建议?比如这个索引怎么建

1 个赞

DT字段不是你的分区键吗,你的条件是and th.dt>=20250105 and th.dt<20250111,这个时间范围不大,建索引没道理不走。另外关联字段要有索引,这个是前提,比如这个关联条件(th.client_no=ci.cust_id )

1 个赞

mark
蹲一个后续,看看这种怎么超大数据量大佬们是怎么处理的

1 个赞

思路:
大的拆小, 按主键设定100个范围, 一条SQL拆成了100条, 最后聚合一下。
用shell或者其他什么工具, 串行或者并行执行, 可控就是需要耗费时间做一些重复性的拆SQL动作。

1 个赞

这个DT 字段 要建索引的话是不是得建全局索引,list分区,分区内dt的值是一样的

1 个赞

数据量太大,全局索引怕是也建不出来

1 个赞