同一个sql语句,执行多次结果不同

【 使用环境 】生产环境
【 OB or 其他组件 】OB
【 使用版本 】5.7.25-OceanBase_CE-v4.3.5.2
【问题描述】多次执行同一个sql,结果不同!!

update dw_naire_answer_rec r join (
SELECT  distinct
	r.q_naire_id,
	r.diagnostician_id,
	COUNT(
		DISTINCT a.diag_obj_id,
		a.q_id
	) q_count
FROM
	dw_actual_answer_naire_question a
JOIN dw_naire_answer_rec r ON a.naire_answer_rec_id = r.naire_answer_rec_id
JOIN dw_diag_obj o on a.diag_obj_id=o.diag_obj_id and o.diag_obj_type_id not in ('ZHIYUANJS')
WHERE
	a.school_code = 'bndsftzx'
AND a.school_term_id = '2521'
GROUP BY
	r.q_naire_id,
	r.diagnostician_id
) a on r.q_naire_id=a.q_naire_id and r.diagnostician_id=a.diagnostician_id
set r.q_count = a.q_count

例如第一次执行:q_count =137 ,再次执行:q_count =-33591

每次执行子查询结果是对的。但更新到表dw_naire_answer_rec后结果不一致

这样查询下 每次结果一样吗?

select * from  dw_naire_answer_rec r join (
SELECT 
	r.q_naire_id,
	r.diagnostician_id,
	COUNT(
		DISTINCT a.diag_obj_id,
		a.q_id
	) q_count
FROM
	dw_actual_answer_naire_question a
JOIN dw_naire_answer_rec r ON a.naire_answer_rec_id = r.naire_answer_rec_id
JOIN dw_diag_obj o on a.diag_obj_id=o.diag_obj_id and o.diag_obj_type_id not in ('ZHIYUANJS')
WHERE
	a.school_code = 'bndsftzx'
AND a.school_term_id = '2521'
GROUP BY
	r.q_naire_id,
	r.diagnostician_id
) a on r.q_naire_id=a.q_naire_id and r.diagnostician_id=a.diagnostician_id
and r.q_count = a.q_count

还有这情况,软解析应该都一样,为啥结果不同呢

这样查询结果是一样的,只执行子查询也一样

  • 子查询里用别名 加个表别名r2。别和需要update的重复使用

是的,update结果不同,子查询一样

update  dw_naire_answer_rec rr join (
SELECT distinct
	r.q_naire_id,
	r.diagnostician_id,
	COUNT(
		DISTINCT a.diag_obj_id,
		a.q_id
	) q_count

FROM
	dw_actual_answer_naire_question a
JOIN dw_naire_answer_rec r ON a.naire_answer_rec_id = r.naire_answer_rec_id
JOIN dw_diag_obj o on a.diag_obj_id=o.diag_obj_id and o.diag_obj_type_id not in ('ZHIYUANJS')
WHERE
	a.school_code = 'bndsftzx'
AND a.school_term_id = '2521'
GROUP BY
	r.q_naire_id,
	r.diagnostician_id
) a on rr.q_naire_id=a.q_naire_id and rr.diagnostician_id=a.diagnostician_id
set rr.q_count = a.q_count  

在子查询中有一个distinct

试过了,也不行

UPDATE dw_naire_answer_rec AS r
JOIN (
SELECT
r2.q_naire_id,
r2.diagnostician_id,
COUNT(DISTINCT CONCAT(a.diag_obj_id, ‘-’, a.q_id)) AS q_count
FROM dw_actual_answer_naire_question a
JOIN dw_naire_answer_rec r2
ON a.naire_answer_rec_id = r2.naire_answer_rec_id
JOIN dw_diag_obj o
ON a.diag_obj_id = o.diag_obj_id
AND o.diag_obj_type_id NOT IN (‘ZHIYUANJS’)
WHERE a.school_code = ‘bndsftzx’
AND a.school_term_id = ‘2521’
GROUP BY r2.q_naire_id, r2.diagnostician_id
) AS t
ON r.q_naire_id = t.q_naire_id
AND r.diagnostician_id = t.diagnostician_id
SET r.q_count = t.q_count;

试一下这个

这样可以,重复执行了十几次,没问题。上面的sql感觉是随机的,在我们etl过程中用了不少类似的,什么原因的?可以通过不修改sql避免吗修改sql量太大了。。。

具体思路的话

  • COUNT(DISTINCT CONCAT(...)) 解决多列去重;
  • 把统计结果先做成一个派生表(或临时表),再与 dw_naire_answer_rec 联表更新,避免 MySQL 限制。

确认一下语意是否存在变化