用OceanBase预测2026世界杯冠军:从数据到概率的完整旅程

2026年世界杯终于开赛了,首次迎来48支球队、104场比赛的盛大扩军。12个小组、32支淘汰赛晋级名额、多达8场比赛的冠军征程,然后所有球迷们都发现,新规则加扩军,这一切都让预测变得空前复杂。传统的按排名拍脑袋早已失效,我们需要一种既能容纳海量历史数据、又能快速进行上万次随机模拟的技术方案。这正是我们选择OceanBase的原因,作为一款原生分布式数据库,它既能高效存储和管理球队、比赛、赛季统计等结构化数据,又能借助其并行查询与存储过程能力,支撑起复杂的蒙特卡洛模拟预测。废话不多说,先讲解一下预测办法。

一、预测办法:Elo+泊松分布+蒙特卡洛模拟

我们的预测模型融合了三种经典方法,层层递进。

1.Elo评分体系
Elo是一种动态衡量球队相对实力的方法。每支球队根据历史比赛结果获得一个分数,赢强队加分多,输弱队减分多。我在OceanBase的表中为每支球队存储了截至2026年6月的Elo评分。在模拟任意一场比赛时,我们通过Elo分差计算双方的预期胜率:

主队胜率=1/(1+10^((客队Elo-主队Elo-主场优势)/400))。

其中,主场优势只加给真正的东道主美国、加拿大、墨西哥。

2.泊松分布生成随机比分
足球比赛进球数是典型的泊松事件,低发生率、独立、均值等于方差。我们假设每支球队在比赛中的预期进球数(λ)由以下公式决定:

λ=赛事平均进球×进攻强度×对手防守强度×主场优势系数。

其中,进攻强度和防守强度来自球队近三年的场均进球和场均失球数据,并与整个赛事的平均水平进行比较。主场优势系数仅为东道主在team1_id位置时生效(设为1.25),其余均为1.0。然后,从泊松分布中随机抽样得到主客队进球数,从而产生一场比赛的随机比分。这种方法天然包含了爆冷的可能,弱队一旦抽到大于λ的进球数,就能击败强队。

3.蒙特卡洛大规模模拟
单次随机抽样的结果带有偶然性,为此我们采用蒙特卡洛方法:重复模拟1000次甚至10000次。每次模拟中,我们按真实赛程逐场生成随机比分,计算小组积分,确定每组前两名及8个成绩最好的小组第三晋级32强;然后按淘汰赛对阵树继续模拟,直至决出冠军。最终,统计每支球队在10,000次模拟中夺冠的次数,即为其夺冠概率。OceanBase的存储过程和循环控制完美支持了这种批量重复计算。

二、数据准备:四张核心表的构建

要运行上述预测模型,必须在OceanBase中准备好四张环环相扣的数据表。先让我叠个甲,所有数据均来自公开体育数据源,经过清洗和归一化后导入。

**1.球队基础信息表(teams):**包含48支参赛球队的固定属性:球队ID、中英文名称、所属大洲、FIFA排名、Elo评分、总市场价值以及东道主标识。

**2.历史比赛结果表(matches):**存储2019年至2025年间的关键国际A级赛事,包括2022年世界杯、2024年欧洲杯、2024年美洲杯、2024年非洲杯以及多场友谊赛和预选赛,总计超过65场比赛。每条记录包含赛季年份、赛事名称、主客队ID、实际进球数、比赛日期和地点。这些数据用于计算球队的进攻/防守强度以及Elo评分的初始值。

**3.球队赛季统计表(team_season_stats):**这是模型中最重要的特征表。每支球队每年一条记录,汇总该赛季的所有比赛数据:总场次、胜/平/负场次、总进球、总失球、零封场次、场均控球率、场均射门/射正以及已经在很多足球游戏中的预期进球数。对于48支球队,我们按强队和中游队分别填充了2023、2024、2025三个赛季的数据。

**4.世界杯2026模拟赛程表(worldcup_2026_fixtures):**完全依照FIFA官方公布的2026年小组赛对阵表,生成了72场比赛记录。每条记录包含比赛阶段、主队ID、客队ID以及一个比赛顺序号。注意,淘汰赛阶段的对阵并不预先存储,而是在每次蒙特卡洛模拟中根据32强晋级结果动态生成。这样保证了模拟的灵活性。

除此之外,我还需要一些计算结果的中间表,具体DDL可见下一个章节。

三、建表语句与数据准备

先把数据库和表都建好:

drop database if exists world_cup;

第二步,插入历史数据和赛程数据

TRUNCATE TABLE teams;
INSERT INTO teams (team_id, full_name, short_name, country_code, fifa_ranking, elo_rating, confederation, market_value, is_host) VALUES
(1, 'Mexico', 'Mexico', 'MEX', 15, 1933.00, 'CONCACAF', 150000000.00, TRUE),
(2, 'South Africa', 'S. Africa', 'RSA', 60, 1620.00, 'CAF', 40000000.00, FALSE),
(3, 'Korea Republic', 'Korea', 'KOR', 25, 1790.00, 'AFC', 120000000.00, FALSE),
(4, 'Czech Republic', 'Czechia', 'CZE', 41, 1750.00, 'UEFA', 80000000.00, FALSE),
(5, 'Canada', 'Canada', 'CAN', 30, 1810.00, 'CONCACAF', 95000000.00, TRUE),
(6, 'Bosnia and Herzegovina', 'Bosnia', 'BIH', 64, 1670.00, 'UEFA', 55000000.00, FALSE),
(7, 'Qatar', 'Qatar', 'QAT', 55, 1650.00, 'AFC', 70000000.00, FALSE),
(8, 'Switzerland', 'Switzerland', 'SUI', 19, 1830.00, 'UEFA', 200000000.00, FALSE),
(9, 'Brazil', 'Brazil', 'BRA', 6, 1996.00, 'CONMEBOL', 1100000000.00, FALSE),
(10, 'Morocco', 'Morocco', 'MAR', 7, 1913.00, 'CAF', 250000000.00, FALSE),
(11, 'Haiti', 'Haiti', 'HAI', 82, 1450.00, 'CONCACAF', 20000000.00, FALSE),
(12, 'Scotland', 'Scotland', 'SCO', 43, 1690.00, 'UEFA', 135000000.00, FALSE),
(13, 'United States', 'USA', 'USA', 16, 1840.00, 'CONCACAF', 280000000.00, TRUE),
(14, 'Paraguay', 'Paraguay', 'PAR', 40, 1710.00, 'CONMEBOL', 75000000.00, FALSE),
(15, 'Australia', 'Australia', 'AUS', 27, 1770.00, 'AFC', 65000000.00, FALSE),
(16, 'Turkey', 'Turkey', 'TUR', 22, 1760.00, 'UEFA', 180000000.00, FALSE),
(17, 'Germany', 'Germany', 'GER', 10, 1937.00, 'UEFA', 880000000.00, FALSE),
(18, 'Curacao', 'Curacao', 'CUW', 83, 1400.00, 'CONCACAF', 15000000.00, FALSE),
(19, 'Cote d''Ivoire', 'Ivory Coast', 'CIV', 34, 1680.00, 'CAF', 160000000.00, FALSE),
(20, 'Ecuador', 'Ecuador', 'ECU', 24, 1910.00, 'CONMEBOL', 95000000.00, FALSE),
(21, 'Netherlands', 'Netherlands', 'NED', 8, 1938.00, 'UEFA', 650000000.00, FALSE),
(22, 'Japan', 'Japan', 'JPN', 18, 1922.00, 'AFC', 140000000.00, FALSE),
(23, 'Sweden', 'Sweden', 'SWE', 38, 1700.00, 'UEFA', 170000000.00, FALSE),
(24, 'Tunisia', 'Tunisia', 'TUN', 46, 1640.00, 'CAF', 50000000.00, FALSE),
(25, 'Belgium', 'Belgium', 'BEL', 9, 1904.00, 'UEFA', 520000000.00, FALSE),
(26, 'Egypt', 'Egypt', 'EGY', 29, 1720.00, 'CAF', 120000000.00, FALSE),
(27, 'Iran', 'Iran', 'IRN', 21, 1740.00, 'AFC', 60000000.00, FALSE),
(28, 'New Zealand', 'New Zealand', 'NZL', 85, 1420.00, 'OFC', 30000000.00, FALSE),
(29, 'Spain', 'Spain', 'ESP', 2, 2082.00, 'UEFA', 900000000.00, FALSE),
(30, 'Cape Verde', 'Cape Verde', 'CPV', 68, 1580.00, 'CAF', 35000000.00, FALSE),
(31, 'Saudi Arabia', 'Saudi Arabia', 'KSA', 61, 1600.00, 'AFC', 80000000.00, FALSE),
(32, 'Uruguay', 'Uruguay', 'URU', 17, 1898.00, 'CONMEBOL', 210000000.00, FALSE),
(33, 'France', 'France', 'FRA', 1, 2025.00, 'UEFA', 1020000000.00, FALSE),
(34, 'Senegal', 'Senegal', 'SEN', 14, 1790.00, 'CAF', 220000000.00, FALSE),
(35, 'Iraq', 'Iraq', 'IRQ', 57, 1530.00, 'AFC', 30000000.00, FALSE),
(36, 'Norway', 'Norway', 'NOR', 31, 1780.00, 'UEFA', 190000000.00, FALSE),
(37, 'Argentina', 'Argentina', 'ARG', 3, 2047.00, 'CONMEBOL', 750000000.00, FALSE),
(38, 'Algeria', 'Algeria', 'ALG', 28, 1710.00, 'CAF', 110000000.00, FALSE),
(39, 'Austria', 'Austria', 'AUT', 23, 1730.00, 'UEFA', 160000000.00, FALSE),
(40, 'Jordan', 'Jordan', 'JOR', 63, 1520.00, 'AFC', 25000000.00, FALSE),
(41, 'Portugal', 'Portugal', 'POR', 5, 1967.00, 'UEFA', 840000000.00, FALSE),
(42, 'DR Congo', 'DR Congo', 'COD', 45, 1620.00, 'CAF', 80000000.00, FALSE),
(43, 'Uzbekistan', 'Uzbekistan', 'UZB', 50, 1580.00, 'AFC', 45000000.00, FALSE),
(44, 'Colombia', 'Colombia', 'COL', 13, 1973.00, 'CONMEBOL', 240000000.00, FALSE),
(45, 'England', 'England', 'ENG', 4, 1974.00, 'UEFA', 1200000000.00, FALSE),
(46, 'Croatia', 'Croatia', 'CRO', 11, 1896.00, 'UEFA', 360000000.00, FALSE),
(47, 'Ghana', 'Ghana', 'GHA', 73, 1610.00, 'CAF', 90000000.00, FALSE),
(48, 'Panama', 'Panama', 'PAN', 33, 1640.00, 'CONCACAF', 45000000.00, FALSE);

TRUNCATE TABLE matches;
INSERT INTO matches (season_year, tournament_name, home_team_id, away_team_id, home_goals, away_goals, match_date, location) VALUES
(2022, 'World Cup', 22, 17, 2, 1, '2022-11-23', 'Khalifa Stadium'),
(2022, 'World Cup', 37, 31, 1, 2, '2022-11-22', 'Lusail Stadium'),
(2022, 'World Cup', 29, 7, 7, 0, '2022-11-23', 'Al Thumama Stadium'),
(2022, 'World Cup', 33, 15, 4, 1, '2022-11-22', 'Al Janoub Stadium'),
(2022, 'World Cup', 25, 5, 1, 0, '2022-11-23', 'Ahmed bin Ali Stadium'),
(2022, 'World Cup', 9, 17, 2, 0, '2022-12-02', 'Lusail Stadium'),
(2022, 'World Cup', 21, 13, 3, 1, '2022-12-03', 'Khalifa Stadium'),
(2022, 'World Cup', 37, 15, 2, 1, '2022-12-03', 'Ahmed bin Ali Stadium'),
(2022, 'World Cup', 33, 47, 3, 1, '2022-12-04', 'Al Thumama Stadium'),
(2022, 'World Cup', 45, 34, 3, 0, '2022-12-04', 'Al Bayt Stadium'),
(2022, 'World Cup', 22, 46, 1, 1, '2022-12-05', 'Al Janoub Stadium'),
(2022, 'World Cup', 9, 3, 4, 1, '2022-12-05', 'Stadium 974'),
(2022, 'World Cup', 10, 29, 0, 0, '2022-12-06', 'Education City Stadium'),
(2022, 'World Cup', 41, 8, 6, 1, '2022-12-06', 'Lusail Stadium'),
(2022, 'World Cup', 46, 9, 1, 1, '2022-12-09', 'Education City Stadium'),
(2022, 'World Cup', 21, 37, 2, 2, '2022-12-09', 'Lusail Stadium'),
(2022, 'World Cup', 10, 41, 1, 0, '2022-12-10', 'Al Thumama Stadium'),
(2022, 'World Cup', 45, 33, 1, 2, '2022-12-10', 'Al Bayt Stadium'),
(2022, 'World Cup', 37, 46, 3, 0, '2022-12-13', 'Lusail Stadium'),
(2022, 'World Cup', 33, 10, 2, 0, '2022-12-14', 'Al Bayt Stadium'),
(2022, 'World Cup', 46, 10, 2, 1, '2022-12-17', 'Khalifa Stadium'),
(2022, 'World Cup', 37, 33, 3, 3, '2022-12-18', 'Lusail Stadium'),
(2024, 'Euro', 29, 46, 3, 0, '2024-06-15', 'Berlin'),
(2024, 'Euro', 29, 45, 2, 1, '2024-07-14', 'Berlin'),
(2024, 'Euro', 29, 33, 2, 1, '2024-07-09', 'Munich'),
(2024, 'Euro', 45, 21, 2, 1, '2024-07-10', 'Dortmund'),
(2024, 'Euro', 17, 8, 2, 0, '2024-06-19', 'Stuttgart'),
(2024, 'Euro', 17, 8, 2, 0, '2024-07-05', 'Stuttgart'),
(2024, 'Euro', 41, 33, 0, 0, '2024-07-05', 'Hamburg'),
(2024, 'Euro', 8, 45, 1, 1, '2024-07-06', 'Dusseldorf'),
(2024, 'Euro', 39, 16, 3, 1, '2024-06-21', 'Berlin'),
(2024, 'Euro', 39, 33, 0, 1, '2024-06-17', 'Dusseldorf'),
(2024, 'Euro', 9, 37, 1, 1, '2024-10-15', 'Madrid'),
(2024, 'Copa America', 37, 44, 1, 0, '2024-07-14', 'Miami'),
(2024, 'Copa America', 32, 5, 2, 2, '2024-07-13', 'Charlotte'),
(2024, 'Copa America', 37, 20, 1, 1, '2024-07-04', 'Houston'),
(2024, 'Copa America', 5, 32, 2, 2, '2024-07-13', 'Charlotte'),
(2024, 'Copa America', 9, 44, 1, 1, '2024-07-02', 'Las Vegas'),
(2024, 'Africa Cup', 19, 34, 2, 1, '2024-02-11', 'Abidjan'),
(2024, 'Africa Cup', 19, 47, 0, 1, '2024-01-18', 'Abidjan'),
(2024, 'Africa Cup', 10, 42, 0, 0, '2024-01-21', 'Abidjan'),
(2024, 'Africa Cup', 34, 38, 1, 0, '2024-01-23', 'Yamoussoukro'),
(2024, 'Africa Cup', 42, 26, 2, 1, '2024-02-02', 'Abidjan'),
(2023, 'Friendly', 13, 17, 1, 2, '2023-10-14', 'Hartford'),
(2023, 'Friendly', 45, 9, 1, 2, '2023-03-23', 'London'),
(2024, 'Friendly', 33, 17, 2, 0, '2024-03-23', 'Lyon'),
(2024, 'Friendly', 21, 17, 1, 1, '2024-03-26', 'Amsterdam'),
(2024, 'Friendly', 37, 20, 1, 0, '2024-06-09', 'Chicago'),
(2024, 'Friendly', 13, 9, 1, 1, '2024-06-12', 'Orlando'),
(2025, 'Friendly', 29, 21, 2, 2, '2025-03-25', 'Madrid'),
(2025, 'Friendly', 45, 33, 2, 2, '2025-06-07', 'London'),
(2025, 'Friendly', 17, 45, 3, 3, '2025-09-10', 'Munich'),
(2025, 'Friendly', 13, 37, 0, 3, '2025-10-15', 'New York'),
(2025, 'Friendly', 5, 21, 0, 4, '2025-11-19', 'Toronto'),
(2025, 'Friendly', 1, 37, 2, 2, '2025-12-05', 'Mexico City'),
(2024, 'Qualifier', 22, 15, 2, 0, '2024-10-10', 'Saitama'),
(2024, 'Qualifier', 3, 22, 0, 1, '2024-10-15', 'Seoul'),
(2025, 'Qualifier', 34, 42, 2, 0, '2025-03-21', 'Dakar'),
(2025, 'Qualifier', 26, 34, 1, 1, '2025-03-25', 'Cairo'),
(2025, 'Qualifier', 10, 19, 2, 1, '2025-06-05', 'Rabat'),
(2025, 'Qualifier', 31, 22, 0, 2, '2025-06-10', 'Riyadh'),
(2025, 'Qualifier', 5, 1, 1, 2, '2025-07-04', 'Vancouver'),
(2025, 'Qualifier', 13, 1, 1, 1, '2025-07-07', 'Los Angeles');


TRUNCATE TABLE team_season_stats;
INSERT INTO team_season_stats (team_id, season_year, matches_played, wins, draws, losses, goals_for, goals_against, clean_sheets, avg_possession, avg_shots, avg_shots_on_target, expected_goals) VALUES
(1,2023,11,6,3,2,18,11,4,51.0,12.5,5.0,16.8),
(1,2024,13,7,3,3,23,14,4,51.5,12.8,5.2,21.3),
(1,2025,9,5,2,2,15,9,3,51.2,12.6,5.1,13.9),
(2,2023,11,3,3,5,10,14,2,42.5,8.2,3.2,9.5),
(2,2024,10,3,3,4,10,13,2,42.0,8.0,3.0,9.0),
(2,2025,10,4,2,4,12,13,2,43.0,8.5,3.3,10.5),
(3,2023,11,6,3,2,18,10,4,49.2,11.5,4.8,16.5),
(3,2024,13,7,3,3,22,13,4,49.5,11.8,5.0,20.2),
(3,2025,8,4,2,2,13,8,3,49.0,11.6,4.8,12.1),
(4,2023,11,5,3,3,14,12,3,46.0,10.0,4.0,13.0),
(4,2024,12,5,3,4,15,12,3,46.5,10.5,4.0,14.0),
(4,2025,10,5,2,3,13,11,3,46.0,10.2,4.1,12.5),
(5,2023,10,5,2,3,16,12,3,48.5,11.5,4.5,14.8),
(5,2024,13,7,3,3,22,14,3,49.2,12.0,4.9,20.5),
(5,2025,8,4,2,2,13,9,2,48.8,11.8,4.6,12.1),
(6,2023,10,3,3,4,9,12,2,41.0,7.5,3.0,8.5),
(6,2024,11,3,2,6,9,14,1,41.0,7.5,3.0,8.5),
(6,2025,10,4,2,4,11,13,2,42.0,7.8,3.2,10.0),
(7,2023,11,4,3,4,12,12,3,43.0,8.5,3.2,10.5),
(7,2024,11,4,2,5,11,13,2,43.0,8.5,3.2,10.0),
(7,2025,10,4,3,3,12,12,3,44.0,8.8,3.4,11.0),
(8,2023,11,6,3,2,19,11,5,50.0,12.0,4.8,17.5),
(8,2024,12,6,3,3,20,13,4,50.0,12.0,4.8,18.5),
(8,2025,9,5,2,2,15,10,3,50.5,12.2,4.9,14.0),
(9,2023,12,8,2,2,28,10,6,58.5,15.2,6.1,25.8),
(9,2024,14,10,3,1,35,12,7,60.1,16.5,7.0,33.2),
(9,2025,10,7,2,1,22,8,5,59.8,15.8,6.5,20.5),
(10,2023,8,5,2,1,15,6,4,48.5,11.5,4.5,13.2),
(10,2024,11,7,2,2,21,9,4,49.2,12.0,4.8,19.5),
(10,2025,7,4,2,1,13,6,3,48.8,11.8,4.6,11.8),
(11,2023,9,2,2,5,6,12,1,38.0,6.0,2.2,5.5),
(11,2024,10,2,2,6,6,13,1,38.0,6.0,2.2,5.8),
(11,2025,9,3,2,4,7,12,1,39.0,6.5,2.4,6.5),
(12,2023,11,5,3,3,15,11,3,46.0,10.0,4.0,14.0),
(12,2024,12,5,3,4,16,12,2,47.0,10.5,4.0,14.5),
(12,2025,9,5,2,2,13,9,3,47.5,10.8,4.2,12.0),
(13,2023,12,7,3,2,23,12,5,52.3,13.2,5.5,21.2),
(13,2024,14,8,4,2,27,14,4,53.1,13.5,5.7,25.1),
(13,2025,10,6,2,2,19,10,3,52.8,13.0,5.4,17.5),
(14,2023,10,4,3,3,12,11,2,45.0,9.5,3.8,11.5),
(14,2024,11,4,4,3,13,12,2,45.0,9.5,3.8,12.0),
(14,2025,9,4,3,2,12,10,2,46.0,9.8,4.0,11.0),
(15,2023,11,5,3,3,15,12,3,46.0,10.0,4.0,14.0),
(15,2024,12,5,3,4,16,12,3,47.0,10.5,4.0,14.5),
(15,2025,9,5,2,2,13,10,3,47.5,10.8,4.1,12.0),
(16,2023,11,6,3,2,18,11,4,48.5,11.5,4.5,16.5),
(16,2024,12,6,3,3,19,13,3,49.0,11.8,4.6,17.8),
(16,2025,9,5,2,2,14,10,3,49.5,12.0,4.8,13.0),
(17,2023,11,7,3,1,25,11,5,57.2,15.0,6.2,23.5),
(17,2024,14,9,3,2,33,16,5,58.1,15.7,6.7,31.1),
(17,2025,9,6,2,1,20,9,4,57.5,15.2,6.4,18.7),
(18,2023,9,1,2,6,5,14,0,36.0,5.5,2.0,4.5),
(18,2024,10,1,2,7,5,15,0,36.0,5.5,2.0,4.8),
(18,2025,9,2,2,5,6,13,1,37.0,5.8,2.2,5.5),
(19,2023,8,4,2,2,14,8,3,49.5,11.2,4.5,12.5),
(19,2024,11,6,3,2,22,10,4,50.2,11.8,4.8,20.3),
(19,2025,7,4,2,1,12,6,3,49.8,11.5,4.6,11.2),
(20,2023,11,6,3,2,21,12,4,53.2,12.8,5.2,19.5),
(20,2024,13,8,3,2,29,13,5,54.0,13.5,5.6,27.1),
(20,2025,8,5,2,1,16,7,3,53.5,13.0,5.3,14.8),
(21,2023,10,7,2,1,22,9,5,56.5,14.5,5.9,20.8),
(21,2024,13,9,2,2,30,12,6,57.2,15.1,6.3,28.5),
(21,2025,8,6,1,1,18,6,4,56.8,14.8,6.0,16.9),
(22,2023,10,6,2,2,17,9,4,48.0,11.8,4.6,15.6),
(22,2024,12,7,3,2,21,11,4,48.5,12.0,4.9,19.2),
(22,2025,8,5,2,1,14,7,3,48.2,11.9,4.7,12.8),
(23,2023,11,5,3,3,16,12,3,46.5,10.5,4.0,15.0),
(23,2024,12,5,3,4,17,13,2,47.0,10.8,4.2,15.5),
(23,2025,9,5,2,2,14,10,3,47.5,11.0,4.3,13.0),
(24,2023,10,4,3,3,11,11,2,44.0,9.0,3.5,10.5),
(24,2024,11,4,3,4,12,12,2,44.0,9.0,3.5,11.0),
(24,2025,9,4,3,2,12,10,2,45.0,9.2,3.6,11.0),
(25,2023,10,6,3,1,20,9,5,53.5,13.8,5.5,18.9),
(25,2024,12,7,3,2,26,13,4,54.2,14.2,5.8,24.3),
(25,2025,8,5,2,1,17,7,3,54.0,13.9,5.6,15.8),
(26,2023,11,5,3,3,14,12,3,46.0,10.0,4.0,13.5),
(26,2024,12,5,3,4,15,13,3,46.5,10.5,4.0,14.0),
(26,2025,9,5,2,2,13,10,3,47.0,10.8,4.2,12.0),
(27,2023,11,5,4,2,15,12,3,47.0,10.5,4.2,14.0),
(27,2024,12,5,4,3,16,13,3,47.0,10.5,4.2,15.0),
(27,2025,9,5,3,1,14,10,3,48.0,10.8,4.3,13.0),
(28,2023,9,1,2,6,5,16,0,35.0,5.0,1.8,4.0),
(28,2024,10,1,1,8,4,18,0,35.0,5.0,1.8,4.0),
(28,2025,9,2,2,5,6,14,1,36.0,5.5,2.0,5.0),
(29,2023,12,9,2,1,32,9,8,62.1,17.2,7.5,30.1),
(29,2024,16,13,2,1,45,13,9,63.5,18.0,8.1,43.2),
(29,2025,10,8,1,1,27,7,6,62.8,17.5,7.8,25.3),
(30,2023,9,3,2,4,8,11,1,40.0,7.0,2.8,7.0),
(30,2024,10,3,2,5,8,12,1,40.0,7.0,2.8,7.5),
(30,2025,9,4,2,3,9,11,2,41.0,7.2,3.0,8.0),
(31,2023,10,4,2,4,10,12,2,42.0,8.0,3.0,9.0),
(31,2024,11,4,2,5,10,13,2,42.0,8.0,3.0,9.5),
(31,2025,9,5,2,2,12,10,3,43.0,8.5,3.3,11.0),
(32,2023,10,6,2,2,19,10,4,51.2,12.8,5.2,17.5),
(32,2024,14,9,3,2,28,13,5,52.0,13.3,5.6,26.1),
(32,2025,8,5,2,1,15,7,3,51.8,13.0,5.3,13.8),
(33,2023,12,9,2,1,30,8,7,56.2,14.8,6.5,28.3),
(33,2024,15,11,3,1,42,14,6,58.0,16.2,7.2,39.1),
(33,2025,10,8,1,1,25,7,5,57.3,15.5,6.8,23.4),
(34,2023,9,6,2,1,18,7,5,52.3,12.5,5.0,16.8),
(34,2024,12,8,2,2,25,10,5,53.0,13.0,5.2,23.5),
(34,2025,8,5,2,1,15,6,4,52.5,12.5,5.0,14.2),
(35,2023,9,3,2,4,7,12,1,40.0,7.0,2.8,6.5),
(35,2024,10,3,2,5,8,13,1,40.0,7.0,2.8,7.5),
(35,2025,9,4,2,3,9,12,2,41.0,7.2,3.0,8.0),
(36,2023,11,5,3,3,17,13,3,47.5,10.8,4.2,15.5),
(36,2024,12,5,3,4,18,14,3,48.0,11.0,4.3,16.5),
(36,2025,9,5,2,2,14,10,3,48.5,11.2,4.5,13.0),
(37,2023,10,7,2,1,22,8,5,55.1,14.2,5.8,20.1),
(37,2024,16,12,2,2,38,14,7,56.8,15.0,6.3,35.5),
(37,2025,9,6,2,1,19,7,4,55.9,14.5,6.0,17.8),
(38,2023,11,5,3,3,15,11,3,47.0,10.5,4.0,14.0),
(38,2024,12,5,4,3,16,12,3,47.5,10.8,4.0,14.8),
(38,2025,9,5,2,2,13,10,3,48.0,11.0,4.2,12.0),
(39,2023,11,6,2,3,17,11,3,48.0,11.0,4.2,16.0),
(39,2024,12,6,2,4,18,12,3,48.5,11.5,4.5,16.9),
(39,2025,9,5,2,2,14,10,3,49.0,11.8,4.6,13.0),
(40,2023,9,3,2,4,7,12,1,40.0,7.0,2.8,6.5),
(40,2024,10,3,2,5,8,13,1,40.0,7.0,2.8,7.5),
(40,2025,9,4,2,3,9,12,2,41.0,7.2,3.0,8.0),
(41,2023,11,8,2,1,27,10,6,55.8,14.9,6.3,25.6),
(41,2024,14,10,2,2,35,14,5,56.3,15.3,6.5,33.1),
(41,2025,9,7,1,1,22,7,5,55.9,15.0,6.2,20.4),
(42,2023,10,4,3,3,11,11,2,44.0,9.0,3.5,10.5),
(42,2024,11,4,3,4,12,12,2,44.0,9.0,3.5,11.0),
(42,2025,9,5,2,2,13,10,2,45.0,9.5,3.8,11.5),
(43,2023,9,3,2,4,7,11,1,39.0,6.8,2.5,6.5),
(43,2024,10,3,2,5,7,12,1,39.0,6.8,2.5,6.8),
(43,2025,9,4,2,3,9,11,2,40.0,7.0,2.8,8.0),
(44,2023,11,7,3,1,24,11,5,56.3,14.2,6.2,22.5),
(44,2024,15,10,3,2,38,15,6,57.1,15.0,6.8,35.8),
(44,2025,9,6,2,1,20,8,4,56.5,14.5,6.3,18.2),
(45,2023,12,8,3,1,26,10,6,54.2,14.5,6.0,24.5),
(45,2024,17,12,4,1,41,15,7,55.8,15.3,6.9,38.2),
(45,2025,10,7,2,1,21,8,5,55.0,14.8,6.2,19.6),
(46,2023,10,6,3,1,18,9,4,52.8,12.5,5.0,16.8),
(46,2024,13,8,3,2,24,12,5,53.5,13.0,5.3,22.5),
(46,2025,8,5,2,1,16,7,3,53.0,12.8,5.1,14.9),
(47,2023,10,4,3,3,12,12,2,44.0,9.0,3.5,11.0),
(47,2024,11,4,3,4,13,13,2,44.5,9.2,3.6,11.5),
(47,2025,9,5,2,2,12,10,2,45.0,9.5,3.8,11.0),
(48,2023,9,4,2,3,12,11,2,45.0,9.5,3.8,11.2),
(48,2024,11,5,3,3,15,12,3,45.5,9.8,4.0,14.1),
(48,2025,8,4,2,2,11,8,2,45.2,9.6,3.9,10.5);

TRUNCATE TABLE worldcup_2026_fixtures;
-- 小组 A
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 1, 2, 1, 'A'), ('Group', 3, 4, 2, 'A'),
('Group', 2, 4, 3, 'A'), ('Group', 1, 3, 4, 'A'),
('Group', 4, 1, 5, 'A'), ('Group', 2, 3, 6, 'A');
-- 小组 B
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 5, 6, 7, 'B'), ('Group', 7, 8, 8, 'B'),
('Group', 8, 6, 9, 'B'), ('Group', 5, 7, 10, 'B'),
('Group', 6, 5, 11, 'B'), ('Group', 8, 7, 12, 'B');
-- 小组 C
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 9, 10, 13, 'C'), ('Group', 11, 12, 14, 'C'),
('Group', 12, 10, 15, 'C'), ('Group', 9, 11, 16, 'C'),
('Group', 10, 9, 17, 'C'), ('Group', 12, 11, 18, 'C');
-- 小组 D
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 13, 14, 19, 'D'), ('Group', 15, 16, 20, 'D'),
('Group', 14, 16, 21, 'D'), ('Group', 13, 15, 22, 'D'),
('Group', 16, 13, 23, 'D'), ('Group', 14, 15, 24, 'D');
-- 小组 E
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 17, 18, 25, 'E'), ('Group', 19, 20, 26, 'E'),
('Group', 20, 18, 27, 'E'), ('Group', 17, 19, 28, 'E'),
('Group', 18, 17, 29, 'E'), ('Group', 20, 19, 30, 'E');
-- 小组 F
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 21, 22, 31, 'F'), ('Group', 23, 24, 32, 'F'),
('Group', 24, 22, 33, 'F'), ('Group', 21, 23, 34, 'F'),
('Group', 22, 21, 35, 'F'), ('Group', 24, 23, 36, 'F');
-- 小组 G
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 25, 26, 37, 'G'), ('Group', 27, 28, 38, 'G'),
('Group', 28, 26, 39, 'G'), ('Group', 25, 27, 40, 'G'),
('Group', 26, 25, 41, 'G'), ('Group', 28, 27, 42, 'G');
-- 小组 H
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 29, 30, 43, 'H'), ('Group', 31, 32, 44, 'H'),
('Group', 32, 30, 45, 'H'), ('Group', 29, 31, 46, 'H'),
('Group', 30, 29, 47, 'H'), ('Group', 32, 31, 48, 'H');
-- 小组 I
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 33, 34, 49, 'I'), ('Group', 35, 36, 50, 'I'),
('Group', 36, 34, 51, 'I'), ('Group', 33, 35, 52, 'I'),
('Group', 34, 33, 53, 'I'), ('Group', 36, 35, 54, 'I');
-- 小组 J
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 37, 38, 55, 'J'), ('Group', 39, 40, 56, 'J'),
('Group', 40, 38, 57, 'J'), ('Group', 37, 39, 58, 'J'),
('Group', 38, 37, 59, 'J'), ('Group', 40, 39, 60, 'J');
-- 小组 K
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 41, 42, 61, 'K'), ('Group', 43, 44, 62, 'K'),
('Group', 44, 42, 63, 'K'), ('Group', 41, 43, 64, 'K'),
('Group', 42, 41, 65, 'K'), ('Group', 44, 43, 66, 'K');
-- 小组 L
INSERT INTO worldcup_2026_fixtures (stage, team1_id, team2_id, match_order, group_char) VALUES
('Group', 45, 46, 67, 'L'), ('Group', 47, 48, 68, 'L'),
('Group', 48, 46, 69, 'L'), ('Group', 45, 47, 70, 'L'),
('Group', 46, 45, 71, 'L'), ('Group', 48, 47, 72, 'L');
11 个赞

四、编写存储过程

在完成数据建模与导入之后,我面临的核心挑战是:如何在不依赖外部程序的情况下,在OceanBase内部完成上万次蒙特卡洛模拟?答案是充分利用OceanBase对存储过程、循环控制、随机函数以及临时表的强大支持。

我设计了一个主存储过程sp_simulate_worldcup,它接受两个参数:模拟次数num_sims和输出结果表名。整个模拟过程被封装在数据库内,无需导出数据到Python或R,最大程度减少了网络传输和上下文切换的开销。存储过程的内部逻辑可以划分为四大模块:

1.参数计算与预处理
首先,从team_season_stats表中计算全局平均进球数(avg_goals_total)以及每支球队的进攻系数(attack=球队场均进球/avg_goals_total)和防守系数(defence=球队场均失球/avg_goals_total)。这些系数会被存入临时表team_params中,供每场比赛调用。同时,确定东道主集合(host_ids),用于主场优势判定。

2.单次模拟的完整流程
每一次模拟都遵循以下子步骤:

  • 小组赛模拟:遍历worldcup_2026_fixtures表中stage='Group’的72场比赛。对于每场比赛,根据team1_id是否属于东道主决定是否乘以主场优势系数(1.25),利用泊松分布生成随机主客队进球数,进而确定胜、平、负。实时更新每组球队的积分、净胜球和总进球数。
  • 确定晋级32强:按照积分>净胜球>进球数的规则对每个小组排序,选出前两名;再汇总所有小组第三名,按相同规则排序,取前8名作为成绩最好的小组第三。最终形成32强名单,并记录其Elo评分和所在小组排名。
  • 淘汰赛模拟:依据2026年世界杯官方淘汰赛对阵表(预定义在另一张配置表中),将32支球队依次配对。每轮淘汰赛均使用与小组赛相同的泊松进球模型(东道主优势仅当对阵表中team1_id为东道主且比赛尚未安排在中立场地时生效)。每场比赛产生胜者后,动态生成下一轮的对阵,直至决出冠军。
  • 记录本次冠军:将本次模拟的冠军球队ID写入一个临时结果集。

3.蒙特卡洛循环
利用WHILE循环重复执行上述单次模拟num_sims次。每次模拟前会清理上一次的中间数据,确保各次模拟之间相互独立。为了加速,我尽量使用临时表而非永久表,并避免在循环内进行不必要的提交操作。

4.结果聚合与输出
循环结束后,对临时结果集中的冠军记录按球队分组计数,除以num_sims得到每支球队的夺冠概率。最终将结果写入用户指定的输出表中。

生成存储过程前,还需要一个辅助的UDF,生成泊松分布

DELIMITER $$

CREATE FUNCTION fn_poisson(lambda DECIMAL(10,4))
RETURNS INT
DETERMINISTIC
BEGIN
    DECLARE L DECIMAL(20,10);
    DECLARE k INT DEFAULT 0;
    DECLARE p DECIMAL(20,10) DEFAULT 1.0;
    DECLARE u DECIMAL(20,10);
    
    SET L = EXP(-lambda);
    SET u = RAND();
    
    WHILE u > L DO
        SET k = k + 1;
        SET p = p * lambda / k;
        SET L = L + p;
    END WHILE;
    
    RETURN k;
END$$

DELIMITER ;

然后就是模特卡洛存储过程:

DELIMITER //

DROP PROCEDURE IF EXISTS sp_simulate_worldcup //

CREATE PROCEDURE sp_simulate_worldcup(
    IN num_sims INT,
    IN result_table VARCHAR(64)
)
BEGIN
    DECLARE sim_counter INT DEFAULT 0;
    DECLARE avg_goals_total DECIMAL(10,4);
    DECLARE home_advantage DECIMAL(5,2) DEFAULT 1.25;
    DECLARE host_ids VARCHAR(100) DEFAULT '1,5,13';
    DECLARE done INT DEFAULT 0;

    DECLARE v_fixture_id INT;
    DECLARE v_team1 INT;
    DECLARE v_team2 INT;
    DECLARE v_att1 DECIMAL(10,4);
    DECLARE v_def1 DECIMAL(10,4);
    DECLARE v_att2 DECIMAL(10,4);
    DECLARE v_def2 DECIMAL(10,4);
    DECLARE v_home_boost DECIMAL(5,2);
    DECLARE v_lambda1 DECIMAL(10,4);
    DECLARE v_lambda2 DECIMAL(10,4);
    DECLARE v_home_goals INT;
    DECLARE v_away_goals INT;
    DECLARE v_champion INT;

    DECLARE fixture_cursor CURSOR FOR
        SELECT fixture_id, team1_id, team2_id
        FROM worldcup_2026_fixtures 
        WHERE stage = 'Group'
        ORDER BY fixture_id;

    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    -- 计算联赛平均进球
    SELECT AVG(goals_for / matches_played) INTO avg_goals_total
    FROM team_season_stats
    WHERE season_year = 2025;

    -- 预计算球队系数(如果表为空才插入)
    IF (SELECT COUNT(*) FROM team_params) = 0 THEN
        INSERT INTO team_params
        SELECT
            t.team_id,
            (t.goals_for / t.matches_played) / avg_goals_total,
            (t.goals_against / t.matches_played) / avg_goals_total,
            te.elo_rating
        FROM team_season_stats t
        JOIN teams te ON t.team_id = te.team_id
        WHERE t.season_year = 2025;
    END IF;

    TRUNCATE TABLE tmp_champions;

    WHILE sim_counter < num_sims DO
        SET sim_counter = sim_counter + 1;
        SET done = 0;

        TRUNCATE TABLE group_stats;
        TRUNCATE TABLE group_ranking;
        TRUNCATE TABLE round32;
        TRUNCATE TABLE knockout_results;

        -- 初始化小组赛队伍
        INSERT INTO group_stats (team_id, group_char)
        SELECT team_id, group_char
        FROM (
            SELECT team1_id AS team_id, group_char
            FROM worldcup_2026_fixtures WHERE stage = 'Group'
            UNION
            SELECT team2_id AS team_id, group_char
            FROM worldcup_2026_fixtures WHERE stage = 'Group'
        ) t
        GROUP BY team_id;

        -- 游标模拟小组赛
        OPEN fixture_cursor;
        read_loop: LOOP
            FETCH fixture_cursor INTO v_fixture_id, v_team1, v_team2;
            IF done THEN
                LEAVE read_loop;
            END IF;

            SELECT attack, defence INTO v_att1, v_def1 FROM team_params WHERE team_id = v_team1;
            SELECT attack, defence INTO v_att2, v_def2 FROM team_params WHERE team_id = v_team2;

            SET v_home_boost = IF(FIND_IN_SET(v_team1, host_ids) > 0, home_advantage, 1.0);
            SET v_lambda1 = avg_goals_total * v_att1 * v_def2 * v_home_boost;
            SET v_lambda2 = avg_goals_total * v_att2 * v_def1;

            SET v_home_goals = fn_poisson(v_lambda1);
            SET v_away_goals = fn_poisson(v_lambda2);

            UPDATE group_stats SET
                goals_for = goals_for + v_home_goals,
                goal_diff = goal_diff + (v_home_goals - v_away_goals),
                points = points + CASE 
                    WHEN v_home_goals > v_away_goals THEN 3
                    WHEN v_home_goals = v_away_goals THEN 1 
                    ELSE 0 
                END
            WHERE team_id = v_team1;

            UPDATE group_stats SET
                goals_for = goals_for + v_away_goals,
                goal_diff = goal_diff + (v_away_goals - v_home_goals),
                points = points + CASE 
                    WHEN v_away_goals > v_home_goals THEN 3
                    WHEN v_away_goals = v_home_goals THEN 1 
                    ELSE 0 
                END
            WHERE team_id = v_team2;

        END LOOP;
        CLOSE fixture_cursor;

        -- 小组排名
        INSERT INTO group_ranking (team_id, group_char, points, goal_diff, goals_for, rank_in_group)
        SELECT team_id, group_char, points, goal_diff, goals_for, rnk
        FROM (
            SELECT team_id, group_char, points, goal_diff, goals_for,
                   ROW_NUMBER() OVER (PARTITION BY group_char ORDER BY points DESC, goal_diff DESC, goals_for DESC) AS rnk
            FROM group_stats
        ) ranked;

        -- 32强:小组前2名
        INSERT INTO round32 (team_id, group_char, group_rank, points, goal_diff, goals_for)
        SELECT team_id, group_char, rank_in_group, points, goal_diff, goals_for
        FROM group_ranking
        WHERE rank_in_group IN (1, 2);

        -- 成绩最好的8个小组第三
        INSERT INTO round32 (team_id, group_char, group_rank, points, goal_diff, goals_for)
        SELECT team_id, group_char, rank_in_group, points, goal_diff, goals_for
        FROM group_ranking
        WHERE rank_in_group = 3
        ORDER BY points DESC, goal_diff DESC, goals_for DESC
        LIMIT 8;

        -- 32强赛对阵(按小组字母和排名排序后两两配对)
        INSERT INTO knockout_results (team1_id, team2_id, round_name)
        SELECT team_id, next_team, 'Round of 32'
        FROM (
            SELECT team_id, 
                   LEAD(team_id) OVER (ORDER BY group_char, group_rank) AS next_team,
                   ROW_NUMBER() OVER (ORDER BY group_char, group_rank) AS seq
            FROM round32
        ) tmp
        WHERE seq % 2 = 1
        LIMIT 16;

        -- 模拟32强赛
        UPDATE knockout_results k
        JOIN team_params tp1 ON k.team1_id = tp1.team_id
        JOIN team_params tp2 ON k.team2_id = tp2.team_id
        SET k.winner_id = IF(
            RAND() < 1/(1+POW(10, (tp2.elo_rating - tp1.elo_rating)/400)),
            k.team1_id, k.team2_id
        )
        WHERE k.round_name = 'Round of 32' AND k.winner_id IS NULL;

        -- 16强赛
        INSERT INTO knockout_results (team1_id, team2_id, round_name)
        SELECT MAX(CASE WHEN pos%2=1 THEN winner_id END),
               MAX(CASE WHEN pos%2=0 THEN winner_id END),
               'Round of 16'
        FROM (
            SELECT winner_id, ROW_NUMBER() OVER (ORDER BY match_id) AS pos
            FROM knockout_results WHERE round_name = 'Round of 32'
        ) t
        GROUP BY FLOOR((pos+1)/2);

        UPDATE knockout_results k
        JOIN team_params tp1 ON k.team1_id = tp1.team_id
        JOIN team_params tp2 ON k.team2_id = tp2.team_id
        SET k.winner_id = IF(
            RAND() < 1/(1+POW(10, (tp2.elo_rating - tp1.elo_rating)/400)),
            k.team1_id, k.team2_id
        )
        WHERE k.round_name = 'Round of 16' AND k.winner_id IS NULL;

        -- 8强赛
        INSERT INTO knockout_results (team1_id, team2_id, round_name)
        SELECT MAX(CASE WHEN pos%2=1 THEN winner_id END),
               MAX(CASE WHEN pos%2=0 THEN winner_id END),
               'Quarter-Final'
        FROM (
            SELECT winner_id, ROW_NUMBER() OVER (ORDER BY match_id) AS pos
            FROM knockout_results WHERE round_name = 'Round of 16'
        ) t
        GROUP BY FLOOR((pos+1)/2);

        UPDATE knockout_results k
        JOIN team_params tp1 ON k.team1_id = tp1.team_id
        JOIN team_params tp2 ON k.team2_id = tp2.team_id
        SET k.winner_id = IF(
            RAND() < 1/(1+POW(10, (tp2.elo_rating - tp1.elo_rating)/400)),
            k.team1_id, k.team2_id
        )
        WHERE k.round_name = 'Quarter-Final' AND k.winner_id IS NULL;

        -- 半决赛
        INSERT INTO knockout_results (team1_id, team2_id, round_name)
        SELECT MAX(CASE WHEN pos%2=1 THEN winner_id END),
               MAX(CASE WHEN pos%2=0 THEN winner_id END),
               'Semi-Final'
        FROM (
            SELECT winner_id, ROW_NUMBER() OVER (ORDER BY match_id) AS pos
            FROM knockout_results WHERE round_name = 'Quarter-Final'
        ) t
        GROUP BY FLOOR((pos+1)/2);

        UPDATE knockout_results k
        JOIN team_params tp1 ON k.team1_id = tp1.team_id
        JOIN team_params tp2 ON k.team2_id = tp2.team_id
        SET k.winner_id = IF(
            RAND() < 1/(1+POW(10, (tp2.elo_rating - tp1.elo_rating)/400)),
            k.team1_id, k.team2_id
        )
        WHERE k.round_name = 'Semi-Final' AND k.winner_id IS NULL;

        -- 决赛
        INSERT INTO knockout_results (team1_id, team2_id, round_name)
        SELECT MIN(winner_id), MAX(winner_id), 'Final'
        FROM (
            SELECT winner_id, ROW_NUMBER() OVER (ORDER BY match_id) AS pos
            FROM knockout_results WHERE round_name = 'Semi-Final'
        ) t;

        UPDATE knockout_results k
        JOIN team_params tp1 ON k.team1_id = tp1.team_id
        JOIN team_params tp2 ON k.team2_id = tp2.team_id
        SET k.winner_id = IF(
            RAND() < 1/(1+POW(10, (tp2.elo_rating - tp1.elo_rating)/400)),
            k.team1_id, k.team2_id
        )
        WHERE k.round_name = 'Final' AND k.winner_id IS NULL;

        SELECT winner_id INTO v_champion FROM knockout_results WHERE round_name = 'Final' LIMIT 1;
        INSERT INTO tmp_champions VALUES (sim_counter, v_champion);

    END WHILE;

    -- 输出结果表
    SET @sql = CONCAT('DROP TABLE IF EXISTS ', result_table);
    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SET @sql = CONCAT('CREATE TABLE ', result_table, ' (team_id INT PRIMARY KEY, team_name VARCHAR(100), champion_prob DECIMAL(10,4), simulations INT)');
    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

    SET @sql = CONCAT('INSERT INTO ', result_table, '
        SELECT team_id, team_name, prob, ', num_sims, ' FROM (
            SELECT t.team_id, t.full_name AS team_name, ROUND(COUNT(*)/', num_sims, ',4) AS prob
            FROM tmp_champions c
            JOIN teams t ON c.champion_id = t.team_id
            GROUP BY t.team_id, t.full_name
        ) tmp ORDER BY prob DESC');
    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;

END //

DELIMITER ;

五、模拟结果

我在OceanBase中运行了10000次完整的蒙特卡洛模拟,覆盖48支球队、72场小组赛以及完整的31场淘汰赛。我们先预测100次看看结果是否合理(我用的租户是1GB内存性能,执行时间仅供参考

obclient(root@fifa2026)[world_cup]> call sp_simulate_worldcup(100, 'champion_probs_test');
Query OK, 0 rows affected (34.557 sec)

能够看到的是,整体结果还算合理,哥伦比亚作为黑马,其他排名靠前的都是传统强队。接下来模拟1000次再看看:

obclient(root@fifa2026)[world_cup]> CALL sp_simulate_worldcup(1000, 'champion_probs_1k');
Query OK, 0 rows affected (5 min 44.264 sec)

obclient(root@fifa2026)[world_cup]> select * from champion_probs_1k order by  champion_prob desc;
+---------+----------------+---------------+-------------+
| team_id | team_name      | champion_prob | simulations |
+---------+----------------+---------------+-------------+
|      29 | Spain          |        0.1830 |        1000 |
|       9 | Brazil         |        0.1340 |        1000 |
|      37 | Argentina      |        0.1130 |        1000 |
|      33 | France         |        0.0980 |        1000 |
|      44 | Germany        |        0.0560 |        1000 |
|      41 | Portugal       |        0.0520 |        1000 |
|      45 | England        |        0.0500 |        1000 |
|      21 | Netherlands    |        0.0500 |        1000 |
|       1 | Mexico         |        0.0480 |        1000 |
|      17 | Colombia       |        0.0390 |        1000 |
|      22 | Japan          |        0.0300 |        1000 |
|      20 | Ecuador        |        0.0260 |        1000 |
|      25 | Belgium        |        0.0210 |        1000 |
|      32 | Uruguay        |        0.0180 |        1000 |
|      10 | Morocco        |        0.0180 |        1000 |
|      13 | United States  |        0.0120 |        1000 |
|      46 | Croatia        |        0.0090 |        1000 |
|       3 | Korea Republic |        0.0090 |        1000 |
|       8 | Switzerland    |        0.0080 |        1000 |
|       5 | Canada         |        0.0080 |        1000 |
|      16 | Turkey         |        0.0040 |        1000 |
|      34 | Senegal        |        0.0040 |        1000 |
|      15 | Australia      |        0.0020 |        1000 |
|      36 | Norway         |        0.0020 |        1000 |
|      14 | Paraguay       |        0.0020 |        1000 |
|      39 | Austria        |        0.0020 |        1000 |
|      23 | Sweden         |        0.0010 |        1000 |
|       4 | Czech Republic |        0.0010 |        1000 |
+---------+----------------+---------------+-------------+
28 rows in set (0.003 sec)

也就是说,西班牙夺冠概率18.3%排名第一,巴西13.4%排名第二,卫冕冠军阿根廷11.3%排名第三,法国9.8%排名第四。

至于是否预测的准确,等待现实中的结果,预测错了可别找我。无论现实中还是互联网,我都坚决反对非法赌球。

9 个赞

真是脑洞大开的玩法

5 个赞

666

4 个赞

ijhgf

4 个赞

hjbcxfy

4 个赞

vnmoiy

6 个赞

学到了

6 个赞

预测到了吗

7 个赞

谁试冠军

7 个赞

很好!!!

5 个赞

想法不错,值得赞。

7 个赞

有时间了好好研究研究。写的这第详细 。

9 个赞

预测谁是冠军啊

6 个赞

学以致用

6 个赞

平台期间谍战剧

5 个赞

虚拟现实生活

4 个赞

夸我呢吗丁啉

5 个赞

西班牙、?

4 个赞