World Cup 2026 Prediction Pool

Run a World Cup prediction pool for your team, company, or friends — deployed in minutes on Databricks.

Quick Deploy | Features | Scoring | Configuration | Development

Predict scorelines for all 104 World Cup matches, pick the tournament champion and top goal scorers, and compete on a live leaderboard against your colleagues. Match results sync automatically — just deploy and invite your team.

Built as a Databricks App powered by Lakebase (managed PostgreSQL) and deployed via Databricks Asset Bundles.

Predict scores for every match — group stage through the final

Pick the tournament winner and top 3 goal scorers before kickoff

Live leaderboard — track rankings across 150+ participants

Features

104 match predictions — fill in scorelines for every group and knockout match. Knockout brackets update automatically based on your predictions.
Tournament picks — choose the champion and up to 3 top goal scorers before the tournament starts.
Live leaderboard — real-time rankings with detailed points breakdown (outcome, exact score, scorer goals, advancer bonus).
Automatic sync — match fixtures and live scores pulled from football-data.org every 5 minutes.
Custom branding — upload your company logo and set a pool name from the admin panel.
Player profiles — display name, nationality, and profile picture for each participant.
Scales to hundreds of users — multi-worker FastAPI with connection pooling and batch operations.
One-click deploy — Databricks Asset Bundle handles everything: app, sync job, and optional AI/BI dashboard.

How scoring works

Points are awarded per match after results are synced. Knockout rounds are worth more.

Round	Multiplier	Outcome	Exact score	Scorer	Advancer
Group	x1	2	5	2	—
R32	x1.5	3	8	3	5
R16	x2	4	10	4	6
QF	x2.5	5	13	5	8
SF / 3rd / Final	x3	6	15	6	9

Tournament winner: +25 pts when the final result matches your champion pick.

Quick deploy

Prerequisites: A Databricks workspace with Lakebase enabled, Databricks CLI (v0.285+), Node.js 18+, psql and jq, and an API token from football-data.org — see API token tier below for which plan to pick.

# 1. Clone
git clone https://github.com/onno101/worldcup-pool.git && cd worldcup-pool

# 2. Store your football-data.org API token as a Databricks secret
#    (used by the 5-minute scheduled sync job)
databricks secrets create-scope worldcup_pool
databricks secrets put-secret worldcup_pool football_data_token --string-value "YOUR_TOKEN"

# 3. Deploy — pass your email as admin so you can manage the pool
databricks bundle deploy -t dev --var admin_emails=you@company.com

# 4. Grant the app's service principal access to Lakebase (one-time, idempotent)
./scripts/bootstrap_lakebase_app.sh dev

# 5. Start the app
databricks bundle run worldcup_pool_app -t dev

Step 3 — admin emails must be set at deploy time. --var admin_emails writes ADMIN_EMAILS into the app environment so you can access admin endpoints (config, logo upload, manual sync). You can re-run bundle deploy with a different value any time; the script in the next step is idempotent so it won't redo work.

For multiple admins, set the value in databricks.yml instead of via --var. The Databricks CLI doesn't reliably forward comma-separated values through the --var flag, so passing --var admin_emails=alice@x.com,bob@x.com ends up with only the first address taking effect. Open databricks.yml, find your target (e.g. dev) under targets:, and add an admin_emails entry to its variables: block:
targets:
  dev:
    default: true
    mode: development
    variables:
      admin_emails: "alice@company.com,bob@company.com"
Then deploy without --var admin_emails: databricks bundle deploy -t dev. Commit this change to your fork so future deploys keep the same admin list.

Why step 4? The Databricks App runs as its own service principal. Lakebase requires that SP to be registered as a Postgres OAuth role and granted schema permissions before the app can create its tables. The bootstrap script reads the SP from the deployed app and provisions everything via the Lakebase API.

Why step 6 is a UI step. The scheduled sync job reads FOOTBALL_DATA_TOKEN from your secret scope (step 2), so matches will populate within ~5 minutes regardless. Setting the token directly on the app speeds up first sync and lets the in-app admin "Sync now" button work immediately. If you'd rather wait for the cron, you can skip step 6.

The Lakebase endpoint defaults to projects/worldcup-pool/branches/production/endpoints/primary. Override with:

databricks bundle deploy -t dev --var lakebase_endpoint="projects/YOUR_PROJECT/branches/production/endpoints/primary"

Create a Lakebase project first if you don't have one yet: in the Databricks UI, go to Catalog > Lakebase > Create project.

API token tier

The pool uses football-data.org for fixtures, live scores, and goal scorers. Pick your plan based on which features you want:

Plan	Cost	Fixtures & live scores	Goal scorer events
Free Tier	Free	Yes	No — scorer points won't be awarded
Free + Deep Data	€29 / month	Yes	Yes — required for top-scorer scoring
Higher paid tiers	Paid	Yes	Yes

If you want the top scorer picks and the per-match scorer points to actually score, you need the Free + Deep Data add-on (€29/month at the time of writing) or a higher paid tier. The standard free key returns matches and scores but omits the goal-event detail the sync job needs to credit scorer points.

Subscribe to Deep Data from your account page after registering at football-data.org/client/register. Without it, the app still runs — outcome / exact-score / advancer points all work, but every match's scorer column will be 0.

Configuration

Variable	Description	Default
`LAKEBASE_ENDPOINT`	Lakebase Autoscale endpoint resource name	`projects/worldcup-pool/.../primary`
`LAKEBASE_DATABASE`	Postgres database name	`databricks_postgres`
`FOOTBALL_DATA_TOKEN`	API token from football-data.org	(required)
`ADMIN_EMAILS`	Comma-separated admin emails	(empty)
`WEB_CONCURRENCY`	Uvicorn worker processes	`2`
`PREDICTION_LOCK_BEFORE_KICKOFF_HOURS`	Hours before kickoff when predictions close	`1`
`TOURNAMENT_PICKS_LOCK_AT_UTC`	When tournament picks become read-only (ISO8601)	`2026-06-11T18:00:00+00:00`
`INIT_SCHEMA_ON_START`	Run DDL on app cold start	`false`
`AUTO_SYNC_MATCHES_IF_EMPTY`	Auto-sync fixtures when matches table is empty	`true`

See .env.example for a complete template.

Bundle targets

Three targets ship in databricks.yml. Pick the one that matches what you're doing — they differ in which Lakebase branch they point at, whether predictions are still open, and whether schema init / fixture sync run on cold start.

Target	When to use	Tournament lock	Auto-sync fixtures	Init schema on start	Mode
`dev` (default)	Active development, internal testing, your real company pool	`2026-06-11T18:00:00Z` (first kickoff)	`true`	`true`	development
`simulation`	Demoing what a finished pool looks like — 150 simulated users with picks already submitted	`2026-04-01T00:00:00Z` (already past — picks read-only)	`false` (uses pre-seeded data)	`false`	development
`prod`	Production deployment for your live pool	`2026-06-11T18:00:00Z`	`true`	`true`	production (locks workspace path, requires explicit deploy)

dev is the default and the lightweight option. Use it to test UI changes, try out scoring tweaks, run through the prediction flow, and generally play with the app. Deploys with a bare databricks bundle deploy — no flags needed. If you don't plan to fork the code or customize much, dev is perfectly fine as your real, live company pool too — share its URL, invite your team, you're done.

simulation is for showing the pool off before the tournament starts — recruiters, leadership, internal demos. It points at a separate Lakebase branch (branches/simulation) so it doesn't pollute your real pool's data, and tournament picks are already locked, so visitors see a fully populated leaderboard immediately. To populate the simulated users, run scripts/simulate_data.py against the simulation Lakebase after deploy.

prod is mostly a naming convention — it deploys an app called worldcup-pool-prod (separate URL from -dev) and runs under Databricks Asset Bundle production mode. As shipped, that mode mainly affects bundle-side validation (rejects deploys to per-user paths, requires run_as on jobs) — it doesn't restrict who can update the app or lock down the deployment. The real value is just having two side-by-side namespaces: a stable URL you share with your team, and a scratch URL for trying changes without breaking the first one. If you're not iterating on the code, you don't need prod at all — dev is fine as your live pool. If you do want stronger production guardrails (workspace path locking, permissions, service-principal run_as), add them to the prod target yourself.

All three targets accept --var lakebase_endpoint=projects/<your-project>/branches/<your-branch>/endpoints/primary to point at your own Lakebase project. The defaults in databricks.yml use placeholder values that you'll need to override.

Architecture

┌─────────────────────────────────────────────────────┐
│                  Databricks App                     │
│  ┌───────────┐    ┌──────────────────────────────┐  │
│  │ React SPA │───▶│  FastAPI (multi-worker)       │  │
│  │  (Vite)   │    │  - Match predictions API     │  │
│  └───────────┘    │  - Tournament picks API      │  │
│                   │  - Leaderboard (cached)       │  │
│                   │  - Admin (sync, config)       │  │
│                   └──────────┬───────────────────┘  │
│                              │                      │
└──────────────────────────────┼──────────────────────┘
                               │ SQLAlchemy + OAuth
                               ▼
                    ┌──────────────────────┐
                    │  Lakebase (Postgres)  │
                    │  Auto-scaling         │
                    └──────────────────────┘
                               ▲
           ┌───────────────────┘
           │ Scheduled job (5 min)
┌──────────┴──────────┐        ┌─────────────────────┐
│   Match Sync Job    │───────▶│  football-data.org   │
│  (Databricks Job)   │        │  (scores & fixtures) │
└─────────────────────┘        └─────────────────────┘

Local development

uv sync
cp .env.example .env   # fill in your values
export DATABASE_URL_OVERRIDE=postgresql://user:pass@localhost:5432/worldcup
export WORLDCUP_DEV_CORS=1
uv run uvicorn worldcup_pool.backend.app:app --reload --host 127.0.0.1 --port 8000

In another terminal:

cd ui && npm ci && npm run dev

Without a football-data token, call POST /api/dev/seed-demo-matches to populate sample data.

Scaling

Budget roughly WEB_CONCURRENCY x (DB_POOL_SIZE + DB_MAX_OVERFLOW) connections. Keep that under your Lakebase connection limit. Match prediction saves use batch upserts for efficiency.

AI/BI Dashboard (optional)

The bundle includes a Lakebase-to-Delta mirror that powers an AI/BI dashboard — showing that one database drives both the app and the Lakehouse.

databricks bundle deploy -t dev --var dashboard_warehouse_id=YOUR_ID
databricks bundle run worldcup_sync_matches -t dev

This deploys a sync task that mirrors four tables into Unity Catalog Delta, plus a dashboard with KPIs, champion pick distribution, and prediction activity.

Troubleshooting

`npm ci` fails during deploy

The committed ui/package-lock.json resolves packages from registry.npmjs.org (pinned via ui/.npmrc). If your local network can't reach the public npm registry — for example, you're behind a corporate proxy that only allows an internal mirror — the predeploy step will fail.

To regenerate the lockfile against whichever registry your machine can reach:

rm -rf ui/node_modules ui/package-lock.json
cd ui && npm install && cd ..
databricks bundle deploy -t dev --var admin_emails=you@company.com

The regenerated lockfile is local-only; don't commit it back if it now references an internal mirror, since that would break deploys for other people on different networks.

License

Apache 2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
docs/screenshots		docs/screenshots
resources		resources
scripts		scripts
ui		ui
worldcup_pool		worldcup_pool
.databricksignore		.databricksignore
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.yaml.template		app.yaml.template
databricks.yml		databricks.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock
wc2026_bracket.json		wc2026_bracket.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

World Cup 2026 Prediction Pool

Features

How scoring works

Quick deploy

API token tier

Configuration

Bundle targets

Architecture

Local development

Scaling

AI/BI Dashboard (optional)

Troubleshooting

`npm ci` fails during deploy

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

World Cup 2026 Prediction Pool

Features

How scoring works

Quick deploy

API token tier

Configuration

Bundle targets

Architecture

Local development

Scaling

AI/BI Dashboard (optional)

Troubleshooting

npm ci fails during deploy

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`npm ci` fails during deploy

Packages