Skip to content

perf(workspace): skip checking out oversized submodules during prep#60

Merged
nathanwhit merged 1 commit into
mainfrom
skip-oversized-submodules
Jun 24, 2026
Merged

perf(workspace): skip checking out oversized submodules during prep#60
nathanwhit merged 1 commit into
mainfrom
skip-oversized-submodules

Conversation

@nathanwhit

Copy link
Copy Markdown
Owner

The problem (measured)

A fresh deno worker checkout sits in prep ~7 min before its session even flips to running (it shows as queued the whole time). I measured a live deno checkout on Vultr:

files size
tests/wpt/suite 159,392 938 MB
rest of repo (excl. .git/target) 51,078

The WPT submodule is 76% of the working-tree file count. The cost isn't download (objects come from the --reference cache) — it's git materializing 159k files into every fresh worker's working tree. That's the bulk of the 7 minutes, and it repeats on every new worker. It also bloats each checkout by ~1 GB / 160k inodes (feeds the disk pressure too).

The fix — generic, no hardcoded paths

After warming each submodule's mirror, count its pinned tree's files from the cache (no working tree written to decide), and leave any submodule over maxEagerSubmoduleFiles (50k) uninitialized — checking out only the small ones via an explicit pathspec. The superproject itself is ~50k files, so anything bigger is bulk test data; WPT (160k) trips it, deno's small submodules (std, node_compat, lzld) don't.

No repo- or path-specific knowledge is baked in — it's a pure file-count heuristic. (You'd flagged hardcoding tests/wpt/suite as ugly; this avoids naming it at all.)

Safety:

  • A submodule whose size can't be measured is kept — the rule never drops one it merely failed to size.
  • A worker that actually needs a skipped submodule runs git submodule update --init <path> itself (the objects are already in the warmed cache). Build / lint / unit tests don't need WPT.
  • An uninitialized submodule is an empty gitlink dir — git status ignores it and git add -A won't sweep it.
  • maxEagerSubmoduleFiles <= 0 disables the skip (old behavior).

Expected: deno worker prep drops from ~7 min to ~1–2 min.

Test

TestPrepareIsolated_SkipsOversizedSubmodule: with the ceiling lowered, a 3-file submodule is left uninitialized while a 1-file submodule alongside it is checked out normally. Existing submodule tests unchanged (default ceiling keeps them).

Follow-up (not in this PR)

Worth a one-line hint in the worker preamble — 'large submodules like tests/wpt/suite aren't checked out by default; run git submodule update --init <path> if your task needs one' — so a WPT-test worker isn't surprised. Happy to add if you want.

A fresh worker checkout of denoland/deno spends ~7 minutes in prep, and the
dominant cost is materializing the WPT submodule (tests/wpt/suite): ~160k files,
roughly 3x the rest of the repo's working tree. Even with the --reference object
cache, git still has to write every one of those files — that's what a worker
sees as 'queued for ~7 min' before its session even flips to running.

Skip it generically: after warming each submodule's mirror, count its pinned
tree's files from the cache (no working tree written to decide) and leave any
submodule over maxEagerSubmoduleFiles (50k) uninitialized — checking out only the
small ones via an explicit pathspec. No repo- or path-specific knowledge baked
in; it's a file-count heuristic (the superproject itself is ~50k files, so
anything bigger is bulk test data). Unmeasurable submodules are kept, so the rule
never drops one it merely failed to size.

Workers/reviewers/follow-ups get a one-line note in their preamble that an
oversized submodule may not be checked out and to run 'git submodule update
--init <path>' if their task needs it.

Cuts deno worker prep from ~7 min to ~1-2 min and shrinks each checkout by ~1GB
/ 160k inodes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant