fix(workspace): don't let reconcileSubmodulePins detach the superproject via a skipped submodule#71
Merged
Conversation
…ect via a skipped submodule A partition-skipped oversized submodule (deno's tests/wpt/suite) is left as an empty placeholder dir with no .git. reconcileSubmodulePins iterated it anyway and ran 'git -C <sub> checkout --detach --force <pin>', which — finding no repo in the empty dir — walked UP to the superproject and detached ITS HEAD onto the submodule's pinned commit. The agent then booted into detached-HEAD, ran 'git checkout main' to recover, and committed off its orcha work branch, so publish saw 'no commits between main and the push branch' and the whole change had to be redone (observed on deno #35516: HEAD detached onto WPT commit d85e753). The old 'rev-parse HEAD' guard couldn't catch it — against the superproject the command succeeds, it just returns the wrong HEAD. Gate the reset on 'git -C <sub> rev-parse --show-superproject-working-tree' being non-empty, which holds only when <sub> genuinely is a submodule of a superproject and is empty when git escaped to the superproject (or the dir isn't an initialized submodule). Also covers a kept submodule that failed to initialize. Only fires on a warm cache (the pin object must be present locally; base()'s local clone hardlinks it in), which is why it hit prod but not cold checkouts. Regression test needs a kept submodule alongside the skipped one (with only an oversized one, updateSubmodules early-returns before reconcile) and two preps to warm the cache; without the fix it fails with HEAD detached.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Objective
eed9f6b5(deno #35516) burned a full extra implement+review cycle (~1h) because the first implementer's commit "vanished" — the worker's goal literally read "previous worker reported commit 444923e92019 but Orcha PR creation says there are no commits between main and the push branch."Root cause is in
reconcileSubmodulePins, exposed by the partition-skip optimization (#60) interacting with the reconcile step (#47):tests/wpt/suite) is partition-skipped as oversized, so prep leaves it an empty placeholder dir with no.git.reconcileSubmodulePinsiterated all.gitmodulespaths — including the skipped WPT — and rangit -C tests/wpt/suite checkout --detach --force <pin>.git -Cwalked up to the superproject and detached its HEAD onto the WPT commitd85e753…, during prep, before the agent started.git checkout mainto recover, and committed its fix onmaininstead oforcha/impl-<id>.orcha/impl-<id>, still at base → "no commits between main and the push branch" → manager re-spawned a reapply worker + second reviewer.The old
rev-parse HEADguard couldn't catch this: against the superproject the command succeeds, it just returns the wrong HEAD.It's warm-cache-only — the pin object must be present locally, and
base()'s localgit clone cache wshardlinks it in — which is why it bites prod's long-lived.orcha-cachebut not a cold checkout. Likely behind recurring "empty PR / no commits" failures on deno.Proof (impl checkout reflog)
And
git rev-parse main:tests/wpt/suite==d85e753…exactly;git -C tests/wpt/suite rev-parse --show-toplevelreturns the superproject root.Fix
Gate the reset on
git -C <sub> rev-parse --show-superproject-working-treebeing non-empty — true only when<sub>genuinely is a submodule of a superproject, empty whengit -Cescaped to the superproject (or the dir isn't an initialized submodule). Also covers a kept submodule that failed to initialize.Test
TestPrepareIsolated_SkippedSubmoduleKeepsSuperprojectOnBranchreproduces the exact prod symptom. It needs a kept submodule alongside the skipped one (with only an oversized submodule,updateSubmodulesearly-returns before reconcile runs) and two preps to warm the cache. Without the fix it fails with the superprojectHEADdetached; with it, HEAD stays on the work branch. Fullinternal/workspacesuite,go vet, andgo build ./...all green.