feat(codegen): fail consolidate on unallowlisted name collisions#919
Merged
Conversation
The consolidate step flattens generated_poc/* into a single namespace using first-seen / stem-preference order. ~201 bare type names are defined in more than one module, so `from adcp.types import <Name>` silently resolves to whichever module sorts first. Previously only ~5 were tracked and the rest were logged as warnings, so the build never failed and the set grew silently. Add a build guard: any collision not handled via qualified imports (KNOWN_COLLISIONS) and not in a checked-in allowlist snapshot raises a ValueError, hard-failing the consolidate step. The allowlist is seeded with the 197 collisions present today (generated via --update-allowlist) so the build passes now while any new or changed collision fails. The error message points contributors at the three resolutions: a disambiguated alias, an allowlist entry, or KNOWN_COLLISIONS. Partially addresses #911 (Step 1). Leaves generated_poc/ and aliases.py untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR #922 (derive response arms from schemas) reshaped the generated_poc modules, changing the set of colliding bare type names. Re-snapshot the checked-in allowlist against the post-#922 tree so the consolidate-step build guard and test_allowlist_matches_current_collisions_exactly pass. Net +13 names (197 -> 210): added BrandContext, Colors, CreditLimit, Disclosure, Feature, House, Input2, Logo, Preview, Preview2, Preview3, Result, Rights, Setup; removed PreviewInput. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1ebfee9 to
1dfd8b4
Compare
Contributor
There was a problem hiding this comment.
Approving. Makes a known silent footgun loud at build time without touching the generated tree or the public surface — the right place to catch an unpredictable class shape is before it ships, not after an adopter files a bug.
Things I checked
- No public-API / wire change. Diff is confined to
scripts/+ a test._generated.pyis byte-identical (only the regen timestamp would move) —known_collisions = KNOWN_COLLISIONSis a pure hoist, the consolidation logic is unchanged.feat(codegen):is the right prefix; non-breaking, no!needed. - Two scan paths agree.
_scan_name_to_modules()and the inline build ingenerate_consolidated_exports()use identical_module_sort_key, identical filters (__init__/ dotfile /bundledexclusion), identicalmodule_nameconstruction, and the sameextract_exports_from_module. Sort order is irrelevant to set membership, so--update-allowlistcannot write a snapshot the guard then rejects.code-reviewer: confirmed identical. - Fail-closed ordering.
_enforce_collision_allowlistraises insidegenerate_consolidated_exports()beforemain()reachesOUTPUT_FILE.write_text()(consolidate_exports.py:515-518) — a new collision exits non-zero with nothing written.--update-allowlistbypasses the guard via_scan_name_to_modules()so regeneration never self-trips. - Test isn't tautological.
test_new_collision_not_in_allowlist_raisesinjects a real 2-module synthetic name and asserts the raise plus all three remediation hints;test_allowlist_matches_current_collisions_exactlycatches both stale and padded snapshots. Import patternfrom scripts.consolidate_exports import ...is already established (test_code_generation.py:49). - Allowlist diff is reviewable —
sorted(),indent=2, one name per line. A future collision reads as a clean single-line insertion.
Follow-ups (non-blocking — file as issues)
- PR body / docstring overstates the enforcement points. The claim that
make validate-generatedhard-fails is wrong: that target onlypy_compiles_generated.py(Makefile:90-93). The guard actually fires onmake regenerate-schemas(Makefile:85) and viatests/test_collision_guard.pyundermake test. Correct the prose so the next maintainer doesn't trust a greenvalidate-generated. (dx-expert) KNOWN_COLLISIONSis subtracted by name, not by module set._collisions_from_accountingremoves the keys wholesale, so if a known name (e.g.FormatId) later appears in a module outside its declared value set, the new collision is neither qualified-imported nor flagged. Narrower blind spot than the pre-PR state (where all ~196 were unguarded), so not a regression — but worth tightening to flagname_to_modules[name] - KNOWN_COLLISIONS[name].--update-allowlistoverwrites silently. It prints only a count. Printing the added/removed set-diff before writing would turn a silent re-snapshot into a visible decision and nudge contributors toward an alias over silencing. (dx-expert)
Minor nits (non-blocking)
- Alias remediation lacks an example. The
ValueErroroption 1 says "add a disambiguated alias in src/adcp/types/aliases.py" but points at no existing pattern. ASyncCatalogResult-style breadcrumb would save a contributor the reverse-engineering trip. - Count prose drift. PR body says 197 (and "201 minus 4"); the second commit bumps to 210; the checked-in file carries 211 entries. Nothing asserts the number, so it's cosmetic — but the figures don't line up.
The second commit re-snapshotting the allowlist the same day #922 reshaped the modules is the guard earning its keep on day one — notable that the first thing it caught was a sibling PR.
LGTM. Follow-ups noted above; none block.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Partially addresses #911 (Step 1). Makes generated-type name collisions loud at build time instead of silently letting one class shadow the others.
scripts/consolidate_exports.pyflattens everygenerated_poc/*module into a single namespace (_generated.py) using first-seen / stem-preference order. ~201 bare type names are defined in more than one module (e.g.Creativeacross 4 modules,Accountacross 4,Authenticationacross 5,Sort,Unit,MediaBuy,GovernanceAgent, ...), sofrom adcp.types import <Name>silently resolves to whichever module sorts first. Previously only ~5 names were tracked viaknown_collisions; the remaining ~196 were merely logged as warnings and the build never failed — so adopters following the migration guide silently get an unpredictable class shape.This PR adds a build-time guard. It does not add disambiguated aliases (that is Step 2, out of scope here) and leaves
generated_poc/andaliases.pyuntouched.What changed
scripts/consolidate_exports.py): the consolidate step now records every module that defines each name and, after the first pass, raises aValueErrorfor any collision that is neither handled via qualified imports (KNOWN_COLLISIONS) nor present in a checked-in allowlist. The consolidate step exits non-zero, somake regenerate-schemas/make validate-generatedhard-fail on a new or changed collision.scripts/collision_allowlist.json): a programmatically generated snapshot of the 197 collisions that exist today (the 201 cross-module names minus the 4 currently inKNOWN_COLLISIONS). Seeding keeps the build green now; any new/changed collision trips the guard. Regenerate withpython scripts/consolidate_exports.py --update-allowlist— a JSON data file so future diffs are reviewable.tests/test_collision_guard.py): asserts the current tree consolidates cleanly against the seeded allowlist, the allowlist exactly matches the real collision set (no drift/padding), and a synthetic new collision not in the allowlist makes the guard raise with the expected remediation hints.Guard error message
When a collision is not allowlisted, the build fails with:
Verification
python scripts/consolidate_exports.pyexits 0 on the seeded allowlist; exits 1 when a name is removed from the allowlist (verified end-to-end)._generated.pyoutput is byte-identical (only the regen timestamp would change) — no semantic change to generated types.tests/test_collision_guard.py— 6 passedtests/test_code_generation.py+tests/test_import_layering.py— passmypy src/adcp/— no issues;ruff check src/— clean;black --check— clean🤖 Generated with Claude Code