Skip to content

Validate changelog section keys at parse-time using changelogs/sections.yaml#4516

Closed
Copilot wants to merge 3 commits into
copilot/relax-changelogtyping-supportfrom
copilot/refine-changelog-typing-support
Closed

Validate changelog section keys at parse-time using changelogs/sections.yaml#4516
Copilot wants to merge 3 commits into
copilot/relax-changelogtyping-supportfrom
copilot/refine-changelog-typing-support

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 14, 2026

This PR tightens changelog schema control without reintroducing hard-coded section names in typing. Section keys remain flexible at the type level, but are now validated at runtime against changelogs/sections.yaml wherever parsed changelog data enters the system.

  • Typing refinement (without hard-coded section keys)

    • Added SourceChangelogChangeSectionsDict = dict[str, SourceChangeList] for typed source-entry section maps.
    • Kept ChangelogSourceDict and ChangelogDict as dict[str, Any].
    • Documented that section authority is runtime validation via AChangelogs.validate_sections, while BaseChangelogDict continues to carry date: str.
  • Runtime section validator on AChangelogs

    • Added validate_sections(self, data, path=None) -> ChangelogDict.
    • Allows date unconditionally.
    • Validates all other keys against self.sections (loaded from changelogs/sections.yaml).
    • Raises ChangelogParseError with unknown section names, optional source path context, and a pointer to CHANGELOG_SECTIONS_PATH.
    • Returns input data unchanged on success so call sites can inline it.
  • Parse-path wiring

    • Updated AChangelog.data to validate parsed data via:
      • self.project.changelogs.validate_sections(await self.project.execute(self.get_data, self.path), self.path)
    • Left AChangelog.get_data unchanged (still classmethod-based parsing).
  • Test coverage

    • Added direct unit coverage for validate_sections:
      • pass-through for valid keys,
      • single/multiple unknown-key failures,
      • sorted unknown-key formatting,
      • date handling,
      • path-in-message vs no-path behavior.
    • Updated AChangelog.data test to assert validator invocation and return value.
    • Added unknown-section parse-path test asserting ChangelogParseError on await changelog.data.
@async_property(cache=True)
async def data(self) -> typing.ChangelogDict:
    return self.project.changelogs.validate_sections(
        await self.project.execute(self.get_data, self.path),
        self.path)
Original prompt

Context

This refines PR #4500 (branch copilot/relax-changelogtyping-support), part of the per-entry changelog plan (see #4498, #4499, and sibling PRs #4501 / #4502).

The reviewer's feedback on the current state of PR #4500:

The sections should be controlled somewhere. Right now this PR removes the hard-coded keys from the type system but doesn't replace them with anything — ChangelogSourceDict and ChangelogDict both become dict[str, Any], which is too permissive. The schema for sections is tightly defined and needs to be validated somewhere. If we drive that from changelogs/sections.yaml, envoy (or any downstream consumer) can override the default sections by editing that YAML — exactly the configurability we want. Typing doesn't have to enforce it, but something must.

What this PR should do

Two changes, in this order:

1. Tighten the typing (partial revert of the current PR #4500 state)

Current state on the branch (py/envoy.base.utils/envoy/base/utils/typing.py):

ChangelogSourceDict = dict[str, Any]
ChangelogChangeSectionsDict = dict[str, ChangeList]
ChangelogDict = dict[str, Any]

Change to:

# Section-name -> entries. Section names are arbitrary at the type level
# and validated at runtime against the project's `changelogs/sections.yaml`
# (see AChangelogs.sections / AChangelogs.validate_sections).
ChangelogChangeSectionsDict = dict[str, ChangeList]
SourceChangelogChangeSectionsDict = dict[str, SourceChangeList]

# `date` is always typed; section keys live alongside it but cannot be
# expressed in TypedDict together with a typed required field, so these
# are intentionally `dict[str, Any]`. Runtime validation lives in
# AChangelogs.validate_sections.
ChangelogSourceDict = dict[str, Any]
ChangelogDict = dict[str, Any]

i.e. keep ChangelogSourceDict / ChangelogDict as dict[str, Any] (the looseness is necessary so arbitrary section keys don't trigger mypy errors), but:

  • Add the SourceChangelogChangeSectionsDict alias (mirrors ChangelogChangeSectionsDict but uses SourceChangeList) so PR 2's directory reader has a typed intermediate shape to build from.
  • Add the explanatory comment pointing at AChangelogs.validate_sections as the runtime authority.
  • Keep BaseChangelogDict unchanged (it carries the date: str contract that consumers still rely on, even though ChangelogDict no longer inherits from it after this PR).

Do not re-introduce hard-coded section keys.

2. Add a runtime section validator on AChangelogs

File: py/envoy.base.utils/envoy/base/utils/abstract/project/changelog.py

AChangelogs already loads changelogs/sections.yaml into self.sections (a ChangelogSectionsDict, i.e. dict[str, ChangelogSectionDict]). That's the schema. Add a method that uses it:

def validate_sections(
        self,
        data: typing.ChangelogDict,
        path: pathlib.Path | None = None) -> typing.ChangelogDict:
    """Validate that every non-`date` key in `data` is a known section.

    Known sections are read from `changelogs/sections.yaml` via
    `self.sections`. Unknown keys raise `ChangelogParseError` naming the
    offending section(s) and (if provided) the source path.

    Returns `data` unchanged on success so the call can be inlined at
    parse sites.
    """
    allowed = set(self.sections) | {"date"}
    unknown = sorted(k for k in data if k not in allowed)
    if unknown:
        where = f" ({path})" if path is not None else ""
        raise exceptions.ChangelogParseError(
            f"Unknown changelog section(s){where}: "
            f"{', '.join(unknown)}. "
            f"Valid sections come from {CHANGELOG_SECTIONS_PATH}.")
    return data

Notes:

  • Use exceptions.ChangelogParseError for consistency with AChangelog.get_data's existing error path.
  • The "date" key is always allowed (it's the one typed field).
  • path is optional so callers that don't have a meaningful path (e.g. an in-memory dict) can still call it.
  • Do not mutate data; return it as-is on success.

3. Wire validate_sections into the existing parse path

AChangelog.get_data (classmethod, lines ~132-149) currently does:

@classmethod
def get_data(cls, path) -> typing.ChangelogDict:
    try:
        data = utils.from_yaml(path, typing.ChangelogSourceDict)
    except (_yaml.reader.ReaderError, utils.TypeCastingError) as e:
        raise exceptions.ChangelogParseError(...)
    return cast(typing.ChangelogDict, {...})

It is a classmethod, so it does not have access to an AChangelogs instance. The wiring should therefore happen at the call site that does have access — AChangelog.data:

@async_property(cache=True)
async def data(self) -> typing.ChangelogDict:
    return self.project.changelogs.validate_sections(
        await self.project.execute(self.get_data, self.path),
        self.path)

This:

  • Keeps get_data pure / classmethod-friendly (PR 2's `get_...

This pull request was created from Copilot chat.

Copilot AI changed the title [WIP] Refine changelog typing support based on review feedback Validate changelog section keys at parse-time using changelogs/sections.yaml May 14, 2026
Copilot AI requested a review from phlax May 14, 2026 15:39
Copilot finished work on behalf of phlax May 14, 2026 15:39
@phlax phlax closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants