Skip to content

autoregressive-generation-concept-page#208

Open
AndreasAbdi wants to merge 6 commits into
mainfrom
autoregressive-generation-concept-page
Open

autoregressive-generation-concept-page#208
AndreasAbdi wants to merge 6 commits into
mainfrom
autoregressive-generation-concept-page

Conversation

@AndreasAbdi

Copy link
Copy Markdown
Contributor

{
"project": "Model Atlas — Autoregressive Generation Canonical Concept Page",
"branchName": "autoregressive-generation-concept-page",
"description": "Publish one canonical English autoregressive-generation concept page, backed by the existing concept.autoregressive-generation registry record and localized messages, so readers can understand the next-token loop and move cleanly between nearby architecture, serving, and sampling docs.",
"context": {
"customerAsk": "Customer ask alignment: refill the queue back toward the 3-4 active worker target now that memory-system-page has landed cleanly on current main and the GPT-2 reader path is down to one remaining live lane. Add one fresh, narrow concept-page slice for autoregressive-generation so readers can follow the generation loop that connects GPT-2, prefill/decode serving, and later sampling pages. Keep this as one reviewable content slice on current main. Scope: create the canonical concept page plus colocated messages/en.json and any needed assets.json, using the existing published concept.autoregressive-generation registry record rather than creating a duplicate concept; classify the page correctly with the project docs; wire aliases, tags, and related links so readers can move between decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split; and add only the focused validation coverage needed for the touched page and discovery surfaces. The prose should explain in simple language how one token becomes the next token, why the loop repeats one step at a time, and how readers should distinguish autoregressive generation from diffusion or encoder-only flows without turning the page into a math lecture. Keep the slice English-only and avoid reopening locale infrastructure, tokenizer-family work, or broad taxonomy churn beyond what is required to land the page cleanly. Acceptance criteria: the autoregressive-generation concept page exists on current main as a canonical docs page, resolves against the existing registry record, connects cleanly to nearby generation and serving pages, and focused touched checks pass.",
"problem": "The site already ships the published concept.autoregressive-generation registry record, a glossary entry for autoregressive generation, and several nearby generation and serving pages. What it still lacks is the canonical concept page that explains autoregressive generation as the broader idea across decoder-only and encoder-decoder systems. Readers therefore have to piece the concept together from a short glossary definition and scattered references from decoder, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split, without one plain-language explanation of how the next-token loop works, why it advances one step at a time, and how it differs from diffusion or encoder-only flows.",
"solution": "Publish a canonical /docs/concepts/autoregressive-generation page using the standard concept-page contract, English-only message-driven content, and a colocated assets.json only if one structured teaching asset is genuinely needed. Reuse the existing concept.autoregressive-generation registry record, keep discovery metadata aligned with the broad concept, connect the page to the existing glossary entry and the nearby decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split surfaces, and add only the focused validation needed to prove route, message, registry, and discovery integrity for this narrow slice."
},
"acceptanceCriteria": [
"A published canonical docs page exists for autoregressive-generation with kind: \"concept\", registryId: \"concept.autoregressive-generation\", English messages, and a colocated assets.json.",
"The page follows the concept-page contract and docs-writing standards with one folded openingSummary, plain-language explanation, distinct section jobs, and no phase or process meta language.",
"The page explains how a current token context produces logits for candidate next tokens, how a next token is chosen, and why the model then repeats the same loop one step later.",
"The page explains the practical bridge from architecture to serving by connecting the next-token loop to decoder, encoder-decoder, kv-cache, prefill, and prefill-decode-split.",
"The page distinguishes autoregressive generation from diffusion-style generation and from encoder-only understanding flows without turning into a math-heavy or training-heavy detour.",
"Registry-backed discovery metadata and relationships let readers move between the new concept page and decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split.",
"The implementation remains English-only and avoids unrelated locale, tokenizer, or broad taxonomy churn.",
"Quality gate: typecheck, lint, and focused tests pass."
],
"userStories": [
{
"id": "autoregressive-generation-concept-page-001",
"title": "Align the existing autoregressive generation record for canonical discovery",
"description": "As a reader searching for autoregressive generation, I want the existing concept.autoregressive-generation registry record to behave like a first-class broad concept so discovery surfaces route me to one canonical explainer instead of only the glossary entry or scattered mentions on nearby pages.",
"acceptanceCriteria": [
"The existing concept.autoregressive-generation record remains the canonical backing record and is updated only as needed with aliases, related ids, tags, citations, or other controlled metadata that match a broad concept page.",
"Registry relationships connect the concept page to shipped nearby docs for decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split, while preserving the glossary bridge for glossary/autoregressive-generation.",
"Discovery metadata distinguishes the broad autoregressive-generation concept page from the existing glossary entry without introducing duplicate canonical targets or broad taxonomy churn.",
"Typecheck passes",
"Tests pass"
],
"priority": 1,
"passes": true,
"notes": ""
},
{
"id": "autoregressive-generation-concept-page-002",
"title": "Publish the canonical autoregressive generation concept page",
"description": "As a technical layperson learning how language models produce text, I want a dedicated autoregressive-generation concept page so I can understand how one token becomes the next token and why that loop repeats step by step.",
"acceptanceCriteria": [
"A canonical concept page exists at /docs/concepts/autoregressive-generation with matching frontmatter, messages/en.json, and a colocated assets.json that is empty or minimal unless one loop-oriented teaching asset is genuinely needed.",
"The page opens with one folded openingSummary and explains autoregressive generation in plain language before narrowing into serving or architecture-specific usage.",
"The page clearly explains that the model reads the current context, produces logits over candidate next tokens, selects or samples one token, appends it to the context, and repeats the process.",
"The page clearly explains why the loop advances one token at a time even when the model processes many positions in parallel during prefill.",
"The page clearly explains how decoder-only and encoder-decoder systems can both generate autoregressively, while encoder-only systems are not primarily built for this next-token loop.",
"The page clearly distinguishes autoregressive generation from diffusion-style generation without turning the page into a broad comparison survey.",
"The page remains understandable in isolation for a first-time reader and complements the glossary entry rather than duplicating it word for word.",
"Typecheck passes",
"Tests pass",
"Verify in browser using the Browser plugin"
],
"priority": 2,
"passes": true,
"notes": ""
},
{
"id": "autoregressive-generation-concept-page-003",
"title": "Route readers between the concept page and nearby generation and serving docs",
"description": "As a reader moving through GPT-style generation, sampling, and serving topics, I want related docs, tags, and glossary bridges to guide me into the broad autoregressive-generation explainer and onward to the right neighboring pages.",
"acceptanceCriteria": [
"Representative discovery paths from decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split expose the canonical autoregressive-generation concept page as a reachable related destination.",
"The concept page renders registry-backed related docs and tag surfaces that connect it to the named neighboring pages without hand-maintained duplicate navigation.",
"At least one touched neighboring discovery surface or related-doc surface presents the new concept page in a way a reviewer can follow without typing the exact concept slug.",
"Browser-visible rendering shows the title, folded summary, tags, and related-doc links without broken links or missing-content placeholders.",
"Typecheck passes",
"Tests pass",
"Verify in browser using the Browser plugin"
],
"priority": 3,
"passes": true,
"notes": ""
},
{
"id": "autoregressive-generation-concept-page-004",
"title": "Add focused validation for the autoregressive generation concept-page slice",
"description": "As a maintainer, I want targeted automated proof for the autoregressive-generation concept-page slice so route, message, registry, and discovery regressions are caught without unrelated test expansion.",
"acceptanceCriteria": [
"Validation or tests confirm the /docs/concepts/autoregressive-generation route, concept.autoregressive-generation record, and default English messages resolve together.",
"Coverage asserts at least one page-specific discovery outcome and at least one page-specific related-link or glossary-bridge expectation for the new concept page.",
"Focused checks stay limited to touched content and discovery integrity rather than inventory snapshots, locale churn, or unrelated suite expansion.",
"Typecheck passes",
"Tests pass"
],
"priority": 4,
"passes": true,
"notes": ""
}
]
}

@AndreasAbdi

Copy link
Copy Markdown
Contributor Author

BLOCKING review summary for autoregressive-generation-concept-page

Concrete blocking issues:

  1. BLOCKING: the live /docs/concepts/autoregressive-generation page does not render the required folded openingSummary.

    • src/content/docs/concepts/autoregressive-generation/messages/en.json:4 defines the folded summary text, but browser QA on http://127.0.0.1:3457/docs/concepts/autoregressive-generation shows only the description paragraph under the title and no rendered folded-summary/details element.
    • I verified this in Playwright and by checking the DOM: the summary text is present only in the hydration script payload, not as visible article content.
    • This fails the project AC The page follows the concept-page contract ... with one folded openingSummary and the docs-writing/docs-quality standards that require one folded summary at the top.
  2. BLOCKING: the rendered references list is incomplete.

    • src/content/registry/concepts/autoregressive-generation.json:40-43 declares three citations, including citation.gpt-2-report.
    • On the live page, the References section renders only 2 items (attention-is-all-you-need and raffel-t5).
    • Root cause: citation.gpt-2-report exists as JSON, but it is not imported or parsed into the runtime citation registry in src/lib/content/registry-runtime.ts.
      Evidence: src/lib/content/registry-runtime.ts:1-19,372-396 includes attentionIsAllYouNeedCitation and raffelT5Citation in citationRecords, but no gpt-2-report import/parse entry.
    • This leaves the user-facing page with missing source support even though the structural registry test passes.

Quality evidence:

  • Local make test: PASS
  • Local bun run lint: PASS
  • Local bun run typecheck: PASS
  • GitHub checks for current head: gh pr checks 208 reports no checks for this branch
  • Browser QA: completed on a local production build at http://127.0.0.1:3457
  • docs/internal/processes/manual-qa.md was referenced by the review instructions but is not present in this worktree, so I used direct browser verification instead.

Project acceptance criteria:

  • PASS: A published canonical docs page exists for autoregressive-generation with kind: "concept", registryId: "concept.autoregressive-generation", English messages, and a colocated assets.json.
    The route bundle, frontmatter, messages, and assets.json are present.
  • FAIL: The page follows the concept-page contract and docs-writing standards with one folded openingSummary, plain-language explanation, distinct section jobs, and no phase or process meta language.
    Plain-language writing and section separation are fine, but the required folded openingSummary is not rendered on the live page.
  • PASS: The page explains how a current token context produces logits for candidate next tokens, how a next token is chosen, and why the model then repeats the same loop one step later.
    The What It Is and One Token At A Time sections cover the loop clearly.
  • PASS: The page explains the practical bridge from architecture to serving by connecting the next-token loop to decoder, encoder-decoder, kv-cache, prefill, and prefill-decode-split.
    The From Architecture To Serving section and related links satisfy this.
  • PASS: The page distinguishes autoregressive generation from diffusion-style generation and from encoder-only understanding flows without turning into a math-heavy or training-heavy detour.
    The Common Confusions section does this cleanly.
  • PASS: Registry-backed discovery metadata and relationships let readers move between the new concept page and decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split.
    Registry relationships and tag/discovery surfaces route correctly; /tags/attention exposes the concept page as a concept result.
  • PASS: The implementation remains English-only and avoids unrelated locale, tokenizer, or broad taxonomy churn.
    Scope stayed narrow and English-only.
  • PASS: Quality gate: typecheck, lint, and focused tests pass.
    Verified locally.

Behavioral assertion check for passes:true stories:

  • PASS: each story includes at least one behavioral acceptance criterion.
  • PASS with note: the added test set is not purely meta; it includes rendered page/search/discovery assertions. However, it missed two user-visible failures: folded-summary rendering and full citation rendering.

Docs-writing standards checklist:

  • FAIL: The page opens with one folded summary and no duplicate title chrome.
    No folded summary is rendered on the live page.
  • PASS: The page is understandable in isolation and does not define the topic only through one architecture slot.
  • PASS: The opening summary and first sections explain why the topic matters in plain language.
    The content is written in plain language; the issue is that the folded summary is not rendered.
  • PASS: The title and first mentions use full names before acronyms or shorthand.
  • PASS: Narrative sections have distinct jobs and do not repeat adjacent sections.
  • PASS: Math sections keep symbol-only definitions directly under equations and avoid concept rows such as projections or grouping mechanics.
    No math section was added.
  • PASS: Customer-facing copy contains no phase, process, or authoring meta language.
  • PASS: Baseline templates and rendered copy contain no reader-shortcut callouts.
  • FAIL: The page follows the companion quality checklist in docs-quality-standards.
    It misses the rendered folded summary and has incomplete rendered references.

Docs-quality standards checklist:

  • FAIL: One folded summary appears at the top with no duplicate title chrome.
  • PASS: The page works for a first-time reader without requiring adjacent pages.
  • PASS: The first sections explain both the concept and its value in plain language.
  • PASS: Titles and first mentions expand full names before acronyms or shorthand.
  • PASS: Customer-facing message files contain no phase, process, or meta language.
  • PASS: Math sections carry symbol-only definitions directly under equations.
    No math section was added.
  • PASS: Graphs, tables, and captions follow graphing standards.
    No graph/table changes were added here.
  • PASS: Narrative sections stay scannable, each paragraph advances one idea, and each section contributes something new.

General website standards review checklist:

  • PASS: architecture and dependency boundaries are clear; the page reuses shared docs components.
  • PASS: data flow and ownership are clear; registry-driven related docs/tags/citations remain centralized.
  • PASS: shared UI patterns are reused appropriately.
  • PASS: accessibility/responsive/browser behavior was considered; the route renders and links are keyboard-addressable in the docs shell.
  • FAIL: test evidence does not fully match the user-facing risk.
    The current tests prove route/discovery structure, but they did not catch the missing folded-summary render or the dropped third citation.

Required fixes before merge:

  • Render the folded openingSummary for this concept page according to the concept-page contract, then add a behavioral test that proves the summary is present in rendered output, not only in message payloads.
  • Add citation.gpt-2-report to the runtime citation registry so CitationList can resolve it, and add a rendered-page assertion that the references section shows all expected citations.

@AndreasAbdi

Copy link
Copy Markdown
Contributor Author

Addressed the blocking review items in commit 2407864d.

Fix map:

  1. Folded openingSummary now renders on live concept pages.
  • Updated src/app/docs/docs-slug-renderer.tsx to mount FoldedSummary for local concepts routes in the shared docs shell, between the page description and article body.
  • Added src/lib/content/concept-shell-render.tsx plus a shell-level assertion in src/lib/content/autoregressive-generation-concept.test.ts so this is verified against final page chrome rather than article-only markup.
  • Browser verification on http://localhost:3478/docs/concepts/autoregressive-generation confirmed the Summary disclosure is visible and expands with the expected opening summary text.
  1. The missing GPT-2 reference now resolves in the rendered References list.
  • Added citation.gpt-2-report to both runtime citation paths: src/lib/content/citations.ts and src/lib/content/registry-runtime.ts.
  • Added coverage in src/lib/content/citations.test.ts and extended the autoregressive concept shell test to assert the rendered references include the expected citation set.
  • Browser verification confirmed the live references list now contains three items, including Language Models are Unsupervised Multitask Learners.

Validation run after the fix:

  • bun test src/lib/content/autoregressive-generation-concept.test.ts src/lib/content/citations.test.ts
  • bun run lint
  • bun run typecheck
  • make test

Remaining browser console issue on local dev was the pre-existing favicon.ico 404 only.

@AndreasAbdi

Copy link
Copy Markdown
Contributor Author

BLOCKING review summary for autoregressive-generation-concept-page on current head 2407864de8cd47af61ac2355ca6cbad7e0cc8e5a

Superseding earlier blocking feedback:

  • The earlier rendered-content blockers are now cleared on the current head after a fresh production build and browser QA. The live /docs/concepts/autoregressive-generation page now renders the folded openingSummary, and the References list now renders all 3 expected citations, including the GPT-2 report.

Remaining blocking issue:

  1. BLOCKING: the PR is currently not mergeable because GitHub reports mergeable: CONFLICTING and mergeStateStatus: DIRTY.
  • I verified this from gh pr view --json headRefOid,mergeStateStatus,mergeable,statusCheckRollup on June 22, 2026.
  • Required action: rebase or merge main, resolve the merge conflicts, and push the updated branch so the review loop can finish.

Quality evidence on the current head:

  • make test: PASS
  • Fresh production build via bun run build: PASS
  • Browser QA on http://127.0.0.1:3478/docs/concepts/autoregressive-generation: PASS after rebuilding the current head
  • Browser QA on http://127.0.0.1:3478/tags/attention: PASS; the concept page is discoverable without typing the exact slug
  • gh pr checks 208: no checks reported on this branch
  • docs/internal/processes/manual-qa.md: not present in this worktree, so I used direct browser verification against a fresh production build

Project acceptance criteria:

  • PASS: A published canonical docs page exists for autoregressive-generation with kind: "concept", registryId: "concept.autoregressive-generation", English messages, and a colocated assets.json.
    Verified in the page bundle and live route.
  • PASS: The page follows the concept-page contract and docs-writing standards with one folded openingSummary, plain-language explanation, distinct section jobs, and no phase or process meta language.
    Verified in source and in the rendered production page.
  • PASS: The page explains how a current token context produces logits for candidate next tokens, how a next token is chosen, and why the model then repeats the same loop one step later.
    Covered clearly in What It Is and One Token At A Time.
  • PASS: The page explains the practical bridge from architecture to serving by connecting the next-token loop to decoder, encoder-decoder, kv-cache, prefill, and prefill-decode-split.
    Covered in From Architecture To Serving and verified through rendered links.
  • PASS: The page distinguishes autoregressive generation from diffusion-style generation and from encoder-only understanding flows without turning into a math-heavy or training-heavy detour.
    Covered in Common Confusions.
  • PASS: Registry-backed discovery metadata and relationships let readers move between the new concept page and decoder, encoder-decoder, token, logit, softmax, sampling-overview, kv-cache, prefill, and prefill-decode-split.
    Verified in the rendered related-doc surfaces and in the attention tag landing page.
  • PASS: The implementation remains English-only and avoids unrelated locale, tokenizer, or broad taxonomy churn.
    Scope stayed narrow.
  • PASS: Quality gate: typecheck, lint, and focused tests pass.
    Satisfied via make test and the fresh production build.

Behavioral assertion check for passes:true stories:

  • PASS: each passing story includes at least one behavioral acceptance criterion.
  • PASS: the touched tests are not purely meta-only; they include rendered/discovery behavior, and the browser QA confirms the key user-visible outcomes.

Docs-writing standards checklist:

  • PASS: The page opens with one folded summary and no duplicate title chrome.
  • PASS: The page is understandable in isolation and does not define the topic only through one architecture slot.
  • PASS: The opening summary and first sections explain why the topic matters in plain language.
  • PASS: The title and first mentions use full names before acronyms or shorthand.
  • PASS: Narrative sections have distinct jobs and do not repeat adjacent sections.
  • PASS: Math sections keep symbol-only definitions directly under equations and avoid concept rows such as projections or grouping mechanics.
    No math section was added.
  • PASS: Customer-facing copy contains no phase, process, or authoring meta language.
  • PASS: Baseline templates and rendered copy contain no reader-shortcut callouts.
  • PASS: The page follows the companion quality checklist in docs-quality-standards.

Docs-quality standards checklist:

  • PASS: One folded summary appears at the top with no duplicate title chrome.
  • PASS: The page works for a first-time reader without requiring adjacent pages.
  • PASS: The first sections explain both the concept and its value in plain language.
  • PASS: Titles and first mentions expand full names before acronyms or shorthand.
  • PASS: Customer-facing message files contain no phase, process, or meta language.
  • PASS: Math sections carry symbol-only definitions directly under equations.
    No math section was added.
  • PASS: Graphs, tables, and captions follow graphing standards.
    No graph/table change in this slice.
  • PASS: Narrative sections stay scannable, each paragraph advances one idea, and each section contributes something new.

General website standards checklist:

  • PASS: architecture and dependency boundaries are clear.
  • PASS: data flow and ownership are clear.
  • PASS: shared UI patterns are reused appropriately.
  • PASS: loading, empty, error, and success state expectations are not regressed by this change.
  • PASS: accessibility, responsive behavior, localization scope, and browser behavior were considered for the touched surface.
  • PASS: test evidence matches the user-visible risk for this narrow docs slice.

Review rules application:

  • PASS: correctness before style. The current implementation solves the stated problem after the fresh rebuild.
  • PASS: no obvious regressions were found in the touched discovery or docs surfaces.
  • PASS: architecture and dependency fit are appropriate.
  • PASS: readability and maintainability are acceptable for the slice size.
  • PASS: automated and manual quality evidence are present.
  • BLOCKING: the branch must be rebased and conflicts resolved before merge.

Processor action:

  • Rebase or merge main into this branch, resolve the merge conflicts, and push the updated branch. After that, this is ready for a final merge check.

… autoregressive generation record for canonical discovery
…onical autoregressive generation concept page]
…etween the concept page and nearby generation and serving docs]
…idation for the autoregressive generation concept-page slice]
…onical autoregressive generation concept page]
…idation for the autoregressive generation concept-page slice]
@AndreasAbdi AndreasAbdi force-pushed the autoregressive-generation-concept-page branch from 2407864 to fdde95e Compare June 22, 2026 07:08
@AndreasAbdi

Copy link
Copy Markdown
Contributor Author

Addressed the latest blocking mergeability feedback in commit fdde95e6 after rebasing the branch onto current main.

Fix map for the June 22, 2026 DIRTY/CONFLICTING blocker:

  1. Rebased autoregressive-generation-concept-page onto current main and force-pushed with lease, which cleared the PR merge conflict state.
  2. Updated the touched autoregressive-generation regression expectations to match current canonical discovery surfaces from main:
  • attention-tag concept groups now sort alphabetically, so the canonical concept link appears before kv-cache and prefill.
  • kv-cache and prefill now resolve through /docs/concepts/... on current main, so the autoregressive-generation related-doc assertions were updated to the current canonical hrefs.
  • the autoregressive-generation registry record now carries the broader citation set from main, so the registry assertion now checks the concept-page citations as a required subset instead of pinning the older exact list.
  • citation metadata coverage now validates the GPT-2 report through citation.url instead of assuming every MLA string repeats the URL inline.

Files adjusted for this mergeability follow-up:

  • src/lib/content/autoregressive-generation-concept.test.ts
  • src/lib/content/autoregressive-generation-registry.test.ts
  • src/lib/content/citations.test.ts
  • src/lib/content/content-reconciliation-attention-tag.test.ts
  • src/tests/content/attention-tag-landing.test.ts

Local validation after the rebase:

  • bun run lint
  • bun run typecheck
  • bun run prepare:content-runtime && bun test src/lib/content/autoregressive-generation-registry.test.ts src/lib/content/autoregressive-generation-concept.test.ts src/lib/content/citations.test.ts src/lib/content/content-reconciliation-attention-tag.test.ts src/tests/content/attention-tag-landing.test.ts

Current PR state after push:

  • mergeable: MERGEABLE
  • mergeStateStatus: UNSTABLE only because the fresh CI run is now in progress on head fdde95e6f92c1c9d42d97d13a9b3cbc13d7b5d5d.
  • The reviewed files remain present in the PR diff after the push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant