Skip to content

docs(docs-web): add llms.txt / llms-full.txt for AI-tool ingestion #1379

@danielscholl

Description

@danielscholl

Problem

  • https://archon.diy/llms.txt and /llms-full.txt both return 404; the Starlight site at packages/docs-web/ has no llms.txt plugin configured.
  • Experienced by anyone pointing an AI tool (Cursor, Claude projects, ChatGPT retrieval) at the Archon docs — they must scrape HTML, clone the repo, or paste pages manually.
  • Comes up every time a user asks their LLM about workflow authoring, node types, config, or variable substitution and gets stale or hallucinated answers.
  • Particularly acute here because Archon's audience is, by definition, LLM-assisted developers.

Proposed Solution

Add starlight-llms-txt as a dev dependency of @archon/docs-web. The plugin reads existing markdown under src/content/docs/ and generates three endpoints at build time:

  • /llms.txt — index of all doc pages with titles and descriptions (llmstxt.org spec)
  • /llms-full.txt — full markdown concatenation of all docs
  • /llms-small.txt — trimmed variant

Implementation touches two places:

  • packages/docs-web/package.json — add starlight-llms-txt to devDependencies
  • packages/docs-web/astro.config.mjs — import the plugin and add it to the plugins: array in the starlight() config

User Flow

Before (current)

User asks LLM: "How do I author a DAG workflow in Archon?"
  [!] LLM has no training data on Archon
  [!] User has to paste docs manually or accept hallucinated answers

After (proposed)

User: "Use https://archon.diy/llms-full.txt as context"
  [+] LLM fetches concatenated docs
  [+] Accurate, current answers grounded in the real docs

Alternatives Considered

Alternative Pros Cons Why not chosen
Do nothing Zero work AI tools can't ingest docs cleanly Misses Archon's core audience
Hand-author llms.txt Full curation control Manual maintenance; drifts from docs Plugin auto-regenerates on every build
Custom Astro integration No third-party dep Reinvents a solved problem starlight-llms-txt is purpose-built and maintained
Rely on existing sitemap.xml Already generated Not the llms.txt convention; tools look for /llms.txt specifically Doesn't solve discovery

Scope

  • Package(s) likely affected: docs-web
  • Breaking change? No — purely additive; existing routes unchanged
  • Database changes needed? No
  • New external dependencies? Yes — starlight-llms-txt (dev dep on @archon/docs-web only, not shipped to users)

Security Considerations

  • New permissions/capabilities? No
  • New external network calls? No — build-time only; serves static files
  • Secrets/tokens handling? No — re-serializes already-public markdown content
  • No new trust boundary: same content as the HTML docs, just a different serialization

Definition of Done

  • starlight-llms-txt added to packages/docs-web/package.json devDependencies
  • Plugin wired into packages/docs-web/astro.config.mjs
  • https://archon.diy/llms.txt returns a valid index after deploy
  • https://archon.diy/llms-full.txt returns full concatenated docs
  • Brief note added (e.g. in reference/index.md or the site footer) pointing AI users at the URL

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low priority - Nice to have, consider closing if stalearea: docsDocumentationeffort/lowSingle file or function, one responsibility, isolated changefeatureNew functionality (planned)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions