Skip to content

Proposal: A QuantEcon Claude Code toolkit (agents, skills, plugins) #304

@mmcky

Description

@mmcky

Summary

We've accumulated a number of one-off prompts, scripts, and GitHub Actions that use Claude to do recurring work across our lecture repos — LaTeX→MyST conversion, style-guide enforcement, English→Chinese translation sync, citation/DOI handling, and various Jupyter Book build fixes. These currently live in different places, with inconsistent interfaces, and no shared way to install or update them.

This proposal recommends consolidating that work into a quantecon Claude Code marketplace — a single repository (or small set of repositories) that packages our reusable agents, slash commands, skills, and supporting scripts as installable plugins. Any contributor working in a QuantEcon repo could then run:

/plugin marketplace add quantecon/claude-toolkit
/plugin install latex-to-myst@quantecon

…and get a versioned, maintained toolkit instead of copy-pasting prompts from Slack or hunting through old branches.

Background: what Claude Code now offers

Over the last several months Anthropic has shipped a stack of features for packaging Claude-driven workflows. The four building blocks we care about:

Building block What it is Where it lives
Slash command A saved prompt invoked with /name; runs inline in the main conversation .claude/commands/<name>.md
Subagent A specialist with its own isolated context window, system prompt, and tool restrictions; the main agent delegates to it .claude/agents/<name>.md
Skill A reference module (instructions + supporting files) auto-loaded when Claude decides it's relevant .claude/skills/<name>/SKILL.md
Plugin A bundle of the above (plus hooks and MCP servers) packaged as one installable unit .claude-plugin/plugin.json at repo root

A marketplace is a repository that lists multiple plugins, installable via /plugin marketplace add owner/repo.

The practical shape: scripts (Python, shell, pandoc invocations) do the deterministic transformations; agents handle the judgement-heavy edge cases; skills hold institutional knowledge ("here's why exercise environments don't round-trip cleanly"); plugins bundle and distribute the lot.

Why this matters for QuantEcon

We have several characteristics that make us a particularly good fit for this stack:

  • Multi-repo work. The same conversion, validation, and sync logic applies across lecture-python.myst, lecture-python-advanced.myst, lecture-julia.myst, etc. Today we copy or re-derive; with plugins we install once and update centrally.
  • Long-running parallelizable tasks. Translation sync across dozens of chapters is the textbook subagent use case — fan out, each subagent handles one file in its own context, main session orchestrates.
  • Established scripts. Pandoc wrappers, reference fixers, and build helpers we've already written keep working — agents call them via the Bash tool. We don't replace deterministic logic; we wrap it.
  • CI/CD already in place. Our GitHub Actions can run claude in headless mode with the same plugins pre-installed, so the local developer experience and the CI experience use the same source of truth.

Recommended architecture

Repository layout

Two reasonable options:

Option A — single repo. One quantecon/claude-toolkit repo containing the marketplace plus all plugins. Lower overhead, easier to start with.

claude-toolkit/
├── .claude-plugin/
│   └── marketplace.json
├── plugins/
│   ├── latex-to-myst/
│   │   ├── .claude-plugin/plugin.json
│   │   ├── agents/
│   │   ├── commands/
│   │   ├── skills/
│   │   └── scripts/
│   ├── lecture-style/
│   ├── translation-sync/
│   └── ...
└── docs/

Option B — marketplace + separate plugin repos. quantecon/claude-toolkit holds only marketplace.json with source: { source: "github", repo: "quantecon/latex-to-myst-plugin" } entries pointing at standalone plugin repos. More overhead, but better for independent versioning and contribution.

Recommendation: start with Option A, split out plugins to their own repos only when they outgrow the shared cadence (a translation-sync plugin that ships weekly probably wants its own release process; a style-guide plugin that changes twice a year doesn't).

Plugin internal structure

Each plugin follows the same pattern:

latex-to-myst/
├── .claude-plugin/
│   └── plugin.json
├── agents/
│   ├── latex-converter.md          # main converter subagent
│   ├── myst-validator.md           # lints output MyST
│   └── citation-fixer.md           # \cite → {cite}
├── commands/
│   ├── convert-lecture.md          # /convert-lecture orchestrator
│   └── diff-original.md            # /diff-original vs upstream .tex
├── skills/
│   └── latex-to-myst/
│       ├── SKILL.md                # when to use, edge-case catalogue
│       ├── math-environments.md
│       └── examples/               # before/after pairs
└── scripts/
    ├── pandoc_wrapper.sh
    ├── fix_refs.py
    └── extract_figures.py

Important constraint from the Anthropic docs: plugins are copied to a cache location on install, so they cannot reference files outside their directory via ../. Scripts must be bundled inside the plugin.

Agent + script split

The pattern that's been working in practice:

  1. Scripts do the 80% of work that's deterministic (pandoc, regex fixes, file shuffling).
  2. Agents handle the long tail of judgement calls (what to do when pandoc emits a malformed exercise environment, how to disambiguate a citation, when to convert a code listing to an executable cell).
  3. Skills are the growing reference catalogue of edge cases and conventions.

A trimmed example of a subagent definition:

---
name: latex-converter
description: Converts a single LaTeX file to MyST Markdown for QuantEcon lectures. Use when given a .tex file or pointed at one in lecture-source/.
tools: Read, Write, Edit, Bash, Grep
model: sonnet
---
You convert QuantEcon LaTeX lectures to MyST Markdown.

Workflow:
1. Run `scripts/pandoc_wrapper.sh <input.tex>` for a baseline conversion.
2. Read the output and fix QuantEcon-specific things pandoc gets wrong:
   - `\begin{exercise}` blocks → `{exercise}` directives
   - `\cite{key}` → ``{cite}`key```
   - Code listings → executable code cells with appropriate tags
3. Run `scripts/fix_refs.py <output.md>` for mechanical reference rewrites.
4. Hand off to the myst-validator subagent for a final check.

Load skills/latex-to-myst/SKILL.md before starting — it catalogues edge cases.

Proposed plugins

Initial set, ordered by likely impact:

  1. latex-to-myst — Converts LaTeX lectures (legacy or external contributions) to QuantEcon MyST. Wraps pandoc, fixes references, normalises directives, handles math environments.
  2. lecture-style — Promotes our existing GitHub Action style-guide checker into a plugin. Same prompt, same logic, runnable locally as /check-style and in CI as a headless claude invocation.
  3. translation-sync — Orchestrates English→Simplified Chinese sync. Main agent fans out subagents per chapter; each subagent diffs against the canonical English source and produces a PR-ready Chinese update.
  4. doi-citations — Handles DOI lookups, citation key normalisation, and bibliography hygiene across lecture repos.
  5. jupyter-book-fixer — Build error triage. When a Jupyter Book build fails, this plugin's agent reads the build log, identifies the failing notebook/file, and proposes a fix.
  6. repo-hygiene — Catch-all for the small-but-recurring ops work: gh-pages branch size, stale workflows, broken cross-repo links.

Each plugin should stay small. The community guidance is to keep plugins single-purpose with an average of 3–4 components — better for token efficiency and easier to reason about.

Migration path

Phased so we get value quickly without committing to the full architecture before we've learned its shape:

Phase 1 — proof of concept (1–2 weeks). Pick one workflow we run often (suggest lecture-style since the prompt already exists in our action). Build it as a standalone plugin in a fresh repo. Install it in one lecture repo. Iterate until it feels right.

Phase 2 — second plugin, prove portability (1–2 weeks). Build latex-to-myst as a second plugin in the same repo. Confirm the marketplace structure works and that both plugins can be installed independently.

Phase 3 — promote to quantecon/claude-toolkit (1 week). Move the repo into the QuantEcon org, document the install flow in the contributing guides for each lecture repo, announce.

Phase 4 — port remaining workflows. Translation sync, DOI handling, build fixer, etc. — each as its own plugin, added incrementally.

Phase 5 — CI parity. Update GitHub Actions to install the marketplace and invoke plugins via headless claude. Single source of truth for prompts; CI and local dev stay in lockstep.

Open questions for discussion

  • Marketplace visibility. Public from day one, or private until v1? Public has the benefit that other educational orgs using MyST could borrow patterns (and we could borrow back). Private lets us iterate without external commitments.
  • Versioning and breaking changes. Plugins are installed by reference to a repo. How do we communicate breaking changes to lecture-repo contributors who have it installed? Pinning by tag/commit in marketplace.json is supported but adds friction.
  • Model selection. Default to Sonnet for most agents? Reserve Opus for the conversion agents where quality matters most? Document per-plugin recommendations in plugin.json.
  • Skill granularity. Should latex-to-myst ship one SKILL.md or several (math, citations, code cells, figures)? Probably depends on how often each section gets edited independently.
  • Shared utilities. If three plugins all need a "find the canonical lecture source for this filename" helper, where does it live? Plugins can't share via ../. Options: duplicate, publish a shared script package on PyPI, or keep the helper inside one plugin and call it as a subagent from the others.

References


Happy to spike Phase 1 (lecture-style plugin) as a concrete starting point — that gets us a working example to evaluate the rest of the proposal against without committing to the full architecture upfront.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions