Recursive CLI documentation introspection for humans and AI agents.
cmdgraph runs CLI help commands (--help, -h, -H, or help), discovers subcommands recursively, parses the output into structured data, and exports documentation as JSON, Markdown, static HTML, llms.txt, and sitemap.xml.
Most CLIs are documented in unstructured terminal text. cmdgraph turns that into:
- Machine-friendly JSON for indexing, retrieval, and agent pipelines
- Human-readable Markdown for generated docs and internal references
- Static single-page HTML for hosting CLI docs as a site
- Explicit
llms.txtoutput for LLM-facing discovery - Explicit
sitemap.xmloutput for search-engine discovery - A command tree (AST) that preserves hierarchy and relationships
- Recursive command discovery from
--help,-h,-H, orhelp - Plugin parser system (
heuristic,oclif,commander,yargs,cobra,thor,picocli,urfave-cli,system-commandline,commandlineparser,click,typer,clap,argparse) - Best-effort metadata extraction for version, arguments, examples, and aliases
- Concurrency control for recursive help crawling
- Automatic in-memory caching of help outputs within a process
- Timeout-safe command execution using
execa - Non-interactive execution defaults (
CI=1,NO_COLOR=1) - JSON, Markdown, static HTML,
llms.txt, andsitemap.xmloutput formats - Searchable single-page HTML docs with client-side command filtering
- Search-engine and LLM-friendly discovery artifacts without coupling them to HTML output
- Unit + integration + e2e tests with deterministic fixtures
- Node.js
>=18
npm install -g cmdgraphcmdgraph generate <command> [options]Examples:
cmdgraph generate git --format=json --format=md --output=./docs
cmdgraph generate git --format=html --output=./site
cmdgraph generate git --format=html --output-html-title="Git CLI Docs" --output-html-project-link=https://github.com/acme/git-cli --output-html-readme=README.md --output=./site
cmdgraph generate git --format=html --format=llms-txt --format=sitemap --output-root-command-name=cmdgraph --output-llms-txt-base-url=https://docs.example.com/git/ --output-sitemap-base-url=https://docs.example.com/git/ --output=./site
cmdgraph generate git --max-depth=3 --concurrency=4 --format=json --output=./docs
cmdgraph generate kubectl --max-depth=3 --timeout=8000 --format=json --output=./docs
cmdgraph generate kubectl --config=./configs/cmdgraph.kubectl.json
cmdgraph generate "node ./tools/my-cli.mjs" --parser=heuristic --format=md --output=./docscmdgraph can load options and flags from JSON.
- By default, it reads
./cmdgraph.config.jsonfrom the current working directory when the file exists. - Use
--config=<path>to load any JSON file. - Explicit CLI flags always win over config values.
Config keys support both flat flag names and grouped nested sections:
{
"max-depth": 3,
"timeout": 8000,
"concurrency": 4,
"parser": "heuristic",
"format": ["json", "md"],
"output": {
"directory": "./docs",
"rootCommandName": "cmdgraph",
"html": {
"title": "cmdgraph CLI Docs",
"projectLink": "https://github.com/haoliangyu/cmdgraph",
"readme": "./README.md"
},
"llmsTxt": {
"baseUrl": "https://docs.example.com/cmdgraph/"
},
"sitemap": {
"baseUrl": "https://docs.example.com/cmdgraph/"
}
},
"crawler": {
"maxDepth": 3,
"timeout": 8000,
"concurrency": 4,
"parser": "heuristic"
}
}| Option | Type | Default | Description |
|---|---|---|---|
--max-depth |
integer | Maximum recursion depth for subcommands (when , crawl continues until leaf commands) | |
--concurrency |
integer | 4 |
Maximum number of help commands to run in parallel |
--timeout |
integer | 5000 |
Per-command timeout in ms |
--parser |
string | Force a parser plugin by name |
| Option | Type | Default | Description |
|---|---|---|---|
--config |
string | ./cmdgraph.config.json |
Path to a JSON config file; if omitted, the default file is used only when it exists |
| Option | Type | Default | Description |
|---|---|---|---|
--format |
repeatable json | md | html | llms-txt | sitemap |
json |
Output format; repeat the flag to write multiple outputs |
--output |
string | ./docs |
Output directory |
--output-root-command-name |
string | Override the displayed root command name in generated outputs | |
--output-html-title |
string | Set HTML page title | |
--output-html-project-link |
string | Project URL shown in the HTML footer | |
--output-html-readme |
string | Path to a .md file rendered as a README section in the HTML page |
|
--output-llms-txt-base-url |
string | Base URL used to generate llms.txt links |
|
--output-sitemap-base-url |
string | Base URL used to generate sitemap.xml links (required for sitemap output) |
cmdgraph can also be used as a library:
import { generate, introspect } from 'cmdgraph'
const { tree, warnings } = await introspect('git', {
timeoutMs: 5000,
concurrency: 4,
})
const generated = await generate('git', {
timeout: 5000,
concurrency: 4,
parser: 'heuristic',
'output-root-command-name': 'cmdgraph',
'output-html-title': 'Git CLI Documentation',
'output-html-project-link': 'https://github.com/haoliangyu/cmdgraph',
'output-html-readme': './README.md',
'output-llms-txt-base-url': 'https://docs.example.com/git/',
'output-sitemap-base-url': 'https://docs.example.com/git/',
format: ['json', 'md', 'html', 'llms-txt', 'sitemap'],
})
console.log(generated.json)
console.log(generated.markdown)
console.log(generated.html)
console.log(generated.llmsTxt)
console.log(generated.sitemap)Library API notes:
introspect(command, options)returns{ tree, warnings }generate(command, options)returns{ tree, json?, markdown?, html?, llmsTxt?, sitemap?, warnings }- omit
maxDepth/max-depthto recurse until leaf commands; set it explicitly to cap traversal depth options.formatsupportsjson,md,html,llms-txt, andsitemap; pass an array for multiple outputs, and omit it to default to JSONoptions.output-root-command-nameoverrides the displayed root command name in generated outputsoptions.output-html-titlecustomizes the HTML page titleoptions.output-html-project-linkadds a project URL link to the HTML footeroptions.output-html-readmepoints to a.mdfile to render as a README section in HTML outputoptions.output-sitemap-base-urlis required forsitemap;options.output-llms-txt-base-urlcontrolsllms.txtlinksgenerateoptions align with CLI flag names:max-depth,timeout,concurrency,parser,format,output-root-command-name,output-html-title,output-html-project-link,output-html-readme,output-llms-txt-base-url, andoutput-sitemap-base-url- advanced injection (
executor,parserRegistry) is available for tests/custom integration
cmdgraph uses a plugin-based parser registry. You can force one with --parser, or let cmdgraph auto-detect.
heuristic: default and fallback parser; handles common help layouts (Usage,Commands,Options/Flags); recommended for most tools.oclif: parser for oclif-style CLIs (supports uppercase section blocks such asUSAGE,COMMANDS,FLAGS).commander: parser for Commander.js-style output (display help for command,output the version number).yargs: parser for yargs-style output (Show help,Show version number, type hints like[boolean]).cobra: parser for Cobra-style CLIs (Available Commands,Flags,Global Flags).thor: parser for Thor-style CLIs (Usage: ... COMMAND [ARGS],Commands/Tasksheadings, e.g. Bundler CLI).picocli: parser for picocli-style Java CLIs (Show this help message and exit.,Print version information and exit., e.g. Gradle).urfave-cli: parser for urfave/cli-style Go CLIs (NAME,USAGE,COMMANDS,GLOBAL OPTIONS).system-commandline: parser for .NET System.CommandLine CLIs (Usage:heading blocks andShow help and usage information).commandlineparser: parser for C# CommandLineParser CLIs (USAGE:,OPTIONS:,Display this help screen.,Display version information.).click: parser for Click-style output ([OPTIONS],Show this message and exit).typer: parser for Typer-style output (Click-based plus completion flags and boxed sections).clap: parser for clap-style output (Print help,Print version).argparse: parser for Python argparse-style output (usage:,show this help message and exit).
Parser selection behavior:
- If
--parseris provided, that parser is used. - Otherwise, parser
detect()methods are checked. - If nothing matches,
heuristicis used.
For cmdgraph generate git --format=json --format=md --format=html --format=llms-txt --format=sitemap --output-llms-txt-base-url=https://docs.example.com/git/ --output-sitemap-base-url=https://docs.example.com/git/ --output=./docs, you get:
docs/git.jsondocs/git.mddocs/index.htmldocs/llms.txtdocs/sitemap.xml
Why these formats:
- JSON is agent-ready because it is structured, stable, and easy to index, diff, validate, and consume in automation pipelines.
- Markdown is human-readable because it is hierarchy-first, scannable in docs/reviews, and works well in repos, wikis, and generated documentation sites.
- HTML is hosting-ready because it renders the canonical command tree into a single accessible page with dark mode support and navigation for static-site deployment.
llms.txtis explicit because it gives LLM crawlers a compact text map of the hosted documentation without embedding that responsibility into the HTML page itself.sitemap.xmlis explicit because search-engine discovery depends on deployable site URLs, not just local output files.
JSON shape:
{
"name": "git",
"description": "The stupid content tracker",
"version": "2.49.0",
"usage": "git [options] [command]",
"aliases": [],
"arguments": [],
"examples": [],
"options": [
{ "flag": "-h, --help", "description": "display help" }
],
"subcommands": ["add", "commit", "push"],
"path": ["git"],
"children": []
}Matching Markdown output:
# Command Documentation
## git
The stupid content tracker
**Usage:** `git [options] [command]`
**Version:** `2.49.0`
**Options**
- `-h, --help`: display help
**Subcommands**
- `add`
- `commit`
- `push`HTML output characteristics:
- generated as a single
index.htmlfile for static hosting - rendered from a React template via server-side rendering
- styled with Tailwind CSS and shadcn/ui-inspired component patterns
- modern light-green theme by default, with an accessible dark mode toggle
- includes client-side command filtering for large documentation pages
- includes crawlable semantic content, metadata, and structured data for search engines and LLM bots
Discovery artifact characteristics:
llms.txtis generated separately and lists the hosted documentation page plus command-level anchorssitemap.xmlis generated separately and requires--output-sitemap-base-urlso it contains valid deployable URLs- HTML output does not implicitly generate either file; request them explicitly with
--format=llms-txtand--format=sitemap
npm run build:docs:release now generates a JSON reference guide and places it inside the published package payload:
dist/agent-reference/cmdgraph.json(stable path for agents)
How to use it in an agent/tooling workflow:
- Install the package.
- Read
dist/agent-reference/cmdgraph.jsonfrom the installed package directory. - Use the command tree, options, and examples as the source of truth when generating or validating
cmdgraphusage.
Notes:
- The file is generated from live introspection of the built CLI.
- It is rebuilt on package release.
Run default tests (build + unit/integration):
npm testRun real CLI e2e tests:
npm run test:e2eWatch mode:
npm run test:watchCurrent test coverage includes:
- Executor behavior (success + timeout)
- Heuristic parser with common and real-world fixtures (
git,docker,kubectl,ghstyles) - Framework parser detection and parsing fixtures (
oclif,commander,yargs,cobra,thor,picocli,urfave-cli,system-commandline,commandlineparser,click,typer,clap,argparse) - Metadata extraction for aliases, arguments, and examples
- Library API tests for programmatic introspection and formatted output generation
- HTML formatter rendering and static site generation
- Explicit
llms.txtandsitemap.xmlgeneration and validation - Integration crawling against a real fixture executable
- End-to-end generation through built CLI, with auto-skip when target CLIs are unavailable (including Bundler/Thor-style, Gradle/picocli-style, urfave/cli-style Go CLIs, and C# System.CommandLine/CommandLineParser CLIs)
npm install
npm run build
npm run lint
npm test
npm run test:e2eFormat code:
npm run formatMIT