Canon

Automated code review that enforces your repo's existing patterns — so humans can focus on what matters.

Code reviews are the bottleneck of modern development. Teams ship faster than ever, but every PR still waits in a queue for a senior dev to check naming conventions, duplicated utilities, and style inconsistencies. That's hours of human time spent on things a machine can catch in seconds.

The problem gets worse with new team members. A developer joining a large codebase doesn't know that fetchUser should follow the getEntity pattern, or that there's already a formatCurrency helper three directories away. The result: review bouncebacks, frustration, and slower ramp-up — often weeks of back-and-forth before new devs internalize the unwritten rules.

Canon fixes this. It reads your codebase, understands its patterns, and reviews every PR against them — automatically. No configuration files to maintain, no linter rules to write. Canon learns directly from the code that already exists.

How it works

A PR is opened in any repo where Canon is installed
Canon fetches the diff and generates embeddings for the changed files
It combines semantic search and lexical matching to find the most relevant existing files in the repo — the ones that should inform how the new code is written
A RAG prompt is built with the diff, the similar files as context, team conventions (CANON.md), and any learned rules from past feedback
An LLM analyzes the diff for pattern deviations against the existing codebase
Inline review comments are posted on the exact lines, with one-click suggested fixes
When a human replies to a Canon comment — agreeing, correcting, or adding context — that feedback is captured and linked to the original finding
Once enough feedback accumulates, Canon distills it into learned rules via LLM: concise team preferences that are injected into all future reviews for that repo

The more your team interacts with Canon, the better it gets. No setup beyond installing the GitHub App — Canon works on day one and improves from day two.

See it in action

In this example PR, a new file is submitted with several pattern violations. Canon reviewed it automatically and flagged 4 findings (2 high, 2 medium), requesting changes:

🔴 Duplicated hashing logic — reimplements sha256Hash inline instead of importing the existing shared utility from src/lib/hash
🔴 Duplicated concurrency helper — rewrites mapWithConcurrency locally instead of using the one from src/lib/concurrency
🟡 Wrong naming convention — interfaces use snake_case (file_analysis_result) when the entire codebase uses PascalCase (IndexProgress, RepoParams)
🟡 Catch variable pattern — uses err instead of error in transaction catch blocks, breaking the repo's established convention

Each finding is posted as an inline comment on the exact diff line, with a one-click suggested fix. See the full review on the PR.

Key features

Pattern-aware reviews, not generic linting

Canon doesn't check for "best practices" — it checks for your practices. If your repo uses camelCase for services and snake_case for DB columns, Canon knows. If you have a retry() wrapper and someone reimplements retry logic, Canon catches it.

One-click suggested fixes

Every finding includes a GitHub suggestion block. Reviewers (or the author) can apply the fix with a single click — no copy-paste, no manual edits.

Learns from your team

When a human replies to a Canon comment (agreeing, disagreeing, or clarifying), Canon captures that feedback. After enough feedback accumulates, it distills team preferences into learned rules that improve future reviews. Canon gets smarter the more your team uses it.

CANON.md — codify your conventions

Run /canon init on any repo to automatically generate a CANON.md file that documents your team's patterns. Canon samples representative files, analyzes them with an LLM, and opens a PR with proposed conventions. Your team reviews, edits, and merges. From then on, Canon enforces those conventions with higher priority.

Multi-repo, single instance

One Canon deployment serves every repo where the GitHub App is installed. Embeddings are stored per-repo in PostgreSQL with pgvector. No per-repo configuration needed.

Smart review rounds

Canon tracks review iterations per PR. On each subsequent review, the confidence threshold increases — only surfacing new, high-confidence findings to avoid noise. Resets automatically or manually with /canon reset.

Automatic index updates

When code is pushed to the default branch, Canon incrementally updates its embeddings index. No manual reindexing needed — the similarity search stays current as the codebase evolves.

Commands

Canon responds to comments on PRs:

Command	Description
`/canon review`	Re-trigger a review on the current PR
`/canon index`	Full repo re-indexing (regenerate all embeddings)
`/canon init`	Analyze repo patterns and propose a `CANON.md` via PR
`/canon reset`	Reset review count and trigger a fresh review

Canon reacts with 👀 when processing and 🚀 when done.

Tech stack

Component	Technology
Runtime	Node.js + TypeScript (strict mode)
GitHub integration	Probot v14
LLM	OpenAI `gpt-5.3-codex` (structured outputs via Zod)
Embeddings	OpenAI `text-embedding-3-small`
Vector storage	PostgreSQL + pgvector (cosine similarity, HNSW index)
Schema validation	Zod (structured LLM outputs)
Tests	Jest + ts-jest (330 tests, 30 suites) + testcontainers
Containerization	Docker Compose

Supports both the Responses API (gpt-5.3-codex, o3, o4-mini) and Chat Completions API (gpt-4o, gpt-4.1, etc.) — auto-detected from the model name.

Architecture

┌───────────────────────────────────────────────────────────────────────┐
│                         GitHub Events                                 │
│                                                                       │
│ PR Opened  Comment (/canon *)  Push (default branch)  Reply to Canon  │
└──────┬──────────────┬─────────────────────┬──────────────────────┬────┘
       │              │                     │                      │
       ▼              ▼                     ▼                      ▼
┌─────────────────────────────┐   ┌─────────────┐    ┌──────────────────────┐
│     src/index.ts            │   │ push-handler│    │  feedback-handler    │
│     Event Router            │   │             │    │                      │
└──────────┬──────────────────┘   │ Incremental │    │ Links human replies  │
           │                      │ embedding   │    │ to original findings │
           ▼                      │ updates     │    │                      │
┌─────────────────────────────┐   └─────────────┘    │ Triggers distillation│
│     pr-handler.ts           │                      │ when threshold met   │
│     Review Orchestrator     │                      └───────────┬──────────┘
└──────────┬──────────────────┘                                  │
           │                                                     ▼
           │  ┌────────────────────────────────────────────────────────────────┐
           │  │                   PostgreSQL + pgvector                        │
           │  │                                                                │
           │  │  repo_embeddings     │ pr_reviews         │ pr_review_feedback │
           │  │  (owner, repo,       │ pr_review_findings │ repo_learned_rules │
           │  │   file_path,         │                    │                    │
           │  │   embedding,         │                    │                    │
           │  │   content_hash)      │                    │                    │
           │  └────────────────────────────────────────────────────────────────┘
           │
           ├─── 1. Fetch PR files ──────────────────── github-client.ts
           │
           ├─── 2. Parse diff ──────────────────────── diff-parser.ts
           │
           ├─── 3. Generate embeddings ─────────────── embeddings.ts ──► OpenAI Embeddings API
           │
           ├─── 4. Find similar files ──────────────── similarity.ts ──► pgvector cosine search 
           │
           ├─── 5. Fetch file contents ─────────────── github-client.ts
           │
           ├─── 6. Load context ────────────────────── CANON.md + learned rules
           │
           ├─── 7. Build RAG prompt ────────────────── prompt-builder.ts
           │
           ├─── 8. LLM analysis ────────────────────── reviewer.ts ────► OpenAI
           │
           ├─── 9. Post inline comments ────────────── commenter.ts
           │
           └── 10. Persist review ──────────────────── review-store.ts

Feedback loop

Human replies to Canon comment
        │
        ▼
feedback-handler.ts ──► storeFeedback() ──► pr_review_feedback table
        │
        │  (when feedback count >= DISTILL_THRESHOLD)
        ▼
distiller.ts ──► LLM summarizes feedback into rules
        │
        ▼
repo_learned_rules table ──► injected into future review prompts

Getting started

Prerequisites

Node.js >= 18
npm >= 9
OpenAI API key
GitHub account with permissions to create GitHub Apps
PostgreSQL with pgvector (provided via Docker Compose, or install manually)

1. Clone and install

git clone <repo-url>
cd canon
npm install

2. Create a GitHub App

Go to github.com/settings/apps and click New GitHub App
Fill in:
- Name: Canon (or any name)
- Homepage URL: any URL
- Webhook URL: for local dev, create a channel at smee.io/new and use that URL
- Webhook secret: generate a random string and save it
Set permissions:
- Pull requests: Read & Write
- Contents: Read & Write
- Issues: Read
Subscribe to events:
- Pull request
- Issue comment
- Push
- Pull request review comment
Click Create GitHub App
Note the App ID from the app page
Under Private keys, click Generate a private key — save the .pem file
Go to Install App and install on the repositories you want Canon to review

3. Configure environment

cp .env.example .env

Required

Variable	Description
`APP_ID`	GitHub App ID
`PRIVATE_KEY_PATH`	Path to the `.pem` file (local dev)
`WEBHOOK_SECRET`	Secret from GitHub App creation
`OPENAI_API_KEY`	OpenAI API key
`DATABASE_URL`	PostgreSQL connection string

In production, use PRIVATE_KEY (full .pem content as string) instead of PRIVATE_KEY_PATH. Probot reads PRIVATE_KEY first.

Optional

Variable	Default	Description
`PORT`	`3000`	Server port
`OPENAI_MODEL`	`gpt-5.3-codex`	Model for reviews
`EMBEDDING_MODEL`	`text-embedding-3-small`	Model for embeddings
`MIN_CONFIDENCE`	`0.9`	Minimum confidence to report a finding
`MAX_FINDINGS`	`10`	Max findings per review
`MAX_REVIEW_ROUNDS`	`3`	Max review iterations per PR
`TOP_K_SIMILAR`	`5`	Similar files used as context
`MAX_FILES_TO_EMBED`	`20`	Max files to embed per review
`MAX_REPO_FILES`	`10000`	Max files to traverse during indexing
`MAX_PROMPT_TOKENS`	`120000`	Token budget for the prompt
`RETRY_MAX_ATTEMPTS`	`3`	Retries for transient OpenAI errors
`RETRY_BASE_DELAY_MS`	`1000`	Base delay for exponential backoff
`EMBEDDING_DIMENSIONS`	auto	Override embedding dimensions
`DISTILL_THRESHOLD`	`5`	Feedback count before distilling learned rules

4. Run

Local development with Docker

docker compose up

Set WEBHOOK_PROXY_URL in .env to your smee.io channel
The GitHub App's Webhook URL must match the same channel
Open a PR in an installed repo to test

Production

npm run build
npm start

Scripts

Script	Description
`npm run dev`	Hot-reload dev server (nodemon watches `src/`)
`npm run build`	Compile TypeScript to `dist/`
`npm start`	Run compiled build with Probot
`npm run lint`	Type-check without compiling (`tsc --noEmit`)
`npm test`	Run all tests (330 tests, 30 suites)
`npm run test:integration`	Integration tests with real PostgreSQL (testcontainers)

Project structure

src/
├── index.ts                     # Entry point — routes GitHub events to handlers
├── config/
│   └── index.ts                 # Environment variable loading and validation
├── github/
│   ├── github-client.ts         # GitHub API helpers, CanonContext type
│   ├── pr-handler.ts            # Review orchestrator (the main flow)
│   ├── comment-handler.ts       # /canon commands and @canon mentions
│   ├── feedback-handler.ts      # Captures human replies to Canon comments
│   ├── init-handler.ts          # /canon init — generates CANON.md
│   └── push-handler.ts          # Incremental embedding updates on push
├── review/
│   ├── diff-parser.ts           # Parses PR diffs, builds valid line map
│   ├── prompt-builder.ts        # RAG prompt with numbered file content
│   ├── reviewer.ts              # LLM call with Zod structured outputs
│   └── commenter.ts             # Posts inline comments with deduplication
├── intelligence/
│   ├── embeddings.ts            # Generates and stores embeddings (pgvector)
│   ├── similarity.ts            # Cosine similarity search
│   ├── indexer.ts               # Full repo indexing
│   └── distiller.ts             # Distills feedback into learned rules
├── db/
│   ├── migrate.ts               # Schema migration (pgvector setup)
│   ├── review-store.ts          # Persists reviews + findings + feedback
│   ├── review-counts.ts         # Tracks review rounds per PR
│   └── learned-rules.ts         # Team preference rules storage
└── lib/
    ├── concurrency.ts           # Parallel execution with concurrency limit
    ├── content-cache.ts         # In-memory file content cache
    ├── db-client.ts             # PostgreSQL connection pool
    ├── hash.ts                  # SHA256 content hashing
    ├── openai-client.ts         # Singleton OpenAI client
    ├── openai-errors.ts         # Transient vs fatal error handling
    ├── openai-models.ts         # Model type detection
    ├── openai-runner.ts         # Dual-API text generation wrapper
    ├── retry.ts                 # Exponential backoff with jitter
    └── truncate.ts              # Text and token truncation

tests/                           # Mirrors src/ (30 suites, 330 tests)
tests/integration/               # Real PostgreSQL via testcontainers

Multi-repo

Canon is multi-repo by design. One deployment serves all repos where the GitHub App is installed. Embeddings are stored per-repo in PostgreSQL, keyed by (owner, repo, file_path) with SHA256 content hashing for cache invalidation. Run /canon index on any repo to index it upfront.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
migrations		migrations
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CANON.md		CANON.md
CLAUDE.md		CLAUDE.md
Dockerfile.dev		Dockerfile.dev
README.md		README.md
docker-compose.yml		docker-compose.yml
jest.config.ts		jest.config.ts
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
railway.json		railway.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canon

How it works

See it in action

Key features

Pattern-aware reviews, not generic linting

One-click suggested fixes

Learns from your team

CANON.md — codify your conventions

Multi-repo, single instance

Smart review rounds

Automatic index updates

Commands

Tech stack

Architecture

Feedback loop

Getting started

Prerequisites

1. Clone and install

2. Create a GitHub App

3. Configure environment

Required

Optional

4. Run

Local development with Docker

Production

Scripts

Project structure

Multi-repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Canon

How it works

See it in action

Key features

Pattern-aware reviews, not generic linting

One-click suggested fixes

Learns from your team

CANON.md — codify your conventions

Multi-repo, single instance

Smart review rounds

Automatic index updates

Commands

Tech stack

Architecture

Feedback loop

Getting started

Prerequisites

1. Clone and install

2. Create a GitHub App

3. Configure environment

Required

Optional

4. Run

Local development with Docker

Production

Scripts

Project structure

Multi-repo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages