prlearn is a local-first CLI that learns from your GitHub pull requests. It stores PRs, review feedback, comments, files, commits, check failures, extracted learning cards, and evidence in SQLite, then exports compact personal coding memory files for agents and editors.
Runtime dependencies are intentionally minimal: Python 3.11+, SQLite from the standard library, and git. Live sync can use a stable GitHub App installation or the GitHub CLI gh fallback. Fixture sync and the default heuristic extractor work without GitHub credentials, Ollama, Codex, or an OpenAI API key.
For a full first-run guide, see docs/local-setup.md. For maintainer publication checks, see docs/public-release.md and SECURITY.md.
Install from GitHub for now:
python3 -m venv .venv
. .venv/bin/activate
python -m pip install "prlearn @ git+https://github.com/0xLLM73/prlearn.git@v0.1.0"prlearn is not published to PyPI yet. Until there is a signed release workflow
with trusted publishing, GitHub tags are the supported public distribution path.
python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[test]"You can also run it directly from the repo:
python -m prlearn --help
python -m pytest -qThe only runtime package dependency is cryptography; the test extra adds pytest.
If your system has a python command that points to Python 3.11 or newer, you can use python -m ... after activating the venv. The commands below use python inside the venv.
Copy .env.example only as a template. Keep real environment values in your shell, scheduler, or private secret manager, not in git.
python -m prlearn init
python -m prlearn doctor
python -m prlearn daily --fixture tests/fixtures/github_small.json
python -m prlearn list
python -m prlearn exportThe default home is ~/.prlearn and the default database is ~/.prlearn/prlearn.db. Every command accepts --home and --db overrides.
Use this loop when you want prlearn to help before coding or reviewing:
python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[test]"
python -m prlearn init
python -m prlearn doctor
python -m prlearn sync --author @me --limit 50
python -m prlearn extract --engine heuristic
python -m prlearn review --limit 10
python -m prlearn accept <id>
python -m prlearn preflight
python -m prlearn exportKeep the deterministic heuristic engine as the first local path. Use hybrid, ollama, codex, or openai only after the fixture workflow is passing and the chosen model provider is available.
For fixture-only validation without GitHub:
python -m prlearn daily --fixture tests/fixtures/github_small.json
python -m prlearn daily --fixture tests/fixtures/github_small.json
python -m prlearn daily --fixture tests/fixtures/github_incremental.json
python -m prlearn review --limit 10
python -m prlearn preflight --task "build a Stripe webhook handler"
python -m prlearn exportpreflight is the main day-to-day command. It detects the current git repo, branch, changed files, optional paths, and task text, then ranks accepted lessons by specificity, recurrence, evidence quality, repo/path/language/topic match, recency, and generic-card penalties.
prlearn initcreates the home directory, config, logs, reports, exports, bin directory, SQLite DB, and schema migrations. It is idempotent.prlearn doctorchecks Python, DB writability,git, GitHub App auth,gh,gh auth status, optional Ollama reachability/model availability, optional Codex login, optional OpenAI API key configuration, optionalsqlite-vec, and scheduler status. Missing GitHub auth or model-provider auth is reported with next steps but does not block fixture or heuristic workflows.prlearn syncsyncs from GitHub or a fixture. Use--fixture pathto avoid live GitHub calls. Live--incrementalsync usessync_state.last_successful_sync_atwith a default 3-day lookback. Fixture incremental replay stays deterministic unless you pass an explicit--since.prlearn seedinventories accessible repositories, prioritizes active repos, backfills PRs first, optionally scans commit history for high-signal fixes/reverts/security changes, stores resumable repo cursors, and creates reviewable candidates.prlearn extractcreates learning output from dirty PR evidence. Use--engine heuristic|ollama|hybrid|codex|openaior the alias--provider.prlearn dedupemerges repeated learning patterns by canonical key and attaches new evidence instead of creating duplicate cards.prlearn dailyruns doctor, incremental sync, extract, dedupe, export, and report with a lock file and daily log. It finishes with a concise summary and next command. Use--engine heuristic|ollama|hybrid|codex|openaior--provider.prlearn compareruns the same fixture PR corpus through multiple engines and prints comparable cards/candidates.prlearn compare-liveruns live GitHub account seeding in isolated homes for multiple engines and prints comparable candidates without mixing provider state.prlearn privacy status|encrypt-raw|decrypt-rawinspects or encrypts sensitive stored raw GitHub payloads.prlearn listlists learning cards with status, tag, repo, JSON, limit, accepted/pending/rejected/archive shortcuts, usefulness scores, and verbose evidence summaries.prlearn review,accept,reject,archive, andmergemanage card status.accept-candidate,reject-candidate, andmerge-candidatemanage model learning candidates.review --json,--ready, and--min-scorekeep review scriptable.prlearn telegram status|send|poll|ratesends pending learning candidates to Telegram and records ratings as major, minor, or not important.prlearn preflight [paths...] --paths ... --verbose --jsonranks relevant accepted or pending learnings before coding.prlearn insightsreports whetherprlearnis learning useful things: accepted/pending counts, recurring lessons, generic cards, weak-evidence cards, ready candidates, and suggested next commands. Add--actionableto route learnings into "do before next PR", "recurring risks", "needs human review", and "low-value or stale" buckets.prlearn exportwritesLEARNINGS.md,context.json, andrules.md. The markdown memory file is concise, accepted-only by default, grouped by category, and avoids evidence dumps.prlearn scheduleprints or installs a daily scheduler definition.
python -m prlearn preflight
python -m prlearn preflight src/foo.py
python -m prlearn preflight --paths src/foo.py tests/test_foo.py
python -m prlearn preflight --task "add an API webhook handler"
python -m prlearn preflight --verbose
python -m prlearn preflight --jsonDefault output is intentionally short: title, why it matters, why it was selected, and a compact checklist. --verbose adds redacted evidence snippets and PR references. --json returns the detected git context and ranked lessons for editor/agent integrations.
heuristicis the default and preserves the original deterministic fixture workflow.hybridattempts Ollama candidate extraction first, then falls back to the heuristic path if Ollama is unavailable.ollamais strict: missing Ollama or a missing configured model returns a clear non-zero error.codexis strict: it invokescodex execnon-interactively through the local Codex CLI login and stores the output as reviewable candidates.openaiis strict: it calls the OpenAI Responses API with structured JSON output and stores the result as reviewable candidates.
By default, Ollama extraction only runs when prlearn finds primary learning evidence such as review feedback, failed checks, high-signal PR context, or check annotations. If your PR history has sparse review feedback and you want reviewable model suggestions from PR bodies, patches, and check context, opt in explicitly:
python -m prlearn extract --engine hybrid --allow-context-only
python -m prlearn review --candidates --limit 10Context-only candidates are still pending review; they are not exported until accepted.
Use insights --actionable after sync/extract/review to see what should affect behavior now:
python -m prlearn insights --actionable
python -m prlearn insights --actionable --jsonThe actionable report groups output into:
do_before_next_pr: accepted rules that should influence preflight and the next PR.recurring_risks: patterns seen more than once, promoted even if the individual event looked minor.needs_human_review: pending cards or candidates that need accept/reject/merge decisions.low_value: generic, weak-evidence, rejected, or stale items to clean up.
Telegram ratings and review actions feed back into this ranking. A major rating increases actionability, a minor rating keeps the item useful but lower priority, and a not-important rating pushes the item toward low-value cleanup.
Ollama settings live in ~/.prlearn/config.json under the ollama key. Defaults use http://127.0.0.1:11434, require localhost, use qwen3.5:9b as the extractor model, and set deterministic model options. With require_localhost=true, only localhost, 127.0.0.1, and ::1 are accepted. All-interface or remote URLs such as 0.0.0.0 require explicitly setting require_localhost=false. doctor reports installed/missing models and suggests commands such as ollama pull qwen3.5:9b.
Ollama output is stored as reviewable learning_candidates, not direct accepted cards. Exports include accepted learning_cards by default; pending or rejected candidates do not appear in rules.md.
Ollama is optional. heuristic works without Ollama. hybrid falls back to heuristic if local Ollama is unavailable. ollama is strict and fails clearly if the local model is missing.
For unattended daily sync, prefer GitHub App installation auth over gh auth. A GitHub App can be installed on selected repositories, uses fine-grained permissions, and lets prlearn mint short-lived installation tokens on demand.
Create a GitHub App in GitHub with these repository permissions:
- Metadata: read
- Pull requests: read
- Issues: read
- Contents: read
- Checks: read
After creating the app, install it on the repositories you want prlearn to learn from and generate a private key. The app ID is shown on the app's settings page. The installation ID is the numeric ID in the installation settings URL after you install the app.
Store the private key outside the repo, for example:
mkdir -p ~/.prlearn/keys
mv ~/Downloads/*.private-key.pem ~/.prlearn/keys/prlearn-github-app.pem
chmod 600 ~/.prlearn/keys/prlearn-github-app.pemConfigure the daily environment:
export PRLEARN_GITHUB_APP_ID="123456"
export PRLEARN_GITHUB_INSTALLATION_ID="98765432"
export PRLEARN_GITHUB_APP_PRIVATE_KEY_FILE="$HOME/.prlearn/keys/prlearn-github-app.pem"
export PRLEARN_GITHUB_AUTHOR="your-github-login"PRLEARN_GITHUB_AUTHOR is required when commands use the default --author @me, because installation tokens are repository-scoped and do not represent a specific user. You can also pass --author your-github-login.
Verify the connection:
python -m prlearn doctor --json
python -m prlearn sync --github-auth app --author your-github-login --limit 5 --json--github-auth auto is the default: it uses GitHub App auth when the required app settings are present, otherwise it falls back to gh. Use --github-auth app in daily automation if you want failures to be explicit instead of silently falling back.
Use seed for an initial account backfill or an occasional deeper refresh. It first inventories repositories available to the configured GitHub identity, skips archived/fork/inactive repos by default, scores the remaining repos by recent activity, then backfills PR evidence before using commit messages as a secondary signal. Commit-derived items are stored as pending learning_candidates with evidence, so they are reviewable before they become exported rules.
Start with a bounded fixture smoke:
python -m prlearn seed --fixture tests/fixtures/github_seed.json --engine heuristic --json
python -m prlearn review --candidates --jsonFor a live GitHub App backfill:
python -m prlearn seed \
--github-auth app \
--author your-github-login \
--lookback-days 180 \
--max-repos 25 \
--max-prs 500 \
--max-commits-per-repo 50 \
--engine hybrid \
--allow-context-only \
--jsonThe caps are safety bounds for one run, not daily learning limits. For a busy day, daily automation should use prlearn daily without a low --limit; seed can be rerun later with a wider --lookback-days, --include-inactive, --include-forks, or specific --repo owner/name filters when you want older history.
For model-provider backfills, --max-model-prs limits only the number of PRs sent to the model in that run. It does not limit GitHub sync, repository inventory, commit seeding, or future daily runs.
Codex is the hosted no-API-key provider path. It uses the local Codex CLI login, so codex login can authenticate through the ChatGPT/Codex browser OAuth flow and cache the refreshed credentials locally. No OPENAI_API_KEY is required for the codex engine.
Codex settings live in ~/.prlearn/config.json under the codex key. Defaults use the codex command, the user's existing Codex CLI login, gpt-5.5, and low reasoning effort. Set PRLEARN_CODEX_MODEL or codex.model to change the model, and set PRLEARN_CODEX_REASONING_EFFORT or codex.reasoning_effort to change reasoning. Codex requests use the same JSON schema and reviewable candidate storage as Ollama. Unlike Ollama, Codex prompts keep repo, PR, path, URL, and actor metadata readable by default because the provider already has code-writing context; credential-shaped secrets are still scrubbed before requests and storage.
To compare the default 5 fixture PRs across deterministic heuristics, local Ollama, and Codex:
python -m prlearn compare --engines heuristic,ollama,codex --jsonThe comparison command runs each engine in an isolated home and reports cards, candidates, counts, and provider errors independently. Use --keep-home if you want to inspect each provider's SQLite database after the run.
To compare live account output through the no-key Codex OAuth path:
python -m prlearn compare-live \
--github-auth app \
--author your-github-login \
--lookback-days 14 \
--max-repos 3 \
--max-prs 50 \
--max-model-prs 5 \
--engines heuristic,ollama,codex \
--allow-context-only \
--keep-home \
--jsonThe direct openai engine is optional and only for environments that already have an OpenAI API key. If you want the website OAuth flow instead, use the codex engine above.
OpenAI settings live in ~/.prlearn/config.json under the openai key. Defaults use OPENAI_API_KEY, https://api.openai.com/v1, gpt-5.5, low reasoning effort, and 2048 max output tokens. The openai engine uses the same prompt, JSON schema, candidate validation, and per-PR candidate cache as Ollama and Codex.
export OPENAI_API_KEY="..."
python -m prlearn doctor --json
python -m prlearn extract --engine openai --max-model-prs 5 --json
python -m prlearn review --candidates --jsonTo compare the direct OpenAI API provider without mixing provider databases:
python -m prlearn compare-live \
--github-auth app \
--author your-github-login \
--lookback-days 14 \
--max-repos 3 \
--max-prs 50 \
--max-model-prs 5 \
--engines heuristic,ollama,openai \
--allow-context-only \
--keep-home \
--jsonFor a daily unattended learning job, use GitHub App auth and avoid a low PR sync limit. A practical default is:
python -m prlearn daily \
--github-auth app \
--author your-github-login \
--lookback-days 3 \
--engine hybrid \
--jsonIf you have an OpenAI API key and want direct API provider coverage in the daily job, use --engine openai --max-model-prs N where N is your daily budget for model-reviewed PRs. The sync still records every changed PR, and later runs can process the backlog with a wider --max-model-prs or a targeted --repo owner/name backfill.
If you do not have an OpenAI API key, use the Codex OAuth path instead:
codex login
python -m prlearn daily \
--github-auth app \
--author your-github-login \
--lookback-days 3 \
--engine codex \
--max-model-prs 5 \
--jsonThe sync still records every changed PR; --max-model-prs only controls how many PRs are sent to Codex in that run.
Telegram review is optional and uses the Bot API through local CLI commands. It does not store the bot token in SQLite. Set credentials in your shell:
export PRLEARN_TELEGRAM_BOT_TOKEN="..."
export PRLEARN_TELEGRAM_CHAT_ID="..."
python -m prlearn telegram statusCreate the bot with BotFather and send the bot one message from the target chat. If you do not know the chat ID yet, run:
export PRLEARN_TELEGRAM_BOT_TOKEN="..."
python -m prlearn telegram chatsUse the printed chat ID as PRLEARN_TELEGRAM_CHAT_ID. Keep both values out of committed files.
Send pending candidates with inline rating buttons:
python -m prlearn telegram send --limit 10 --readyAfter tapping a button in Telegram, poll once to apply the rating:
python -m prlearn telegram poll --timeout 10Reviewed Telegram messages are deleted after the rating is recorded, so completed reviews do not build up in the chat. Use --keep-reviewed if you want to keep those messages for debugging.
Ratings map to product actions:
Major learning: accepts the candidate and marks the resulting card as high severity.Minor learning: accepts the candidate and marks the resulting card as low severity.Not important: rejects the candidate with a Telegram rating reason.
For local testing or manual rating without Telegram network calls:
python -m prlearn telegram rate <candidate_id> major
python -m prlearn telegram rate <candidate_id> minor
python -m prlearn telegram rate <candidate_id> not_importantBy default, prlearn keeps normalized learning data readable and stores GitHub raw JSON payloads as plaintext so fixture development is simple. To hide raw PR payloads, patches, review payloads, and check payloads at rest, set a passphrase and encrypt the raw columns:
export PRLEARN_PASSPHRASE="use-a-long-local-passphrase"
python -m prlearn privacy encrypt-raw
python -m prlearn privacy statusAfter encrypt-raw, future sync and extraction runs keep prs.raw_json, events.raw_json, pr_files.raw_json, and check_runs.raw_json encrypted. Commands that need those raw payloads require the same PRLEARN_PASSPHRASE, or --prompt-passphrase.
You usually do not need to decrypt the database to research old PRs; keep it encrypted and unlock per command with the passphrase. privacy decrypt-raw is available as a recovery/export escape hatch and turns raw payload storage back to plaintext in config.json.
This is field-level encryption for raw GitHub payloads, not full SQLite database encryption. Learning cards, normalized columns, reports, and exports remain readable unless a later SQLCipher-style full database lock is added.
Exports, reports, prompts, and preflight output use normalized/redacted learning evidence. They do not read or print encrypted raw payload blobs.
python: command not found: create a venv withpython3 -m venv .venv, activate it, then usepython -m ...inside the venv.ModuleNotFoundError: cryptography: install the package withpython -m pip install -e .or install test dependencies withpython -m pip install -e ".[test]".No module named pytest: install the test extra withpython -m pip install -e ".[test]".gitmissing: install Git and rerunpython -m prlearn doctor.- GitHub App unavailable: check
PRLEARN_GITHUB_APP_ID,PRLEARN_GITHUB_INSTALLATION_ID,PRLEARN_GITHUB_APP_PRIVATE_KEY_FILE, andPRLEARN_GITHUB_AUTHOR, then runpython -m prlearn doctor --json. ghmissing or unauthenticated: configure a GitHub App for stable sync, or install GitHub CLI and rungh auth login. Fixture sync and heuristic extraction still work without GitHub auth.- Fixture path errors: run commands from the repository root or pass an absolute fixture path.
- Ollama unavailable or model missing: use
--engine heuristicfor the deterministic local path, or install Ollama and pull the models suggested bypython -m prlearn doctor --json. - Codex unavailable or unauthenticated: install the Codex CLI, run
codex login, then rerunpython -m prlearn doctor --json. - OpenAI unavailable or unauthenticated: set the environment variable named by
openai.api_key_envin~/.prlearn/config.jsonand rerunpython -m prlearn doctor --json. - Telegram bot errors: confirm
PRLEARN_TELEGRAM_BOT_TOKENandPRLEARN_TELEGRAM_CHAT_IDare set, that you have messaged the bot at least once, and that you runtelegram pollafter tapping rating buttons. - Missing
PRLEARN_PASSPHRASE: set it before encrypted raw-payload operations, or pass--prompt-passphraseon privacy commands that support prompting. - Stale daily lock:
dailyremoves stale locks when the recorded process is gone. If a lock remains after a crash, confirm noprlearn dailyprocess is running before deleting~/.prlearn/prlearn.lock. - Sharing support output: redact
doctor --json, livelist --json, preflight, reports, and exports before posting them publicly. They can include repository names, branch names, PR URLs, paths, actor names, or local environment details.
The fixture acceptance loop is:
TMP_HOME="$(mktemp -d)"
python -m prlearn daily --home "$TMP_HOME" --fixture tests/fixtures/github_small.json --json
python -m prlearn daily --home "$TMP_HOME" --fixture tests/fixtures/github_small.json --json
python -m prlearn daily --home "$TMP_HOME" --fixture tests/fixtures/github_incremental.json --json
python -m prlearn list --home "$TMP_HOME" --jsonThe second run must not create duplicate events or learning cards. The incremental run should increase the recurrence/evidence count for the existing null/empty-state testing lesson.
You can run the same deterministic quality gate directly:
python -m prlearn eval \
--fixture tests/fixtures/github_small.json \
--incremental-fixture tests/fixtures/github_incremental.json \
--jsonprlearn eval runs the fixture loop in an isolated home, verifies expected card count, idempotency, recurrence behavior, expected lesson titles, and export generation. CI runs this command after the test suite.
Run the full local validation suite:
python -m pytest -q
python -m prlearn eval \
--fixture tests/fixtures/github_small.json \
--incremental-fixture tests/fixtures/github_incremental.json \
--json