RepoMind AI

Offline AI-Powered Repository Intelligence Platform

RepoMind AI turns a GitHub URL, ZIP, or local repository into an evidence-backed intelligence workspace: architecture maps, dependency views, security findings, CTO/recruiter reports, technical debt summaries, and cited repository chat.

It runs locally. The only generation model is:

${FORGE_MODELS}/qwen-judge

No model routing. No cloud LLM fallback. No mock answers.

What It Does

RepoMind AI analyzes real repositories and keeps the result after the source checkout is deleted.

Ingest a GitHub repository, ZIP archive, or local folder.
Parse Python, JavaScript, TypeScript, JSX, and TSX with AST extraction.
Build repository summaries, route maps, dependency graphs, security findings, and score evidence.
Index code with BAAI/bge-small-en-v1.5 embeddings in ChromaDB.
Answer repository questions with retrieval, reranking, citations, Mermaid diagrams, and qwen-judge inference.
Generate CTO, recruiter, security, technical debt, roadmap, and project status reports.
Persist metadata, vector indexes, and reports.
Delete cloned repository contents after analysis to avoid repository accumulation.

Architecture

Executive Architecture

flowchart LR
  Frontend["Frontend Workspace"] --> Backend["FastAPI Backend"]
  Backend --> Analysis["Analysis Engine"]
  Analysis --> Vector["Chroma Vector Store"]
  Analysis --> LLM["qwen-judge Local LLM"]
  Backend --> Reports["Report Artifacts"]
  Vector --> Chat["Cited Repository Chat"]
  LLM --> Reports
  LLM --> Chat

Service Architecture

flowchart LR
  Ingestion["Repository Ingestion Service"] --> AST["AST Analysis Service"]
  AST --> Dependency["Dependency Engine"]
  AST --> Security["Security Engine"]
  AST --> RAG["RAG Engine"]
  Dependency --> Report["Report Engine"]
  Security --> Report
  RAG --> Report
  RAG --> Chat["Repository Answer Engine"]

Analysis Pipeline

flowchart TD
  Source["GitHub / ZIP / Local Path"] --> Filter["Ignore generated folders"]
  Filter --> Parse["Tree-sitter + Python AST parsing"]
  Parse --> Extract["Functions, classes, methods, imports, routes, models"]
  Extract --> Graph["Dependency graph"]
  Extract --> Security["Bandit + Semgrep + custom rules"]
  Graph --> Scores["Evidence-backed scores"]
  Security --> Scores
  Scores --> Reports["Markdown reports"]

RAG Pipeline

flowchart LR
  Question["Repository question"] --> Embed["BGE-small embedding"]
  Embed --> Chroma["ChromaDB search"]
  Chroma --> Rerank["Lexical + path-aware reranking"]
  Rerank --> Evidence["Cited evidence chunks"]
  Evidence --> Qwen["qwen-judge"]
  Qwen --> Answer["Structured answer + diagram + risks"]

Repository Lifecycle

flowchart LR
  Clone["Clone/import repository"] --> Analyze["Analyze + index"]
  Analyze --> Persist["Persist metadata, vectors, reports"]
  Persist --> Delete["Delete repository contents"]
  Delete --> Retain["Retain reports and indexes"]
  Retain --> Cleanup["Scheduled retention cleanup"]

Architecture Experience

The Architecture tab is designed as the hero surface:

Executive Architecture: business-system view with no files.
Service Architecture: ingestion, AST analysis, dependency, security, RAG, and reporting services.
Module Architecture: collapsed module groups with optional expansion.
Implementation Architecture: files, routes, symbols, and imports only when debugging.

The graph uses React Flow, ELK auto-layout, Dagre fallback, animated edges, minimap, zoom, search, focus, fullscreen, service icons, and hover impact cards.

Dependency Explorer

The dependency view groups code by architectural layer:

Frontend
API
Business Logic
Analysis
RAG
Storage
LLM

This avoids file-level hairballs while still exposing critical paths and module details.

Repository Chat

Example question:

How does authentication work?

RepoMind AI answers with:

Direct answer
Architecture impact
Critical files
Rendered Mermaid diagram
Risks
Improvements
Citations

If evidence does not exist, the product says so. For RepoMindAI itself, authentication is correctly reported as not implemented.

Security View

Security analysis combines:

Bandit findings
Semgrep findings
Custom repository rules
Evidence-backed scoring
Positive and negative score contributors

Why It Is Different

Tool	Strength	RepoMind AI difference
Sourcegraph	Enterprise-scale code search	RepoMind AI focuses on local repository intelligence, generated reviews, diagrams, and offline model inference.
Cursor	IDE-native code assistance	RepoMind AI is a repository-level audit and showcase surface, not an editor autocomplete loop.
GitHub code search	Fast symbol/text lookup	RepoMind AI adds AST extraction, vector retrieval, reports, security scoring, architecture maps, and cited answers.

Features

Local-first qwen-judge inference.
Real BGE embeddings and ChromaDB vector search.
AST parsing for Python, TypeScript, JavaScript, TSX, and JSX.
Architecture diagrams at executive, service, module, and implementation levels.
Layered dependency graph with search, minimap, zoom, and focus.
Cited repository chat with answer-quality guardrails.
Security audit with Bandit, Semgrep, and custom rules.
Evidence-backed security, production, maintainability, CTO, and recruiter scores.
Generated CTO review, recruiter review, roadmap, security report, technical debt report, and project status.
Post-analysis repository deletion with retained metadata, vectors, and generated reports.

Example Questions

How does authentication work?
Where are API routes defined?
What services talk to the database?
How is repository cleanup handled?
What would prevent this project from production deployment?
Which files are most important for the RAG pipeline?

Technical Deep Dive

Parsing

RepoMind AI extracts implementation facts instead of relying on regex-only source scans. It captures imports, exports, classes, functions, methods, routes, environment variables, and database models, then turns those facts into dependency and architecture evidence.

Retrieval

Repository chunks are embedded with BAAI/bge-small-en-v1.5 and stored in ChromaDB. Retrieval combines vector search, lexical reranking, topic-specific boosts, pinned evidence paths, and citation metadata.

Architecture Extraction

The UI intentionally separates abstraction levels:

executive and service views show systems and services;
module view groups code ownership areas;
implementation view exposes concrete files and symbols.

This prevents the common failure mode where architecture diagrams become unreadable file graphs.

Security Analysis

Findings from Bandit, Semgrep, and custom rules are normalized into one security score. Scores include positive contributors, negative contributors, and a calculation explanation.

Benchmarks

Benchmarks were run on real repositories with ingestion, analysis, BGE embeddings, ChromaDB indexing, qwen-judge report generation, qwen-judge explainers, and cleanup verification.

Repository	Analysis	Indexing	Report generation	Files	Indexed chunks	Retrieval	Cleanup
FastAPI	214.669s	34.913s	75.228s	2,748	10,862	auth/routing/db strong	passed
Flask	71.975s	1.940s	67.421s	231	857	auth/db strong, routing partial	passed
Next.js	200.799s	92.848s	65.757s	25,024	50,996	auth/routing/db strong	passed
RepoMindAI	84.715s	9.770s	73.348s	66	220	auth/routing/db partial	passed

Full details: BENCHMARK_RESULTS.md.

Screenshots

View	Screenshot
Dashboard	`screenshots/dashboard-overview.png`
Architecture	`screenshots/architecture-view.png`
Dependencies	`screenshots/dependency-view.png`
Security	`screenshots/security-view.png`
Repository Chat	`screenshots/repository-chat.png`

Installation

Requirements

Python 3.11+
Node.js 18+
Local model at ${FORGE_MODELS}/qwen-judge
CUDA-capable GPU recommended for qwen-judge

Backend

cd ${PROJECT_ROOT}
make setup
PYTHONPATH=backend .venv/bin/uvicorn repomind.main:app --host 0.0.0.0 --port 8000

Frontend

cd ${PROJECT_ROOT}/frontend
npm install
npm run build
cp -R .next/static .next/standalone/.next/static
HOSTNAME=0.0.0.0 PORT=3000 node .next/standalone/server.js

Open:

http://localhost:3000

Configuration

Repository cleanup is controlled by environment settings:

AUTO_DELETE_AFTER_ANALYSIS=true
RETENTION_MINUTES=1440

Validation

PYTHONPATH=backend .venv/bin/ruff check backend tests scripts
PYTHONPATH=backend .venv/bin/pytest tests/api/test_api.py::test_health_endpoint tests/unit/test_utils_and_parsing.py -q
cd frontend && npm run build

Current Release Status

RepoMind AI is a strong local AI engineering showcase, but it is not presented as a fully managed SaaS product.

Known release gaps:

Next.js/PostCSS audit advisories remain until a safe framework upgrade.
qwen-judge report generation is accurate but slow.
Browser test coverage should be expanded beyond screenshot generation.
Public setup still assumes the local Forge model path.

See GITHUB_RELEASE_CHECKLIST.md and PRODUCT_REVIEW.md.

Roadmap

Stream ingestion and report-generation progress.
Add Playwright regression tests for key UI surfaces.
Improve report post-processing for even tighter executive prose.
Add benchmark trend history.
Reduce large-repo indexing latency.
Package model-path configuration for easier public setup.

Contributing

Contributions should preserve the core constraint: repository intelligence must be evidence-backed and generated from the configured local model only.

Useful contribution areas:

UI polish and accessibility
Retrieval quality tests
Benchmark harness improvements
Security rule coverage
Documentation and examples

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
backend/repomind		backend/repomind
data/validation		data/validation
docs		docs
frontend		frontend
reports		reports
sample_repos/python_fastapi_example		sample_repos/python_fastapi_example
screenshots		screenshots
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE_QUALITY_REVIEW.md		ARCHITECTURE_QUALITY_REVIEW.md
BENCHMARK_RESULTS.md		BENCHMARK_RESULTS.md
BUG_REPORT.md		BUG_REPORT.md
CI_FIX_REPORT.md		CI_FIX_REPORT.md
CI_ROOT_CAUSE.md		CI_ROOT_CAUSE.md
Dockerfile.backend		Dockerfile.backend
FINAL_VERIFICATION_REPORT.md		FINAL_VERIFICATION_REPORT.md
GITHUB_RELEASE_CHECKLIST.md		GITHUB_RELEASE_CHECKLIST.md
INSTALL_PLAN.md		INSTALL_PLAN.md
LICENSE		LICENSE
LINKEDIN_POST.md		LINKEDIN_POST.md
LOCAL_MACHINE_AUDIT.md		LOCAL_MACHINE_AUDIT.md
MODEL_BENCHMARK.md		MODEL_BENCHMARK.md
MODEL_VALIDATION.md		MODEL_VALIDATION.md
Makefile		Makefile
PATH_AUDIT.md		PATH_AUDIT.md
PATH_FIX_REPORT.md		PATH_FIX_REPORT.md
PRODUCT_REVIEW.md		PRODUCT_REVIEW.md
PROJECT_HIGHLIGHTS.md		PROJECT_HIGHLIGHTS.md
README.md		README.md
REALITY_CHECK.md		REALITY_CHECK.md
RELEASE_CANDIDATE_REPORT.md		RELEASE_CANDIDATE_REPORT.md
SELF_ANALYSIS_QUALITY.md		SELF_ANALYSIS_QUALITY.md
TRUTH_REPORT.md		TRUTH_REPORT.md
UI_AUDIT.md		UI_AUDIT.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

RepoMind AI

What It Does

Architecture

Executive Architecture

Service Architecture

Analysis Pipeline

RAG Pipeline

Repository Lifecycle

Architecture Experience

Dependency Explorer

Repository Chat

Security View

Why It Is Different

Features

Example Questions

Technical Deep Dive

Parsing

Retrieval

Architecture Extraction

Security Analysis

Benchmarks

Screenshots

Installation

Requirements

Backend

Frontend

Configuration

Validation

Current Release Status

Roadmap

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages