A Multi-Agent System for Improving the Quality, Discoverability, and Credibility of AI/ML Repositories.
Publication Assistant for AI Projects is an advanced multi-agent AI system that analyzes a GitHub repository and automatically generates high-quality publication improvements, including:
- A clearer, more engaging README
- Better project titles and metadata
- Discoverability improvements (tags, keywords)
- Structural and documentation recommendations
- Automated fact-checking of technical claims
The system is built using LangGraph orchestration, integrates multiple specialized agents, and leverages tool-augmented reasoning to go far beyond basic LLM text generation. This project was developed as part of the Mastering AI Agents program and demonstrates real-world, production-style agent collaboration.
This project demonstrates mastery of core AI-agent concepts. Here's a breakdown of the design and architecture:
- Multiple agents with distinct responsibilities
- Clear handoff of state and artifacts between agents
- Coordinated execution through a shared orchestration graph
- Workflow implemented using LangGraph
- Deterministic execution order with shared state
- Modular, extensible pipeline design
- Each agent is tool-augmented
- Tools extend agent capabilities beyond text generation
- Graceful fallbacks when optional tools are unavailable
The system is composed of five core agents, each with a focused role:
| Agent | Responsibility |
|---|---|
| RepoAnalyzerAgent | Parses repository structure, README, and code statistics |
| MetadataRecommenderAgent | Suggests project titles, tags, and short descriptions |
| ContentImproverAgent | Rewrites and improves README using RAG + web examples |
| ReviewerCriticAgent | Scores documentation quality and flags issues |
| FactCheckerAgent | Verifies technical claims using arXiv |
All agents are coordinated using a LangGraph StateGraph, ensuring clean, reproducible execution.
graph TD
A["🔍 Repo Analysis"] --> B["🏷️ Metadata Recommendation"]
B --> C["✍️ Content Improvement (RAG + Web Search)"]
C --> D["🧐 Review & Critique"]
D --> E["📚 Fact Checking"]
E --> F["✅ Final Report"]
style A fill:#e1f5fe,stroke:#0288d1,stroke-width:2px;
style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px;
style C fill:#e8f5e9,stroke:#388e3c,stroke-width:2px;
style D fill:#fce4ec,stroke:#c2185b,stroke-width:2px;
style E fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px;
style F fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px;
Each step enriches the shared state and passes structured outputs to the next agent.
This project integrates five tools, including both built-in and custom implementations:
| Tool | Purpose |
|---|---|
| RepoParser | Reads local, ZIP, or remote GitHub repositories |
| KeywordExtractor (Gemini / Heuristic) | Extracts technical keywords |
| TavilySearchTool | Finds similar successful repositories |
| RAGRetriever (ChromaDB) | Retrieves best-practice documentation hints |
| ArxivScholarTool | Verifies scientific and technical claims |
| MCPBus (Optional) | Lightweight pub/sub communication layer |
All tools are optional-dependency-safe and fail gracefully.
- 🔍 Automatic repository inspection (local, ZIP, or GitHub URL)
- ✍️ README rewriting using RAG + Web Search
- 🏷️ Intelligent metadata generation (titles, tags, descriptions)
- 📊 Documentation quality scoring
- 📚 Claim verification using academic sources
- 🧩 Modular and extensible agent design
- 🖥️ CLI and Gradio App support
Interactive Gradio UI (Screenshots):
Video Walkthrough: 🎥 Watch the Video Demo on YouTube/Loom
- Languages: Python 3.11+
- Orchestration / LLM Framework: LangGraph, LangChain
- LLM Providers: Groq (Llama-3, Mixtral), Google GenAI (Gemini)
- Web UI Framework: Gradio
- Vector Database (RAG): ChromaDB
- Web Search Integration: Tavily Python Client
- Scientific Verification: ArXiv API
Publication Assistant/
├── agents/
│ ├── __init__.py
│ ├── repo_analyzer.py
│ ├── metadata_recommender.py
│ ├── content_improver.py
│ ├── reviewer_critic.py
│ └── fact_checker.py
│
├── orchestration/
│ ├── __init__.py
│ └── graph.py
│
├── tools/
│ ├── __init__.py
│ ├── repo_parser.py
│ ├── web_search.py
│ ├── rag_retriever.py
│ ├── keyword_extractor.py
│ └── arxiv_scholar.py
│
├── venv
├── utils/
│ ├── __init__.py
│ ├── evaluation.py
│ ├── logging.py
│ └── mcp.py
│
├── .env
├── .env.example
├── .gitignore
├── app.py
├── Dockerfile
├── main.py
├── README.md
└── requirements.txt
Before you begin, make sure you have the following:
- ✅ Python 3.11+
- 🔑 Groq API Key (required)
- 🔑 Google API Key (optional)
- 🔑 Tavily API Key (optional, for web search capabilities)
git clone https://github.com/your-username/publication-assistant.git
cd publication-assistantpip install -r requirements.txtCreate a .env file:
GOOGLE_API_KEY=your_google_api_key
GROQ_API_KEY=your_groq_api_key
TAVILY_API_KEY=your_tavily_api_key(Optional tools will still work without this.)
Once the application is installed, you can use it via the interactive Gradio app or the command line.
The Gradio app provides the richest experience for exploring the generated documentation.
To start the server:
python app.pyHow to use:
- Open your browser and navigate to
http://localhost:7860. - Project Setup: Paste a public GitHub Repository URL into the input field and click "Validate".
- Configuration: On the left panel, select your preferred "Writing Style" (e.g., Technical Blog) and "AI Model".
- Generation: Click "Generate Article". The system will trigger the multi-agent pipeline and present the improved README, tags, and titles on the right.
You can directly parse local or remote repositories from your terminal for quick analysis.
Analyze a local repository:
python main.py --repo-path ./some_local_repoAnalyze a remote repository:
python main.py --repo-path https://github.com/user/projectThe CLI will output a concise report in your terminal containing suggested titles, tags, review scores, and missing sections.
- Separation of Concerns – each agent has a single responsibility
- Tool-Augmented Intelligence – agents do not rely on LLMs alone
- Fault Tolerance – optional tools fail gracefully
- Extensibility – new agents or tools can be added easily
- Formal evaluation metrics against baseline READMEs
- Multi-repo batch analysis
- GitHub Actions integration
- Automatic PR creation with improved README
- Support for MCP over network
Contributions are welcome! Please open an issue or submit a pull request with clear documentation.
Licensed under the MIT license.
- Ready Tensor – Agentic AI Developer Certification
- LangGraph Framework – Official Documentation
- LangChain – Building context-aware reasoning applications
- Gradio – The fastest way to build & share ML apps
- ChromaDB – Open-source AI-native embedding database
- Tavily Search – Search Engine Optimized for LLMs
- ArXiv API – Scholarly Research API



