Skip to content

henryli2002/Report-Summary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Report-By-Agent

Automated pipeline for generating academic-quality survey reports from structured PDF paper collections. Powered by MinerU (PDF parsing) and Gemini (content generation).

How It Works

PDF Papers          MinerU           LLM (x4 stages)              Final Report
(organized in   ──────────>  Markdown  ──────────────────>  output/<topic>/report_final.md
 folders)                    per paper   Outline
                                         Section Briefs
                                         Section Writing
                                         Assembly
                                         Review & Translate

Pipeline Stages

Stage What happens Output
1. PDF Parsing MinerU converts each PDF to markdown. Already-parsed PDFs are skipped. *.md alongside each PDF
2. Outline LLM reads all abstracts + your README.md to produce a structured outline with [AuthorYear] citation keys. output/<topic>/outline.md
3. Section Briefs LLM generates a tailored writing brief for each section based on the outline and your README.md using Structured Output (JSON). output/<topic>/prompts/sections/*.txt
4. Section Writing For each section: full paper text for its own papers, abstracts-only for other sections' papers. Each section uses its auto-generated brief. output/<topic>/sections/*.md
5. Assembly LLM assembles all sections into one coherent report with title, abstract, transitions, and references list. output/<topic>/report_draft.md
6. Review & Translate LLM performs factual/citation/grammar checks, then translates to FINAL_LANGUAGE if not English. output/<topic>/report_final.md

All intermediate stages use English. Only the final review stage translates to your configured language.

Quick Start

Prerequisites

  • Python 3.12 (recommended)
  • uv
  • MinerU (mineru v3.0+)
  • A Google Gemini API key

Setup

# 1. Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Create virtual environment with Python 3.12
uv venv .venv --python 3.12
source .venv/bin/activate

# 3. Install dependencies
uv pip install -e "."

# 4. Configure environment
cp .env_example .env
# Edit .env: set GOOGLE_API_KEY and FINAL_LANGUAGE

# 5. Run
./run.sh topics/RAG

run.sh will auto-sanitize PDF filenames (replacing spaces/hyphens with underscores), auto-create the venv, and install dependencies if you skip steps 2-3.

CLI Options

./run.sh <topic_dir> [options]

Options:
  --output-dir DIR       Override output directory
  --model MODEL          Override Gemini model (default: gemini-2.5-pro)
  --thinking-budget N    Override thinking budget in tokens (default: 8192)
  --reset                Discard saved progress, start from scratch

Creating a New Topic

A topic is a folder under topics/ containing PDFs organized into sections:

topics/
  YourTopic/
    README.md                          # Your structural guidance (required)
    Section A/
      paper1.pdf
      paper2.pdf
    Section B/
      Subsection B1/
        paper3.pdf
      Subsection B2/
        paper4.pdf
    Section C/
      paper5.pdf

Folder Rules

  • Folder names = section headings in the generated report
  • Nesting = heading hierarchy (subfolder becomes a subsection)
  • PDF placement determines which papers belong to which section
  • The same PDF can appear in multiple sections if needed

The README.md File

This is the only file you write per topic. It tells the pipeline your intended structure and narrative focus. Example:

I. Introduction (origins and motivation)
II. Core Architecture & Bottlenecks
III. Evaluation (place benchmarks before advanced methods so readers have the measuring stick first)
IV. Advanced Methods: Comparative Analysis
V. Challenges, Security & Future Directions

You don't need to match folder names exactly. The LLM uses your README as narrative guidance and maps it to the actual folder structure. You can add notes about emphasis, ordering, what to compare, etc.

Resume & Fault Tolerance

The pipeline saves progress after every stage to output/<topic>/state.json via atomic writes to prevent corruption.

  • Re-run the same command to resume from the last checkpoint
  • PDF parsing is incremental: only new/unparsed PDFs are processed
  • LLM failures retry automatically with exponential backoff (3 attempts)
  • PDF parsing failures retry with exponential backoff (2 attempts)
  • Use --reset to force a complete re-run

Configuration

All options can be set in .env (see .env_example):

Variable Default Description
GOOGLE_API_KEY (required) Gemini API key
FINAL_LANGUAGE English Language for the final report. Intermediate stages are always English.
MODEL_NAME gemini-2.5-pro Gemini model to use
THINKING_BUDGET 8192 Token budget for Gemini extended thinking
OUTPUT_DIR output Output base directory
PROMPTS_DIR prompts Reusable prompts directory

CLI arguments override .env values.

Project Structure

Report-By-Agent/
├── run.sh                     # Entry point (auto-sanitizes PDF names)
├── .env_example               # Configuration template
├── pyproject.toml             # Python dependencies (mineru>=3.0.0)
├── scripts/                   # Utility scripts
│   └── sanitize_pdfs.sh       #   Cleans up PDF filenames
├── tests/                     # Unit tests
│   └── test_pdf_parser.py     #   Tests for robust PDF reference truncation
├── prompts/                   # Reusable prompts (shared across all topics)
│   ├── outline.txt            #   Outline generation
│   ├── gen_section_briefs.txt #   Meta-prompt for auto-generating section briefs
│   ├── section_default.txt    #   Section writing instructions
│   ├── assemble.txt           #   Report assembly
│   └── review.txt             #   Review + translation
├── src/                       # Pipeline source code
│   ├── main.py                #   Orchestrator
│   ├── config.py              #   Configuration loading
│   ├── state.py               #   Checkpoint/resume state (Atomic save)
│   ├── llm_client.py          #   Gemini API wrapper with retry & Structured Output
│   ├── pdf_parser.py          #   MinerU integration & robust reference truncation
│   ├── structure.py           #   Folder tree → section tree
│   ├── outline_generator.py   #   Stage 2
│   ├── prompt_generator.py    #   Stage 3 (Uses JSON response_schema)
│   ├── section_generator.py   #   Stage 4
│   ├── assembler.py           #   Stage 5
│   └── reviewer.py            #   Stage 6
├── topics/                    # Input: one folder per topic
│   └── RAG/
│       ├── README.md
│       ├── Introduction/
│       ├── Foundational RAG Architecture/
│       │   ├── Pre-Retrieval/
│       │   ├── Ranking & Hybrid Search/
│       │   └── Long Context/
│       ├── Evaluation Benchmarks & Metrics/
│       ├── Advanced Methodologies/
│       │   ├── Agentic RAG/
│       │   ├── Graph RAG/
│       │   └── Multimodel RAG/
│       ├── System Implementations & Domain Applications/
│       └── Challenges, Security & Future Work/
└── output/                    # Generated output
    └── RAG/
        ├── state.json
        ├── outline.md
        ├── prompts/sections/  # Auto-generated section briefs
        ├── sections/          # Individual section drafts
        ├── report_draft.md
        └── report_final.md    # The final report

Customizing Prompts

The five prompts in prompts/ are generic and topic-agnostic. You can edit them to adjust:

  • Writing style or depth (section_default.txt)
  • Outline format (outline.txt)
  • Assembly strategy (assemble.txt)
  • Review criteria (review.txt)
  • Brief generation logic (gen_section_briefs.txt)

These changes apply to all topics. Per-topic customization is done exclusively through topics/<name>/README.md.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors