Skip to content

Jenny0932/research-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Research Explorer

A dark-themed research web app built for a principal data scientist — streams the latest GenAI and ML papers live from arXiv across 8 curated topic feeds. Browse, search, annotate, and save notes on papers that matter to your work.


What This App Does

Research Explorer solves the problem of staying current with fast-moving GenAI research without spending hours on Google Scholar or Twitter. It gives you a single, always-up-to-date view of the papers most relevant to applied ML work — especially in production environments like financial services.

The core workflow:

  1. Browse — pick a topic from the sidebar (e.g. Multi-Agent Systems) and instantly see the 30 most recent arXiv papers, sorted by submission date
  2. Search — run a free-text search across arXiv titles and abstracts when you need something specific
  3. Read — click Read on any paper card to open the split-view reader: PDF on the left, your notes on the right
  4. Annotate — write notes alongside the paper; they auto-save as you type and are stored as Markdown files on your local machine
  5. Review — revisit all your annotated papers in the Saved Notes view, with previews and metadata intact

Key Features

Paper Discovery

  • 8 curated topic feeds — each with a hand-tuned arXiv query covering the most important terms for that topic
  • Live arXiv API — every fetch hits arXiv directly; papers are sorted by submission date with no stale intermediate index
  • Date window filter — scope results to the last 30 days, 90 days, 6 months, or 1 year
  • Full-text search — searches arXiv title and abstract simultaneously across all CS and quantitative finance categories
  • Refresh on demand — a Refresh button bypasses the 2-hour cache to pull the absolute latest papers at any time

Reading Experience

  • Expandable abstracts — read the full abstract inline on the card without navigating away
  • NEW badge — papers submitted in the last 7 days are highlighted so you can spot the freshest work at a glance
  • Split-view PDF reader — click Read to open a dedicated page with the PDF on the left and your notes on the right; drag the divider to resize the panels to any ratio
  • Native PDF tools — the PDF renders in the browser's built-in viewer, so text selection, copy, and browser highlight tools all work exactly as normal
  • Copy arXiv ID — one-click copy button for citing or sharing a specific paper

Paper Annotation

  • Split-view note editor — notes panel sits alongside the PDF so you can write while you read, without switching windows
  • Inline note editor — click 📝 Add Note on any paper card for a quick note without leaving the main feed
  • Auto-save — notes in the split-view save automatically 2 seconds after you stop typing; a status indicator shows Unsaved → Saving… → Saved ✓
  • Keyboard shortcut — press ⌘↵ (Mac) or Ctrl↵ (Windows/Linux) to save immediately in either editor
  • Markdown persistence — each note is saved to notes/{paper-title}.md on your local machine, containing the full paper metadata (arXiv ID, authors, categories, links) plus your notes
  • Edit and delete — reopen any note to update it; the file is overwritten in place
  • 📝 NOTE badge — cards with existing notes show an amber badge so you can see what you've already reviewed
  • Saved Notes view — a dedicated sidebar section lists all annotated papers with note previews, save timestamps, and filenames

Saved Notes Format

Each note is saved as a readable, portable Markdown file:

# Attention Is All You Need

**arXiv ID:** `1706.03762`
**Authors:** Ashish Vaswani, Noam Shazeer, ...
**Published:** 2017-06-12
**Categories:** cs.CL, cs.LG
**Abstract:** https://arxiv.org/abs/1706.03762
**PDF:** https://arxiv.org/pdf/1706.03762

---

## Notes

Foundational transformer paper. Self-attention mechanism eliminates
recurrence entirely — key insight for our document classification work.
Revisit the multi-head attention section before the architecture review.

---

*Last saved: 2026-04-29 14:30*

Files are stored locally in notes/ and are compatible with Obsidian, VS Code, Typora, or any Markdown viewer.


Topics

Icon Topic What it covers
⚙️ LLM Fine-Tuning LoRA, QLoRA, PEFT, DPO, RLHF, instruction tuning, SFT
📊 Agentic Evaluation Benchmarks and evaluation frameworks for AI agents
🤝 Multi-Agent Systems Orchestration frameworks, agent coordination and communication
🔍 RAG & Retrieval Retrieval-augmented generation, GraphRAG, knowledge retrieval
🧠 Reasoning & Planning Chain-of-thought, ReAct, tree-of-thought, structured inference
🛡️ AI Safety & Alignment Safety, alignment, red-teaming, adversarial robustness
💹 AI in Finance LLMs for risk management, fraud detection, credit scoring, trading
🏗️ Foundation Models LLM architecture, scaling laws, pre-training, model compression

Prerequisites

  • Python 3.11+
  • uv (curl -LsSf https://astral.sh/uv/install.sh | sh)

Setup

# 1. Clone / enter the project
cd research-app

# 2. Install dependencies (creates .venv automatically)
uv sync

Running

uv run uvicorn main:app --host 0.0.0.0 --port 8765

Open http://localhost:8765 in your browser.

For development with auto-reload:

uv run uvicorn main:app --host 0.0.0.0 --port 8765 --reload

API Endpoints

Method Path Description
GET / Main UI
GET /pdf/{arxiv_id} Split-view PDF reader + notes for a paper
GET /api/topics All topic configs and cache status
GET /api/papers/{topic_id} Papers for a topic (?days=90&refresh=false)
GET /api/search Full-text arXiv search (?q=...&days=90)
POST /api/notes Save a note for a paper
GET /api/notes List all saved notes with previews
GET /api/notes/{arxiv_id} Get note content for a specific paper
DELETE /api/notes/{arxiv_id} Delete a note

/api/papers/{topic_id} parameters

Param Default Description
days 90 How many days back to search
refresh false true to bypass the 2-hour cache

Valid topic_id values

llm_finetuning, agentic_eval, multi_agent, rag, reasoning, ai_safety, ai_finance, foundation_models


Configuration

Key constants in main.py:

Constant Default Description
CACHE_TTL 7200 Paper cache TTL in seconds (2 hours)
REQUEST_DELAY 4.0 Minimum gap between arXiv API calls (rate limit)

arXiv enforces ~1 request per 3 seconds. The 4-second delay gives safe headroom.


Project Structure

research-app/
├── main.py                  # FastAPI app, arXiv client, notes API, caching
├── pyproject.toml           # Project metadata and dependencies
├── .pre-commit-config.yaml  # ruff lint + format hooks
├── templates/
│   ├── index.html           # Main UI — topic feeds, search, paper cards
│   └── pdf_view.html        # Split-view PDF reader + notes editor
├── notes/                   # Saved note files — gitignored, local only
│   ├── .index.json          # arXiv ID → filename lookup map
│   └── *.md                 # One Markdown file per annotated paper
└── README.md

Tech Stack

Layer Technology
Backend FastAPI + uvicorn
HTTP client httpx (async)
Templates Jinja2
Frontend Alpine.js + Tailwind CSS (CDN, no build step)
Paper source arXiv API (free, no auth required)
Code quality ruff via pre-commit
Package manager uv

Keyboard Shortcuts

Key Action
/ Focus the search bar from anywhere on the page
Enter Submit search
⌘↵ / Ctrl↵ Save note (inline editor or split-view reader)

Notes

  • Papers are cached in memory per topic + date window. Cache clears on server restart.
  • The arXiv API is free and requires no API key or account.
  • The notes/ directory is gitignored — your annotations stay local and private.
  • Notes files are plain Markdown and open in any editor (Obsidian, VS Code, Typora, etc.).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors