Skip to content

Anteater10/DocTalk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocTalk

DocTalk helps patients and caregivers understand medical notes in calmer, clearer language. Users can paste a note, see highlighted terms in context, and read a plain-English summary.

Why This Project Exists

Medical notes are often dense and stressful to read. DocTalk is designed to reduce cognitive load, especially for older adults and caregivers preparing for appointments or follow-up calls.

Current Capabilities

  • Detect medical terms from a seeded glossary and aliases
  • Detect acronym mentions and mark ambiguous acronym cases
  • Detect common units/measurements in note text
  • Mark likely negated findings (e.g., "no mention of pneumonia")
  • Produce a deterministic plain-English summary
  • Optionally improve summary wording via OpenRouter LLM (safely gated, fallback-first)
  • Frontend tuned for accessibility-first reading and low-stress interaction

Repo Structure

  • web/ - Next.js frontend (/ main experience)
  • api/ - FastAPI backend (/api/v1/* routes)
  • .github/workflows/ci.yml - basic API + web CI checks
  • Makefile - optional shortcuts (make help)

Local Setup

Prerequisites

  • Python 3.10+ (3.11 recommended)
  • Node 20+
  • Postgres (local) for dictionary-backed detection data

1) Backend Setup (api/)

cd api
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e .

Create .env from the example:

cp .env.example .env

Initialize and seed the dictionary data:

python -m app.db.repo init
python -m app.db.repo seed

Run API:

uvicorn app.main:app --reload --port 8000

2) Frontend Setup (web/)

cd web
npm install
npm run dev

Frontend runs on http://localhost:3000 and calls API at http://localhost:8000.

Environment Variables

Backend env variables (api/.env):

  • DATABASE_URL - required for dictionary reads/seeds
  • CORS_ORIGINS - allowed frontend origin(s), default local dev
  • LOG_LEVEL - logging verbosity
  • OPENROUTER_API_KEY - optional, only used when LLM is enabled
  • OPENROUTER_MODEL - optional model override
  • DOCTALK_ENABLE_LLM - optional feature gate for LLM summary (true/false)

LLM Summary Behavior (Optional)

/api/v1/simplify is always deterministic-first:

  1. Build deterministic baseline summary
  2. If DOCTALK_ENABLE_LLM=true and API key exists, attempt OpenRouter summary
  3. Validate LLM output (fail closed on unsafe/meta output)
  4. Fallback to deterministic baseline on any failure

This keeps the endpoint reliable and schema-compatible.

Running Tests

Backend tests (contract + quality eval + LLM safety):

cd api
python -m pytest -q

From the repo root you can also run make api-test (see make help).

Quality eval docs:

  • api/tests/README_quality_eval.md

API Endpoints (v1)

  • GET /api/v1/health
  • POST /api/v1/detect
  • POST /api/v1/simplify

Limitations

  • Detection quality depends on seeded glossary/acronym/unit dictionaries
  • Negation handling is heuristic (rule-based, not full clinical NLP)
  • Optional LLM output quality depends on provider/model and is intentionally guarded by fallback logic

Reviewer Quick Start

  1. Start API on port 8000 (e.g. make api-run after setup)
  2. Start web app on port 3000 (e.g. make web-run)
  3. Paste sample medical text in the UI
  4. Verify:
    • highlights + term explanations
    • plain-English summary
    • copy/download actions
  5. Run python -m pytest -q in api/ for regression checks

About

DocTalk — NLP-powered web app that simplifies medical notes into plain English. Full-stack demo showcasing React, FastAPI, Postgres, and hybrid classical + modern NLP pipeline.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors