Compendia

Overview

Compendia is a web application designed to combat "doom scrolling" and information overload. Instead of mindlessly consuming content, Compendia helps users build a Personal Curriculum around a specific interest. You enter a topic, machine learning, ceramics, gardening, and Compendia builds a clear multi week curriculum around it.

Each plan is structured, intentional, and distraction free, using high quality open resources, filtering out shorts, reactions, and noise, so learning feels calm, focused, and meaningful.

How to Set-Up

Prerequisites

Node.js (includes npm)
Python 3.10+
MongoDB (local or hosted)

1) Clone and enter the repo

git clone "github.com/linuteresa/compendia"

2) Configure environment variables

Create a .env file at the repo root:

MONGODB_URI=<your-mongodb-connection-string>
PORT=5001

3) Install backend dependencies

python -m venv .venv
.\.venv\Scripts\activate
pip install -r backend\requirements.txt

4) Run the backend API

cd backend
python -m uvicorn main:app --reload --port 5001

The API will be available at http://localhost:5001.

5) Install frontend dependencies

cd ..\Client
npm install

6) Run the frontend

npm run dev

Open http://localhost:5173 in your browser.

Core Features

Topic-Driven Curriculum

Users provide a free text topic and optional context, which is normalized and expanded into core concepts using open and verifiable sources.
Curriculum depth is defined by Bloom’s Taxonomy, with adjacent cognitive levels combined into a single progression stage to control rigor.
The total number of weeks directly shapes scope and pacing, producing a structured syllabus grounded in the user’s input.

Intelligent Video Curation

Every video is a direct YouTube watch link that plays immediately.
Titles are compared for similarity to avoid near duplicate introductions and repetitive content.

Deep Reading Retrieval

Pulls readings from open web, high authority domains such as .edu sites, nih.gov, mit.edu, and britannica.com.
Explicitly excludes paywalled or gated academic aggregators like scribd, researchgate, and coursehero so every link is immediately accessible.

Pedagogical Scaling

Depth Levels (1-3): The curriculum adapts its complexity based on user selection. Bloom's Taxonomy Integration:

Level 1: Focuses on "Remember & Understand"
Level 2: Focuses on "Apply & Analyze"
Level 3: Focuses on "Evaluate & Create"

Backend Pipeline

Phase 1 Topic Intake and Seed Discovery

Input is the raw user topic string and the number of weeks.
The topic is normalized and tokenized to extract keywords.
MediaWiki API is queried to select a Seed Title, the most relevant Wikipedia page for the topic.
Related Wikipedia pages are added to a pool using targeted searches like topic overview and topic history.
The Seed Title is parsed for section headers to use as anchors for week naming.

Phase 2 Resource Retrieval and Filtering

YouTube fetcher queries the YouTube Data API when a key is available, and falls back to YouTube search HTML parsing on 403 or missing key.
Every video is stored as an exact watch URL and is deduped globally across the full curriculum.
Video titles are filtered with a cosine similarity threshold to avoid near duplicates.
Results are filtered for junk terms such as shorts, reaction, memes, and boosted for channels like MIT OpenCourseWare and StatQuest.
Open web readings are pulled from Wikipedia externallinks on the Seed Title and related pages.
External links are filtered through an allowlist of high authority domains and a blocklist for paywalls and academic aggregators, plus file type filters like pdf and ppt.
Each week includes one Wikipedia reading first, then open web links when available, with domain diversity enforced per week.

Phase 3 Curriculum Assembly and Output

Weeks are generated from the requested week count, with depth level controlling items per week, 1 2 or 3.
Weekly themes rotate through fixed phases like orientation, history, science, methods, practice, risks, policy, ethics, future, synthesis.
Bloom focus is set by depth level, with adjacent levels grouped, 1 is Remember and Understand, 2 is Apply and Analyze, 3 is Analyze Evaluate Create.
Output is a single JSON object with meta, a course summary paragraph, and a weeks array containing week title, 50 word summary, videos, and reading links.

Hackathon Scope

Primary Tech Stack: Python, Google Gemini, YouTube Data API.

Original Work: All curriculum logic and filtering algorithms were built during the hackathon period.

Goal: To demonstrate how AI can curate safe, educational pathways on the open web.

Demonstration Video

Watch the full walkthrough:Demo video

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
Client		Client
backend		backend
.gitignore		.gitignore
README.md		README.md
curriculum_builder.py		curriculum_builder.py
gemini_config.py		gemini_config.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compendia

Overview

How to Set-Up

Prerequisites

1) Clone and enter the repo

2) Configure environment variables

3) Install backend dependencies

4) Run the backend API

5) Install frontend dependencies

6) Run the frontend

Core Features

Topic-Driven Curriculum

Intelligent Video Curation

Deep Reading Retrieval

Pedagogical Scaling

Backend Pipeline

Phase 1 Topic Intake and Seed Discovery

Phase 2 Resource Retrieval and Filtering

Phase 3 Curriculum Assembly and Output

Hackathon Scope

Demonstration Video

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Compendia

Overview

How to Set-Up

Prerequisites

1) Clone and enter the repo

2) Configure environment variables

3) Install backend dependencies

4) Run the backend API

5) Install frontend dependencies

6) Run the frontend

Core Features

Topic-Driven Curriculum

Intelligent Video Curation

Deep Reading Retrieval

Pedagogical Scaling

Backend Pipeline

Phase 1 Topic Intake and Seed Discovery

Phase 2 Resource Retrieval and Filtering

Phase 3 Curriculum Assembly and Output

Hackathon Scope

Demonstration Video

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages