IRIS — Full-Stack Search Engine (BM25)

A full-stack search engine built from scratch implementing web crawling, inverted indexing, and BM25 ranking, with a FastAPI backend and React frontend.

Key Features

Web Crawler
- Multi-seed crawling
- Domain-restricted traversal
- HTML parsing & text extraction (BeautifulSoup)
- Noise filtering (scripts, nav, Wikipedia special pages)
Search Engine Core
- Inverted index construction
- BM25 ranking algorithm (relevance scoring)
- Tokenization + stopword filtering
- Multi-term query handling
Backend API
- FastAPI-based REST service
- /search?q= endpoint
- JSON responses with ranked results
- CORS-enabled for frontend integration
Frontend UI
- React + TypeScript (Vite)
- Search interface with real-time results
- Snippets + relevance scores
- Keyboard + button search support

Architecture

Crawler → Documents → Indexer → Inverted Index
                                      ↓
                                 Query Engine (BM25)
                                      ↓
                                   FastAPI API
                                      ↓
                              React Frontend UI

Tech Stack

Languages: Python, TypeScript
Backend: FastAPI
Frontend: React (Vite)
Parsing: BeautifulSoup, Requests
IR Model: BM25 (Okapi)

Highlights

Built a search engine from scratch using information retrieval concepts
Implemented inverted indexing and BM25 ranking algorithm
Designed a REST API using FastAPI for query processing
Developed a React + TypeScript frontend for real-time search
Optimized crawler with domain filtering and noise reduction
Handled CORS, async API integration, and full-stack communication

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
api		api
crawler		crawler
indexer		indexer
iris-frontend		iris-frontend
query		query
ranker		ranker
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IRIS — Full-Stack Search Engine (BM25)

Key Features

Architecture

Tech Stack

Highlights

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IRIS — Full-Stack Search Engine (BM25)

Key Features

Architecture

Tech Stack

Highlights

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages