BookScout is a full‑stack, AI‑powered book recommendation web app that uses semantic similarity, genre filtering, and automated metadata enrichment to suggest books you’ll love. Built with Flask, Pandas, BeautifulSoup, Hugging Face Transformers, and Chart.js, it features a dark‑mode UI and a scalable CSV‑backed cache.
-
Search & Recommend
- Query by book title
- TF‑IDF + cosine similarity on descriptions + genres
- Optional genre filter checkboxes
-
Metadata Enrichment
- Google Books API for base metadata
- Wikipedia fallback for plot summaries & categories
- Zero‑shot genre classification (facebook/bart‑large‑mnli)
-
Bulk Addition
- Scrape Goodreads lists for new titles
- Clean & dedupe titles (removes series info)
- Add in bulk via
bulk_add.pywith auto‑preprocessing
-
Data Utilities (
/scripts)scrape_goodreads.py→ Fetch new titles from Goodreadsstrip_malformed_rows.py→ Clean up broken CSV rowsclean_categories.py→ Sanitize category fieldsgenre_counter.py→ Count & normalize genres (CSV +<option>HTML)structure_maker.py→ Print project folder tree
-
Clone the repo
git clone https://github.com/anand25116/bookscout.git cd bookscout -
Create & activate virtual environment
python3 -m venv venv source venv/bin/activate # macOS/Linux venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
-
Configure environment
- Copy
.env.exampleto.env - Add your
GOOGLE_BOOKS_API_KEY(optional)
- Copy
export FLASK_APP=app.py
export FLASK_ENV=development
flask run- Open http://localhost:5000
- Enter a book title and optional genre filters → See recommendations!
-
Generate new titles (optional)
python scripts/scrape_goodreads.py
-
Add to cache & preprocess
python bulk_add.py
-
Clean categories:
python scripts/clean_categories.py
-
Count genres & generate options:
python scripts/genre_counter.py
- Dockerize with a
Dockerfile(not included) - Deploy to Heroku / AWS / GCP with
Procfile& environment variables