Backend service for the Library project.
This part contains the FastAPI API, database layer, migrations, background workers, parser jobs, mail workflows, and S3-compatible storage integration.
- Python 3.12
- FastAPI
- SQLAlchemy asyncio
- Alembic
- PostgreSQL
- Redis
- RabbitMQ
- Taskiq
- MinIO/S3-compatible storage
- Poetry
backend/
|-- app/
| |-- api/v1/ # Route modules
| |-- core/ # Config, database, broker, scheduler, storage
| |-- core/models/ # SQLAlchemy models and enums
| |-- crud/ # Database access helpers
| |-- dependencies/ # FastAPI dependencies
| |-- mailing/ # Mail sending and templates
| |-- schemas/ # Pydantic schemas
| |-- scripts/ # One-off runnable scripts
| |-- services/ # Domain services
| |-- tasks/ # Taskiq background tasks
| `-- run.py # Uvicorn entrypoint
|-- alembic/ # Alembic migrations
|-- Dockerfile
|-- docker-compose.yml
|-- pyproject.toml
`-- .env.example
Create a local environment file before running the backend:
cp .env.example .envImportant environment groups:
POSTGRES_*configures PostgreSQL.REDIS_*configures Redis.RMQ_*configures RabbitMQ.MINIO_*configures MinIO and default buckets.MAIL_*configures SMTP mail sending.SECURE_*configures JWT, cookies, CSRF, and token TTL values.CORS_*configures allowed frontend origins and headers.ADMIN_*configures bootstrap admin creation.PARSER_*configures external book parsing.RUN_*configures the API host and port.
The backend contains an automatic parser for importing public-domain book data from Gutendex. The default source is configured by PARSER_API_BASE and points to:
https://gutendex.com
Gutendex is used as a catalog API. The parser reads book metadata from Gutendex JSON responses and follows the plain-text file links provided in each book's formats object.
- Category discovery starts at
GET /books/?sort=popular. - The parser reads
subjectsandbookshelvesfrom popular books and turns them into local categories. - For each discovered category, it requests
GET /books/?topic=<category>&sort=popular. - Every book is checked against the current category to avoid unrelated search results.
- The parser extracts the title, first author, author birth/death years, first summary as annotation, and cover URL.
- It chooses the best plain-text format from
formats, preferringtext/plain,.txt.utf-8, and.txtlinks. - It downloads the text file, removes Project Gutenberg start/end boilerplate, normalizes Unicode/control characters, and keeps paragraph breaks.
- The cleaned text is split into pages using
PARSER_PAGE_CHARS. - Books are saved in batches with their category, author, metadata, cover URL, and generated
BookPagerows.
The parser avoids importing the same book twice by checking the combination of title, author, and category before text download and again before save.
It also limits the import size so a scheduled run remains controlled:
PARSER_DISCOVERY_PAGES_LIMITlimits how many popular catalog pages are scanned for categories.PARSER_MAX_CATEGORIESlimits the number of categories imported per run.PARSER_MAX_AUTHORS_PER_CATEGORYlimits author variety inside one category.PARSER_MAX_BOOKS_PER_AUTHORlimits how many books one author can contribute to one category.PARSER_BATCH_SIZEcontrols how many parsed books are saved per database batch.PARSER_REQUEST_RETRIEScontrols retry attempts for Gutendex JSON and text requests.
The parser implementation lives in app/services/parser/.
Main entry points:
app/services/parser/runner.pyruns the full parser flow.app/tasks/parser_tasks.pyexposes the scheduled Taskiq task.app/scripts/run_parser_once.pyqueues one parser task manually through the broker.
Docker services related to the parser:
workerexecutes parser and email tasks from RabbitMQ.schedulerqueues the parser on the configured cron schedule.parser_bootstrapqueues one parser run when started.
Run parser bootstrap once:
poetry run python -m app.scripts.run_parser_onceRun the worker that executes parser jobs:
poetry run taskiq worker app.core.broker:broker app.tasks.parser_tasks app.tasks.email_tasks --log-level INFO --max-prefetch 1 --ack-type when_executed --shutdown-timeout 30Run the scheduler:
poetry run taskiq scheduler app.core.scheduler:scheduler app.tasks.parser_tasks --log-level INFOFrom the repository root, start the whole project:
docker compose up -d --buildFrom this directory, start only the backend stack:
docker compose up -d --buildThe API container runs migrations before starting the server:
alembic -c app/alembic.ini upgrade head && python -m app.runDefault local URLs from .env.example:
- API:
http://localhost:8000 - OpenAPI docs:
http://localhost:8000/docs - RabbitMQ UI:
http://localhost:15672 - MinIO UI:
http://localhost:9001
Install dependencies:
poetry installRun migrations:
poetry run alembic -c app/alembic.ini upgrade headRun the API:
poetry run python -m app.runRun a worker:
poetry run taskiq worker app.core.broker:broker app.tasks.parser_tasks app.tasks.email_tasks --log-level INFO --max-prefetch 1 --ack-type when_executed --shutdown-timeout 30Run the scheduler:
poetry run taskiq scheduler app.core.scheduler:scheduler app.tasks.parser_tasks --log-level INFORun parser bootstrap once:
poetry run python -m app.scripts.run_parser_onceCreate a migration after model changes:
poetry run alembic -c app/alembic.ini revision --autogenerate -m "describe change"Apply migrations:
poetry run alembic -c app/alembic.ini upgrade head- API routes:
app/api/v1/ - Application factory:
app/api/main.py - Runtime config:
app/core/config.py - Database models:
app/core/models/ - CRUD layer:
app/crud/ - Pydantic schemas:
app/schemas/ - Auth/security helpers:
app/services/,app/dependencies/ - Background tasks:
app/tasks/ - Parser implementation:
app/services/parser/ - Scheduled parser task:
app/tasks/parser_tasks.py - One-time parser queue script:
app/scripts/run_parser_once.py - Migrations:
alembic/versions/