Skip to content

ianbakst/image-classification

Repository files navigation

Quantivly

A REST API platform for training and serving medical image classification models using MedMNIST datasets. The platform lets you browse and download any MedMNIST dataset, train a CNN on it, and classify new images — all through a simple REST API.

Architecture

Component Technology
API server FastAPI + Uvicorn
Background jobs arq (asyncio job queue)
Job broker Redis
Database PostgreSQL (SQLModel / asyncpg)
Object storage S3-compatible (AWS S3, MinIO, etc.)
Package manager uv

Prerequisites

Getting Started

1. Clone the repository

git clone https://github.com/ianbakst/quantivly.git
cd quantivly

2. Configure environment variables

Create a .env file in the project root. The application reads it automatically via pydantic-settings.

# S3-compatible object storage
AWS__BUCKET_NAME=quantivly
AWS__ACCESS_KEY_ID=minioadmin
AWS__SECRET_ACCESS_KEY=minioadmin
AWS__ENDPOINT_URL=http://minio:9000
AWS__REGION_NAME=us-east-1

# PostgreSQL
DB__USERNAME=quantivly
DB__PASSWORD=quantivly
DB__DB_NAME=quantivly
DB__HOST=postgres
DB__PORT=5432

# Redis (arq job broker)
REDIS_URL=redis://redis:6379
Variable Description
AWS__BUCKET_NAME S3 bucket for datasets and model weights
AWS__ACCESS_KEY_ID S3 access key ID
AWS__SECRET_ACCESS_KEY S3 secret access key
AWS__ENDPOINT_URL S3 endpoint URL (use your MinIO or AWS regional endpoint)
AWS__REGION_NAME S3 region (default: us-east-1)
DB__USERNAME PostgreSQL username
DB__PASSWORD PostgreSQL password
DB__DB_NAME PostgreSQL database name
DB__HOST PostgreSQL hostname
DB__PORT PostgreSQL port (default: 5432)
REDIS_URL Redis DSN used by both the API and the arq worker

Building and Running

docker compose up --build

This starts the following services:

Service Description Default port
api FastAPI application (Uvicorn) 8000
worker arq background worker
postgres PostgreSQL database 5432
redis Redis job broker 6379

The interactive API docs are available at http://localhost:8000/docs once the stack is running.

Workflows

a. Browse available remote datasets

curl http://localhost:8000/dataset/remote

Returns a JSON array of MedMNIST dataset names, e.g. ["pathmnist", "chestmnist", "dermamnist", ...].

b. Download a dataset

curl -X POST http://localhost:8000/dataset/remote/pathmnist/download
{"workflow_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"}

The dataset is downloaded from the MedMNIST CDN and uploaded to S3 in the background. Use the returned workflow_id to track progress.

c. Check download progress

curl http://localhost:8000/workflow/a1b2c3d4-e5f6-7890-abcd-ef1234567890
{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "task": "download_dataset",
  "dataset_name": "pathmnist",
  "arq_job_id": "arq:job:abc123",
  "s3_key": null,
  "error": null,
  "created_at": "2026-03-24T12:00:00Z",
  "updated_at": "2026-03-24T12:01:30Z"
}

Possible status values: pending, running, completed, failed, cancelled.

d. Browse available architectures

curl http://localhost:8000/model/
[
  {
    "name": "CNN",
    "parameters": [
      {"name": "in_channels", "type": "int", "default": null, "required": true},
      {"name": "num_classes", "type": "int", "default": null, "required": true},
      {"name": "dropout", "type": "float", "default": "0.3", "required": false}
    ]
  }
]

e. Start training

First, note the dataset UUID from GET /dataset/ (after the download completes):

curl http://localhost:8000/dataset/

Then start a training job:

curl -X POST http://localhost:8000/model/CNN/train \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
    "learning_rate": 0.001,
    "batch_size": 64,
    "n_epochs": 10
  }'
{"workflow_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901"}

f. Monitor training progress

curl http://localhost:8000/workflow/b2c3d4e5-f6a7-8901-bcde-f12345678901

The response is the same shape as shown in step c. Once status is completed, the trained model record will appear in GET /trained-model/.

g. List trained models

curl http://localhost:8000/trained-model/
[
  {
    "id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
    "architecture": "CNN",
    "dataset_id": "f7e6d5c4-b3a2-1098-fedc-ba9876543210",
    "s3_key": "models/CNN/pathmnist.pth",
    "accuracy": 0.9123,
    "f1": 0.9045,
    "auc": 0.9876,
    "confusion_matrix": [[...], ...],
    "training_params": {"learning_rate": 0.001, "batch_size": 64, "n_epochs": 10},
    "created_at": "2026-03-24T12:05:00Z"
  }
]

h. Classify an image

curl -X POST http://localhost:8000/trained-model/c3d4e5f6-a7b8-9012-cdef-123456789012/inference \
  -F "image=@/path/to/your/image.png"
{
  "predicted_class": 3,
  "probabilities": {
    "0": 0.01,
    "1": 0.02,
    "2": 0.04,
    "3": 0.91,
    "4": 0.02
  }
}

The image is normalised to [-1, 1] and converted to the channel mode (grayscale or RGB) that matches the dataset the model was trained on.

Cancelling a Job

Any pending or running workflow can be cancelled:

curl -X DELETE http://localhost:8000/workflow/b2c3d4e5-f6a7-8901-bcde-f12345678901/cancel

The job is aborted in Redis and the workflow status is set to cancelled. Attempting to cancel a job that is already completed, failed, or cancelled returns HTTP 422.

Choosing the Model Architecture

To the best of my limited research (as the goal of this exercise was not to achieve optimal model performance), appropriate architectures for these models are CNN-based. I took inspiration from the mnist character recognition dataset/model. Through some research, I also chose a slightly more complicated architecture. I didn't want to make it too complicated, since I wanted to be able to train on CPU in short enough time-scales to prove the pipeline works. This more complicated architecture has 3 convolutional layers with a final AdaptiveAvgPool to handle the spatial dimensions and different image sizes flexibly. If given more time, I would research the recommended model architectures for this problem and first implement what the literature suggests is the optimal model.

Next Steps

As with all projects, the perfect cannot be the enemy of the good. This project was time-boxed, and thus, I was not able to get to everything I would've wanted.

At a high level, with more time I would've implemented the following:

  • Training already happens on its own worker, but I would've made the training worker more optimized for training models. If this was actually deployed to the cloud, I would've provisioned some accelerated hardware (GPUs) for the training image.
  • On that note, inference is served through the main app, which isn't ideal. With more time I would've built a more robust serving system that would promote a model to its own service. I would then make that served model available to handle batch classification as well as single-image classification.
  • The bulk of the time of this project was spent setting up the environment to reliably load data and train on that data. Not too much time was spent investigating model architectures and their performance.
  • The ability to manage/add metrics would be added.
  • On that note, I would have loved to make model archetecture composable via REST endpoints. This would require a significant effort though.
  • Transforms are currently baked into the training pipeline and are minimal (just what's needed to get the data into a usable format). With more time I would like to extract transforms out as their own configurable objects that can be added to either the download step, a separate preprocessing step (I'd need to make this), or a configurable piece of the training invocation.
  • I would implement logging and tracing to more than just stdout. I would write log files somewhere that is accessible, either a logging/tracing platform or files linked to the model/training run. This way, there's a full trace of what happened in the development and performance of a model.
  • Lastly, I would research and implemnent an optimal model for these classification problems. This would require reading the literature surrounding the field, and these datasets, as well as training, hyperparameter tuning, and comparison of performance before serving a production-ready model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors