AML: Pneumonia Classification via Chest X-Rays

Repository for the Applied Machine Learning course (WBAI065-05) at the University of Groningen.

Development

We use uv for project management.

Clone the project.
Synchronise the project.

uv sync

Create a copy of example.config.yaml and rename it to config.yaml. Update the configuration, if desired.

Command Line Interface (CLI)

The project can be run via a CLI, for convenient usage and testing.

Downloading Data

Option 1: Download Script using Kaggle API

uv run -m src.data.download [--force]

--force: Forces a redownload of the data, in the event of missing or corrupted raw data. Defaults to False.

This requires a Kaggle API token to be set up on your device: https://www.kaggle.com/settings/api

Option 2: Manual Download

Dataset: https://www.kaggle.com/datasets/tolgadincer/labeled-chest-xray-images

Extract the archive and place it in a directory named "DATA_DIR/raw/" (ex. data/raw/<the extracted folder>).
Run the download script for automated reorganisation.

Preprocessing and Feature Extraction

uv run -m src.features.preprocess_data [--pipeline] [--lgb-size]

--pipeline: Chooses which pipeline to run: pytorch, lightgbm, all. Running the pytorch pipeline is required in order to run the lightgbm pipeline. Defaults to all.
--lgb-size: Determines the edge size for downsampling in LightGBM feature extraction. Defaults to 64.

Training a Model

uv run -m src.training.train --model <model_name> [options]

--model: The model architecture to train: cnn, resnet, lgbm.
--epochs: Number of training epochs. Defaults dynamically.
--batch-size: Batch size for PyTorch models. Defaults to 32.
--lr: Learning rate. Defaults dynamically.
--patience: Epochs to wait for improvement before early stopping. Defaults to 3.
--num-leaves: Number of leaves for LightGBM. Defaults to 31.
--max-depth: Maximum tree depth for LightGBM. Defaults to -1.
``--weight-decay`: Weight decay for PyTorch models. Defaults to 0.0.
--device: Device for PyTorch models (cuda, mps, cpu). Defaults to auto-detection.

Cross-Validation

uv run -m src.training.cv --model <model_name> [options]

--model: The model to cross-validate: cnn, resnet, lgbm.
--splits: Number of folds (k). Defaults to 5.
--epochs: Number of training epochs. Defaults dynamically.
``-batch-size`: Batch size for PyTorch models. Defaults to 32.
--lr: Learning rate. Defaults dynamically.
--weight-decay: Weight decay for PyTorch models. Defaults to 0.0.
--device: Device for PyTorch models (cuda, mps, cpu). Defaults to auto-detection.
--grid-search: Enable hyperparameter grid search cross-validation.

Tensorboard Dashboard

uv run tensorboard --logdir logs/tensorboard

Running Tests

uv run pytest tests

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
.idea		.idea
archive		archive
notebooks		notebooks
overleaf/proposal		overleaf/proposal
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
example.config.yaml		example.config.yaml
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AML: Pneumonia Classification via Chest X-Rays

Development

Command Line Interface (CLI)

Downloading Data

Option 1: Download Script using Kaggle API

Option 2: Manual Download

Preprocessing and Feature Extraction

Training a Model

Cross-Validation

Tensorboard Dashboard

Running Tests

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AML: Pneumonia Classification via Chest X-Rays

Development

Command Line Interface (CLI)

Downloading Data

Option 1: Download Script using Kaggle API

Option 2: Manual Download

Preprocessing and Feature Extraction

Training a Model

Cross-Validation

Tensorboard Dashboard

Running Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages