glzn (/²ɡlɪsːn/) is a minimal library to facilitate rapid training and inference of ML / AI for research, developed by the Digital Signal Processing and Image Analysis Group at the Institute of Informatics at the University of Oslo.
The name glzn is a textese disemvowelment of glissen, meaning sparse in Norwegian.
glzn is packaged in submodules, each designed to facilitate a specific role in running machine learning experiments with PyTorch.
glzn.data: WDS format data wrapper for handling data of different modalities.glzn.log: local logging functionality, can be paired with Aim for more extensive reporting.glzn.cfg: config and argparse module to handle experiments with pydantic.glzn.proc: training and validation processor for runs, providing optimization in a neatly packaged context manager.
Planned submodules:
glzn.optim: Commonly used optimizers for large scale image training.glzn.aug: Augmentation factories for commonly used setups.glzn.parse: Run factories for common supervised / self-supervised vision pipelines.
glzn is designed to stay minimal and efficient for HPC resources, and minimalisim is what drives the development.
pip install git+https://github.com/dsbifi/glzn.git
pip install git+ssh://git@github.com/dsbifi/glzn.git
-
datasubmodule for data handling.- iTar implementation.
- Basic grouping support.
- Stem search and extraction.
- Improved extension filtering.
- Low overhead stateful sampling capability.
- Add-ons (low priority):
- Add optional encoders.
- blosc2-openhtj2k.
- pillow-jxl-plugin.
- Additional video codecs.
- Seismic data support.
- Add encoder based grouping format.
- Collator factory with support for NamedTuple or dict from Dataloader.
- Add optional encoders.
-
augsubmodule for augmentations.- Standard ViT Augmentations.
- DEIT3 Augmentations.
- DINO / iBOT Augmentations.
- DINOv2 / v3 support.
- MAE Augmentations.
-
cfgsubmodule for config declaration.- Pydantic type verification.
- Presedence logic.
-
parsemodule for modular approach to central config / run parsing.- LLRD parsing support.
- Factories for creating runs for supervised training.
- IN1k training.
- IN22k training.
- COCO Segmentation training.
- COCO Detection / Instance Seg. training.
- Factories for creating runs for self-supervised training.
- DINO (no MIM)
- iBOT / DINOv2 / DINOv3
- MAE / MIMR (MIM Refiner)
-
logmodule for rudimentary logging to jsonl and stdout.- Basic logging support.
- Add-ons (medium priority):
- Add Aim support.
- W&B support (low priority, locks users into pay-to-use)
-
optimmodule with commonly used optimizers not covered by PyTorch.- cAdamW, StableAdamW, cStableAdamW.
- LAMB, cLAMB.
- Flags / registry for adaptive selection of gradient clipping (based on optimizer functionality).
- Add-ons (medium priority):
- Scion.
-
procsubmodule for train / validation processing and wrappers.- Simple
emawrapper. - Simple
schedmodule.- Meta style precomputed array based schedulers.
-
wrapmodule for wrapping scheduled events.
-
stepmodule, tracks relevant training / validation phases.-
StepStateclass, for immutables. -
StepTelemetryclass (clock for run start, etc.). -
StepTrackerclass for full experiment tracking.
-
- Main
procmodule for context-based batch processing.- Gradient clipping support
- Gradient accumulation support, in conj. with
step. - AMP support / gradient scaling.
- Scheduling support through
wrap. - Simple logging via
log+stepmodules. - Context manager implementation.
- Simple