Releases: SahilKumar75/TuneOS
Releases · SahilKumar75/TuneOS
v0.2.0 — Modal Cloud GPU, DPO Training, Phase 4 Feature Complete
What's New in v0.2.0
This release brings TuneOS to feature-complete Phase 4, adding cloud GPU training, DPO preference learning, advanced trainer capabilities, and a completely refreshed UI.
🚀 New Features
Cloud GPU Training (Modal.com)
- Jobs can now run on a free Modal T4 GPU instead of the local device — ideal when no local GPU is available
- New Compute backend selector in Step 4 (Local GPU / Modal / HF Spaces)
workers/modal_runner.pyserialises the dataset, runs the full training pipeline remotely, and streams adapter + eval metrics back to local disk- Enable by setting
MODAL_TOKEN_ID+MODAL_TOKEN_SECRET;modalis optional (poetry install --with modal)
DPO Preference Training (P4-C)
- Full DPO recipe wired end-to-end through trainer, API, and UI
- Step 4 now exposes DPO hyperparameters; dataset step validates preference columns before build
- TRL
DPOTrainerintegration with configurablebetaand reference model
Advanced Trainer (P4-A / P4-B)
- Prompt template registry — Alpaca, ChatML, Llama-3, Phi-3, and raw formats; sample packing for throughput
- Configurable
report_tofor HF experiment-tracker integration - Flash Attention 2 / SDPA and RoPE scaling plumbed into model loading
- LoRA init strategy exposed via
init_lora_weights - Extended architecture support: Qwen3, Phi-4, Cohere, OLMo, Mixtral, MPT, StarCoder2, GPT-BigCode
use_4bitnow optional instead of forced; all checkpoints usesafe_serialization=True
Distillation, FSDP & INT8 Export (P4-F)
- Knowledge distillation from a teacher model
- FSDP (Fully Sharded Data Parallel) for multi-GPU training
- Hyperparameter sweep support
- INT8 export via
bitsandbytes
Richer Eval Metrics & Live Streaming (P4-D / P4-E)
- ROUGE-1 and BLEU computed from held-out evaluation sample after training
- Live Modal training log streaming in Step 5
- Eval metrics persisted to SQLite and surfaced in Step 6 Results
Multi-run Compare & Benchmark (P4-F)
- New compare page with overlaid loss curves across runs
lm-evalbenchmark wrapper for standardised model evaluation
API Hardening (P4-A)
- LRU inference model cache (
maxsize=3) — avoids GB-scale model reloads GET /api/jobspagination (default 50, max 500)GET /api/gpunow reportsdevice_count,vram_total_gb,vram_free_gb,cuda_version- Eval metrics fall back to durable SQLite when Redis copy has expired
PostgreSQL Experiment Backend
- Set
EXPERIMENTS_DB_URLto apostgresql://DSN to share experiment data across workers - All upserts migrated to portable
ON CONFLICT … DO UPDATE(SQLite 3.24+ and PostgreSQL compatible)
UI Polish
- Chat panel consolidated into a single, cleaner component
- Step 5/6 surfaces all eval metrics + Modal badge
- Register-to-model-registry card on Results step
- Hyperparameter comparison table (Technique, LR, LoRA r, Batch columns)
🔧 CI Improvements
- Monolithic CI split into dedicated path-scoped jobs (lint, test-core, test-ui, docker, docs)
- Docker build significantly faster via BuildKit cache injection for pip/poetry downloads
mypystatic typing added to lint job
📖 Docs
- Quickstart updated for Celery fallback,
HF_TOKEN, and Modal cloud GPU docs/api.mddocumentsGET /api/health/workersandcompute_backenddocs/DEPLOY.mdgains full.envreference tabledocs/supported-models.mdreflects autotarget_modulesdetection
🔄 Changed
- State data models now use
pydantic.BaseModel(Reflexrx.Baseremoved in newer versions) - Minimum
trlversion raised to>=0.12.0
Full Changelog: v0.1.0...v0.2.0