[codex] Update Torch and Lightning for Python 3.9 by jbohnslav · Pull Request #180 · jbohnslav/deepethogram

jbohnslav · 2026-07-03T22:15:12Z

Summary

Bump Torch to the highest release that still supports Python 3.9: torch==2.8.0.
Bump PyTorch Lightning to the highest release that still supports Python 3.9: pytorch_lightning==2.6.0.
Move Linux Torch wheels from cu124 to cu126 and update the Modal GPU test image to CUDA 12.6.3.
Port training setup to Lightning 2 APIs for Trainer devices, Tuner usage, progress bars, and callback hook signatures.
Make TensorBoard an explicit dependency and update checkpoint loading for Torch 2.6+ weights_only behavior.
Remove the obsolete ReduceLROnPlateau(verbose=...) kwarg rejected by Torch 2.8.

Validation

uv lock --check --python 3.9 --no-python-downloads
uv run ruff check deepethogram/base.py deepethogram/callbacks.py deepethogram/losses.py deepethogram/schedulers.py deepethogram/utils.py modal_gpu_test.py
uv run pytest -> 28 passed, 5 deselected
uvx --from modal modal run --detach modal_gpu_test.py --write-junit pytest-gpu.xml -> 5 passed on CUDA 12.6.3 / Torch 2.8.0+cu126

Modal run: https://modal.com/apps/jbohnslav/main/ap-uM8UCZmZzvAZCKx2w6Dsgt

claude · 2026-07-03T22:15:26Z

Claude finished @jbohnslav's task in 3m 14s —— View job

Review of PR #180 — Update Torch/Lightning for Python 3.9

Gather context (PR diff against correct merge-base 42a21c5^)
Read CLAUDE.md for repo guidelines (none present at root)
Review base.py (Lightning 2 Trainer/Tuner APIs)
Review callbacks.py (callback hook signatures)
Review losses.py / schedulers.py / utils.py
Review pyproject.toml / uv.lock dependency bumps
Review tests/test_callbacks.py
Post final review feedback

Summary

This is a clean, well-scoped dependency bump with a correct port to the Lightning 2 API. Verified each hunk against the actual PR diff (git diff 895154c HEAD, which matches the stated +414/-326 file stats).

Correctness checks — all look right:

deepethogram/base.py:40-43 — new trainer_device_kwargs() correctly replaces the removed gpus=[...] kwarg with accelerator/devices, and gracefully falls back to CPU when CUDA isn't available (an improvement over the old code, which would've hard-failed on CPU-only environments).
Tuner import from pytorch_lightning.tuner (replacing pl.tuner.tuning.Tuner) and use of trainer.tuner → a persisted local tuner variable — correct for Lightning 2, which removed Trainer.tuner.
TQDMProgressBar(refresh_rate=...) + enable_progress_bar=refresh_rate > 0 correctly replaces the removed progress_bar_refresh_rate kwarg (base.py:392-406).
callbacks.py hook signatures correctly match Lightning 2: dataloader_idx dropped from on_train_batch_start/end (single train loader assumption), kept with =0 default on validation/test hooks, on_keyboard_interrupt → on_exception with an isinstance(exception, KeyboardInterrupt) guard, and removal of the deprecated on_init_end/on_epoch_start/on_epoch_end hooks.
tests/test_callbacks.py adds solid coverage for the new on_exception behavior on MetricsCallback and CheckpointCallback (both the keyboard-interrupt path and the "ignore other exceptions" path).
torch.load(..., weights_only=False) in losses.py/utils.py correctly preserves pre-2.6 behavior for checkpoints that bundle non-tensor training metadata, now that Torch 2.6+ defaults weights_only=True.
Removed ReduceLROnPlateau(verbose=True) — correctly matches Torch 2.8 removing that kwarg. Only side effect is losing the automatic per-step LR-change log line, which is a minor, acceptable trade-off.
I confirmed there are no other stale torch.load call sites, gpus=[...] usages, or removed-hook usages left elsewhere in the codebase.

Minor nit (non-blocking):

deepethogram/base.py:285-286 — the comment above the tuning branch still says reload_dataloaders_every_epoch = True: a bit slower..., but the code (both before and after this PR) actually uses the newer reload_dataloaders_every_n_epochs kwarg. Stale terminology, purely cosmetic. Fix this →

Note on weights_only=False: this restores the pre-Torch-2.6 default rather than introducing new risk, but it's worth keeping in mind that it disables Torch's newer guard against unsafe unpickling for any pretrained-weight files sourced from outside the project (e.g. user-shared checkpoints). Not a regression from this PR — just flagging for awareness since Torch's changed default was a deliberate security hardening step.

No functional bugs found. Given the validation already performed (unit tests, ruff, and a real GPU run on Modal with CUDA 12.6.3 / Torch 2.8.0+cu126), this looks ready to merge.

cursor · 2026-07-04T01:26:31Z

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

Update Torch and Lightning for Python 3.9

42a21c5

Fix Lightning exception callbacks

12822d5

jbohnslav marked this pull request as ready for review July 4, 2026 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Update Torch and Lightning for Python 3.9#180

[codex] Update Torch and Lightning for Python 3.9#180
jbohnslav wants to merge 2 commits into
masterfrom
codex/update-torch-lightning-py39

jbohnslav commented Jul 3, 2026

Uh oh!

claude Bot commented Jul 3, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jbohnslav commented Jul 3, 2026

Summary

Validation

Uh oh!

claude Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #180 — Update Torch/Lightning for Python 3.9

Summary

Uh oh!

cursor Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jul 3, 2026 •

edited

Loading