Port to PyTorch 2.x / CUDA 12.x / Python 3.12#53
Open
ilessiorobotflowlabs wants to merge 1 commit into
Open
Conversation
Complete compatibility port for modern stack:
- PyTorch 2.10+, CUDA 12.8, Python 3.12, Pillow 12, NumPy 2.x
Core changes:
- torch.cuda.amp.autocast → torch.amp.autocast('cuda') across all files
- torch.cuda.amp.GradScaler → torch.amp.GradScaler('cuda')
- torch._six.inf → math.inf
- pkg_resources → importlib.resources
- weights_only=False for legacy LDM checkpoints
- Deferred imports for optional deps (gradio, nltk)
CUDA C++ (Mask2Former deformable attention):
- Tensor.data<T>() → data_ptr<T>() (removed in PyTorch 2.x)
- AT_ERROR → TORCH_CHECK(false, ...)
- Removed deleted ATen/cuda/CUDAApplyUtils.cuh include
- Added gpuAtomicAdd wrapper with BFloat16/Half specializations
- Removed -D__CUDA_NO_HALF* flags for fp16 support
- use_reentrant=False in gradient checkpointing
Bug fixes found via code review:
- Fixed NameError on non-main DDP workers (writers variable)
- Fixed OmegaConf crash with undeclared enable_visualizer key
- Fixed inverted autocast logic in msdeformattn.py
- Fixed = instead of += for demo_stuff_colors (module global mutation)
- Fixed bare except: catching KeyboardInterrupt
- Fixed file handle leak in default_setup
- Fixed shell injection via $CXX in collect_env
- Fixed operator precedence bug in extract_features.py
- Added torch.meshgrid indexing='ij' to silence deprecation
- NumPy 2.x int casts for np.linspace throughout
Third-party:
- pytorch_lightning.utilities.distributed → .rank_zero
- PIL.Image.LINEAR → Image.BILINEAR
- Gradio 3.x → 4.x API migration in demo/app.py
- Removed detectron2 v0.6 hard pin in Mask2Former/setup.py
Validated: all imports, CUDA ops (fp32+fp16), config loading, LDM,
demo inference on 4 images — zero errors on 8xL4 GPU server.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complete compatibility port of ODISE to the modern PyTorch ecosystem. All 42 files updated, validated end-to-end on 8xL4 GPU server with zero errors.
Stack: PyTorch 2.10 · CUDA 12.8 · Python 3.12 · Pillow 12 · NumPy 2.x · Gradio 4.x · pytorch-lightning 2.x
Changes
PyTorch 2.x API migrations:
torch.cuda.amp.autocast→torch.amp.autocast('cuda')(10 files)torch.cuda.amp.GradScaler→torch.amp.GradScaler('cuda')torch._six.inf→math.infpkg_resources→importlib.resourcestorch.load(..., weights_only=False)for legacy LDM checkpointstorch.meshgrid(..., indexing='ij')to silence deprecationuse_reentrant=Falsein gradient checkpointingCUDA C++ (Mask2Former deformable attention):
Tensor.data<T>()→Tensor.data_ptr<T>()(14 call sites, removed in PyTorch 2.x)AT_ERROR→TORCH_CHECK(false, ...)ATen/cuda/CUDAApplyUtils.cuhincludegpuAtomicAddwrapper with BFloat16/Half specializations-D__CUDA_NO_HALF*build flags (no longer needed in PyTorch 2.x)NumPy 2.x:
np.int→np.int64,np.bool→np.bool_int()wrapping fornp.linspacecount argsThird-party compatibility:
pytorch_lightning.utilities.distributed→.rank_zeroPIL.Image.LINEAR→Image.BILINEAR(removed in Pillow 10+)demo/app.pydetectron2==0.6pin in Mask2Former/setup.pyBug fixes found during port:
NameErroron non-main DDP workers (writersvariable unbound)enable_visualizerkey crashing OmegaConf struct modemsdeformattn.py(re-enabled AMP during inference instead of disabling)=instead of+=fordemo_stuff_colors(mutated module-level constant)except:catchingKeyboardInterruptin CUDA op fallbackdefault_setup$CXXincollect_env.py(shell=True→ list args)extract_features.pyValidation
Tested on 8x NVIDIA L4 (CUDA 12.8, PyTorch 2.10, Python 3.12):
Motivation
ODISE is an excellent open-vocabulary panoptic segmentation model, but the original codebase targets PyTorch 1.x / CUDA 11.x which makes it unusable on modern GPU infrastructure. This PR brings full compatibility with current-generation hardware and software, making ODISE accessible to researchers and practitioners on modern setups.
Ported by RobotFlow Labs 🤖
🤖 Generated with Claude Code