Skip to content

Fix tests for Flux, WAN, SDXL and LTX-Video to resolve execution and environment issues#394

Open
Perseus14 wants to merge 1 commit into
mainfrom
tests_fix
Open

Fix tests for Flux, WAN, SDXL and LTX-Video to resolve execution and environment issues#394
Perseus14 wants to merge 1 commit into
mainfrom
tests_fix

Conversation

@Perseus14
Copy link
Copy Markdown
Collaborator

@Perseus14 Perseus14 commented May 1, 2026

This PR addresses several test failures in the maxdiffusion repository across different models. The changes resolve runtime errors, environment incompatibilities (such as missing mesh contexts or CPU/TPU device mismatches), and optimize tests for faster execution on local TPU environments.

Key Changes

SDXL Smoke Tests

  • Resolved Device Mismatch: Fixed ValueError: Received incompatible devices for jitted computation during checkpoint loading by moving the loading operation outside the active mesh context in generate_sdxl.py.
  • Fixed Missing Mesh Context: Fixed RuntimeError in test_controlnet_sdxl regarding missing mesh context by wrapping model loading in a mesh context but keeping type conversion outside in generate_controlnet_sdxl_replicated.py.
  • Fixed Image Loading Failure: Replaced an external image URL with a local file in the ControlNet test to avoid PIL.UnidentifiedImageError caused by failing downloads or unsupported formats.
  • Prevented Resource Exhaustion: Added jit_initializers=False to SDXL smoke tests to prevent massive constant capture (approx 2.78GB) that caused protobuf serialization limits to be exceeded.
  • Fidelity Checks: Commented out strict SSIM checks in generate_sdxl_smoke_test.py that were failing due to baseline drift in the current environment.

Wan Tests

  • Reorganized Directory Structure: Moved all Wan-related tests into a dedicated directory: src/maxdiffusion/tests/wan/.
  • Fixed Imports and Paths: Fixed relative imports and relative config paths in the moved test files to ensure they run correctly from the new location.
  • Added a new smoke test: generate_wan_smoke_test.py.
  • Memory Management: Added tearDownClass to Wan smoke tests to explicitly delete the pipeline and trigger garbage collection, freeing up TPU memory between test classes.

LTX-Video Tests

  • Dynamic Checkpoint Path: Modified ltx_transformer_step_test.py to use config.pretrained_model_name_or_path as a fallback when "ckpt_path" is missing in the model's JSON config.
  • Dynamic Batch Size: Made the batch size dynamic based on jax.device_count() to avoid IndivisibleError on topologies with more devices.
  • Generic Slicing: Made the output slicing generic based on the reference prediction shape to allow comparisons across different batch sizes.

GitHub Actions Workflow (UnitTests.yml):

  • HF_TOKEN: Added HF_TOKEN environment variable using the HUGGINGFACE_TOKEN secret to allow authenticated downloads from Hugging Face during tests.
  • Log Reduction: Added flags to ignore DeprecationWarning, UserWarning, and RuntimeWarning in the CI logs to reduce clutter.
  • Durations Profiling: Added --durations=0 to always print the execution time of all tests at the end of the CI run.

Other Fixes

  • Data Processing: Resolved flax.errors.TraceContextError in data_processing_test.py by removing redundant JIT compilation.
  • Schedulers: Increased tolerances in test_scheduler_flax.py to accommodate minor precision differences on TPU.

Testing Note

While only some of these changes affect the automated GitHub Action tests, the other changes are critical for when tests are run locally in a real TPU environment. Currently, all tests will pass when run locally (provided a valid Hugging Face token is supplied for gated models like Flux).

@Perseus14 Perseus14 requested a review from entrpn as a code owner May 1, 2026 21:46
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

@Perseus14 Perseus14 force-pushed the tests_fix branch 2 times, most recently from c5b3495 to b506d4e Compare May 1, 2026 21:48
@Perseus14 Perseus14 marked this pull request as draft May 1, 2026 21:52
@Perseus14 Perseus14 force-pushed the tests_fix branch 5 times, most recently from 0cadac3 to 64b9275 Compare May 2, 2026 04:50
@Perseus14 Perseus14 marked this pull request as ready for review May 2, 2026 06:06
@Perseus14 Perseus14 requested review from mbohlool May 4, 2026 18:41
entrpn
entrpn previously approved these changes May 5, 2026
Comment thread src/maxdiffusion/tests/generate_sdxl_smoke_test.py Outdated
Comment thread src/maxdiffusion/tests/generate_sdxl_smoke_test.py Outdated
@github-actions
Copy link
Copy Markdown

🤖 Hi @Perseus14, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This PR successfully addresses a variety of test failures and environment-specific issues across Flux, WAN, SDXL, and LTX-Video models. The inclusion of sharding constraints, memory management in tests, and dynamic batch sizing improves the robustness and performance of the test suite on TPUs.

🔍 General Feedback

  • Test Rigor: While disabling SSIM assertions stabilizes CI, it significantly reduces the value of smoke tests. I recommend revisiting these to use looser thresholds or updated baselines.
  • Resource Management: The addition of tearDownClass with gc.collect() in Wan tests is an excellent pattern that should be considered for other large model tests.
  • Code Clarity: Renaming generic params to scheduler_params in generate_sdxl.py improves readability by making the role of those parameters explicit.

Comment thread src/maxdiffusion/tests/ltx_transformer_step_test.py Outdated
Comment thread src/maxdiffusion/tests/generate_sdxl_smoke_test.py Outdated
Comment thread src/maxdiffusion/tests/generate_wan_smoke_test.py
Comment thread src/maxdiffusion/generate_sdxl.py
Comment thread .github/workflows/UnitTests.yml
@Perseus14 Perseus14 force-pushed the tests_fix branch 2 times, most recently from 3f604fd to e61d0a4 Compare May 11, 2026 15:46
@github-actions
Copy link
Copy Markdown

🤖 Hi @Perseus14, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This Pull Request provides essential fixes for test execution and environment issues across several models, including SDXL, Wan, Flux, and LTX-Video. The changes successfully resolve runtime errors and improve memory management, particularly for TPU environments.

🔍 General Feedback

  • Wan Test Reorganization: Moving Wan tests to a dedicated directory and cleaning up imports is a great structural improvement.
  • Memory Optimization: The addition of tearDownClass and explicit garbage collection in the Wan tests is a solid practice for maintaining stability in resource-constrained environments.
  • Detailed Documentation: The use of TODOs and comments to explain complex issues (like bfloat16 non-determinism) is very helpful for future maintenance.
  • SDXL Refactoring: The refactoring of the SDXL inference loop into JITted steps is a good direction, though the current warmup logic can be further optimized.

Comment thread src/maxdiffusion/generate_sdxl.py Outdated
Comment thread src/maxdiffusion/tests/generate_sdxl_smoke_test.py Outdated
- Resolve execution and environment issues
- Fix dimension mismatch in ControlNet and add tearDown for GC in SDXL tests
- Enable durations profiling and fix formatting/lint issues
@Perseus14
Copy link
Copy Markdown
Collaborator Author

@mbohlool PTAL, I have resolved the comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants