feat: Time-to-Move sampling by Pizzawookiee · Pull Request #13707 · Comfy-Org/ComfyUI

Pizzawookiee · 2026-05-04T18:36:33Z

Adds native support for #10745 which is a request for https://github.com/time-to-move/TTM

There is an implementation in the WanVideoWrapper repository but this one should in theory work for other video models as well.

Adds two nodes:
TimeToMoveKSamplerAdvanced, which modifies KSamplerAdvanced to add inputs for a latent_mask and a time_to_move_end_at_step. Logic is as follows:

The input latent should be the VAE encoded 'crude reference animation' as mentioned in the abstract reproduced above.
The latent is cloned at the start to produce a reference latent.
After each sampling step in [start_at step, time_to_move_end_at_step), reference latent is noised and composited onto the partially sampled latent using the input latent_mask.
Starting at time_to_move_end_at_step, the compositing stops and the latent is sampled as usual.

RGBMaskToLatentMask, which uses the input VAE's downsampling factor to create a compatible latent mask to input into the TimeToMoveKSamplerAdvanced. For example, a video mask that is 81 frames long will be interpolated using nearest neighbors to a 21 frame mask if Wan VAE is the input.

Considerations:

For Wan Video I2V workflows, the conditionings from WanImagetoVideo node are required as those conditionings include a VAE encoded latent containing the start image. However we don't use the latent from WanImagetoVideo and instead use the encoded reference animation as input latent. Likely to be the case for other video models as well.

Issues

The loop I use to retrieve the latent image after each sampling step causes issues with the visual progress bar on top of the node, such that it will appear at 100 percent after the first step.
While I believe this matches the compositing logic seen in the original Time-to-Move repo, I am unable to get particularly impressive results on a Wan 2.2 low-step workflow (compared to only using a later start_at_step to begin sampling with the reference animation latent) and lack the time and resources to test this with a 50-step workflow used in the original Time-to-Move implementation.

coderabbitai · 2026-05-04T19:10:17Z

📝 Walkthrough

Walkthrough

Adds latent-space video compositing and a step-windowed KSampler workflow plus RGB→latent mask conversion. New functions in comfy_extras/nodes_custom_sampler.py: video_latent_composite (blends 5D latents with spatial offsets and optional mask), time_to_move_sample (iterative per-step KSampler with optional recomputed noisy latents and compositing), and time_to_move_common_ksampler (prepares latent/noise, callbacks, returns updated latent). New node TimeToMoveKSamplerAdvanced registered. In comfy_extras/nodes_mask.py: convert_rgb_mask_to_latent_mask (temporal sampling + spatial nearest-resize) and node RGBMaskToLatentMask registered.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: Time-to-Move sampling' directly and specifically summarizes the main change: adding Time-to-Move sampling functionality to the codebase.
Description check	✅ Passed	The description is detailed and directly related to the changeset, explaining the new nodes (TimeToMoveKSamplerAdvanced and RGBMaskToLatentMask), their behavior, and known issues.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy_extras/nodes_custom_sampler.py`:
- Around line 86-117: The loop can be skipped when start_step >= effective last
step, leaving samples undefined; fix by guarding/initializing before the loop
(check the same condition used in the for: if min(last_step, steps) - start_step
<= 0) and set samples to a valid tensor (e.g., convert latent_image to the
expected device/dtype via
comfy.model_management.intermediate_device()/intermediate_dtype() or simply
assign latent_image) and return it, or initialize samples to latent_image before
entering the loop; update the code around the for-loop that uses start_step,
last_step, steps, samples, and latent_image to ensure samples is always defined
when returned.

In `@comfy_extras/nodes_mask.py`:
- Around line 49-63: The temporal sampling is incorrect: don’t assume
vae.downscale_ratio is a tuple or that a stride k reconstructs the latent frame
count; instead derive the target temporal length explicitly (e.g., accept a
target_latent_T or compute it from vae.downscale_ratio whether it’s an int or a
tuple) and resample the input mask to that exact T using a proper mapping (use
torch.linspace to create target frame indices and gather/interpolate along dim=0
or use torch.nn.functional.interpolate after adding a channel dim) rather than
mask[0:1] + mask[1::k]; update convert_rgb_mask_to_latent_mask to compute
target_T robustly from vae.downscale_ratio or an explicit parameter and replace
the mask0/mask1 sampling with index-based gather/interpolate, and apply the same
fix at the other occurrence (lines ~455-457) so the returned latent_mask frame
count matches the VAE latent frames.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45eb2c66-072e-4b5c-b596-194a18865abf

📥 Commits

Reviewing files that changed from the base of the PR and between c33d26c and ae54d7a.

📒 Files selected for processing (2)

comfy_extras/nodes_custom_sampler.py
comfy_extras/nodes_mask.py

Clarifies that node is intended to work with causal Video VAEs.

coderabbitai

🧹 Nitpick comments (1)

comfy_extras/nodes_mask.py (1)

451-457: ⚡ Quick win

Consider adding a type guard for clearer error messages.

The description correctly documents that this node is for causal Video VAEs, but if a user accidentally wires a non-video VAE (which has a scalar downscale_ratio), they'll get a cryptic "'int' object is not subscriptable" error. A quick check would improve the UX.

💡 Optional: Add validation for VAE type

     `@classmethod`
     def execute(cls, mask, vae) -> IO.NodeOutput:
         # Ensure we work on a copy of the mask to remain non-destructive
         mask_copy = mask.clone()
         downscale_ratio = vae.downscale_ratio
+        if not isinstance(downscale_ratio, tuple) or len(downscale_ratio) < 3:
+            raise ValueError("RGBMaskToLatentMask requires a causal Video VAE (e.g., Wan). The provided VAE does not have a compatible downscale_ratio.")
         k = (mask.shape[0] - 1) // (downscale_ratio[0](mask.shape[0]) - 1) if (downscale_ratio[0](mask.shape[0]) - 1) > 1 else 1
         return IO.NodeOutput(convert_rgb_mask_to_latent_mask(mask_copy, k, spatial_downsample_h = downscale_ratio[1], spatial_downsample_w = downscale_ratio[2]))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@comfy_extras/nodes_mask.py` around lines 451 - 457, The execute method
assumes vae.downscale_ratio is indexable (video VAE) but a scalar (image VAE)
causes "'int' object is not subscriptable"; add a type guard at the start of
execute to validate vae.downscale_ratio is a sequence/iterable with at least
three elements (or supports __getitem__ for indices 0..2), and if not raise a
clear ValueError (or TypeError) referencing the node and value (e.g.,
"nodes_mask.execute expects video VAE with downscale_ratio like (h,w,...) but
got <value>"); keep using mask.clone(), downscale_ratio, k calculation and
convert_rgb_mask_to_latent_mask/IO.NodeOutput after the guard.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@comfy_extras/nodes_mask.py`:
- Around line 451-457: The execute method assumes vae.downscale_ratio is
indexable (video VAE) but a scalar (image VAE) causes "'int' object is not
subscriptable"; add a type guard at the start of execute to validate
vae.downscale_ratio is a sequence/iterable with at least three elements (or
supports __getitem__ for indices 0..2), and if not raise a clear ValueError (or
TypeError) referencing the node and value (e.g., "nodes_mask.execute expects
video VAE with downscale_ratio like (h,w,...) but got <value>"); keep using
mask.clone(), downscale_ratio, k calculation and
convert_rgb_mask_to_latent_mask/IO.NodeOutput after the guard.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d23c08ef-8557-4a57-b996-ae09abc2b794

📥 Commits

Reviewing files that changed from the base of the PR and between 8dd41ef and d56a093.

📒 Files selected for processing (1)

comfy_extras/nodes_mask.py

… VAE inputted

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@comfy_extras/nodes_mask.py`:
- Line 458: The computation for k incorrectly uses the guard
(downscale_ratio[0](mask.shape[0]) - 1) > 1 which forces k=1 when T_latent==2;
change it to only guard against division by zero by checking > 0 and call the
downscale callable once: store ratio = downscale_ratio[0](mask.shape[0]) then
compute k = (mask.shape[0] - 1) // (ratio - 1) if (ratio - 1) > 0 else 1 so
T_latent==2 yields the correct k = T - 1 and avoids the double call to
downscale_ratio[0].

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 06107a97-6de4-4031-a81e-e36584382b7b

📥 Commits

Reviewing files that changed from the base of the PR and between d56a093 and de97192.

📒 Files selected for processing (1)

comfy_extras/nodes_mask.py

coderabbitai · 2026-05-04T19:49:58Z

+        downscale_ratio = vae.downscale_ratio
+        if not isinstance(downscale_ratio, tuple) or len(downscale_ratio) < 3:
+            raise ValueError("RGBMaskToLatentMask requires a causal Video VAE (e.g., Wan). The provided VAE does not have a compatible downscale_ratio.")
+        k = (mask.shape[0] - 1) // (downscale_ratio[0](mask.shape[0]) - 1) if (downscale_ratio[0](mask.shape[0]) - 1) > 1 else 1


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

k guard condition > 1 silently misbehaves when T_latent == 2.

The condition (downscale_ratio[0](mask.shape[0]) - 1) > 1 evaluates to False when T_latent - 1 == 1 (i.e., T_latent == 2), so k is hard-clamped to 1 instead of the correct k = T - 1. A mask with 81 input frames would be returned with all 81 temporal frames rather than the expected 2, silently producing a shape mismatch downstream. The guard only needs to avoid division by zero, so the threshold should be > 0.

🐛 Proposed fix

- k = (mask.shape[0] - 1) // (downscale_ratio[0](mask.shape[0]) - 1) if (downscale_ratio[0](mask.shape[0]) - 1) > 1 else 1 + t_latent = downscale_ratio[0](mask.shape[0]) + k = (mask.shape[0] - 1) // (t_latent - 1) if (t_latent - 1) > 0 else 1

The refactor also avoids calling the downscale_ratio[0] callable twice.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@comfy_extras/nodes_mask.py` at line 458, The computation for k incorrectly uses the guard (downscale_ratio[0](mask.shape[0]) - 1) > 1 which forces k=1 when T_latent==2; change it to only guard against division by zero by checking > 0 and call the downscale callable once: store ratio = downscale_ratio[0](mask.shape[0]) then compute k = (mask.shape[0] - 1) // (ratio - 1) if (ratio - 1) > 0 else 1 so T_latent==2 yields the correct k = T - 1 and avoids the double call to downscale_ratio[0].

kijai · 2026-05-04T20:20:35Z

Hey, have you tried my implementation in KJNodes, and if so is it lacking something? I called it LatentInpaintTTM because it's really just bit different inpaint method and can be implemented with much less code in general. If it's adequate, it could be added to core with little effort.

Pizzawookiee · 2026-05-04T20:33:33Z

LatentInpaintTTM

@kijai I was not aware of your implementation, thanks for letting me know. Yes, would love to see this in core. And your implementation is quite helpful in teaching me how to patch the model.

Pizzawookiee added 6 commits May 2, 2026 15:23

Add VideoLatentCompositeMasked and RGBMaskToLatentMask nodes

800bf84

Add TimeToMoveKSamplerAdvanced

c3cd2a4

Fix tensor on wrong device error; slight change to logic

0b7d560

Fix new composite function not being defined

b3a0665

Remove VideoLatentCompositeMasked node

f3aebfa

Fix compositing error, change input arguments

ae54d7a

Pizzawookiee mentioned this pull request May 4, 2026

Please Support for TTM #10745

Open

Pizzawookiee marked this pull request as ready for review May 4, 2026 19:04

Pizzawookiee requested review from Kosinkadink, alexisrolland, comfyanonymous, guill, kijai and rattus128 as code owners May 4, 2026 19:04

Merge branch 'master' into time_to_move

b715186

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread comfy_extras/nodes_custom_sampler.py Outdated

Comment thread comfy_extras/nodes_mask.py

Pizzawookiee added 2 commits May 4, 2026 15:26

Handle case where start_step is greater than last_step

8dd41ef

Update description of RGBMaskToLatentMask

d56a093

Clarifies that node is intended to work with causal Video VAEs.

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Update RGBMaskToLatentMask to raise more detailed error if non-causal…

de97192

… VAE inputted

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

comfyanonymous reviewed May 4, 2026

View reviewed changes

Comment thread comfy_extras/nodes_mask.py

comfyanonymous reviewed May 4, 2026

View reviewed changes

Comment thread comfy_extras/nodes_custom_sampler.py

Pizzawookiee marked this pull request as draft May 4, 2026 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Time-to-Move sampling#13707

feat: Time-to-Move sampling#13707
Pizzawookiee wants to merge 10 commits intoComfy-Org:masterfrom
Pizzawookiee:time_to_move

Pizzawookiee commented May 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

Uh oh!

kijai commented May 4, 2026

Uh oh!

Pizzawookiee commented May 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Pizzawookiee commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kijai commented May 4, 2026

Uh oh!

Pizzawookiee commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Pizzawookiee commented May 4, 2026 •

edited

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Pizzawookiee commented May 4, 2026 •

edited

Loading