Skip to content

Issue with ORI token in partial diffusion#275

Open
jonfunk21 wants to merge 1 commit into
RosettaCommons:productionfrom
jonfunk21:fix/partial-diffusion-com-and-ori-token
Open

Issue with ORI token in partial diffusion#275
jonfunk21 wants to merge 1 commit into
RosettaCommons:productionfrom
jonfunk21:fix/partial-diffusion-com-and-ori-token

Conversation

@jonfunk21
Copy link
Copy Markdown

_set_origin for partial diffusion previously hard-coded ori_token=None and infer_ori_strategy="com", with two consequences:

  1. The user-supplied ori_token (and infer_ori_strategy) was silently dropped. Documented behavior is that the diffused-region COM ends up within ~5 A of the user's ORI token, but in partial diffusion mode any value passed by the caller had no effect on the output coordinates.

  2. The default centering used the COM of the entire input (motif + diffused region) instead of the diffused-region COM. The training pipeline uses center_option=diffuse (see configs/datasets/design_base.yaml), so the model expects the diffused region's COM at the origin. Joint-COM centering puts the diffused region far from origin in a frame the model never saw at training, which biases denoising to drag the diffused region toward the motif COM. On asymmetric systems (large target, small binder), the binder visibly drifts toward the target with increasing partial_t -- consistent with the "binder winds around target" reports.

This change updates partial-diffusion centering to:

  1. Skip centering for symmetric structures (unchanged).
  2. Use user-supplied ori_token / infer_ori_strategy if provided.
  3. Otherwise center on the diffused-region COM (matches training).

…tial diffusion

`_set_origin` for partial diffusion previously hard-coded `ori_token=None`
and `infer_ori_strategy="com"`, with two consequences:

1. The user-supplied `ori_token` (and `infer_ori_strategy`) was silently
   dropped. Documented behavior is that the diffused-region COM ends up
   within ~5 A of the user's ORI token, but in partial diffusion mode any
   value passed by the caller had no effect on the output coordinates.

2. The default centering used the COM of the entire input (motif +
   diffused region) instead of the diffused-region COM. The training
   pipeline uses `center_option=diffuse` (see
   `configs/datasets/design_base.yaml`), so the model expects the
   diffused region's COM at the origin. Joint-COM centering puts the
   diffused region far from origin in a frame the model never saw at
   training, which biases denoising to drag the diffused region toward
   the motif COM. On asymmetric systems (large target, small binder),
   the binder visibly drifts toward the target with increasing
   `partial_t` -- consistent with the "binder winds around target"
   reports.

This change updates partial-diffusion centering to:
  1. Skip centering for symmetric structures (unchanged).
  2. Honor user-supplied `ori_token` / `infer_ori_strategy` if provided.
  3. Otherwise center on the diffused-region COM (matches training).

Empirical verification on a CD3E binder showed the inward pull at
partial_t=15 dropped from -4.46 A (joint-COM centering) to -0.46 A
(diffused-COM centering). On a thelma enzyme run, supplying
`ori_token=[50,0,0]` now shifts the output frame by -50 A in x as
expected (was 0 A before), while fixed-atom RMSD stays at machine
precision (~0.02 A Kabsch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rclune rclune requested a review from Ubiquinone-dot May 7, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant