Candle nn lora clean#26
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 32963aeeae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let merged_w = (base_w + delta)?; | ||
| Ok(Linear::new(merged_w, self.base.bias().cloned())) |
There was a problem hiding this comment.
Detach merged weights before constructing Linear
merge() builds merged_w as base_w + delta, but delta is still connected to lora_a/lora_b through the autograd graph. Because the result is returned directly without detach(), the merged layer still retains LoRA graph dependencies, so calling backward() through the merged layer can produce gradients for LoRA vars and keeps adapter tensors alive in memory, which contradicts the method’s contract that the merged layer has no LoRA parameters.
Useful? React with 👍 / 👎.
No description provided.