transformers for quant and full with unified example for smol lm3 by DrJesseGlass · Pull Request #2 · DrJesseGlass/candle

DrJesseGlass · 2025-12-04T18:24:40Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-12-04T18:26:57Z

+        let scale = 1.0 / (self.head_dim as f64).sqrt();
+        // Make q contiguous before matmul to avoid stride mismatch
+        let q = q.contiguous()?;
+        let attn_weights = (q.matmul(&k.t()?)? * scale)?;


Use proper transpose for attention matmul

The quantized SmolLM3 attention builds Q/K/V as 4D tensors (B, num_heads, seq_len, head_dim) but computes scores with q.matmul(&k.t()?). Tensor::t() only handles 2D tensors, so with 4D k this call will fail at runtime and the quantized model cannot run. The full-precision path uses k.transpose(2, 3)? instead, which is the needed permutation to produce (B, H, L, L) attention scores.

Useful? React with 👍 / 👎.

// From candle's tensor.rs
pub fn t(&self) -> Result {
let rank = self.rank();
if rank < 2 { /* error */ }
else { self.transpose(rank - 2, rank - 1) } // Same as transpose(2,3) for 4D!
}

So you are wrong. LOL

chatgpt-codex-connector Bot reviewed Dec 4, 2025

View reviewed changes

DrJesseGlass added 9 commits December 5, 2025 16:35

quantized and full SmolLM3

ea162e8

include chrono for prompt

ebfc456

resolve pub consist and unused var

f8168ed

formatted

8447af4

last spacing in format

fcb22b4

add credits

ce922dc

chat template

9621cbf

integrate new chat template for smollm3 example

526ab5a

fmt and clippy

e9cf0e3

DrJesseGlass force-pushed the examples/add/smolLM3 branch from f812d1f to e9cf0e3 Compare December 5, 2025 21:35

DrJesseGlass changed the base branch from downstream-transformers to downstream December 30, 2025 21:33

DrJesseGlass closed this Dec 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformers for quant and full with unified example for smol lm3#2

transformers for quant and full with unified example for smol lm3#2
DrJesseGlass wants to merge 9 commits into
downstreamfrom
examples/add/smolLM3

DrJesseGlass commented Dec 4, 2025

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Dec 4, 2025

Uh oh!

DrJesseGlass Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DrJesseGlass commented Dec 4, 2025

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

DrJesseGlass Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant