Skip to content

Add TranslateGemma support and reorganize Gemma module structure CAN-34#10

Open
DrJesseGlass wants to merge 15 commits into
downstreamfrom
translate-gemma
Open

Add TranslateGemma support and reorganize Gemma module structure CAN-34#10
DrJesseGlass wants to merge 15 commits into
downstreamfrom
translate-gemma

Conversation

@DrJesseGlass

Copy link
Copy Markdown
Owner

This PR adds support for Google's TranslateGemma translation models and reorganizes the Gemma model family into a unified module structure.
Changes
Module reorganization:

Moved gemma.rs → gemma/gemma1.rs
Consolidated gemma2.rs, gemma3.rs, quantized_gemma3.rs under gemma/
Added gemma/mod.rs with re-exports for backward compatibility

New: TranslateGemma support:

Added gemma/translate_gemma.rs with prompt formatting utilities and ISO 639-1 language codes (55 languages)
Added examples/translate-gemma.rs supporting both full precision and quantized inference

Bug fixes:

gemma3.rs: Make KV tensors contiguous before cache append. This fixes slice-set only supports contiguous tensors error that surfaces with certain GQA ratios (e.g., TranslateGemma 4B)
quantized_gemma3.rs: Added clear_kv_cache() method for multi-turn inference

Key and value states become non-contiguous after transpose but
KvCache::append() requires contiguous tensors for slice_set.
This worked for some model dimensions but failed for others
(e.g., TranslateGemma 4B with different GQA ratios).
@DrJesseGlass DrJesseGlass changed the title Add TranslateGemma support and reorganize Gemma module structure Add TranslateGemma support and reorganize Gemma module structure CAN-34 Jan 21, 2026
@DrJesseGlass

DrJesseGlass commented Jan 21, 2026

Copy link
Copy Markdown
Owner Author

Full precision models auto-download from HuggingFace. Quantized inference requires a local GGUF file via --model-path, as there are no official GGUF conversions from Google. Community conversions are available on HuggingFace.

@DrJesseGlass

Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b83bcc5add

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +370 to +375
let prompt = format_translate_prompt(text, source.code(), target.code());

let encoding = self
.tokenizer
.encode(prompt.as_str(), true)
.map_err(E::msg)?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid double-adding special tokens in TranslateGemma

format_translate_prompt already embeds <bos>/turn markers, so calling encode(..., true) will run the tokenizer post-processor again. For Gemma tokenizers that add BOS/EOS, this yields a duplicated <bos> and a trailing <eos> in the input, which can cause empty/early-terminated translations or otherwise shift the generation context. Consider encode(..., false) here (or remove <bos> from the prompt) so the prompt is not post-processed twice.

Useful? React with 👍 / 👎.

…ls; however this is because quantized_gemma3 and gemma3 have different activation functions
Comment thread candle-transformers/src/models/gemma/translate_gemma.rs Outdated
Comment thread candle-transformers/src/models/gemma/translate_gemma.rs Outdated
@DrJesseGlass DrJesseGlass reopened this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant