Add ability to compose multi-value Machinery translations by mathjazz · Pull Request #4236 · mozilla/pontoon

mathjazz · 2026-06-18T14:25:25Z

For Fluent and MF2-handled formats (Android, Gettext, WebExt, Xcode, Xliff), the Machinery panel only matched on the first input field (via `getPlainMessage()`), so attributes and selector variants were never surfaced. This mirrors Pretranslation's complex string composition in Machinery, adding directly-pasteable composed suggestions alongside the existing results. More details: 1. Parameterize Pretranslation with mt_provider/mt_service_name/ mt_supported, and move entity-walking into `Pretranslation.walk_entity()` so Machinery can reuse the composition pipeline. 2. Add `/machinery-composed/` endpoint that walks the entity, looks up each value in TM, and falls back to the requested MT service for any remaining value. Returns the composed string + the actual mix of services used (TM badge + MT badge for hybrid results). 3. Frontend fires composed requests in parallel with the existing fetches when the entity format can have multiple values. Composed results dedupe through the existing `addResults()` merge.

on the entity having more than one translatable input, reusing the editor's field-counting logic. 2. Surface a quality badge. When every value is a 100% TM match, the composed string is a perfect TM match, so return quality 100 and pass it through to the panel. 3. Render composed (multi-value) suggestions as labeled fields, the same representation as the original string panel.

across all input fields, reusing the field-building logic that is already used by the History panel. The plain message is recorded as `machinery.translation` so that source attribution still matches the saved translation on submit.

string isn't suggested back to itself. The composed path didn't, so a composed TM result could be reconstructed from the entity's own translation after it was translated. Add an opt-in `exclude_entity` flag to Pretranslation that excludes the entity's own TM entries from per-value lookups, and enable it from the Machinery composed view. A value that can only be served by the entity's own translation then has no TM match, so a TM-only composition relying on it is no longer produced. Pretranslation behavior is unchanged.

Re-applying a composed Machinery suggestion (or restoring history) after editing a field took two clicks: the first did nothing. Typing updates only CodeMirror's internal doc and EditorResult, not EditorData.state.fields, so TranslationForm doesn't re-render and each EditField keeps a stale `defaultValue`. EditField re-syncs its document only in `useEffect(() => setValue(defaultValue), [defaultValue])`. distributeEntrySource builds new fields with placeholder handles while the on-screen editors stay bound, via their React key, to the previous fields' live handles. The re-applied value for the edited field equals its stale `defaultValue`, so the effect doesn't fire and the field isn't updated. A later re-render refreshes `defaultValue`, which is why the second click works. Push the distributed values straight into the live handles, matched by field id, the same way clearEditor does, so one click suffices.

The "Refine using AI" dropdown doesn't work on composed (multi-field/ plural) suggestions: the loader never shows, the refined result never updates the UI, and copying dumps the raw Fluent source into the first field. The backend /gpt-transform/ endpoint also refines a single string, so it can't preserve the entry structure (e.g. returns 2 plural forms instead of 4). Hide the dropdown for composed suggestions so they behave like a plain Google Translate source. Proper composed support is left as a follow-up.

When a suggestion combines multiple sources (e.g. GOOGLE TRANSLATE and TRANSLATION MEMORY), the source titles ran together with no separator.

codecov-commenter · 2026-06-18T14:28:16Z

Codecov Report

❌ Patch coverage is 84.48276% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.90%. Comparing base (cb5f5d8) to head (e11f61f).
⚠️ Report is 11 commits behind head on main.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eemeli

Looked at the Python parts only thus far.

eemeli · 2026-06-22T05:20:28Z

+COMPOSED_MT_SERVICES = {
+    "google-translate": (
+        lambda text, locale, preserve_placeables: get_google_translate_data(
+            text=text, locale=locale, preserve_placeables=preserve_placeables
+        ),
+        "google_translate_code",
+    ),
+    "microsoft-translator": (
+        lambda text, locale, preserve_placeables: get_microsoft_translator_data(
+            text, locale.ms_translator_code
+        ),
+        "ms_translator_code",
+    ),
+}


I'd prefer not separately defining the lambdas like this, and to have the calls directly in the code below.

eemeli · 2026-06-22T05:28:39Z

+COMPOSED_FORMATS = {
+    Resource.Format.FLUENT,
+    Resource.Format.ANDROID,
+    Resource.Format.GETTEXT,
+    Resource.Format.WEBEXT,
+    Resource.Format.XCODE,
+    Resource.Format.XLIFF,
+}


The behaviour should not be gated on the format, but on whether the Entity.value and .properties represent a single-pattern or multi-pattern message.

eemeli · 2026-06-22T05:30:40Z

+    Return a composed multi-value translation for a Fluent / MF2 entity.
+
+    Each translatable leaf (Fluent value/attribute, MF2 variant) is looked up in
+    Translation Memory; leaves without a 100% TM match fall back to the requested
+    MT service. Mirrors the Pretranslation pipeline so the Machinery panel can
+    surface a directly-pasteable composed translation alongside the per-leaf
+    results.


The direct references to the formats here are misleading, as I presume "MF2" is encompassing all the formats (like Android and Gettext) that are internally represented using MF2 syntax?

eemeli · 2026-06-22T05:34:12Z

+    if service == "translation-memory":
+        mt_provider = None
+        mt_service_name = "tm"
+        mt_supported = False
+    elif service in COMPOSED_MT_SERVICES:


See comment above; this should be a match statement handling all the valid service values.

eemeli · 2026-06-22T05:51:55Z

+        if entity.resource.format == Resource.Format.FLUENT:
+            entry = fluent_parse_entry(entity.string, with_linepos=False)
+            if entry.value:
+                self.message(entry.value)
+            accesskeys: list[tuple[str, Message]] = []
+            for key, prop in entry.properties.items():
+                if key.endswith("accesskey"):
+                    accesskeys.append((key, prop))
+                else:
+                    self.message(prop)
+            for key, prop in accesskeys:
+                set_accesskey(entry, key, prop)
+            return FluentSerializer().serialize_entry(
+                fluent_astify_entry(entry, escape_syntax=False)
+            )
+
+        if entity.resource.format in {
+            Resource.Format.ANDROID,
+            Resource.Format.GETTEXT,
+            Resource.Format.WEBEXT,
+            Resource.Format.XCODE,
+            Resource.Format.XLIFF,
+        }:
+            format = Format.mf2
+            msg = parse_message(format, entity.string)
+        else:
+            format = None
+            msg = PatternMessage([entity.string])
+        self.message(msg)
+        return serialize_message(format, msg)


Include

from moz.l10n.message import message_from_json

above, and then do something like this, with the serialization done later:

Suggested change

if entity.resource.format == Resource.Format.FLUENT:

entry = fluent_parse_entry(entity.string, with_linepos=False)

if entry.value:

self.message(entry.value)

accesskeys: list[tuple[str, Message]] = []

for key, prop in entry.properties.items():

if key.endswith("accesskey"):

accesskeys.append((key, prop))

else:

self.message(prop)

for key, prop in accesskeys:

set_accesskey(entry, key, prop)

return FluentSerializer().serialize_entry(

fluent_astify_entry(entry, escape_syntax=False)

)

if entity.resource.format in {

Resource.Format.ANDROID,

Resource.Format.GETTEXT,

Resource.Format.WEBEXT,

Resource.Format.XCODE,

Resource.Format.XLIFF,

}:

format = Format.mf2

msg = parse_message(format, entity.string)

else:

format = None

msg = PatternMessage([entity.string])

self.message(msg)

return serialize_message(format, msg)

value = message_from_json(entity.value)

if value:

self.message(entry.value)

properties = {

key: message_from_json(prop)

for key, prop in entity.properties.items()

} if entity.properties else {}

accesskeys: list[tuple[str, Message]] = []

for key, prop in properties.items():

if key.endswith("accesskey"):

accesskeys.append((key, prop))

else:

self.message(prop)

for key, prop in accesskeys:

set_accesskey(entry, key, prop)

return value, properties

eemeli · 2026-06-22T05:54:06Z

+        self.mt_provider = mt_provider or get_google_translate_data
+        self.mt_service_name = mt_service_name


Are "mt_provider" and "mt_service" effectively synonymous, or somehow different? In any case, it seems weird to fall back here for one, but not the other.

mathjazz · 2026-06-22T18:12:16Z

Thanks for the comments.

Please note that I've intentionally not flagged anyone for code review yet, because I'd like to first get feedback on the functionality. The code is still deployed to https://pontoon.allizom.org/.

mathjazz added 9 commits June 17, 2026 17:29

Make sure the composed Machinery suggestions are always at the top

670d3b7

Fix failing test

b241ca8

Add separator between Machinery source titles

e11f61f

When a suggestion combines multiple sources (e.g. GOOGLE TRANSLATE and TRANSLATION MEMORY), the source titles ran together with no separator.

eemeli requested changes Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ability to compose multi-value Machinery translations#4236

Add ability to compose multi-value Machinery translations#4236
mathjazz wants to merge 9 commits into
mozilla:mainfrom
mathjazz:machinery-compose-multi-value-2886

mathjazz commented Jun 18, 2026

Uh oh!

codecov-commenter commented Jun 18, 2026 •

edited

Loading

Uh oh!

eemeli left a comment

Uh oh!

eemeli Jun 22, 2026

Uh oh!

eemeli Jun 22, 2026

Uh oh!

eemeli Jun 22, 2026

Uh oh!

eemeli Jun 22, 2026

Uh oh!

eemeli Jun 22, 2026

Uh oh!

eemeli Jun 22, 2026

Uh oh!

mathjazz commented Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		self.mt_provider = mt_provider or get_google_translate_data
		self.mt_service_name = mt_service_name

Uh oh!

Conversation

mathjazz commented Jun 18, 2026

Uh oh!

codecov-commenter commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

eemeli left a comment

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

eemeli Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

mathjazz commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jun 18, 2026 •

edited

Loading

mathjazz commented Jun 22, 2026 •

edited

Loading