Add offline Chinese->English translation support and improved inplace layout by itrejomx · Pull Request #241 · bquenin/interpreter

itrejomx · 2026-05-02T00:57:06Z

Summary

add Chinese -> English offline mode with RapidOCR + OPUS-MT zh-en
add language-aware status labels and safer source-language normalization
improve inplace overlay layout by stacking overlaps and ordering labels for vertical reading flow
refresh README intro and language support docs

Testing

pytest tests -q
python3.11 -m py_compile src/interpreter/config.py src/interpreter/gui/main_window.py src/interpreter/overlay/base.py

greptile-apps · 2026-05-02T01:00:21Z

Greptile Summary

This PR adds Chinese→English offline translation (RapidOCR + OPUS-MT zh-en) alongside the existing Japanese pipeline, introduces a language selector in the GUI, and refactors the inplace overlay to stack overlapping labels for vertical reading flow.

P1 — inplace overlay swaps labels: arrange_overlay_regions sorts regions by (-x, y) internally, so the returned arranged list is in a different order than label_entries. The index-based zip(label_entries, arranged) on line 411 of base.py then assigns each QLabel the position of a different region — in the two-region case the two labels are fully swapped on screen.

Confidence Score: 3/5

Not safe to merge as-is: the inplace overlay label-position bug will visibly swap translated text across regions whenever more than one caption is on screen.

One confirmed P1 (label/position index mismatch after internal sort) pulls the score below the P1 ceiling of 4. The P2 findings (CUDA leak, warnings scope) are non-blocking but worth fixing before shipping.

src/interpreter/overlay/base.py (P1 label mismatch), src/interpreter/translate.py (P2 CUDA/warnings scope)

Important Files Changed

Filename	Overview
src/interpreter/overlay/base.py	Adds `arrange_overlay_regions` for overlap-aware inplace label layout, but contains a P1 bug: `arranged` is returned in sorted (-x, y) order while `label_entries` retains insertion order, causing index-based `zip` to assign each label the wrong position.
src/interpreter/translate.py	Refactors into `JapaneseTranslator`/`ChineseTranslator` with a language-aware `Translator` wrapper; `warnings.catch_warnings()` scope exits before the heavier `AutoTokenizer.from_pretrained` call, and the CUDA fallback path leaves a GPU translator object allocated before overwriting it.
src/interpreter/ocr_rapid.py	New `ChineseOCR` backend using RapidOCR; duplicates the `bgra_to_rgb` helper already defined in `ocr.py` (previously flagged).
src/interpreter/ocr.py	Renames `OCR` to `JapaneseOCR` and adds a language-aware `OCR` façade that delegates to `JapaneseOCR` or `ChineseOCR`; straightforward delegation pattern with no issues.
src/interpreter/config.py	Adds `SourceLanguage` enum, `normalize_source_language`, and model-name helpers; serialization round-trip and fallback handling look correct.
src/interpreter/gui/main_window.py	Adds language combo, dynamic model-name labels, and extracts `_restart_process_worker`; the `_fixing_ocr` flag is still not set for the Chinese OCR failure path (previously flagged).
src/interpreter/gui/workers.py	Adds `contains_han`, `set_source_language`, and language-gated translation-skip logic; clean and correct.
tests/test_overlay_layout.py	New tests verify `arrange_overlay_regions` return values by `text` key but don't catch the index-based `zip` mismatch in the caller that causes the P1 bug.
tests/test_translate.py	Adds boilerplate version patching; existing translate tests unchanged and look correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Screen Capture] --> B{source_language}
    B -->|JAPANESE| C[JapaneseOCR / MeikiOCR]
    B -->|CHINESE| D[ChineseOCR / RapidOCR]
    C --> E[OCR regions list]
    D --> E
    E --> F{contains_japanese / contains_han filter}
    F -->|pass| G{Translator wrapper}
    F -->|skip| H[emit empty]
    G -->|JAPANESE| I[JapaneseTranslator / Sugoi V4]
    G -->|CHINESE| J[ChineseTranslator / OPUS-MT zh-en]
    I --> K[translated regions]
    J --> K
    K --> L{overlay_mode}
    L -->|BANNER| M[BannerOverlay]
    L -->|INPLACE| N[arrange_overlay_regions sort by -x,y]
    N --> O[zip(label_entries arranged) order mismatch]
    O --> P[InplaceOverlay QLabels]

_{Reviews (2): Last reviewed commit: "chore: quiet Chinese model startup warni..." | Re-trigger Greptile}

greptile-apps · 2026-05-02T01:00:30Z

        failed = self._process_worker.get_failed_models()
-        if "ocr" in failed:
+        if "ocr" in failed and self._source_language == SourceLanguage.JAPANESE:
            delete_model_cache("rtr46/meiki.text.detect.v0")
            delete_model_cache("rtr46/meiki.txt.recognition.v0")
            self._fixing_ocr = True


_fixing_ocr never set for Chinese OCR failures

When OCR fails in Chinese mode and the user clicks "Fix Models", _fixing_ocr remains False because there is no branch for SourceLanguage.CHINESE. The worker restarts and the ocr_status signal fires "loading", but the status label path that shows "Downloading..." (guarded by _fixing_ocr) is skipped, so the label jumps directly from its error state to "Loading...". Setting self._fixing_ocr = True when Chinese OCR is in the failed set would keep the labels consistent.

greptile-apps · 2026-05-02T01:00:31Z

+from pathlib import Path
+
+ROOT = Path(__file__).resolve().parents[1]
+SRC = ROOT / "src"


Duplicate import importlib.metadata

importlib.metadata is imported on line 5 and again on line 13. The second import is redundant. The same pattern appears in tests/test_language_backends.py (lines 5 and 13 in that file).

greptile-apps · 2026-05-04T17:34:55Z

+            content_offset=content_offset,
+        )
+
+        for (label, _, _), layout in zip(label_entries, arranged, strict=False):


Label-to-position mismatch due to internal sort in arrange_overlay_regions

arrange_overlay_regions sorts the input regions list internally by (-x, y) before computing positions, so the returned arranged list is in a different order than label_entries. The zip(label_entries, arranged) on line 411 pairs by index, meaning label for region A ends up receiving the position computed for region B.

Concrete example: two regions at x=100 and x=200 — label_entries keeps the original order [A(x=100), B(x=200)], but arranged is [B_layout, A_layout] (sorted by -x). The zip assigns label_A.move(B_layout) and label_B.move(A_layout), swapping both overlays on screen.

Fix: match by text field rather than relying on positional alignment:

layout_by_text = {item["text"]: item for item in arranged} for label, text, _ in label_entries: if text in layout_by_text: layout = layout_by_text[text] label.move(layout["x"], layout["y"]) label.show() self._labels.append(label)

itrejomx added 6 commits May 1, 2026 15:11

feat: add chinese to english offline translation support

5d09b40

Merge branch 'chinese-support'

e67f716

feat: clarify active models in status panel

18cee58

fix: normalize selected source language values

cb563b1

fix: stack overlapping inplace translations

7b794dc

fix: order inplace translations by vertical reading flow

8e64c5d

greptile-apps Bot reviewed May 2, 2026

View reviewed changes

chore: quiet Chinese model startup warnings

d1d139d

greptile-apps Bot reviewed May 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add offline Chinese->English translation support and improved inplace layout#241

Add offline Chinese->English translation support and improved inplace layout#241
itrejomx wants to merge 7 commits into
bquenin:mainfrom
itrejomx:main

itrejomx commented May 2, 2026

Uh oh!

greptile-apps Bot commented May 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot May 2, 2026

Uh oh!

greptile-apps Bot May 2, 2026

Uh oh!

greptile-apps Bot May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

itrejomx commented May 2, 2026

Summary

Testing

Uh oh!

greptile-apps Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

greptile-apps Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 2, 2026 •

edited

Loading