Skip to content

Instant Selection, Smart OCR and UI/UX polish#62

Open
Art-ovv wants to merge 13 commits into
AKS-Labs:mainfrom
Art-ovv:art-ovv
Open

Instant Selection, Smart OCR and UI/UX polish#62
Art-ovv wants to merge 13 commits into
AKS-Labs:mainfrom
Art-ovv:art-ovv

Conversation

@Art-ovv

@Art-ovv Art-ovv commented Jun 12, 2026

Copy link
Copy Markdown

Overview

This PR introduces a major architectural overhaul of the text selection and translation pipelines. The primary goals are to bring the text selection experience closer to Google Lens (Instant Selection), significantly improve OCR accuracy for mixed Cyrillic/Latin text, fix rendering bugs in the translation module, and eliminate UI stuttering.

Key Features and Improvements

Instant Text Selection

  • Auto-Trigger: Text analysis now triggers automatically in the background when the overlay is launched. The "Select Text" button has been removed.
  • Seamless Overlay: The selection view is cleanly integrated into the background. Tapping on recognized text highlights it, while tapping outside falls through to the drawing canvas.

OCR Accuracy & Smart Pre-processing

  • Dual-Language Support: Tesseract now prioritizes both English and Russian dictionaries to accurately parse mixed-language text.
  • Script Un-mixer: Implemented a custom post-processing algorithm to fix Cyrillic/Latin dictionary collisions, greatly improving Cyrillic text recognition reliability.
  • Smart Binarization: The pipeline calculates average image luminance. If a Dark Mode UI is detected, the image is automatically inverted and binarized, improving accuracy on faint text and colored chat bubbles.
  • Multi-Column Clustering: Upgraded the line-clustering algorithm to respect horizontal proximity. Side-by-side elements (like chat bubbles) are no longer merged into single unreadable lines.

Performance & Memory

  • Tesseract Caching: Removed redundant image splitting logic and cached the Tesseract instance in memory. This avoids reloading language models from disk on every scan, significantly speeding up extraction.
  • Optimized Rendering: Completely rewrote the text selection rendering loop. Heavy object allocations were moved out of onDraw into a state-update method, preventing garbage collection churn and ensuring smooth UI.
  • Bitmap GC Safety: Removed aggressive manual .recycle() calls tied to Compose states to fix random crashes related to recycled bitmaps.

Translation Quality

  • On-Device Screen Translation: Screen translation is now fully functional. By default, it automatically translates text into your system's language, but you can easily change the target translation language at any time via the three-dot menu.
  • Accurate Background Sampling: Added outward padding to the color sampler. The scanner now successfully reads the true background color instead of accidentally sampling the font's ink.

UI and UX Polish

  • Auto-Hide Panels: The top header and bottom control bar now smoothly slide out of the way when starting to draw or resize a selection.
  • Tap-to-Toggle: Tapping an empty background area now toggles the visibility of the UI panels.
  • Code Cleanup: Addressed and cleaned up legacy Android API deprecation warnings.

art-ovv added 13 commits June 12, 2026 23:08
Include dependencies for Text Recognition, Translation, and Language Identification. Added kotlinx-coroutines-play-services for Task API support.
Implement automated screen translation pipeline using ML Kit Translation and Language ID. Added intelligent background color sampling and WCAG-compliant text contrast calculation.
Create a dedicated dialog for selecting target translation languages, supporting all ML Kit languages with a searchable interface.
Add logic to automatically detect and highlight interactive entities on the screen using OCR data and regex patterns.
Enable automatic text recognition when the overlay opens. Added Tesseract caching for 5-10x performance boost and simplified the scanning UI.
Switch to dual-language 'rus+eng' mode by default. Added a character un-mixer and refined vertical/horizontal clustering to prevent word fragmentation.
Ensure all inline documentation is in English. Cleaned up redundant code in AccessibilityService and updated UI preferences.
Improved Cyrillic character disambiguation with a custom un-mixer. Refined line clustering with multi-column detection and added intelligent case correction for OCR errors.
Updated WCAG contrast formula to correctly alternate between light and dark text based on background luminance. Added background sampling padding to avoid font-edge color bleeding.
Implemented allocation-free drawing logic with Canvas scaling and Path caching in CopyTextOverlayManager. Refined bitmap lifecycle management in OverlayActivity.
Panels now automatically slide out of the way during drawing or resizing and return upon release. Added background tap gesture to manually toggle top and bottom bar visibility.
Optimized Tesseract scanning by implementing smart binarization, automatic inversion for dark mode, and high-contrast preprocessing. Added TessBaseAPI instance caching and removed redundant tiling for a 5-10x speed boost. All remaining Russian code comments have been translated to English.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant