Skip to content

Route LiteRT-LM native media inputs#205

Open
leehack wants to merge 5 commits into
mainfrom
litert-multimodal-input
Open

Route LiteRT-LM native media inputs#205
leehack wants to merge 5 commits into
mainfrom
litert-multimodal-input

Conversation

@leehack

@leehack leehack commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Refs #189

Summary

  • routes native LiteRT-LM LlamaImageContent / LlamaAudioContent path and encoded-byte inputs through the existing Conversation API message JSON
  • configures litert_lm_engine_settings_set_max_num_images for image-bearing requests and exposes LiteRtLmMediaInput / LiteRtLmMediaType for advanced runtime callers
  • rejects unsupported media shapes before native generation: remote image URLs, missing payloads, raw RGB image bytes with dimensions, and raw PCM Float32List audio samples
  • updates README, website docs, LiteRT-LM template notes, and changelog to distinguish bundle-native media input from llama.cpp mmproj lifecycle

Scope notes

  • This does not wire loadMultimodalProjector, supportsVision, or supportsAudio for .litertlm bundles. LiteRT-LM media is bundle-native and does not use an external projector path; reporting true capabilities still needs a reliable runtime/model probe.
  • The default visual token budget remains LiteRT-LM's native model-config default. The C ABI exposes litert_lm_conversation_optional_args_set_visual_token_budget; this PR adds the binding path but does not add a public GenerationParams knob.
  • LiteRT-LM web remains text-only through @litert-lm/core in llamadart.

Validation

  • dart analyze lib/llamadart.dart lib/src/backends/litert_lm/litert_lm_runtime.dart lib/src/backends/litert_lm/litert_lm_runtime_stub.dart lib/src/backends/litert_lm/litert_lm_service.dart test/unit/backends/litert_lm/litert_lm_service_test.dart test/unit/backends/litert_lm/litert_lm_runtime_test.dart
  • dart test test/unit/backends/litert_lm/litert_lm_service_test.dart test/unit/backends/litert_lm/litert_lm_runtime_test.dart
  • ./tool/docs/validate_links.sh
  • git diff --check

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Chat app preview deployed for d5907b1.

@codecov-commenter

codecov-commenter commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 66.21622% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.48%. Comparing base (343574c) to head (d5907b1).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
lib/src/backends/litert_lm/litert_lm_service.dart 66.07% 19 Missing ⚠️
lib/src/backends/litert_lm/litert_lm_runtime.dart 66.66% 6 Missing ⚠️

❌ Your patch status has failed because the patch coverage (66.21%) is below the target coverage (70.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #205      +/-   ##
==========================================
- Coverage   80.59%   80.48%   -0.11%     
==========================================
  Files          85       85              
  Lines       11407    11474      +67     
==========================================
+ Hits         9193     9235      +42     
- Misses       2214     2239      +25     
Flag Coverage Δ
unittests 80.48% <66.21%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@leehack leehack marked this pull request as ready for review June 7, 2026 02:38
Copilot AI review requested due to automatic review settings June 7, 2026 02:38

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native multimodal (image/audio) input routing for LiteRT-LM .litertlm bundles by translating LlamaImageContent / LlamaAudioContent parts into LiteRT-LM Conversation API message JSON, and updates docs/tests to reflect the new bundle-native media flow (distinct from llama.cpp mmproj).

Changes:

  • Route supported native media shapes (local path + encoded bytes/blob) through LiteRT-LM conversation message JSON, and reject unsupported shapes early (remote URLs, raw PCM samples, raw RGB-with-dimensions).
  • Configure LiteRT-LM engine settings for image-bearing requests (max_num_images) and expose LiteRtLmMediaInput / LiteRtLmMediaType for advanced runtime callers.
  • Add/extend unit tests and update README/website docs/changelog to document LiteRT-LM’s bundle-native multimodal behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
website/docs/platforms/support-matrix.md Updates support matrix narrative to reflect native LiteRT-LM media input support and remaining limitations.
website/docs/guides/multimodal.md Documents the distinct GGUF projector flow vs .litertlm bundle-native media flow, with a LiteRT-LM example.
website/docs/guides/backend-selection.md Updates capability table and guidance to describe LiteRT-LM native media inputs (and web limitations).
test/unit/backends/litert_lm/litert_lm_service_test.dart Adds tests covering routing of local image parts and pre-generation rejection of unsupported media shapes.
test/unit/backends/litert_lm/litert_lm_runtime_test.dart Adds serialization tests for LiteRtLmMediaInput and validates maxNumImages argument handling.
README.md Updates LiteRT-LM limitations/notes and adds explicit guidance for .litertlm native media parts.
lib/src/backends/litert_lm/litert_lm_service.dart Converts content parts into LiteRT-LM media inputs, validates shapes, and plumbs maxNumImages + media inputs into runtime generation.
lib/src/backends/litert_lm/litert_lm_runtime.dart Introduces LiteRtLmMediaInput/LiteRtLmMediaType, builds Conversation API message JSON with media items, and adds optional visual token budget plumbing.
lib/src/backends/litert_lm/litert_lm_runtime_stub.dart Mirrors new runtime-facing media types/signatures on non-FFI platforms.
lib/llamadart.dart Exports LiteRtLmMediaInput / LiteRtLmMediaType from the public API surface.
doc/litert_lm_templates.md Notes how LiteRT-LM media parts are sent via Conversation API JSON and differ from llama.cpp mmproj.
CHANGELOG.md Records the LiteRT-LM native multimodal input feature and the supported/unsupported shapes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/src/backends/litert_lm/litert_lm_runtime.dart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants