Route LiteRT-LM native media inputs#205
Conversation
|
Chat app preview deployed for
|
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (66.21%) is below the target coverage (70.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #205 +/- ##
==========================================
- Coverage 80.59% 80.48% -0.11%
==========================================
Files 85 85
Lines 11407 11474 +67
==========================================
+ Hits 9193 9235 +42
- Misses 2214 2239 +25
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds native multimodal (image/audio) input routing for LiteRT-LM .litertlm bundles by translating LlamaImageContent / LlamaAudioContent parts into LiteRT-LM Conversation API message JSON, and updates docs/tests to reflect the new bundle-native media flow (distinct from llama.cpp mmproj).
Changes:
- Route supported native media shapes (local
path+ encodedbytes/blob) through LiteRT-LM conversation message JSON, and reject unsupported shapes early (remote URLs, raw PCM samples, raw RGB-with-dimensions). - Configure LiteRT-LM engine settings for image-bearing requests (
max_num_images) and exposeLiteRtLmMediaInput/LiteRtLmMediaTypefor advanced runtime callers. - Add/extend unit tests and update README/website docs/changelog to document LiteRT-LM’s bundle-native multimodal behavior.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| website/docs/platforms/support-matrix.md | Updates support matrix narrative to reflect native LiteRT-LM media input support and remaining limitations. |
| website/docs/guides/multimodal.md | Documents the distinct GGUF projector flow vs .litertlm bundle-native media flow, with a LiteRT-LM example. |
| website/docs/guides/backend-selection.md | Updates capability table and guidance to describe LiteRT-LM native media inputs (and web limitations). |
| test/unit/backends/litert_lm/litert_lm_service_test.dart | Adds tests covering routing of local image parts and pre-generation rejection of unsupported media shapes. |
| test/unit/backends/litert_lm/litert_lm_runtime_test.dart | Adds serialization tests for LiteRtLmMediaInput and validates maxNumImages argument handling. |
| README.md | Updates LiteRT-LM limitations/notes and adds explicit guidance for .litertlm native media parts. |
| lib/src/backends/litert_lm/litert_lm_service.dart | Converts content parts into LiteRT-LM media inputs, validates shapes, and plumbs maxNumImages + media inputs into runtime generation. |
| lib/src/backends/litert_lm/litert_lm_runtime.dart | Introduces LiteRtLmMediaInput/LiteRtLmMediaType, builds Conversation API message JSON with media items, and adds optional visual token budget plumbing. |
| lib/src/backends/litert_lm/litert_lm_runtime_stub.dart | Mirrors new runtime-facing media types/signatures on non-FFI platforms. |
| lib/llamadart.dart | Exports LiteRtLmMediaInput / LiteRtLmMediaType from the public API surface. |
| doc/litert_lm_templates.md | Notes how LiteRT-LM media parts are sent via Conversation API JSON and differ from llama.cpp mmproj. |
| CHANGELOG.md | Records the LiteRT-LM native multimodal input feature and the supported/unsupported shapes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Refs #189
Summary
LlamaImageContent/LlamaAudioContentpath and encoded-byte inputs through the existing Conversation API message JSONlitert_lm_engine_settings_set_max_num_imagesfor image-bearing requests and exposesLiteRtLmMediaInput/LiteRtLmMediaTypefor advanced runtime callersFloat32Listaudio samplesmmprojlifecycleScope notes
loadMultimodalProjector,supportsVision, orsupportsAudiofor.litertlmbundles. LiteRT-LM media is bundle-native and does not use an external projector path; reporting true capabilities still needs a reliable runtime/model probe.litert_lm_conversation_optional_args_set_visual_token_budget; this PR adds the binding path but does not add a publicGenerationParamsknob.@litert-lm/corein llamadart.Validation
dart analyze lib/llamadart.dart lib/src/backends/litert_lm/litert_lm_runtime.dart lib/src/backends/litert_lm/litert_lm_runtime_stub.dart lib/src/backends/litert_lm/litert_lm_service.dart test/unit/backends/litert_lm/litert_lm_service_test.dart test/unit/backends/litert_lm/litert_lm_runtime_test.dartdart test test/unit/backends/litert_lm/litert_lm_service_test.dart test/unit/backends/litert_lm/litert_lm_runtime_test.dart./tool/docs/validate_links.shgit diff --check