feat: support image attachments on user messages#572
Conversation
Add multimodal image input across the chat surfaces: - New ImageAttachment type carried on user messages, converted to AI SDK image parts at the provider boundary (data URL accepted by Anthropic, Google, and OpenAI-compatible providers). - Clipboard paste (Ctrl+V) and drag/typed image file paths become attachments; image path tokens are stripped from the message text. - ACP image content blocks are collected as attachments; unsupported media types and audio are noted rather than silently dropped. - modelSupportsVision() heuristic drives a non-blocking warning when the active model likely cannot see images (the image is still sent). - UserInput shows pending attachments with Ctrl+X to remove the last; UserMessage shows an attached-image count. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves the CodeQL "incomplete string escaping" alert: the test built its escaped path by replacing spaces only, leaving any pre-existing backslash unescaped. Escape backslashes first, then spaces, so the encoding is complete for any path. Behavior is unchanged for the temp paths under test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Hey @ragini-pandey - this is a brilliant PR. Thank you for building this. It works well and below are mostly some quality of life comments :)
Love this though. Cannot wait to merge! Let me know if there are any other questions or thoughts! |
|
Hey @ragini-pandey - just wondered if my above feedback was all okay for you? Happy to jump in and help if needs be. Looking forward to merging this :) |
|
- Drop the 🖼 emoji from the "image attached" label (use ■ glyph) - Recognise unquoted macOS dragged paths with backslash-escaped spaces - Remove the per-attach "model may not support images" warning and the now-dead modelSupportsVision/visionSupported plumbing - Log a debug line naming the missing clipboard tool (osascript / wl-paste / xclip / powershell) so Ctrl+V is not a silent no-op - Skip http(s) URLs in extractImageReferences before touching the FS Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thank you @will-lamerton! 🙏 All of these made sense
Thanks again for the thorough review! |
…hments # Conflicts: # source/components/user-input.tsx
- New docs/features/image-attachments.md: clipboard paste (Ctrl+V), drag-and-drop, typed paths (quoted/unquoted/escaped), Ctrl+X to remove, supported formats (PNG/JPEG/GIF/WebP, ≤10 MB), and per-platform clipboard tool requirements - Add image bindings to the keyboard-shortcuts reference - Add an "Attaching Images" subsection and reference-table row to the features index - Add a CHANGELOG entry for the feature Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks for these changes - brilliant PR :) |
Description
Adds multimodal image input to user messages across all chat surfaces (interactive TUI, VS Code prompt path, and ACP). Images are carried as base64 on the internal
Messagetype and converted to AI SDK image parts (adata:URL) at the provider boundary, which Anthropic, Google, and OpenAI-compatible providers all accept.Highlights:
ImageAttachmenttype threaded through the submit chain (useChatHandler,useAppHandlers,app-util,message-builder, chat-input/user-input).Ctrl+V(macOSosascript, Linuxwl-paste/xclip, Windows PowerShell).modelSupportsVision()heuristic drives a non-blocking warning when the active model likely can't see images (the image is still sent).UserInputlists pending attachments (Ctrl+Xremoves the last);UserMessageshows an attached-image count.Type of Change
Testing
Automated Tests
.spec.ts/tsxfilespnpm test:allcompletes successfully)Manual Testing
Checklist
Screen.Recording.2026-06-14.at.6.40.47.PM.mov