Speed up CropView rotation: GPU-resident source + per-generation content bake#308
Merged
Conversation
The editing source was a CPU-backed `CIImage(cgImage:)`, so Core Image re-uploaded its bitmap CPU->GPU on every canvas render. A device Time Profiler trace of CropView rotation showed that upload (`CIMetalTextureSetBytes` / `replaceRegion:withBytes:`) dominating the main thread (~40% of all main-thread time): zoom/pan/rotation invalidate the canvas's viewport cache every frame and force a fresh render of the source. `EditingSourcePreparation.makeGPUResidentSource` now uploads the oriented source into a persistent private MTLTexture once at load (background queue) and returns a texture-backed `CIImage`, so every downstream consumer samples a GPU-resident texture. The texture depth adapts to the source: `rgba8Unorm` for an 8-bit source (the common case -- lossless, half the memory), `rgba16Float` for >8-bit / HDR. This re-does what #297 removed, but avoids that PR's downsides (MTKTextureLoader 8-bit quantization, no 16-bit support, y-flip bugs): it renders through a CIContext in the source's own color space, and the CIContext.render(to:) / CIImage(mtlTexture:) round-trip is flip-free. `EditingImageWarmUp` is deleted -- building the texture warms the Core Image pipeline. EditingSourceTextureTests verifies pixel-equivalence to `CIImage(cgImage:).oriented()` across 8 orientations x {8,16}-bit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After the source upload was eliminated, the device trace showed the remaining per-frame cost was Core Image re-evaluating the blur-heavy `adjusted` graph on every rotation frame (`create_intermediate` / `GetSurfaceFromCache`: IOSurface allocation under a barrier lock). The effect content is invariant during a gesture -- only the viewport moves -- but the viewport-keyed prepared-layers cache missed every frame. `EditingCanvasContentBake.bake` renders the `adjusted` (local-effect) layer into a capped (2560) texture once per render-images generation; the composite path resamples it per frame. The cache is invalidated only on setRenderImages / setViewportCachedSourceEnabled, never on viewport changes. `base` (global pointwise effects on the now-GPU-resident source) stays a live graph: cheap to re-evaluate, and not baking it halves the held texture memory. Measured on device (normalized ms per wall-second): canvas draw 102 -> 26, CI graph eval 70 -> 3, IOSurface alloc + barrier 58 -> 0. EditingCanvasContentBakeTests verifies the bake is a faithful drop-in (same extent / coordinates, within float tolerance). The UIKit-layout cost that now dominates rotation, and the remaining bake-memory levers, are tracked in docs/editing-engine-todo.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
muukii
added a commit
that referenced
this pull request
Jun 17, 2026
## Summary Fixes a regression where, after changing the zoom in **Crop** mode and switching to **Blur** (tool/mask) mode, the crop output rendered **small and pinned to the top-left**. Toggling the Crop/Blur tabs again corrected it; zooming the other way produced the inverse misplacement. ## Root cause `makeToolCropDisplayViewport` derives the Metal canvas placement from **presentation-layer** conversions (`currentLayerRect(... usesPresentationLayers:)`), gated only on `toolSurface.isInteractiveZoomGestureActive == false`. This was introduced in #309 to make the mask canvas follow the post-release zoom bounce. The Crop→Blur switch applies the tool viewport **once, synchronously**, in the same runloop turn that `updateToolScrollGeometry` just reconfigured the freshly-un-hidden tool scroll view (non-animated `setZoomScale`, `centerContentInViewport`, etc.): `setFeatureFocus` → `applySurfaceMode(syncsToolViewportFromCrop: true)` → `updateToolScrollGeometry` (model set synchronously) → `updateToolCropDisplayViewport()` (one-shot, no display link). At that instant no pinch is active, so `usesPresentationLayers == true`, but Core Animation has not yet committed the new geometry to the presentation layers — they still hold the **previous** Blur session's geometry. The one-shot samples that stale transform → small + top-left placement, and because no display link runs afterward, the stale frame sticks until the next mode switch (the "switch back and forth fixes it" symptom). `#308` (GPU-resident source / per-generation content bake), originally suspected, is **not** involved: it never touches `CropView.swift`, and `EditingCanvasContentBake` maps the bake back to the original `extent`, so it changes layer *content*, never viewport *placement*. ## Fix Presentation-layer reads are only meaningful while the viewport display link is actively re-sampling an in-flight bounce. Gate them on whether that link is running: ```swift let usesPresentationLayers = toolSurface.viewportRendering.isRunning && toolSurface.isInteractiveZoomGestureActive == false ``` This flips **only** the link-idle one-shot paths (i.e. the mode switch) back to the model layers, which the synchronous reconfigure already made authoritative. Every display-link-driven path (interactive pinch, post-release bounce, deceleration) is unchanged, so #309's bounce tracking still works. ## Scope / notes - Scoped to the **tool** path only. `makeCropDisplayViewport` has the same presentation-read pattern, but the crop path is not a reported repro and its animator-driven rotation computes the viewport inside a `UIViewPropertyAnimator` block (where presentation reads are intended), so it is intentionally left untouched. - The tool path's animator case early-returns (the tool surface is hidden in Crop mode), so the gate is safe there. ## Test plan - [x] `xcodebuild -scheme BrightroomUI -destination 'platform=iOS Simulator,name=iPhone 17 Pro'` builds. - [x] Verified on device: Crop zoom change → switch to Blur now displays the crop output correctly (no small/top-left misplacement, no need to toggle tabs). - [ ] Editing canvas renders black on the Simulator, so visual confirmation is device-only. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two performance fixes for slow CropView rotation, found and verified with on-device Instruments Time Profiler traces. Both are behind pixel-equivalence tests (the editing canvas renders black on the Simulator, so correctness is guarded via CIContext read-back, and the felt win was confirmed on device).
1. Make the editing source GPU-resident (adaptive bit depth)
The editing source was a CPU-backed
CIImage(cgImage:), so Core Image re-uploaded its bitmap CPU→GPU on every canvas render. During rotation the viewport cache invalidates every frame, so the trace showedCIMetalTextureSetBytes/replaceRegion:withBytes:consuming ~40% of all main-thread time.EditingSourcePreparation.makeGPUResidentSourceuploads the oriented source into a persistent privateMTLTextureonce at load (background queue) and returns a texture-backedCIImage. The texture depth adapts to the source:rgba8Unormfor an 8-bit source (the common case — lossless, half the memory),rgba16Floatfor >8-bit / HDR.This re-does what #297 removed, but avoids that PR's downsides (
MTKTextureLoader8-bit quantization / no 16-bit / y-flip) by rendering through aCIContextin the source's own color space — theCIContext.render(to:)/CIImage(mtlTexture:)round-trip is flip-free.EditingImageWarmUpis deleted (building the texture warms the pipeline).2. Bake the local-effect layer once per generation
With the upload gone, the trace's remaining per-frame cost was Core Image re-evaluating the blur-heavy
adjustedgraph every frame (create_intermediate/GetSurfaceFromCache— IOSurface allocation under a barrier lock). The effect content is invariant during a gesture; only the viewport moves.EditingCanvasContentBake.bakerenders theadjustedlayer into a capped (2560) texture once per render-images generation; the composite path resamples it per frame. The cache is invalidated only onsetRenderImages/setViewportCachedSourceEnabled, never on viewport changes.base(global pointwise effects on the now-GPU-resident source) stays a live graph — cheap to re-evaluate, and not baking it halves the held texture memory.Measured (device, normalized ms per wall-second)
MTKView.draw)tile_node_graph)CA::Transaction::commitdropped 158 → 90. The new dominant rotation cost is UIKit layout (updateUIViewController→updateCropLayout), tracked as a follow-up.Memory
Adaptive source depth + baking only
adjustedkeep the footprint bounded. For an 8-bit source: plain crop/rotate holds just the ~26MB source texture (≈ the pre-change CGImage footprint); a blur-mask gesture adds one ~52MBadjustedbake. Theadjustedbake staysrgba16Floatdeliberately — it is the extended-linear-Display-P3 intermediate, where 8-bit linear bands and can't carry exposure overshoot / wide gamut.Test plan
xcodebuild -scheme BrightroomUIbuilds.EditingSourceTextureTests— texture-backed source is pixel-equivalent toCIImage(cgImage:).oriented()across 8 orientations × {8,16}-bit (16 cases).EditingCanvasContentBakeTests— bake is a faithful drop-in (offset extent, asymmetric content, within float tolerance; cap respected).baseun-baked did not re-introduce per-frame cost (expected fine — pointwise on a GPU texture).Follow-ups (docs/editing-engine-todo.md item 7)
🤖 Generated with Claude Code