Skip to content

Speed up CropView rotation: GPU-resident source + per-generation content bake#308

Merged
muukii merged 2 commits into
v5from
claude/interesting-bouman-60b4e0
Jun 17, 2026
Merged

Speed up CropView rotation: GPU-resident source + per-generation content bake#308
muukii merged 2 commits into
v5from
claude/interesting-bouman-60b4e0

Conversation

@muukii

@muukii muukii commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Two performance fixes for slow CropView rotation, found and verified with on-device Instruments Time Profiler traces. Both are behind pixel-equivalence tests (the editing canvas renders black on the Simulator, so correctness is guarded via CIContext read-back, and the felt win was confirmed on device).

1. Make the editing source GPU-resident (adaptive bit depth)

The editing source was a CPU-backed CIImage(cgImage:), so Core Image re-uploaded its bitmap CPU→GPU on every canvas render. During rotation the viewport cache invalidates every frame, so the trace showed CIMetalTextureSetBytes / replaceRegion:withBytes: consuming ~40% of all main-thread time.

EditingSourcePreparation.makeGPUResidentSource uploads the oriented source into a persistent private MTLTexture once at load (background queue) and returns a texture-backed CIImage. The texture depth adapts to the source: rgba8Unorm for an 8-bit source (the common case — lossless, half the memory), rgba16Float for >8-bit / HDR.

This re-does what #297 removed, but avoids that PR's downsides (MTKTextureLoader 8-bit quantization / no 16-bit / y-flip) by rendering through a CIContext in the source's own color space — the CIContext.render(to:) / CIImage(mtlTexture:) round-trip is flip-free. EditingImageWarmUp is deleted (building the texture warms the pipeline).

2. Bake the local-effect layer once per generation

With the upload gone, the trace's remaining per-frame cost was Core Image re-evaluating the blur-heavy adjusted graph every frame (create_intermediate / GetSurfaceFromCache — IOSurface allocation under a barrier lock). The effect content is invariant during a gesture; only the viewport moves.

EditingCanvasContentBake.bake renders the adjusted layer into a capped (2560) texture once per render-images generation; the composite path resamples it per frame. The cache is invalidated only on setRenderImages / setViewportCachedSourceEnabled, never on viewport changes. base (global pointwise effects on the now-GPU-resident source) stays a live graph — cheap to re-evaluate, and not baking it halves the held texture memory.

Measured (device, normalized ms per wall-second)

Main-thread cost before after
Canvas draw (MTKView.draw) 102 26
CI graph eval (tile_node_graph) 70 3
IOSurface alloc + barrier lock 58 0
Per-frame source upload (eliminated in step 1) 0

CA::Transaction::commit dropped 158 → 90. The new dominant rotation cost is UIKit layout (updateUIViewControllerupdateCropLayout), tracked as a follow-up.

Memory

Adaptive source depth + baking only adjusted keep the footprint bounded. For an 8-bit source: plain crop/rotate holds just the ~26MB source texture (≈ the pre-change CGImage footprint); a blur-mask gesture adds one ~52MB adjusted bake. The adjusted bake stays rgba16Float deliberately — it is the extended-linear-Display-P3 intermediate, where 8-bit linear bands and can't carry exposure overshoot / wide gamut.

Test plan

  • xcodebuild -scheme BrightroomUI builds.
  • EditingSourceTextureTests — texture-backed source is pixel-equivalent to CIImage(cgImage:).oriented() across 8 orientations × {8,16}-bit (16 cases).
  • EditingCanvasContentBakeTests — bake is a faithful drop-in (offset extent, asymmetric content, within float tolerance; cap respected).
  • Regression-green: ColorContract, CropSurfaceBlurRenderPath, CommittedMaskParametricParity, LocalAdjustmentRendering/MaskOrientation, EditingPreviewExportParity, MaskedPreviewExportScaleConsistency, RendererOrientation.
  • Device re-trace to confirm keeping base un-baked did not re-introduce per-frame cost (expected fine — pointwise on a GPU texture).

Follow-ups (docs/editing-engine-todo.md item 7)

  • UIKit per-frame layout during the angle drag is now the dominant rotation cost.
  • Remaining bake-memory levers (resolution cap, confirmed-SDR formats) if needed.

🤖 Generated with Claude Code

muukii and others added 2 commits June 18, 2026 02:15
The editing source was a CPU-backed `CIImage(cgImage:)`, so Core Image
re-uploaded its bitmap CPU->GPU on every canvas render. A device Time Profiler
trace of CropView rotation showed that upload (`CIMetalTextureSetBytes` /
`replaceRegion:withBytes:`) dominating the main thread (~40% of all main-thread
time): zoom/pan/rotation invalidate the canvas's viewport cache every frame and
force a fresh render of the source.

`EditingSourcePreparation.makeGPUResidentSource` now uploads the oriented source
into a persistent private MTLTexture once at load (background queue) and returns
a texture-backed `CIImage`, so every downstream consumer samples a GPU-resident
texture. The texture depth adapts to the source: `rgba8Unorm` for an 8-bit
source (the common case -- lossless, half the memory), `rgba16Float` for
>8-bit / HDR.

This re-does what #297 removed, but avoids that PR's downsides (MTKTextureLoader
8-bit quantization, no 16-bit support, y-flip bugs): it renders through a
CIContext in the source's own color space, and the CIContext.render(to:) /
CIImage(mtlTexture:) round-trip is flip-free. `EditingImageWarmUp` is deleted --
building the texture warms the Core Image pipeline.

EditingSourceTextureTests verifies pixel-equivalence to
`CIImage(cgImage:).oriented()` across 8 orientations x {8,16}-bit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After the source upload was eliminated, the device trace showed the remaining
per-frame cost was Core Image re-evaluating the blur-heavy `adjusted` graph on
every rotation frame (`create_intermediate` / `GetSurfaceFromCache`: IOSurface
allocation under a barrier lock). The effect content is invariant during a
gesture -- only the viewport moves -- but the viewport-keyed prepared-layers
cache missed every frame.

`EditingCanvasContentBake.bake` renders the `adjusted` (local-effect) layer into
a capped (2560) texture once per render-images generation; the composite path
resamples it per frame. The cache is invalidated only on setRenderImages /
setViewportCachedSourceEnabled, never on viewport changes. `base` (global
pointwise effects on the now-GPU-resident source) stays a live graph: cheap to
re-evaluate, and not baking it halves the held texture memory.

Measured on device (normalized ms per wall-second): canvas draw 102 -> 26, CI
graph eval 70 -> 3, IOSurface alloc + barrier 58 -> 0.

EditingCanvasContentBakeTests verifies the bake is a faithful drop-in (same
extent / coordinates, within float tolerance). The UIKit-layout cost that now
dominates rotation, and the remaining bake-memory levers, are tracked in
docs/editing-engine-todo.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@muukii muukii merged commit 03adf1b into v5 Jun 17, 2026
2 checks passed
@muukii muukii deleted the claude/interesting-bouman-60b4e0 branch June 17, 2026 17:21
muukii added a commit that referenced this pull request Jun 17, 2026
## Summary

Fixes a regression where, after changing the zoom in **Crop** mode and
switching to **Blur** (tool/mask) mode, the crop output rendered **small
and pinned to the top-left**. Toggling the Crop/Blur tabs again
corrected it; zooming the other way produced the inverse misplacement.

## Root cause

`makeToolCropDisplayViewport` derives the Metal canvas placement from
**presentation-layer** conversions (`currentLayerRect(...
usesPresentationLayers:)`), gated only on
`toolSurface.isInteractiveZoomGestureActive == false`. This was
introduced in #309 to make the mask canvas follow the post-release zoom
bounce.

The Crop→Blur switch applies the tool viewport **once, synchronously**,
in the same runloop turn that `updateToolScrollGeometry` just
reconfigured the freshly-un-hidden tool scroll view (non-animated
`setZoomScale`, `centerContentInViewport`, etc.):

`setFeatureFocus` → `applySurfaceMode(syncsToolViewportFromCrop: true)`
→ `updateToolScrollGeometry` (model set synchronously) →
`updateToolCropDisplayViewport()` (one-shot, no display link).

At that instant no pinch is active, so `usesPresentationLayers == true`,
but Core Animation has not yet committed the new geometry to the
presentation layers — they still hold the **previous** Blur session's
geometry. The one-shot samples that stale transform → small + top-left
placement, and because no display link runs afterward, the stale frame
sticks until the next mode switch (the "switch back and forth fixes it"
symptom).

`#308` (GPU-resident source / per-generation content bake), originally
suspected, is **not** involved: it never touches `CropView.swift`, and
`EditingCanvasContentBake` maps the bake back to the original `extent`,
so it changes layer *content*, never viewport *placement*.

## Fix

Presentation-layer reads are only meaningful while the viewport display
link is actively re-sampling an in-flight bounce. Gate them on whether
that link is running:

```swift
let usesPresentationLayers = toolSurface.viewportRendering.isRunning
  && toolSurface.isInteractiveZoomGestureActive == false
```

This flips **only** the link-idle one-shot paths (i.e. the mode switch)
back to the model layers, which the synchronous reconfigure already made
authoritative. Every display-link-driven path (interactive pinch,
post-release bounce, deceleration) is unchanged, so #309's bounce
tracking still works.

## Scope / notes

- Scoped to the **tool** path only. `makeCropDisplayViewport` has the
same presentation-read pattern, but the crop path is not a reported
repro and its animator-driven rotation computes the viewport inside a
`UIViewPropertyAnimator` block (where presentation reads are intended),
so it is intentionally left untouched.
- The tool path's animator case early-returns (the tool surface is
hidden in Crop mode), so the gate is safe there.

## Test plan

- [x] `xcodebuild -scheme BrightroomUI -destination 'platform=iOS
Simulator,name=iPhone 17 Pro'` builds.
- [x] Verified on device: Crop zoom change → switch to Blur now displays
the crop output correctly (no small/top-left misplacement, no need to
toggle tabs).
- [ ] Editing canvas renders black on the Simulator, so visual
confirmation is device-only.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant