Skip to content

refactor: split CPUBackend::Render() and CUDABackend::Render()#53

Merged
Cle2ment merged 5 commits into
masterfrom
fix/issue-48-split-large-functions
Jun 18, 2026
Merged

refactor: split CPUBackend::Render() and CUDABackend::Render()#53
Cle2ment merged 5 commits into
masterfrom
fix/issue-48-split-large-functions

Conversation

@Cle2ment

Copy link
Copy Markdown
Owner

Closes #48. Extract RenderISPC/RenderCPUFallback (CPUBackend) and UploadSceneAndBVHIfChanged/CopyInteropToVulkan (CUDABackend).

@github-actions

Copy link
Copy Markdown
Contributor

PR Review: refactor/split-large-functions (#48)

Overall: Clean, mechanical refactoring. No behavioral changes introduced. One pre-existing optimization opportunity noted. Ready to merge as-is.


Issues Found

1. m_AccumFrameIndex hoisted correctly — subtle improvement
CPUBackend.cpp:50: Previously set only inside the #else branch (CPU fallback); now set unconditionally before branching. The ISPC path uses frameIndex directly as a parameter, so no functional change — but this fixes a latent asymmetry where m_AccumFrameIndex was stale during ISPC rendering. Good catch.

2. BVH SoA packing runs every frame in RenderISPC() (pre-existing)
CPUBackend.cpp:135-162: BVH node data is packed into SoA arrays every frame without a dirty check. Materials/spheres are guarded by m_LastISPCSceneVersion (line 93), but BVH packing is not. Since m_BVH only rebuilds on scene version change, this is wasted work on most frames. Extracting RenderISPC makes this more noticeable but doesn't introduce the issue.

Suggestion (follow-up, not blocking): Guard BVH packing with the same scene version check, or track a separate BVH version.

// BVH packing only when BVH actually changed
if (scene.Version != m_LastISPCSceneVersion) { /* BVH packing here */ }

3. Minor formatting in CUDABackend.h:70-71
No blank line between the CopyInteropToVulkan() declaration and the // ── CUDA state ── comment block that belongs to member variables below. The comments visually collide:

	void CopyInteropToVulkan();
	// ── CUDA state ──          ← reads as belonging to the function above

Add a blank line between them for visual separation.

4. reinterpret_cast improvement (line 214)
The C-style cast (cudaStream_t) was replaced with reinterpret_cast<cudaStream_t> — consistent with PR #18's cast elimination standards.


Verdict

No bugs, no regressions, good extraction boundaries. The three new private methods (RenderISPC, RenderCPUFallback, UploadSceneAndBVHIfChanged, CopyInteropToVulkan) each have a single clear responsibility. PR description matches the actual changes.

New%20session%20-%202026-06-18T17%3A01%3A21.369Z
opencode session  |  github run

@github-actions

Copy link
Copy Markdown
Contributor

My review aligns with the existing PR comment. Here's my analysis:

Verdict: Clean refactoring. No bugs, no regressions.

Good:

Pre-existing issue (now more visible):

BVH SoA packing in RenderISPC() (CPUBackend.cpp:135-162) runs every frame without a dirty check. Materials/spheres are guarded by m_LastISPCSceneVersion at line 93, but BVH packing is not. Since m_LastBvhSceneVersion and m_LastISPCSceneVersion are always in sync (both track scene.Version), the fix is a one-line move — hoist the BVH packing block inside the existing if (scene.Version != m_LastISPCSceneVersion) guard.

Minor nit: RenderCPUFallback() lines 225-226 have inconsistent indentation compared to line 223 (2 extra spaces). Pre-existing, but now more noticeable since it's its own function.

New%20session%20-%202026-06-18T17%3A07%3A01.876Z
opencode session  |  github run

Cle2ment added 2 commits June 19, 2026 01:30
…#48)

CPUBackend: extract RenderISPC() and RenderCPUFallback() private methods.
CUDABackend: extract UploadSceneAndBVHIfChanged() and CopyInteropToVulkan().

Render() sizes: CPUBackend 205→30 lines, CUDABackend 192→110 lines.
@Cle2ment Cle2ment force-pushed the fix/issue-48-split-large-functions branch from cf0beb7 to c97c698 Compare June 18, 2026 17:30
@github-actions

Copy link
Copy Markdown
Contributor

PR Review: fix/issue-48-split-large-functions

Verdict: Clean, mechanical refactoring. No regressions. Ready to merge.


Positive observations

  • m_AccumFrameIndex hoisted correctly (CPUBackend.cpp:50). Previously set only inside the #else fallback branch; now set unconditionally before branching. The ISPC path used frameIndex directly as a parameter so no behavioral change, but this fixes a latent asymmetry where m_AccumFrameIndex was stale during ISPC rendering. Good catch.

  • reinterpret_cast fix (CUDABackend.cpp:214). C-style (cudaStream_t) replaced with reinterpret_cast<cudaStream_t>, consistent with PR refactor: phase_6_code_quality — C_style_casts_to_named_iota_constexpr_centralized_constants #18's cast elimination standards.

  • Extraction boundaries are well-chosen. Each new private method (RenderISPC, RenderCPUFallback, UploadSceneAndBVHIfChanged, CopyInteropToVulkan) has a single, clear responsibility.

  • Blank line fix in commit c97c698 — the formatting issue between CopyInteropToVulkan() and the // ── CUDA state ── comment is resolved.


Issues found

1. Indentation regression in RenderCPUFallback()CPUBackend.cpp:225-226

Lines 225-226 lost one level of indentation during extraction. They sit at the accumulatedColor /= level but should be nested inside the const glm::vec4 color = PerPixel(...) lambda body, matching the indentation of lines 220-222:

				const glm::vec4 color = PerPixel(x, y);
				// ...
				m_AccumulationData[idx] += color;           // correct (5 levels)
				glm::vec4 accumulatedColor = m_AccumulationData[idx];  // correct
				accumulatedColor /= static_cast<float>(m_AccumFrameIndex); // correct

		accumulatedColor = glm::clamp(...);                         // WRONG (4 levels, should be 5)
		outputBuffer[idx] = PathTracerCore::ConvertToRGBA(...);   // WRONG

The old code wrapped these across multiple lines which masked the indentation: accumulatedColor = glm::clamp( on one line and the lines below were visually nested. When compressed to single lines, the indent wasn't adjusted. Fix: add one more indent level to lines 225-226.

2. BVH SoA packing runs every frame in RenderISPC()CPUBackend.cpp:135-162 (pre-existing, now more visible)

BVH node data is packed into SoA arrays unconditionally every frame. Materials/spheres are guarded by m_LastISPCSceneVersion (line 93), but BVH packing is outside that guard. Since m_LastBvhSceneVersion and m_LastISPCSceneVersion both track scene.Version and are always in sync, the fix is a one-line move: hoist the BVH packing block inside the existing if (scene.Version != m_LastISPCSceneVersion) check at line 93. Not blocking, but worth a follow-up.


Summary

No bugs, no behavioral regressions. Fix the indentation on lines 225-226 and optionally guard BVH packing with the scene version check for a minor perf win. Otherwise good to go.

New%20session%20-%202026-06-18T17%3A31%3A08.431Z
opencode session  |  github run

…n (CI fix)

After extracting RenderCPUFallback(), outputBuffer became a function parameter instead of being in the outer scope. The inner lambda at line 215 needs explicit capture.
@github-actions

Copy link
Copy Markdown
Contributor

I've read all four changed files. Here's my review:

Verdict: Clean mechanical refactoring. Two real issues, one pre-existing improvement opportunity.

1. Indentation regression in RenderCPUFallback()CPUBackend.cpp:225-226

Lines 225-226 have 2 tabs (4 levels) of indentation when they should have 6 tabs (6 levels) to match the surrounding lambda body:

// line 223 (correct — 6 tabs)
						accumulatedColor /= static_cast<float>(m_AccumFrameIndex);

// lines 225-226 (wrong — 2 tabs, should be 6)
		accumulatedColor = glm::clamp(accumulatedColor, glm::vec4(0.0f), glm::vec4(1.0f));
		outputBuffer[idx] = PathTracerCore::ConvertToRGBA(accumulatedColor);

These were likely continuation lines in the original code that got flattened during extraction without adjusting the indent. Compare with the non-multithreaded path (lines 237-238) which has correct indentation.

2. BVH SoA packing runs every frame in RenderISPC()CPUBackend.cpp:135-162 (pre-existing)

BVH node data is packed unconditionally every frame. Material/sphere packing is guarded by m_LastISPCSceneVersion (line 93), but BVH packing sits outside that guard. Since m_BVH.Build() only runs on scene version change (lines 51-55), BVH data is identical on consecutive frames — this is wasted work. The fix is a one-line move: hoist lines 135-162 inside the if (scene.Version != m_LastISPCSceneVersion) block at line 93.

Positive observations

  • m_AccumFrameIndex hoisted correctly (line 50). Previously set only inside the #else fallback branch; now set unconditionally. Fixes a latent asymmetry where the member was stale during ISPC rendering.
  • Extraction boundaries are well-chosen. Each new private method has a single clear responsibility.
  • reinterpret_cast use (lines 61, 198, 214) is consistent with PR refactor: phase_6_code_quality — C_style_casts_to_named_iota_constexpr_centralized_constants #18 cast elimination standards.
  • Blank line between CopyInteropToVulkan() and // ── CUDA state ── (CUDABackend.h:71) is present — earlier reviewer comment noted this was missing but it was already fixed.

Summary

One real bug (indentation) and one follow-up optimization (BVH packing guard). No behavioral regressions. Good to merge with the indentation fix.

New%20session%20-%202026-06-18T17%3A42%3A33.616Z
opencode session  |  github run

@github-actions

Copy link
Copy Markdown
Contributor

PR Review: fix/issue-48-split-large-functions (#48)

Verdict: Clean refactoring. One real indentation bug, one pre-existing perf issue. No behavioral regressions.


Confirmed Issues

1. Indentation regression in RenderCPUFallback()CPUBackend.cpp:225-226

Lines 225-226 lost one indentation level during extraction. They sit at 2 tabs but should be at 6 tabs to match the surrounding lambda body (compare lines 222-223 which are at 6 tabs, and the non-multithreaded path at lines 237-240 which is correctly indented):

// line 223 (correct — 6 tabs from margin)
						accumulatedColor /= static_cast<float>(m_AccumFrameIndex);

// lines 225-226 (wrong — 2 tabs, should be 6 tabs)
		accumulatedColor = glm::clamp(accumulatedColor, glm::vec4(0.0f), glm::vec4(1.0f));
		outputBuffer[idx] = PathTracerCore::ConvertToRGBA(accumulatedColor);

Likely caused by flattening continuation lines during extraction. Fix: add indentation to match lines 222-223.

2. BVH SoA packing runs every frame in RenderISPC()CPUBackend.cpp:135-162 (pre-existing)

BVH node data is packed into SoA arrays unconditionally every frame. Material/sphere packing is guarded by m_LastISPCSceneVersion (line 93), but BVH packing is outside. Since m_BVH.Build() only runs on version change (line 51-55), BVH data is identical frame-to-frame. Fix: hoist lines 135-162 inside the if (scene.Version != m_LastISPCSceneVersion) guard.


Positive Observations

  • m_AccumFrameIndex hoisted correctly (line 50) — was only set in the #else fallback branch; now set unconditionally before branching. Fixes a latent asymmetry where the ISPC path left the member stale.
  • Extraction boundaries are well-chosenRender() went from ~100 lines to ~20 in both backends. Each new private method has a single responsibility.
  • reinterpret_cast usage (lines 61, 198, 214 in CUDABackend.cpp) consistent with PR refactor: phase_6_code_quality — C_style_casts_to_named_iota_constexpr_centralized_constants #18 cast elimination standards.
  • Blank line at CUDABackend.h:71 between CopyInteropToVulkan() and comment block already present (fixed in follow-up commit c97c698).
  • Lambda captures resolved correctly across the two CI-fix commits (786807d, 221190a) for MSVC nested lambda rules.

Summary

Fix the indentation on lines 225-226 before merging. The BVH packing optimization can be a follow-up. Otherwise, this is a clean extract-method refactoring with no regressions.

New%20session%20-%202026-06-18T17%3A48%3A12.191Z
opencode session  |  github run

…h scene version (review feedback)

Lines 225-226 had wrong indent level (5 instead of 6 tabs).
BVH SoA packing now inside if(scene.Version != m_LastISPCSceneVersion),
avoiding wasted work on consecutive frames when scene hasn't changed.
@github-actions

Copy link
Copy Markdown
Contributor

PR Review: fix/issue-48-split-large-functions (#48)

Verdict: Clean refactoring. All reviewer-flagged issues already fixed in follow-up commits. Ready to merge.

What was done

Extracted 4 private methods from two large Render() functions:

  • CPUBackend: RenderISPC() and RenderCPUFallback()
  • CUDABackend: UploadSceneAndBVHIfChanged() and CopyInteropToVulkan()

Issues found and already fixed

All issues identified in the PR review comments have been addressed across 4 fixup commits:

Issue Commit Status
Blank line between CopyInteropToVulkan() and // ── CUDA state ── in CUDABackend.h c97c698 Fixed
Lambda missing outputBuffer capture (MSVC CI failure) 786807d, 221190a Fixed
Indentation regression at CPUBackend.cpp:225-226 (2 tabs → 6 tabs) 2a7685b Fixed
BVH SoA packing outside scene version guard in RenderISPC() 2a7685b Fixed

Positive observations

  • m_AccumFrameIndex hoisting (CPUBackend.cpp:50) — previously set only in the #else fallback branch; now set unconditionally before branching. Fixes a latent asymmetry where the member was stale during ISPC rendering.
  • Extraction boundaries are well-chosen — each of the 4 new methods has a single, clear responsibility.
  • reinterpret_cast usage (CUDABackend.cpp:61, 198, 214) consistent with PR refactor: phase_6_code_quality — C_style_casts_to_named_iota_constexpr_centralized_constants #18 cast elimination standards.

No remaining issues

The current tip (2a7685b) has zero bugs, zero regressions, and clean formatting. No changes needed.

New%20session%20-%202026-06-18T17%3A52%3A44.442Z
opencode session  |  github run

@Cle2ment Cle2ment merged commit 9c82953 into master Jun 18, 2026
6 checks passed
@Cle2ment Cle2ment deleted the fix/issue-48-split-large-functions branch June 18, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 16.9-16.10: Split oversized Render() functions

1 participant