Fix/perf 3bc capture coalesce#22
Open
jzmiller1 wants to merge 5 commits into
Open
Conversation
…rsect
The Roche-limit computation in planetesimals_intersect cloned both `p`
and `prev_p` purely to satisfy the borrow checker — only three scalar
fields (mass, mass, radius) are read from the clones. Planetesimal
carries Vec<Planetesimal> moons + Vec<Ring> rings, so each clone is a
deep tree copy. In hot post-accretion loops this is per-iteration waste.
Replace the two clones with direct scalar extraction:
let (larger_mass, smaller_mass, smaller_radius) = if p.mass >= prev_p.mass {
(p.mass, prev_p.mass, prev_p.radius)
} else {
(prev_p.mass, p.mass, p.radius)
};
Note the third argument: the original code passed `&smaller.radius`
(the smaller body's radius, matching roche_limit_au's documented
contract that the third arg is the moon/smaller body's radius). The
refactor preserves this — using smaller_radius rather than the larger
body's radius keeps the physics identical.
For the capture_moon call site, use std::mem::swap with a `>=` tiebreak
to orient the bodies in place: after the swap, prev_p holds the larger
body. This matches the original `match p.mass >= prev_p.mass { ... }`
ordering exactly (p wins on equal mass).
All 16 fixture tests remain bit-identical after the refactor — zero
fixture drift under cargo test, as the clone-elimination itself does
not change the arithmetic order.
Two coupled hot-path fixes that build on the Roche-scalar refactor:
Fix 3B (capture_moon ownership): the function previously cloned both
the larger and smaller bodies internally, even after the Roche-scalar
fix removed the caller-side clones. With heavily-mooned planets these
are deep tree clones (Vec<Planetesimal> moons + Vec<Ring> rings each).
Change the signature to take owned (Planetesimal, Planetesimal) values
and modify the larger body in place. Only moon.id (a String) is
cloned for the event message — not the struct.
Add a pub(crate) Planetesimal::placeholder() constructor: a zero-
initialized stub used solely by `std::mem::replace` to extract owned
values from &mut Planetesimal slots in planetesimals_intersect.
The placeholder is overwritten in the same statement that creates it
and is never observed. Visibility is pub(crate) so it cannot leak
into serialized output or downstream APIs.
Fix 3C (coalesce in-place): the function allocated a shadow Vec and
cloned every Planetesimal on every call, regardless of whether any
coalescence actually occurred. Most calls on the post-accretion hot
path have no intersections. Replace with an in-place loop that uses
Vec::remove(i) only on actual coalescence events:
while i < planets.len() {
if check_orbits_intersect(...) {
let mut p = planets.remove(i);
planetesimals_intersect(&mut p, &mut planets[i - 1], ...);
// don't advance i — merged body may intersect next one
} else {
i += 1;
}
}
Preserved from upstream commit 7351622: `planet.b = semi_minor_axis(
planet.a, planet.e)` in capture_moon, plus the float_to_precision
wrappers on new_axis/new_eccn, plus `m.b = m.a * (1.0 - m.e^2).sqrt()`
in the moon orbit loop.
All 16 fixture tests pass bit-identically after the refactor — no
fixture regeneration needed. 4 new tests added: coalesce_produces_
correct_count, capture_moon_adds_moon, no_spurious_coalescence,
same_seed_same_system_after_perf_fixes.
# Conflicts: # src/structs/system.rs
The merge of master (post PR 1 + PR 2) changed events_log from &mut AccreteEvents to Option<&mut AccreteEvents>. The capture_moon call site at planetesimals_intersect was missed during conflict resolution and moved the Option instead of reborrowing it, leaving the subsequent coalesce_planetesimals and moons_to_rings calls trying to use a moved value. Add .as_deref_mut() to match the surrounding calls.
…alesce # Conflicts: # src/structs/system.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After #21 stripped the call-site clones in
planetesimals_intersect, two deep-clone hot spots remained in the post-accretion path:capture_moonstill cloned both arguments internally:For heavily-mooned planets late in bombardment, the larger-body clone is the dominant per-iteration cost. Every captured satellite gets duplicated just to update one orbit.
coalesce_planetesimalsallocated a shadowVecand cloned everyPlanetesimalon every call, regardless of whether any orbits actually intersected:coalesce_planetesimalsruns once per planet per bombardment iteration (and recursively on moons after each capture). In the common case, no intersection, every clone is unneeded.This was caught in the same profiling pass as #21 (2.56M-system Rayon batch on 12-core), where these two paths together dominated the remaining clone cost after the Roche-limit refactor.
Change
Fix 3B —
capture_moonownership. Change the signature to take owned(Planetesimal, Planetesimal)and modify the larger body in place:Only
moon.id(aString) is cloned, for the event message — not the struct. The float-precision wrappers andplanet.b = semi_minor_axis(planet.a, planet.e)line added in commit7351622are preserved (the moon orbit loop'sm.b = m.a * (1.0 - m.e^2).sqrt()line as well).Caller in
planetesimals_intersect. Take owned values out of the&mut Planetesimalslots viastd::mem::replacewith a placeholder:The placeholder left in
*pis dropped at scope end — the slot is either aVec::removelocal (fromcoalesce_planetesimals) or a freshly-allocatedouter_bodylocal (frompost_accretion); never read after the branch.*prev_pis overwritten by thecapture_moonresult in the same statement.New
Planetesimal::placeholder()constructor. A zero-initialized stub used solely by themem::replaceextraction above. Visibility ispub(crate)so it cannot leak into serialized output or downstream APIs. Never observed.Fix 3C
coalesce_planetesimalsin place. Replace the shadow-Vec+ clone-everything loop with an in-place walk that only removes a body when an actual intersection occurs:No
Vecallocation, no clones in the common case.Four new tests are added (
coalesce_produces_correct_count,capture_moon_adds_moon,no_spurious_coalescence,same_seed_same_system_after_perf_fixes) covering the merged-count invariant and same-seed reproducibility across the refactor.Impact
same_seed_same_system_after_perf_fixes.3A + 3B + 3Cto roughly a 23% improvement (1362s → 1052s) on the full batch. This means this PR alone accounts for the gap from perf(system): eliminate two Planetesimal clones in planetesimals_inte… #21 's ~15% to the full ~23%. Combined with the opt-out from theevents_logPR (also in this series), the same fork's run dropped to 333s. Those numbers are reference-fork data, not re-measured for this exact patch; the structural costs being eliminated are the same.Planetesimal::placeholder()ispub(crate)is not part of the public API. It exists only as amem::replacefiller and is overwritten in the same statement that creates it. No risk of leaking a zero-idplanet into serialized output or downstream code.7351622precision behavior:float_to_precision(new_axis),float_to_precision(new_eccn),planet.b = semi_minor_axis(planet.a, planet.e), and the moon-loopm.b = m.a * (1.0 - m.e^2).sqrt()are all retained. Earlier internal variants of this refactor dropped those lines; this PR does not.capture_moonandcoalesce_planetesimalsare both private (fn, nopub). Their signatures change, but no downstream caller can observe it.Planetesimal::placeholder()ispub(crate). Net public-API delta: none.