Skip to content

[Medium] Trace accumulation is O(steps^2) on the generic and stale-async execution paths #36

Description

@gratus907

Severity: Medium (performance)

Trace event accumulation is O(steps²) on the generic sync and stale-async execution paths, because each step copies the entire event tuple.

Locationssrc/variopt/study/execution.py:646, src/variopt/study/stale_async.py:394

trace = trace.append(event)  # Trace.append copies the full tuple each call

Scenario

A 10k-step, batch_size=1 run performs ~5·10^7 tuple-element copies purely for tracing. Profiling of cheap objectives on the generic path shows trace append dominating wall time.

Notably, the recently added cheap-scalar-sequential fast path (execution.py:217,301) already avoids this by buffering trace_events in a plain list and constructing the Trace once at report assembly — the general and stale-async paths still pay the quadratic cost this fast path was specifically built to avoid.

Fix direction

Accumulate TraceEvents in a list during the run loop and construct Trace once at report-assembly time, as the fast path already does.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions