Skip to content

perf(gen_build): memoize classpath-files for is-aoted?#91

Open
english wants to merge 2 commits into
griffinbank:mainfrom
english:series/gen-srcs-opt-classpath-memo
Open

perf(gen_build): memoize classpath-files for is-aoted?#91
english wants to merge 2 commits into
griffinbank:mainfrom
english:series/gen-srcs-opt-classpath-memo

Conversation

@english
Copy link
Copy Markdown
Contributor

@english english commented Apr 4, 2026

Depends on #90

Memoises classpath file listing in the is-aoted? path to avoid repeated scans of the same classpath
entries.

Change

  • add memoised wrapper for classpath-files
  • use memoised lookup in is-aoted?

Benchmark

examples/gen_srcs_bench (Hyperfine, 10 runs, prepare: delete src/**/BUILD.bazel)

  • before median: 10.113s (mean 11.905s ± 6.529s)
  • after median: 9.017s (mean 9.399s ± 1.827s)
  • change (median): -10.84%

@english english requested a review from a team April 4, 2026 16:19
@english english marked this pull request as ready for review April 4, 2026 18:35
Copy link
Copy Markdown
Contributor

@arohner arohner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this works. AFAICT jar-classes is only called in one place, ->class->jar, and ->class->jar is only called in one place, srcs. Are there duplicate jars on the classpath?

english and others added 2 commits April 7, 2026 09:22
Benchmark (examples/gen_srcs_bench, hyperfine, 10 runs, prepare: delete src/**/BUILD.bazel):\n- before median: 10.113s (mean 11.905s ± 6.529s)\n- after median: 9.017s (mean 9.399s ± 1.827s)\n- change (median): -10.84%
The previous memoize approach cached the file listing per path but
still did O(n) linear scans via `some` on every call. Restructure to
build a set once per classpath entry and pass it through
should-compile-namespace?/is-aoted?, replacing the linear scan with
O(1) `contains?` lookups.

Benchmark (examples/gen_srcs_bench, hyperfine, 5 runs):
- baseline median:      13.155s
- memoize median:       10.899s (-17%)
- restructured median:   6.636s (-50%)

(written by Claude)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@english
Copy link
Copy Markdown
Contributor Author

english commented Apr 7, 2026

not duplicate jars — it's the same jar/directory scanned once per namespace within it. ->dep-ns->label iterates each jar/directory on the classpath. for each one, it discovers all namespaces inside it, then calls should-compile-namespace?is-aoted?classpath-files(path) per namespace. so if a jar contains N namespaces, we were listing the same jar's contents N times.

I've pushed a commit that restructures instead of memoising: build the file listing as a set once per classpath entry, pass it through. also replaces the (some #(= ...) <list>) with (contains? set ...) which gets us O(1) lookups instead of O(n) linear scans.

benchmark (examples/gen_srcs_bench, hyperfine, 5 runs):

with --prepare 'find src -name BUILD.bazel -type f -delete' (full regeneration):

  • baseline median: 13.155s
  • memoise median: 10.899s (-17%)
  • restructured median: 6.636s (-50%)

without prepare (no-op, BUILD files already exist):

  • baseline median: 3.049s
  • memoise median: 2.714s (-11%)
  • restructured median: 2.560s (-16%)

(written by Claude)

@english english force-pushed the series/gen-srcs-opt-classpath-memo branch from bdd6b16 to 22a7b97 Compare April 7, 2026 09:10
@english english requested a review from arohner April 7, 2026 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants