Add Gazelle Clojure plugin (babashka-based parser)#100
Draft
miridius wants to merge 5 commits into
Draft
Conversation
e60ac8a to
a84aa7d
Compare
miridius
added a commit
that referenced
this pull request
May 18, 2026
Bugs fixed: - parse-ns-form / read-ns-from-jar-entry now use edamame/parse-string-all so a leading (set! *warn-on-reflection* true) before (ns ...) no longer hides the namespace. - gen-dir's rollup-rules :lib-deps now uses emitted rule :name attrs rather than path basenames, so an ns-binary-meta :name override still produces valid deps. - gazelle/clojureparser.Runner.Shutdown sets dead=true to short-circuit a subsequent Parse that would otherwise race a closed stdin. - receive() copies the scanner buffer before returning so callers can hold the slice safely. - find-output-base throws on non-zero bazel info exit instead of returning empty and propagating a misleading "@deps/BUILD.bazel not found under /external" later. - handle-parse throws when no source-path matches rel-dir (was silently emitting rules with resource_strip_prefix=""). - Cache transit-read is wrapped so a corrupt cache file (truncated transit, partial-write from a killed prior run) triggers a clean rebuild instead of crashing handle-init. - applyAttr rejects non-integer floats for int Bazel attrs (would have silently truncated). - ClojureExtensionDirective renamed to ClojureEnabledDirective so the Go-side name matches the user-facing directive value. Test improvements: - parse-deps-build-multi-aot-entries now builds a real jar so all three AOT namespaces actually appear in clj-ns->label (was a smoke test). - resolve-deps-build-override + probe-bzlmod-deps-build extracted as testable helpers; new tests for canonical / apparent / missing branches and the override existence checks. - New tests for ApparentLoads (remapped module, missing module fatal, Loads/ApparentLoads delegation) and subdirHasClojureFiles walkErr fatal path. - TestImportsRuleNsHit now asserts ImportSpec.Lang. - TestGenerateRulesFatalsOnParserDeath pins the actual fatal message. - Exception-chain ExecutionException-without-cause now asserts the wrapper message is preserved. - fatal-error-detection JDK-class-hierarchy tautology removed. Code quality: - Dead `resolved` atom in resolve-ns-deps dropped. - basename / file-ext delegate to babashka.fs/strip-ext and /extension. - rule-spec->wire uses update-keys. - handle-parse threads rel-dir into resolve-ns-deps so the unresolved-requires warning shows which directory. Comments and docs: - Go comments referencing rule construction now point at gazelle_server.bb's ns-rules, not gen-build / gen_build.clj. - DepsEdn() docstring corrected (absolute path, not workspace-relative). - Platform-keys cross-reference fixed to Platform* constants. - Cache key docstring at top of gazelle_server.bb names all three inputs (BUILD content + format version + no-aot set). - "pick the platform-appropriate one and stop" replaced with the actual behaviour (both labels emitted, map semantics dedupe). - emdashes in docstrings / println output replaced with parens per project convention. - "Bazel built-ins" comment for java_library qualified. - clojure_test rule kind declares MergeableAttrs for env/tags/jvm_flags/ size/timeout so user edits aren't clobbered. Test infrastructure: - bb tests guard their entry point with (when (= *file* ...)) so load-file callers don't trigger System/exit. - CircleCI installs babashka and sets GAZELLE_INTEGRATION_TEST_REQUIRED=1 so the bb-side integration tests can't silently no-op in CI. PR description (#100): - Drop "13 bb-side unit tests" (actual count is 42). - Clarify relation to #98 / #99 (this PR is standalone; parity claims assume those land). - "Alternative to gen_srcs" rather than "Replaces".
032e41f to
bedd65e
Compare
A Gazelle Clojure language plugin keeping `BUILD.bazel` files in sync with Clojure source. Bundled into `gazelle_bin`; the Go plugin spawns a `bb` subprocess (fetched via `rules_multitool`) that parses `.clj` / `.cljc` / `.cljs` files and `@deps/BUILD.bazel` over a newline-JSON wire protocol. Why bb - Cold start ~30ms vs ~1s for a JVM-based parser; no daemon needed for incremental use. - Substantially faster full-repo regen than `gen_srcs`, plus a sub-second path-scoped mode that `gen_srcs` can't service at all. - edamame picks up reader-conditional / macro-heavy CLJS namespaces in jar contents that `clojure.tools.reader` silently dropped. How it works - Long-lived `bb` subprocess speaking newline-JSON. On `init` it parses `@deps/BUILD.bazel` and caches the per-jar ns scan to disk; cache key mixes BUILD content, cache-format version, and `:bazel :no-aot` so any of those changing invalidates. - bb's ns-rules mirrors `cljs.analyzer/aliasable-clj-ns?`: a `clojure.X` require from a CLJS source rewrites to `cljs.X` when the original has no CLJS-loadable form and the replacement does. - Gazelle merges with hand-written `BUILD` rules cleanly; only the rules_clojure load line is touched. Co-authored-by: Daniel Compton <desk+github@danielcompton.net> Co-authored-by: Claude <noreply@anthropic.com>
bedd65e to
0d8133f
Compare
`parse-deps-build` only matched buildifier-canonical blocks where `(` and `)` sit on their own lines. rules_clojure's @deps extension emits the compact form (open-paren + first arg on one line, close on the last arg line); every block silently failed to match, producing an empty dep_ns_labels map and a wall of bogus 'unresolved' warnings for every require in any consumer with a substantial CLJS surface. Broaden the open regex to allow content after the paren, and let block- end match any line whose trailing `)` closes the call. Pinned both shapes with a compact-format `java_import` test and a compact-format `clojure_library` AOT test.
Closure stdlib namespaces (goog.string, goog.object, ...) live as raw JavaScript inside the closure-library jar, with no `(ns ...)` form for scan-jar to index. A CLJS file requiring goog.string would generate a BUILD with no dep entry for it, leaving the cljs compiler unable to link. cljs.core's wrapper label (org_clojure_clojurescript) transitively depends on closure-library, so routing any goog.* require to that label gets the goog code on the compile classpath without a fragile hardcoded closure-library label name.
Empty was unset, so a clojure_library whose .clj was deleted kept its BUILD entry across runs. Add orphan stubs; lift test_ns / main_class into MergeableAttrs so IsEmpty fires after merge.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
Bazel reads
BUILD.bazelfiles to figure out what to build. In a large Clojure repo those files describe everyclojure_library/clojure_test/clojure_binarytarget plus the deps between them. Hand-maintaining them at scale is impractical, so we generate them from the source tree.We've been doing that with
gen_srcs(a standalonebazel runtarget). It walks the repo, parses every Clojure namespace, computes deps, writes BUILD files. It works, but it's outside the normal Bazel lifecycle: no incremental updates, no path-scoped runs, no per-package directives, no interop with other languages also generating BUILD files in the same repo.This PR adds an experimental Gazelle plugin as an alternative.
gen_srcscontinues to exist until the Gazelle plugin is deemed stable (we'd eventually drop it).What's Gazelle?
Gazelle is a BUILD-file generator that runs as a Bazel build target. You write a language plugin (in Go, Gazelle's own implementation language) that tells it how to find your source files and what rules to emit; Gazelle handles the rest (directory walk, BUILD-file parsing, rule merging, cross-package dep resolution). Most modern Bazel rule sets ship a Gazelle plugin (Go, Java, Python, Rust, …) so users can keep their BUILD files in sync with one command.
Why a subprocess?
Gazelle plugins are Go code. But namespace parsing belongs in Clojure (
edamamealready handles reader conditionals,(ns ...)forms, splice forms, etc.). There aren't good Clojure parsers for Go, plus it's MUCH easier for us to maintain Clojure code.The plugin follows the Java Gazelle plugin architecture: the Go side is glue, the actual parsing lives in a long-running Clojure subprocess it talks to over stdio.
Why babashka, not JVM Clojure?
bb's ~30ms cold start (vs ~1s for the JVM) makes path-scoped runs viable: a per-file-save gazelle invocation completing in sub-second time vs. multi-second for the JVM variant.The cost is that
bbcan't loadtools.deps(Java reflection blocks AOT), so the bb side has its own rule-construction code (gazelle_server.bb) rather than reusingrules-clojure.gen-build. The two implementations are kept in sync via a shared parity fixture (test/rules_clojure/rollup_rules_fixtures.edn) loaded by both sides' tests.flowchart LR Bazel[bazel run //gazelle:gazelle_bin] --> Plugin[Go plugin<br/>gazelle/] Plugin <-->|JSON lines<br/>on stdio| Server[bb subprocess<br/>gazelle_server.bb] Plugin --> Builds[BUILD.bazel files]Life of a Gazelle run
Gazelle walks the repo top-down, calling hook points on every package. The plugin implements all four:
# gazelle:directives. On the root call, auto-discoverdeps.ednand boot the bb subprocess.{kind, attrs}specs into*rule.Rule.clojure_library, walk its:requires and fill in:depsagainst Gazelle's cross-package index.Wire protocol
The bb server speaks newline-delimited JSON on stdio. One request per line, one response per line.
initdep_ns_labelsper platform),deps_bazeloverrides,source_paths,ignore_pathsparseNamespaceInfoper basename group + the__clj_lib/__clj_filesrollup rulesStdio (not gRPC / sockets / etc.) because zero setup, crash-safe (subprocess dies, Go side sees EOF and
log.Fatalfs), easy to debug (bb gazelle_server.bband paste JSON at it).What lives where
Rule construction is bb-side (
gazelle_server.bb'sns-rules): AOT-vs-plain decisions (:bazel/clojure_libraryns-meta), test attr passthrough (:bazel/clojure_testfor size/tags/timeout),:require→ dep label mapping,clojure_binaryfor:bazel/clojure_binary,java_libraryfor.jssiblings, and rollup composition.ns-rulesreturns[{:type :clojure_library :attrs {...}} ...]; Go translates verbatim into*rule.Rule.The Go side does three things:
processExitInfo.{:type :clojure_library :attrs {:name "core" …}}→rule.NewRule("clojure_library", "core")+r.SetAttr(...).clojure_library: intra-repo index first (matches(:require [my.foo])to//src/my:foowhen another package generated it), theninit'sdep_ns_labelsper platform, plus per-target overrides fromdeps_bazel.Static deps that don't need Gazelle's index (
org_clojure_clojure, import-deps, gen-class-deps, ns-library-meta extras) are pre-merged bb-side and seeded into the dep set from the rule's existing:deps; Resolve only adds what genuinely needs the cross-package index.Configuration
Per-package directives in BUILD-file comments:
Important
Migrating from
gen_srcs?gen_srcstakes aliases via CLI args. The plugin reads them from# gazelle:clojure_aliases :a,:b,...in the rootBUILD.bazel. If unset, the plugin defaults to every alias indeps.edn(matchinggen_srcs's typicaldeps.install(aliases = [...])invocation pattern).Failure semantics
An empty
GenerateResultfor a previously-rule-bearing package looks to Gazelle like "delete every rule". A green run that wipes the build graph is worse than a noisy exit, so the plugin fails loud on:subdirHasClojureFiles(permission denied, broken symlink, etc.)RuleKindenum)dep_ns_labels.clj/.cljs, orNamespaceInfomixing the Clojure-group and JS-only shapesrules_clojurebazel_depinMODULE.bazelbb side: the request loop catches
Throwable(not justException) so OOM /VirtualMachineErrorcan't silently kill the subprocess; non-fatal errors return a{type:"error", message: <full cause chain>}envelope so the actionable root cause surfaces.Subdir rollup
Each package emits a
__clj_lib/__clj_filesrollup that aggregates its own rules plus the rollups of any Clojure-bearing subdirectories. Naively that's an O(n)WalkDirper package — quadratic across the tree. Gazelle's bottom-up walk lets the plugin recordhasClojureContent[rel]for each visited package and consult it as an O(1) lookup when the parent gets generated (falls back to the on-disk walk for any subdir we haven't visited yet — defensive, shouldn't happen in normal Gazelle ordering).Intermediate-only directories (no direct
.clj/.cljs/.cljc/.jsfiles, but Clojure-bearing subdirs) still emit rollup rules so consumers of//foo:__clj_libkeep working whenfoo/is an aggregator with code only infoo/bar/.Tests
//test/rules_clojure:gazelle-server-bb-testwraps bb-side unit tests covering ns parsing (reader conditionals, splice forms), libspec shapes,@deps/BUILD.bazelmulti-line parsing, rule rollup, cache invariants (corrupt sha, corrupt transit),find-output-basecache validation, scan-jar.cljc/.cljspaths,parse-groupfailure modes, the-mainrequest loop, andhandle-initsource-path tiebreaker.//gazelle:gazelle_testis the Go-sideConfigure/GenerateRules/Resolvesuite, including an end-to-endGenerateRulestest against a stubbed bb script that exercises the full Configure → GenerateRules → Resolve pipeline.//gazelle/clojureparser:clojureparser_testis the wire-protocol round-trip tests using a self-contained tempdir fixture (tinydeps.edn+ empty@deps/BUILD.bazelset viaGAZELLE_DEPS_BUILD).Cross-process parity between
gazelle_server.bb'srollup-rulesandgen_build.clj'srollup-rulesis pinned bytest/rules_clojure/rollup_rules_fixtures.edn, loaded by tests on both sides.Relation to #84
Supersedes #84 (the JVM-Clojure parser variant). That PR can be closed once this one is reviewed.