-
Notifications
You must be signed in to change notification settings - Fork 24
Code Organization
This page describes Hydra's code organization pattern used across multiple language implementations.
Hydra uses a consistent separation between hand-written and generated code organized into three top-level directories.
-
packages/holds a package's DSL-based module definitions, plus source-language helpers used to write them. -
heads/holds per-host runtimes that run those modules after they have been translated to a target language. -
dist/holds generated and copied artifacts, never edited by hand.
The test for whether a file belongs in packages/ or heads/: does it describe (or help describe) Hydra modules,
or does it run them after translation?
Description goes in packages/; running goes in heads/.
A package's source-language helpers (such as extra DSLs convenient for specifying that package's module definitions)
live alongside the package's DSL sources in packages/, written in the same language as the sources.
In some cases these helpers can be exported and reused by other packages written in the same source language.
The Hydra repository is not a place for general-purpose utilities written in a specific host language.
Host-specific code that is not part of writing or running Hydra modules belongs elsewhere.
The one deliberate exception is bindings/, which holds host-specific third-party integrations —
adapters and utilities that connect Hydra to external systems in a particular host language.
Code in bindings/ is not for runtime or bootstrapping and is not subject to the packages/ vs heads/ rule.
bindings/ is the third structural category alongside packages/ and heads/,
introduced after the rollup-everything-into-hydra-java design proved unworkable.
The rules:
- Each binding is a hand-written Maven/PyPI/etc. artifact (no DSL definition,
no JSON pipeline, not in
hydra.json's package list). - Each binding depends on exactly one Hydra package (e.g.,
hydra-rdf4jdepends onhydra-rdf) and optionally on the third-party library it wraps. - Bindings are independently versioned and publishable. In a multi-project Gradle
build they participate as
project(':hydra-rdf4j')references; downstream consumers pull the published artifact. - Bindings are not consumed by the bootstrap demo or by any Hydra package. They sit at the leaves of the dependency graph, not in the spine.
Two flavors of binding exist:
-
Third-party adapters — wrap an external library against a Hydra package
(e.g.,
hydra-rdf4jconnectshydra.rdf.syntax.*to Eclipse rdf4j;hydra-neo4jparses Cypher/GQL via ANTLR and converts tohydra.pg.query.*). Most bindings are this shape. -
Per-package host DSL helpers — hand-written host-language code that
provides DSL surface for a Hydra package, with no third-party dependency
(e.g.,
hydra-pg-dslprovides Java fluent builders forhydra.pg.{model,query}). These exist inbindings/rather thanheads/<host>/because they're tied to one Hydra package, not to the host language's Hydra runtime.
If hand-written host-language code wants to depend on a third-party library
(rdf4j, ANTLR, Neo4j, Apache Jena, TinkerPop, etc.) or wants to provide
Java/Python/etc. DSL surface for a specific Hydra package, that code belongs
in a binding, not in a heads/<lang>/ runtime. The runtime stays
third-party-free except for host stdlib + minimal build tooling.
-
packages/- DSL source packages. Most are Haskell-based, buthydra-javaandhydra-pythonare now host-language-native (Java and Python sources respectively); see the per-package notes below.-
packages/hydra-kernel/- Kernel type and term modules (the heart of Hydra, written in the Hydra DSL — Haskell-based) -
packages/hydra-haskell/- Haskell coder DSL sources (Haskell-based) -
packages/hydra-java/- Java coder DSL sources (Java-based as of 0.15; legacy Haskell sources retained as backup until 0.16) -
packages/hydra-python/- Python coder DSL sources (Python-based as of 0.15; legacy Haskell sources retained as backup until 0.16) -
packages/hydra-scala/,packages/hydra-lisp/- Per-target coder DSL sources (Haskell-based) -
packages/hydra-pg/- Property graph models and coders (Pg, Cypher, Tinkerpop, Graphviz) -
packages/hydra-rdf/- RDF, SHACL, OWL, ShEx, and XML schema models -
packages/hydra-ext/- Long-tail extension coders -
packages/hydra-bench/- Synthetic inference benchmark workloads (opt-in viabin/sync-bench.sh) (Avro, Protobuf, GraphQL, Cpp, Csharp, Go, Rust, TypeScript, Yaml, ...) -
packages/hydra-coq/,packages/hydra-typescript/,packages/hydra-go/,packages/hydra-wasm/- Additional targets (Coq complete; TypeScript complete per #126; Go and Wasm are "head buds" with partial runtimes)
-
-
heads/- Hand-written runtime code per language-
heads/haskell/- Haskell primitives, DSL helpers, code generation utilities, tests -
heads/java/- Java primitives, utilities, framework classes, tests -
heads/python/- Python primitives, DSL utilities, tests -
heads/scala/- Scala primitives, tests -
heads/lisp/- Per-dialect Lisp runtimes (clojure/,scheme/,common-lisp/,emacs-lisp/) sharing the hydra-lisp coder -
heads/typescript/- TypeScript runtime (#126) -
heads/go/,heads/wasm/- Head buds; partial runtimes pending completion
-
-
dist/- Generated code per language-
dist/haskell/hydra-kernel/- Generated Haskell kernel -
dist/java/hydra-kernel/- Generated Java kernel -
dist/python/hydra-kernel/- Generated Python kernel -
dist/scala/hydra-kernel/- Generated Scala kernel -
dist/typescript/hydra-kernel/- Generated TypeScript kernel -
dist/go/hydra-kernel/- Generated Go kernel (head bud — kernel only) -
dist/clojure/,dist/scheme/,dist/common-lisp/,dist/emacs-lisp/- Per-dialect Lisp kernels -
dist/json/- JSON kernel modules (canonical interchange format; tracked in git)
-
Only dist/json/ and dist/haskell/ are tracked in git; the rest regenerate from dist/json/ on demand.
This separation serves several purposes:
- Clear distinction - Easy to identify what is hand-written vs. generated
- Multi-language parity - Same DSL sources generate Haskell, Java, Python, Scala, and Lisp implementations
- Reproducibility - Generated code can be recreated from sources at any time
-
Version control - Both source and generated code are checked in, enabling:
- Tracking changes and reviewing diffs
- Understanding the impact of DSL changes
- Bisecting regressions across generations
-
Separation of concerns - Language-specific runtime code stays in
heads/, while the kernel remains pure indist/
All generated files include a header comment indicating they should not be manually edited. For example in Java:
// Note: this is an automatically generated file. Do not edit.Or in Haskell:
-- Note: this is an automatically generated file. Do not edit.Generated code should be regenerated whenever DSL sources change. See the specific README files for each package:
- Hydra-Haskell README - Haskell code generation
- Hydra-PG README - Property graph coders and models
- Hydra-Java README - Building Java artifacts
Generated tests follow the same pattern as generated main code:
-
dist/<lang>/hydra-kernel/src/test/- Generated test code- Common test suite ensuring parity across implementations
- Generated from the same test specifications
- Validates that all language implementations behave identically
Each Hydra package adapts this pattern to its language and purpose:
The Haskell implementation serves as the bootstrapping implementation for the entire Hydra project.
Hand-written sources are split between packages/hydra-kernel/ (kernel type and term specifications),
packages/hydra-haskell/ (Haskell coder DSL sources), and heads/haskell/ (runtime: primitives,
DSL helpers, code generation drivers, tests). Generated code lives under dist/haskell/.
-
packages/hydra-kernel/src/main/haskell/contains:- Kernel type and term DSL specifications (
Hydra/Sources/Kernel/Types/,Hydra/Sources/Kernel/Terms/) - Canonical primitive registry — one
PrimitiveDefinition-emitting module perhydra.lib.<sub>namespace (Hydra/Sources/Kernel/Lib/) - Host-side primitive bindings — pairs primitive names with native impls (
Hydra/Sources/Libraries.hs)
- Kernel type and term DSL specifications (
-
packages/hydra-haskell/src/main/haskell/contains:- Haskell coder DSL sources (
Hydra/Sources/Haskell/)
- Haskell coder DSL sources (
-
heads/haskell/src/main/haskell/contains:- DSL helpers and wrappers (
Hydra/Dsl/Meta/) - Native primitive implementations (
Hydra/Haskell/Lib/) - Code generation utilities (
Hydra/Generation.hs)
- DSL helpers and wrappers (
-
dist/haskell/hydra-kernel/src/main/haskell/contains:- Complete generated kernel implementation
- Generated DSL modules (
Hydra/Dsl/) with constructors, accessors, and updaters for all Hydra types - Generated from DSL sources via
writeHaskellandwriteDslHaskell
See Hydra-Haskell README for details.
The Java implementation provides a Java API for Hydra with the same kernel semantics.
The Java coder DSL sources are themselves written in Java (as of 0.15);
hand-written runtime lives under heads/java/; generated code under dist/java/hydra-kernel/.
-
packages/hydra-java/src/main/java/hydra/sources/java/contains:- The Java coder DSL sources:
Syntax.java,Language.java,Coder.java,Serde.java,Names.java,Utils.java,Environment.java,Testing.java - Support classes (
JavaHelpers.java,SourceDsl.java) - These are the source of truth for
hydra.java.*modules; the self-host entry point isbin/generate-hydra-java-from-java.sh.
A legacy Haskell-DSL copy of the same modules still lives under
packages/hydra-java/src/main/haskell/Hydra/Sources/Java/and produces byte-identical output. It will be dropped before 0.16; the main sync sequence will switch over to the Java-native pipeline in the meantime. - The Java coder DSL sources:
-
heads/java/src/main/java/contains:- Hand-written primitive function implementations (
hydra/lib/) - Core utilities (
hydra/util/) - Framework classes (
hydra/tools/) - Core algorithms (
Rewriting.java,Reduction.java) - The native Java DSL → JSON driver (
hydra/UpdateJavaJson.java) - Language-specific parsers
- Hand-written primitive function implementations (
-
dist/java/hydra-kernel/src/main/java/contains:- Generated Java code from Hydra DSL sources
- Core types (
hydra/core/) - Graph and module structures
- Type adapters and computational abstractions
- Generated via
writeJavain heads/haskell
Uses the visitor pattern for representing algebraic data types in Java.
See Hydra-Java README for details.
Extension modules are organized into four domain-specific packages:
-
packages/hydra-pg/- Property graph models, coders, and related tools. See the Hydra-PG README.- PG data model (
Hydra/Sources/Pg/) - GraphSON, GQL, Cypher, TinkerPop syntax models
- Graphviz support
- PG data model (
-
packages/hydra-rdf/- RDF, SHACL, OWL, ShEx, and XML schema models. See the Hydra-RDF README.- RDF syntax model (
Hydra/Sources/Rdf/) - SHACL model and coder (
Hydra/Sources/Shacl/) - OWL 2 syntax model (
Hydra/Sources/Owl/)
- RDF syntax model (
-
packages/hydra-ext/- Long-tail extension coders- Avro, Protobuf, GraphQL, Pegasus
- Cpp, Csharp, Go, Rust, TypeScript syntax models
- Kusto, Delta, Datalog, JSON Schema, YAML, and other miscellaneous models
-
packages/hydra-bench/- Synthetic inference benchmark workloads (hydra.bench.*). See the Hydra-Bench README. Deliberately stress-shaped; not regenerated by the default sync. Usebin/sync-bench.shto refresh on demand before runningbin/run-inference-bench.sh.
Generated code for these packages lives in dist/*/hydra-pg/, dist/*/hydra-rdf/,
dist/*/hydra-ext/, and dist/*/hydra-bench/.
Demos are in demos/ at the repository root.
The Python implementation uses the same pattern as other implementations.
The Python coder DSL sources are themselves written in Python (as of 0.15);
hand-written runtime lives under heads/python/; generated code under dist/python/hydra-kernel/.
-
packages/hydra-python/src/main/python/hydra/sources/python/contains:- The Python coder DSL sources:
syntax.py,language.py,coder.py,serde.py,names.py,utils.py,environment.py,testing.py - Support modules (
_python_helpers.py,_kernel_refs.py) - These are the source of truth for
hydra.python.*modules; the self-host entry point isbin/generate-hydra-python-from-python.sh.
A legacy Haskell-DSL copy of the same modules still lives under
packages/hydra-python/src/main/haskell/Hydra/Sources/Python/and produces byte-identical output. It will be dropped before 0.16; the main sync sequence will switch over to the Python-native pipeline in the meantime. - The Python coder DSL sources:
-
heads/python/src/main/python/contains:- Hand-written primitive implementations (
hydra/lib/) - DSL utilities (
hydra/dsl/) - Language-specific parsers and extensions
- Hand-written primitive implementations (
-
dist/python/hydra-kernel/src/main/python/contains:- Generated Python code from Hydra DSL sources
- Core types (
hydra/core.py) - Graph and module structures (
hydra/graph.py,hydra/module.py) - Type inference and checking (
hydra/inference.py,hydra/checking.py) - Term transformations (
hydra/reduction.py,hydra/rewriting.py,hydra/hoisting.py) - Generated via
writePythonin heads/haskell
-
dist/python/hydra-kernel/src/test/python/contains:- Generated test suite ensuring parity with Haskell, Java, Python, Scala, and Lisp
- Generation tests (terms generated to Python and executed)
See Hydra-Python README for details.
- Implementation - Detailed implementation guide including the bootstrap process
- Testing - Common test suite and language-specific testing
- Concepts - Core concepts and design principles