-
Notifications
You must be signed in to change notification settings - Fork 24
Testing
Hydra maintains a comprehensive test suite to ensure correctness and parity across all language implementations. Testing is organized into two categories: the common test suite (shared across all implementations) and language-specific tests.
The common test suite (hydra.test.testSuite) is designed to run identically in each Hydra implementation.
Passing all test cases in the common test suite is the criterion for a true Hydra implementation,
ensuring that all implementations behave identically
and can interoperate in heterogeneous environments.
To run tests, you need:
- Hydra built locally (see main README)
- For Haskell: Stack installed
- For Java: JDK 11+ and Gradle
- For Python: Python 3.12+ and pytest
This guide is for:
- Contributors adding new tests to the common suite
- Developers implementing new language backends
- Anyone debugging test failures
See also:
- Test Suite Architecture - Detailed architecture of the test kernel and module structure
- Implementation - How tests are code-generated
- Concepts - Understanding Flow and primitives
The common test suite is part of Hydra's "test kernel" and is written in the Hydra DSL at Hydra/Sources/Test. Like the rest of Hydra's kernel, these test definitions are code-generated into Haskell, Java, Python, Scala, TypeScript, and the Lisp dialects using Hydra's own coders.
Hydra's Common Test Suite gives rise to two kinds of tests: kernel tests and generation tests.
Both are derived from the same suite of test cases (defined in Hydra/Sources/Test/),
but they are instantiated differently to test different modes of operation:
-
Kernel tests validate that Hydra's runtime works correctly. The test cases are code-generated into each target language, and each implementation runs them against its own kernel (primitives, type checker, reducer, etc.). This answers: "Does the Hydra kernel behave correctly in this language?"
-
Generation tests validate that Hydra's code generators produce correct output. The same test cases are used to generate code in each target language, then verify that the generated code compiles and produces the expected results. This answers: "Does code generated by Hydra work correctly in the target language?"
In other words, kernel tests exercise Hydra-as-interpreter, while generation tests exercise Hydra-as-compiler.
| Aspect | Kernel Tests | Generation Tests |
|---|---|---|
| Purpose | Test the Hydra runtime | Test code generation |
| Source | Common test suite DSL | Common test suite DSL |
| Instantiation | Generated test harness runs against kernel | Generated target-language code is compiled and executed |
| Regeneration |
bin/sync.sh (or per-language bin/sync-<lang>.sh) emits both kernel-test and generation-test artifacts to dist/<lang>/hydra-kernel/src/test/
|
Both test types are generated to dist/<lang>/hydra-kernel/src/test/
but serve complementary purposes in ensuring Hydra's correctness across both modes of operation.
The common test suite currently includes:
Tests for Hydra's standard library functions in hydra.lib:
List primitives (Test/Lib/Lists.hs):
-
apply,bind,concat,cons,drop,elem -
filter,find,foldl,foldr,group,intercalate,intersperse -
length,map,maybeAt,maybeHead,maybeInit,maybeLast,maybeTail -
nub,null,partition,pure,replicate,reverse -
singleton,sort,span,take,transpose,uncons,zip
Example test cases:
-
lists.maybeAt: Access elements at specific indices; Nothing on out-of-bounds -
lists.map: Apply functions over lists -
lists.concat: Concatenate nested lists -
lists.reverse: Reverse list order
String primitives (Test/Lib/Strings.hs):
-
cat,cat2,fromList,intercalate -
length,lines,maybeCharAt,null,splitOn,toList -
toLower,toUpper,unlines
Example test cases:
- Unicode handling:
"ñ世🌍"(combining characters, multi-byte) - Empty string edge cases
- Control characters and special characters
Case convention conversion tests (Test/Formatting.hs):
-
lower_snake_case↔UPPER_SNAKE_CASE↔camelCase↔PascalCase - Handles numbers and edge cases:
"a_hello_world_42_a42_42a_b"
Tests for Hydra's type inference system (Test/Inference):
- AlgebraicTypes.hs: Sum types, product types, union/record inference
- NominalTypes.hs: Named type definitions
- Fundamentals.hs: Basic inference cases
- Classes.hs: Type-class constraint propagation (equality, ordering)
- AlgorithmW.hs: Algorithm W implementation tests
- KernelExamples.hs: Real kernel code examples
- Failures.hs: Expected failure cases
The common test suite supports multiple test case types:
- Evaluation tests: Verify that a term reduces to an expected result
- Case conversion tests: Verify string case convention transformations
- Inference tests: Verify that type inference produces the expected type
- Inference failure tests: Verify that type inference fails as expected for invalid terms
Tests are defined in the Hydra DSL in packages/hydra-kernel/src/main/haskell/Hydra/Sources/Test/:
-- Example from Test/Lib/Lists.hs
listsReverse = TestGroup "reverse" Nothing [] [
test "basic list" [1, 2, 3, 4, 5] [5, 4, 3, 2, 1],
test "single element" [42] [42],
test "empty list" [] []]
where
test name input output =
primCase name _lists_reverse [intList input] (intList output)Test definitions are generated into each target language using Hydra's coders:
- Haskell: dist/haskell/hydra-kernel/src/test/haskell/Hydra/Test/TestSuite.hs
- Java: dist/java/hydra-kernel/src/test/java/hydra/test/
-
Python: Generated to
dist/python/hydra-kernel/src/test/python/hydra/test/
Regenerating tests: kernel-test and generation-test artifacts are produced by the
standard sync pipeline. Running bin/sync-haskell.sh (or the full bin/sync.sh) emits
both kinds via bootstrap-from-json, writing to dist/<lang>/hydra-kernel/src/test/.
For a Haskell-only refresh of the generation tests after editing test sources:
heads/haskell/bin/update-generation-tests.shSee Test Generation Architecture below for details on test sources.
Each language has a test runner that executes the generated test suite:
Haskell (TestSuiteSpec.hs):
- Uses HSpec framework
- Runs:
stack testinheads/haskell/ - Interactive:
stack ghci hydra:lib hydra:hydra-testthenTest.Hspec.hspec Hydra.TestSuiteSpec.spec
Java (TestSuiteRunner.java):
- Uses JUnit 5 with parameterized tests
- Runs:
./gradlew testfrom root directory - Executes evaluation tests via term reduction
Python (test_suite_runner.py):
- Uses pytest framework
- Runs:
pytest src/test/python/test_suite_runner.pyinheads/python/ - Dynamically generates pytest test functions from the test suite
- Includes evaluation tests via term reduction
The common test suite ensures that all Hydra implementations behave identically:
- Same test definitions: All implementations run exactly the same tests (generated from one source)
- Same primitive functions: Each language implements the same set of primitives with identical behavior
- Same reduction semantics: Term evaluation produces identical results across languages
- Continuous validation: Tests run on every build to catch regressions
This is critical for heterogeneous environments like Apache TinkerPop where the same logic must execute identically across multiple languages.
Test cases live in DSL-defined modules under
packages/hydra-kernel/src/main/haskell/Hydra/Sources/Test/.
Each test module is a regular Hydra module that constructs TestCase /
TestGroup values; the same module is code-generated into every
target language alongside the kernel, and each language's runner walks
the resulting TestSuite tree to execute the cases.
The kernel test modules cover:
-
hydra.test.checking.*— type checker tests (algebraic types, collections, fundamentals, nominal types, advanced cases, expected failures). -
hydra.test.inference.*— type inference tests (algorithm W, algebraic types, classes, collection terms, fundamentals, kernel examples, nominal types, expected failures). -
hydra.test.lib.*— primitive-library tests (per-namespace). - Topical suites:
hydra.test.formatting,hydra.test.generation,hydra.test.hoisting,hydra.test.json.*,hydra.test.ordering,hydra.test.reduction,hydra.test.rewriting,hydra.test.serialization,hydra.test.sorting,hydra.test.strip,hydra.test.substitution,hydra.test.unification,hydra.test.validate.*,hydra.test.variables. - The umbrella
hydra.test.testSuitemodule aggregates every test group above into a singleTestSuitetree.
Adding a new test case means adding a TestCase/TestGroup value to
one of these modules (or creating a new module and wiring it into
hydra.test.testSuite). After regeneration, every language picks up
the new case via the next sync. See the
Extending tests recipe
for step-by-step instructions.
This section provides quick-start instructions for running tests in each language. For complete setup instructions (installing dependencies, configuring your environment), see the language-specific READMEs linked below.
See Hydra-Haskell README for full setup instructions.
Run all tests (kernel tests + generation tests + language-specific tests):
cd heads/haskell
stack testInteractive testing (useful for debugging):
stack ghci hydra:lib hydra:hydra-test-- Run kernel tests
Test.Hspec.hspec Hydra.TestSuiteSpec.spec
-- Run generation tests
Test.Hspec.hspec Generation.Spec.specRegenerate tests after modifying test sources:
# Regenerate Haskell kernel + generation tests via the standard sync pipeline
heads/haskell/bin/sync-haskell.sh # or bin/sync-haskell.sh from worktree root
# Generation tests only (e.g. for fast iteration on hspec specs)
heads/haskell/bin/update-generation-tests.sh
# Run updated tests
stack testTest directory structure:
heads/haskell/
├── src/test/haskell/ # Hand-written test infrastructure
│ ├── Spec.hs # Entry point (hspec-discover)
│ └── Hydra/TestSuiteSpec.hs # Kernel test runner
dist/haskell/hydra-kernel/
├── src/test/haskell/ # Generated tests
│ ├── Hydra/Test/ # Kernel test data (TestGroup structures)
│ └── Generation/ # Generation tests (executable specs)
See Hydra-Java README for full setup instructions.
Run all tests (kernel tests + generation tests + Java-specific tests, from the repository root):
./gradlew :hydra-java:testRun a specific test class:
./gradlew test --tests "hydra.VisitorTest"Regenerate tests (from the worktree root):
./bin/sync-java.shThis wraps bin/sync.sh --hosts java --targets java, which regenerates
Java code across every package (kernel modules, eval libs, coders,
tests) and runs the Java test suite.
For manual generation, see the Hydra-Java README.
Test directory structure:
heads/java/
├── src/test/java/ # Hand-written test infrastructure
│ ├── hydra/TestSuiteRunner.java # Kernel test runner (JUnit 5)
│ └── hydra/VisitorTest.java # Java-specific tests
dist/java/hydra-kernel/
├── src/test/java/ # Generated tests
│ ├── hydra/test/ # Kernel test data (TestGroup structures)
│ └── generation/ # Generation tests (executable JUnit tests)
See Hydra-Python README for full setup instructions.
Run all tests (kernel tests + Python-specific tests):
cd heads/python
pytestRun only the common test suite (kernel tests):
pytest src/test/python/test_suite_runner.pyRun generation tests (tests generated code correctness):
pytest ../../dist/python/hydra-kernel/src/test/python/generation/Run a specific test file:
pytest src/test/python/test_grammar.pyGenerate a categorized summary report:
python src/test/python/test_summary_report.pyRegenerate tests (from the worktree root):
./bin/sync-python.shThis wraps bin/sync.sh --hosts python --targets python, which
regenerates Python code across every package (kernel modules, eval
libs, coders, tests) and runs the pytest suite.
For manual regeneration using bootstrap-from-json:
cd heads/haskell
stack build hydra:exe:bootstrap-from-json
stack exec bootstrap-from-json -- --target python --include-coders --include-tests --include-gentests +RTS -K256M -A32M -RTSKnown limitations: The Python coder does not yet support all literal types (e.g., float32),
so some generation tests may fail to generate.
Run the generation script to see the current status.
Test directory structure:
heads/python/
├── src/test/python/ # Hand-written tests
│ ├── test_suite_runner.py # Kernel test runner
│ ├── test_grammar.py # Python-specific tests
│ └── ...
dist/python/hydra-kernel/
├── src/test/python/ # Generated tests
│ ├── hydra/test/ # Kernel test data (TestGroup structures)
│ └── generation/ # Generation tests (pytest test files)
│ └── hydra/test/lib/ # e.g., test_chars.py, test_lists.py
In addition to the common test suite, each implementation has language-specific tests:
Located in heads/haskell/src/test/haskell:
- Haskell coder tests
- DSL tests
- Property-based tests (QuickCheck)
- Integration tests
Generation tests (in dist/haskell/hydra-kernel/src/test/haskell/Generation/) verify Haskell code generation.
Located in heads/java/src/test/java:
- Java coder tests
- Serialization tests
- Integration tests
- Performance tests
Located in heads/python/src/test/python:
-
test_python.py- Python-specific functionality -
test_json.py- JSON encoding/decoding -
test_grammar.py- Grammar parsing -
test_generated_code.py- Generated code validation
Run with: pytest in the packages/hydra-python/ directory
-
Create test definitions in
packages/hydra-kernel/src/main/haskell/Hydra/Sources/Test/- Add to existing test groups or create new ones
- Use the testing DSL (
TestGroup,primCase, etc.)
-
Register tests in TestSuite.hs
- Add test group binding
- Include in appropriate parent group
-
Regenerate tests for all implementations via the standard sync pipeline:
# Single language (host == target) bin/sync-haskell.sh # also bin/sync-java.sh, bin/sync-python.sh, ... # Full matrix (every host × every target) bin/sync.sh --hosts all --targets all # Triad (Haskell, Java, Python) bin/sync-default.sh
Each invocation emits both kernel-test data and executable generation tests under
dist/<lang>/hydra-kernel/src/test/. -
Update Haskell generation tests in isolation (faster iteration):
heads/haskell/bin/update-generation-tests.sh
-
Run tests in each language to verify
See also:
-
DSL guide -
Learn the testing DSL syntax (
TestGroup,primCase, etc.) - Developer recipes - Step-by-step guides for common tasks
Add tests directly to the language-specific test directory using that language's testing framework.
The common test suite currently provides:
- ✅ List primitives: ~30 functions with multiple test cases each
- ✅ String primitives: ~13 functions including Unicode handling
- ✅ Case conversion: All major case conventions
- ✅ Type inference: Basic inference, algebraic types, nominal types, failures
- 🚧 Future expansion: More primitive functions, coders, adapters, and kernel functionality
The test suite is continuously expanding to cover more of Hydra's functionality.
- Hydra-Haskell Testing - Haskell-specific test instructions
- Hydra-Java Testing - Java test setup
- Hydra-Python Testing - Python test environment