Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 125 additions & 49 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,158 @@
# cljs repl
# Development Guide

```shell
npx shadow-cljs watch dev
```
This document describes how to set up your development environment and contribute to the project.

Observe browser window open with a message like:
> Code entered in a browser-repl prompt will be evaluated here.
## Prerequisites

The, connect and select the appropriate shadow repl.
- **Java 21+** (uses virtual threads)
- **Clojure CLI** 1.12+
- **Node.js** (for ClojureScript tests)
- **Docker** + **Docker Compose** (for integration tests with PostgreSQL and FoundationDB)

```clojure
(require '[shadow.cljs.devtools.api :as shadow])
(shadow/browser-repl)
```
## Quick Start

# doc
```bash
# Install dependencies and run tests
bin/kaocha
```

```shell
npx shadow-cljs watch doc
...
shadow-cljs - HTTP server available at http://localhost:8000
#open the browser
## Project Structure

# or
npx shadow-cljs compile doc
python -m http.server --directory public
```
intemporal/
├── src/intemporal/ # Main source code
│ ├── core.cljc # Public API
│ ├── protocol.cljc # Core protocols (IStore, etc.)
│ ├── store.cljc # In-memory store
│ ├── store/ # JDBC and FDB stores
│ └── internal/ # Internal implementation
├── test/ # Tests
├── dev/ # Development utilities
└── resources/migrations/ # Database migrations
```

# cljs repl
## Database Setup

For integration and chaos tests, start the databases:

```bash
docker compose up -d postgresql foundation
```
clj -A:dev:doc:cljs
```

# Tests
- **PostgreSQL** on port 5432 — `jdbc:postgresql://localhost:5432/root?user=root&password=root`
- **FoundationDB** on port 4500 — cluster file at `docker/fdb.cluster`

Override the Postgres URL with `DATABASE_URL` (kaocha store/integration tests) or
`POSTGRES_JDBC_URI` (the chaos harness) if your setup differs.

```shell
## Running Tests

```bash
# Everything: JVM + ClojureScript
bin/kaocha

# Fast JVM tests, skips ^:integration (no DB needed)
bin/kaocha :in-memory

# JVM tests incl. ^:integration (needs PostgreSQL + FoundationDB)
bin/kaocha :test

# ClojureScript tests (Node)
bin/kaocha :test-cljs

# or run everything
bin/run-coverage
# Focus a single namespace (use hyphens, not underscores)
bin/kaocha :test --focus intemporal.tests.signal-test
```

# focusing
./bin/kaocha :test --focus intemporal.tests.signal-test
## Jepsen / Chaos Tests

# cljs focus is a bit different
./bin/kaocha :test-cljs --focus 'cljs:intemporal.tests.signal-test'
There are **two** distinct things under the "jepsen" name.

```
### 1. Per-scenario bug guard tests — `test/intemporal/tests/jepsen/`

# CI runs
Deterministic single-JVM tests, one namespace per known failure mode, each exercising
InMemory + JDBC + FDB. They double as regression guards: a *fixed* bug's test asserts the
correct behaviour, an *unfixed* bug's test asserts the buggy behaviour it still exhibits.

Install earthly: https://earthly.dev
| Namespace | Bug (see `improvements.md`) | State |
|---|---|---|
| `bug-1-1-test` | Lost wake on signal across pods | buggy (Phase C) |
| `bug-1-2-test` | Concurrent same-seq write corruption | buggy (Phase C) |
| `bug-1-3-test` | No recovery poller on restart | buggy (Phase C) |
| `bug-2-1-test` | Register-then-consume signal race | **fixed** (Phase A) |
| `bug-2-3-test` | Cancel can't reach a sleeper | **fixed** (Phase A) |

```bash
# in-memory variants only (no DB)
bin/kaocha :in-memory --focus intemporal.tests.jepsen.bug-2-1-test \
--focus intemporal.tests.jepsen.bug-2-3-test

# all three stores (start PG + FDB first)
docker compose up -d postgresql foundation
bin/kaocha :test --focus intemporal.tests.jepsen.bug-1-1-test \
--focus intemporal.tests.jepsen.bug-1-2-test \
--focus intemporal.tests.jepsen.bug-1-3-test \
--focus intemporal.tests.jepsen.bug-2-1-test \
--focus intemporal.tests.jepsen.bug-2-3-test
```
earthly -P -i +test
```

# Check FDB is working for your architecture
`racing_store.clj` is a shared `IStore` wrapper that pins the executing thread inside the
signal consume/register window so `bug-2-1` reproduces its race 100% deterministically.

### 2. Forked-JVM chaos harness — `test/intemporal/jepsen/`

Boots N worker JVMs against one Postgres, drives a submit/signal/cancel generator and a
nemesis that SIGKILL/SIGTERMs and restarts workers, then checks invariants after a quiesce
phase. This is the integration vehicle for the Phase C multi-pod work. Full design:
[test/intemporal/jepsen/README.md](test/intemporal/jepsen/README.md).

```bash
docker compose up -d postgresql

# default chaos run: 4 workers, 120s active, 90s grace
clojure -X:dev:jdbc:jepsen intemporal.jepsen.runner/run :workers 4 :duration 120

```shell
# no-kill baseline (should pass all checkers)
clojure -X:dev:jdbc:jepsen intemporal.jepsen.runner/run :workers 4 :duration 60 :no-kill true

$ JAVA_OPTS="-DFDB_LIBRARY_PATH_FDB_C=/usr/local/lib/libfdb_c.dylib -DFDB_LIBRARY_PATH_FDB_JAVA=/usr/local/lib/libfdb_java.jnilib" clj -A:fdb:jdbc
# aggressive
clojure -X:dev:jdbc:jepsen intemporal.jepsen.runner/run \
:workers 6 :duration 180 :nemesis-min-ms 1500 :nemesis-jitter-ms 3000 :min-alive 1 :grace-s 120
```

The runner forks workers via the `:jepsen-worker` alias; both `:jepsen` and `:jepsen-worker`
are defined in `deps.edn`. The Postgres URL comes from `POSTGRES_JDBC_URI` (default localhost).

### Standalone bug reproducer

(import 'com.apple.foundationdb.JNIUtil)
(let [method (.getDeclaredMethod com.apple.foundationdb.JNIUtil "loadLibrary" (into-array Class [String]))]
(.setAccessible method true)
(.invoke method com.apple.foundationdb.JNIUtil (object-array ["fdb_java"]))
(.invoke method com.apple.foundationdb.JNIUtil (object-array ["fdb_c"])))
`dev/verify_bugs.clj` runs all five scenarios against JDBC + FDB and prints a pass/fail
report — a quick end-to-end smoke check:

```bash
clojure -X:dev:jdbc:fdb verify-bugs/run
```

# Telemetry
### Known flaky test

# Get the OT javaagent
`intemporal.tests.replay-check-test/test-log-once-workflow` can fail under full-suite load
(`run-once` persists its dedup marker lazily; parallel `async`/`join-all` can re-run the
thunk). It is **pre-existing** (reproduces on pre-Phase-A commits) and unrelated to the
signal/cancel work. It passes reliably in isolation.

```shell
wget --content-disposition https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v2.21.0/opentelemetry-javaagent.jar
## REPL Development

```bash
clojure -A:dev # REPL with dev + test deps
clojure -A:dev:jdbc # + PostgreSQL/JDBC
clojure -A:dev:fdb # + FoundationDB
clojure -M:nrepl # nREPL server on port 7888
```
Run with the `dev` profile to activate the java agent.

## Code Style

- Follow standard Clojure conventions
- Use `kebab-case` for functions and variables
- Keep functions small and focused
- Write tests for new functionality
- File names use underscores (`signal_test.clj`); namespaces use hyphens (`signal-test`)
- Always pass `--color=never` to `grep`
50 changes: 50 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,56 @@ Examples:
(println result)))
```

### Saga / compensations

Create a saga with `intemporal/saga`, register a compensation for each step *after*
it succeeds with `intemporal/add-compensation`, and roll back from a catch block
with `intemporal/compensate`. Compensations run in reverse registration order
(LIFO). A step that fails before its `add-compensation` registers nothing to undo.
Compensations should themselves call activity stubs so they are durable and
replay-safe.

Both real failures and workflow cancellation flow through the catch, so the one
idiom rolls back in either case. Catch `Exception`: the engine's normal
control-flow *suspensions* subclass `Error`, so they are excluded automatically
and propagate to the engine untouched.

```clojure
(defn booking-saga [order]
(let [saga (intemporal/saga)
book-hotel (intemporal/stub #'book-hotel)
book-flight (intemporal/stub #'book-flight)
charge-card (intemporal/stub #'charge-card)
cancel-hotel (intemporal/stub #'cancel-hotel)
cancel-flight (intemporal/stub #'cancel-flight)]
(try
(let [h (book-hotel order)
_ (intemporal/add-compensation saga #(cancel-hotel h))]
(let [f (book-flight order)
_ (intemporal/add-compensation saga #(cancel-flight f))]
;; if charge-card throws, the catch runs compensate -> cancel-flight then
;; cancel-hotel (LIFO) -> then rethrows so the workflow finalizes :failed
(charge-card order)
:booked))
(catch Exception e
(intemporal/compensate saga)
(throw e)))))
```

Cancellation is a catchable `Exception`, so any `(catch Exception ...)` in a
workflow will intercept it — that is what lets a cancelled saga roll back.

In **ClojureScript** there is no `Error`/`Exception` split (everything is a
`js/Error`), so `(catch :default e)` would also catch suspensions. There, rethrow
them explicitly:

```clojure
(catch :default e
(when (intemporal/suspension? e) (throw e)) ;; engine control flow
(intemporal/compensate saga)
(throw e))
```

# TODO

- [X] Activites + Workflows
Expand Down
12 changes: 11 additions & 1 deletion deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -55,4 +55,14 @@
:ns-default build}

:test {:jvm-opts ["--enable-native-access=ALL-UNNAMED"]
:main-opts ["-m" "kaocha.runner"]}}}
:main-opts ["-m" "kaocha.runner"]}

;; Run the chaos harness against a live Postgres instance:
;; clojure -X:dev:jdbc:jepsen intemporal.jepsen.runner/run
:jepsen {:extra-paths ["test" "test/resources"]
:jvm-opts ["--enable-native-access=ALL-UNNAMED"]
:main-opts ["-m" "intemporal.jepsen.runner"]}

;; Entry point for forked worker JVMs launched by db.clj/fork!
:jepsen-worker {:extra-paths ["test" "test/resources"]
:jvm-opts ["--enable-native-access=ALL-UNNAMED"]}}}
Loading
Loading