From cd323937b123582b5554d40eec82af91f6281ddd Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Wed, 13 May 2026 17:58:37 +0200 Subject: [PATCH 01/14] Fix changelog Signed-off-by: Jakub Dzikowski --- CHANGELOG.md | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index be0622c6..4e5d9c1c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,23 +1,17 @@ # Unreleased -- **Language:** `for in { … }` in workflows and rules iterates newline-delimited lines of a string binding. Newlines normalize `\r\n` to `\n`; a single trailing empty segment from a final newline is omitted. Lines are not trimmed and empty interior lines are still iterated unless the body skips them (e.g. `if line != "" { … }`). Documented in `docs/language.md`. -- **Tests / QA:** Unit tests for string line splitting (`src/runtime/string-lines.test.ts`); E2E `e2e/tests/135_for_string_lines.sh`. - # 0.9.4 ## Summary -Maintenance and simplification: -- **Breaking:** Inbox dispatch is sequential only (parallel config/env removed). Stricter grammar: multiline `config` blocks only; no one-line braced workflows; no semicolon-separated statements in workflow/rule bodies. -- **Runtime:** Single-line shell steps run in the Node runtime (`sh -c`); script capture only on success; async `run` + `recover` return propagation fixed; mock prompts use JSON arm dispatch and an in-memory response queue; inbox artifact files are written only when a route consumes the channel. -- **CLI / install:** Failure footers use the **last** failed step in `run_summary.jsonl`; curl install ships `package.json` so stable installs resolve the correct default Docker image tag. -- **Language:** RHS bare identifiers and bare dotted identifiers are treated as interpolation sugar where applicable. -- **Library:** `artifacts.save(paths)` in single-argument form (path or newline-separated list); `git format-patch` workflows use `--stdout` so patch bytes are captured. -- **Repo:** `node-workflow-runtime` split into arg-parser, event-emitter, and mock modules; test directories consolidated under `integration/`, `test-fixtures/`, `test-infra/`; `JAIPH_TEST_MODE` no longer suppresses stderr events in runtime code (constructor option instead). -- **Docs / DX:** Agent-proxy design note; explicit parse error for `test` blocks outside `*.test.jh`; architecture/inbox corrections; getting-started shortened. +- Feature: `for in { ... }` loop. +- Simplifying: Sequential inbox only; stricter grammar (multiline `config`, no one-line braced workflows, no `;` in workflow/rule bodies). +- Hardening, test refactoring and bug fixes ## All changes +- **Language:** `for in { … }` in workflows and rules iterates newline-delimited lines of a string binding. Newlines normalize `\r\n` to `\n`; a single trailing empty segment from a final newline is omitted. Lines are not trimmed and empty interior lines are still iterated unless the body skips them (e.g. `if line != "" { … }`). Documented in `docs/language.md`. +- **Tests / QA:** Unit tests for string line splitting (`src/runtime/string-lines.test.ts`); E2E `e2e/tests/135_for_string_lines.sh`. - **Breaking — Language:** Inline one-line `config { k = v }` is removed — only the multiline `config {\n … \n}` form parses (matches documented grammar). The formatter no longer emits compact inline `config`, which would be invalid input. Examples such as `examples/async.jh` were migrated. - **Breaking — Language:** Single-line `workflow name() { stmt }` braced form removed; workflow and rule bodies require one statement per line as in the grammar. - **Breaking — Language:** Semicolons no longer separate statements in workflow/rule bodies (`splitStatementsOnSemicolons` remains for `match` arms). Multiple statements on one line joined by `;` must be split across lines. From 918c5934a3b18e08fd6f34ea06f502d605a18c24 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Thu, 14 May 2026 16:28:09 +0200 Subject: [PATCH 02/14] Perf: parallelize jaiph install missing-library clones Replace the sequential execSync clone loop in src/cli/commands/install.ts with a small bounded-concurrency executor (default 4 in flight) using spawn("git", ["clone", "--depth", "1", ...]) so independent network and process latency overlap when several libraries are missing. The user contract is unchanged: warm-path libraries (target directory exists and --force is absent) still skip without invoking git for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. runInstall is now async and exposes injectable CloneRunner and concurrency options for testing. Tests cover concurrent overlap, warm- path skipping for explicit args and restore, invalid-remote and unknown- ref failure paths, and mixed success/failure lockfile bookkeeping. --- CHANGELOG.md | 2 + QUEUE.md | 43 +++---- docs/cli.md | 6 +- docs/libraries.md | 4 +- src/cli/commands/install.test.ts | 181 +++++++++++++++++++++++++++++- src/cli/commands/install.ts | 185 ++++++++++++++++++++++--------- src/cli/index.ts | 2 +- 7 files changed, 342 insertions(+), 81 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4e5d9c1c..fc2cbf1c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,7 @@ # Unreleased +- **Performance — `jaiph install` parallelism:** Missing-library clones now run in parallel through a small bounded-concurrency executor (default 4 in flight), replacing the previous sequential `execSync` loop. The user contract is unchanged: warm-path libraries (target directory exists and `--force` is absent) still skip without invoking `git` for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. The default clone runner now uses `spawn("git", ["clone", "--depth", "1", …])` so multiple clones can overlap network and process latency. `runInstall` is now `async` and exposes injectable `CloneRunner` / `concurrency` options for testing. Tests cover concurrent overlap (peak in-flight ≥ 2), warm-path skipping for explicit args and restore, invalid-remote and unknown-ref failure paths, mixed success/failure lockfile bookkeeping, and the existing corrupt/missing-lockfile behavior. Docs updated in `docs/cli.md` and `docs/libraries.md`. + # 0.9.4 ## Summary diff --git a/QUEUE.md b/QUEUE.md index 72264987..58a76a9e 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,38 +13,39 @@ Process rules: *** -## Performance — investigate and fix slow installation +## Performance — remove redundant local workflow-start work #dev-ready -**Goal** -`jaiph install` (and related dependency or bootstrap steps) feels unreasonably slow; find the dominant cost and improve it without weakening reproducibility (lockfile, shallow clone behavior, etc.). - -**Scope** +**Problem** +The default local `jaiph run ` path does redundant startup work before the first useful workflow event: -* Profile or instrument the install path (git clone, lockfile I/O, post-install) and document the top 1–3 contributors to latency. -* Implement targeted fixes (e.g. avoid redundant work, reduce subprocess churn, cache safely) and verify wall-clock improvement on a cold and warm run where applicable. +* `src/cli/commands/run.ts` parses the entry file to read metadata/config and print the banner. +* `buildScripts()` walks and parses the transitive `.jh` module set to emit script bodies. +* The spawned `src/runtime/kernel/node-workflow-runner.ts` then calls `buildRuntimeGraph()`, which reads and parses the import closure again before constructing `NodeWorkflowRuntime`. -**Acceptance criteria** - -* A short note in the commit or PR description states what was slow and what changed, with before/after rough timings on the same machine. -* `jaiph install` behavior remains correct: same lockfile semantics and failure modes for bad URLs or missing refs. -* `npm test` passes. - -*** - -## Performance — investigate and fix slow workflow start (initial 2–4 s lag) +For small workflows this duplicate parse/graph setup is a plausible source of the observed 2-4 second lag. Optimize this path before chasing Docker, raw mode, or external subprocess costs. **Goal** -When starting workflows (e.g. `jaiph run` / first step), users observe a 2–4 second delay before useful work; reduce that lag or explain and eliminate unnecessary startup work (JIT, imports, process spawn, discovery). +Reduce cold-start latency for default local `jaiph run ` by eliminating avoidable repeated `.jh` reads/parses between CLI compile prep and the runtime graph used by `NodeWorkflowRuntime`. **Scope** -* Reproduce the lag with a minimal `.jh` workflow; trace Node startup, module load, and runtime init (`NodeWorkflowRuntime` and friends). -* Address fixable costs (e.g. defer heavy work, lazy imports, avoid redundant file scans) without changing user-visible workflow semantics. +* In scope: non-Docker, non-`--raw` `jaiph run ` from the host CLI through the spawned Node workflow runner. +* Out of scope: `jaiph run --raw`, Docker startup/image prep, prompt provider latency, shell command runtime, and bootstrap install performance. +* Prefer one shared module-graph/compile-prep representation over separate ad hoc caches. If serialization is used to cross the process boundary, keep it internal and deterministic. +* Preserve user-visible run semantics: banner, hooks, run artifacts, summaries, return values, exit codes, and `__JAIPH_EVENT__` handling must remain compatible with current behavior. + +**Measurement notes** + +* Use a minimal workflow and one imported-module workflow as repro cases. +* Measure time from CLI process start to the first parsed `__JAIPH_EVENT__` line on stderr. If an implementation chooses a different first-event marker, define it in the PR or commit message. +* Record before/after timings on the same machine. These timings are evidence for the optimization, not acceptance criteria. **Acceptance criteria** -* Documented repro (command + minimal file) and what was measured (time to first event / first step). -* Measurable reduction in the cold-start path on a representative case, or a clear justification if the lag is irreducible (e.g. external subprocess). +* A unit or integration test proves the default local run path does not read/parse the entry module once in the parent and then re-read/re-parse the same module in the child to build the runtime graph. The test must fail if the old `run.ts` + `buildScripts()` + `node-workflow-runner.ts` duplicate parse pattern returns. +* A test with at least one imported `.jh` module proves the optimized graph/compile-prep path preserves cross-module workflow, rule, and script resolution. +* Existing local run behavior remains covered: a minimal workflow still emits the expected start/end events, writes run artifacts/summary metadata, returns the workflow return value, and exits with the correct status. +* The change does not alter `jaiph run --raw` or Docker launch behavior; add a focused test or assertion if shared launch code is touched. * `npm test` passes. *** diff --git a/docs/cli.md b/docs/cli.md index b956872f..77e4dae2 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -348,9 +348,11 @@ jaiph install [--force] **With arguments** — clone each repo into `.jaiph/libs//` (shallow: `--depth 1`) and upsert the entry in `.jaiph/libs.lock`. The library name is derived from the URL: last path segment, stripped of `.git` suffix (e.g. `github.com/you/queue-lib.git` → `queue-lib`). Version pinning is usually written as **`https://…/name.git@`**; other URL shapes with a trailing **`@ref`** are also accepted when the parser can split URL and version unambiguously. -**Without arguments** — restore all libraries from `.jaiph/libs.lock`. Useful after cloning a project or in CI. If the lockfile exists but lists **no** libraries, the command prints `No libs in lockfile.` and exits **0**. +**Without arguments** — restore all libraries from `.jaiph/libs.lock`. Useful after cloning a project or in CI. If the lockfile exists but lists **no** libraries, the command prints `No libs in lockfile.` and exits **0**. Restore mode does **not** invent new lock entries — the lockfile is read but not rewritten. -If `.jaiph/libs//` already exists, the library is skipped. Use **`--force`** (anywhere in the argument list) to delete and re-clone. +If `.jaiph/libs//` already exists, the library is skipped without invoking `git` (warm path) — both for explicit arguments and for restore-from-lock. Use **`--force`** (anywhere in the argument list) to delete and re-clone. + +**Parallel clones.** Missing libraries are cloned concurrently with a small bounded-concurrency executor (default **4 in flight**); the warm-path skip runs in a pre-pass before any clone work starts. Independent network/process latency therefore overlaps when several libraries are missing. Failures from individual clones still propagate: any non-zero clone exits the command non-zero, and failed libraries are **not** added to `.jaiph/libs.lock`. Successful and warm-skipped libraries are upserted as before. **Lockfile format** (`.jaiph/libs.lock`): diff --git a/docs/libraries.md b/docs/libraries.md index 4e65e296..484bfeb3 100644 --- a/docs/libraries.md +++ b/docs/libraries.md @@ -53,7 +53,9 @@ jaiph install https://github.com/you/queue-lib.git@v1.0 jaiph install ``` -`jaiph install` writes **`.jaiph/libs.lock`** under the workspace root. Commit the lockfile; add **`.jaiph/libs/`** to `.gitignore` if you do not want vendored clones in version control. If **`.jaiph/libs//`** already exists, the clone is skipped unless you pass **`--force`** (URL / `@ref` parsing: [CLI — `jaiph install`](cli.md#jaiph-install)). +`jaiph install` writes **`.jaiph/libs.lock`** under the workspace root. Commit the lockfile; add **`.jaiph/libs/`** to `.gitignore` if you do not want vendored clones in version control. If **`.jaiph/libs//`** already exists, the clone is skipped without invoking `git` unless you pass **`--force`** (URL / `@ref` parsing: [CLI — `jaiph install`](cli.md#jaiph-install)). + +Missing libraries are cloned **concurrently** (default 4 in flight), so restoring or installing several repositories at once does not pay full network/process latency one repo at a time. Failed clones still exit the command non-zero and do not produce a lock entry. Restore-from-lock (`jaiph install` with no args) does not invent new lock entries. See [CLI — `jaiph install`](cli.md#jaiph-install) for the full contract. The clone directory name is **`deriveLibName(url)`** (last path segment, **`.git`** stripped), so imports use that segment as **`lib-name`**. diff --git a/src/cli/commands/install.test.ts b/src/cli/commands/install.test.ts index ad4d1a20..d11e15f8 100644 --- a/src/cli/commands/install.test.ts +++ b/src/cli/commands/install.test.ts @@ -1,10 +1,10 @@ import test from "node:test"; import assert from "node:assert/strict"; -import { mkdirSync, writeFileSync, rmSync } from "node:fs"; +import { existsSync, mkdirSync, readFileSync, rmSync, writeFileSync } from "node:fs"; import { join } from "node:path"; import { execSync } from "node:child_process"; import { tmpdir } from "node:os"; -import { parseUrlAndVersion } from "./install"; +import { parseUrlAndVersion, runInstall, type CloneRunner, type CloneOutcome, type InstallSpec } from "./install"; const CLI_PATH = join(__dirname, "../../../src/cli.js"); @@ -83,3 +83,180 @@ test("install: missing lockfile shows no libs message", () => { cleanup(dir); } }); + +test("install: missing libraries clone concurrently", async () => { + const dir = makeTempProject(); + try { + let active = 0; + let maxActive = 0; + const cloneRunner: CloneRunner = async (spec: InstallSpec): Promise => { + active += 1; + maxActive = Math.max(maxActive, active); + // Mimic git clone side effect so the lib directory is materialized. + mkdirSync(spec.libDir, { recursive: true }); + await new Promise((resolve) => setTimeout(resolve, 30)); + active -= 1; + return { spec, ok: true }; + }; + + const code = await runInstall( + [ + "https://example.com/alpha.git", + "https://example.com/beta.git", + "https://example.com/gamma.git", + ], + { cwd: dir, cloneRunner, concurrency: 4 }, + ); + + assert.equal(code, 0); + assert.ok(maxActive >= 2, `expected overlapping clones; observed peak ${maxActive}`); + + const lock = JSON.parse(readFileSync(join(dir, ".jaiph", "libs.lock"), "utf8")) as { + libs: { name: string }[]; + }; + assert.deepEqual( + lock.libs.map((e) => e.name).sort(), + ["alpha", "beta", "gamma"], + "all three should land in the lockfile", + ); + } finally { + cleanup(dir); + } +}); + +test("install: explicit warm path skips existing directories without invoking git", async () => { + const dir = makeTempProject(); + try { + const libDir = join(dir, ".jaiph", "libs", "alpha"); + mkdirSync(libDir, { recursive: true }); + writeFileSync(join(libDir, "sentinel"), "warm\n", "utf8"); + + let callCount = 0; + const cloneRunner: CloneRunner = async (spec) => { + callCount += 1; + return { spec, ok: true }; + }; + + const code = await runInstall(["https://example.com/alpha.git"], { cwd: dir, cloneRunner }); + + assert.equal(code, 0); + assert.equal(callCount, 0, "cloneRunner must not be called when target dir exists and --force is absent"); + assert.equal(readFileSync(join(libDir, "sentinel"), "utf8"), "warm\n"); + } finally { + cleanup(dir); + } +}); + +test("install: restore-from-lock warm path skips existing directories without invoking git", async () => { + const dir = makeTempProject(); + try { + const lockPath = join(dir, ".jaiph", "libs.lock"); + mkdirSync(join(dir, ".jaiph"), { recursive: true }); + writeFileSync( + lockPath, + JSON.stringify({ + libs: [ + { name: "alpha", url: "https://example.com/alpha.git" }, + { name: "beta", url: "https://example.com/beta.git" }, + ], + }) + "\n", + "utf8", + ); + const alphaDir = join(dir, ".jaiph", "libs", "alpha"); + const betaDir = join(dir, ".jaiph", "libs", "beta"); + mkdirSync(alphaDir, { recursive: true }); + mkdirSync(betaDir, { recursive: true }); + writeFileSync(join(alphaDir, "sentinel"), "alpha-warm\n", "utf8"); + writeFileSync(join(betaDir, "sentinel"), "beta-warm\n", "utf8"); + + let callCount = 0; + const cloneRunner: CloneRunner = async (spec) => { + callCount += 1; + return { spec, ok: true }; + }; + + const code = await runInstall([], { cwd: dir, cloneRunner }); + + assert.equal(code, 0); + assert.equal(callCount, 0, "cloneRunner must not be called for restore-from-lock warm path"); + // restore-from-lock with no args must not invent new lock entries; pre-existing two stay. + const lock = JSON.parse(readFileSync(lockPath, "utf8")) as { libs: { name: string }[] }; + assert.deepEqual(lock.libs.map((e) => e.name).sort(), ["alpha", "beta"]); + assert.equal(readFileSync(join(alphaDir, "sentinel"), "utf8"), "alpha-warm\n"); + assert.equal(readFileSync(join(betaDir, "sentinel"), "utf8"), "beta-warm\n"); + } finally { + cleanup(dir); + } +}); + +test("install: invalid remote/path failure exits non-zero and does not lock the failed lib", async () => { + const dir = makeTempProject(); + try { + const bogus = join(dir, "does-not-exist-bogus-remote"); + const code = await runInstall([bogus], { cwd: dir }); + + assert.notEqual(code, 0, "invalid remote/path must exit non-zero"); + const lockPath = join(dir, ".jaiph", "libs.lock"); + assert.ok(existsSync(lockPath), "lockfile is written but should not contain failed entries"); + const lock = JSON.parse(readFileSync(lockPath, "utf8")) as { libs: { name: string }[] }; + assert.equal(lock.libs.length, 0, "failed clone must not produce a lock entry"); + assert.ok( + !existsSync(join(dir, ".jaiph", "libs", "does-not-exist-bogus-remote")), + "no lib directory should remain after a failed clone", + ); + } finally { + cleanup(dir); + } +}); + +test("install: unknown ref failure exits non-zero and does not lock the failed lib", async () => { + const dir = makeTempProject(); + try { + // Create a local repo with one commit so clone-from-path is valid, but the ref is not. + const remoteDir = join(dir, "remote-repo"); + mkdirSync(remoteDir, { recursive: true }); + execSync("git init", { cwd: remoteDir, stdio: "pipe" }); + writeFileSync(join(remoteDir, "README"), "hi\n", "utf8"); + execSync("git add README", { cwd: remoteDir, stdio: "pipe" }); + execSync( + `git -c user.email=test@example.com -c user.name=test commit -m init`, + { cwd: remoteDir, stdio: "pipe" }, + ); + + const code = await runInstall([`${remoteDir}@nonexistent-ref-xyz`], { cwd: dir }); + + assert.notEqual(code, 0, "unknown ref must exit non-zero"); + const lockPath = join(dir, ".jaiph", "libs.lock"); + assert.ok(existsSync(lockPath)); + const lock = JSON.parse(readFileSync(lockPath, "utf8")) as { libs: { name: string }[] }; + assert.equal(lock.libs.length, 0, "unknown-ref clone must not produce a lock entry"); + } finally { + cleanup(dir); + } +}); + +test("install: mixed success and failure locks only the successful libs", async () => { + const dir = makeTempProject(); + try { + const cloneRunner: CloneRunner = async (spec) => { + if (spec.name === "bad") { + return { spec, ok: false, message: "simulated failure" }; + } + mkdirSync(spec.libDir, { recursive: true }); + return { spec, ok: true }; + }; + + const code = await runInstall( + ["https://example.com/good.git", "https://example.com/bad.git", "https://example.com/also-good.git"], + { cwd: dir, cloneRunner, concurrency: 4 }, + ); + + assert.notEqual(code, 0, "any failure must propagate non-zero exit"); + const lock = JSON.parse(readFileSync(join(dir, ".jaiph", "libs.lock"), "utf8")) as { + libs: { name: string }[]; + }; + assert.deepEqual(lock.libs.map((e) => e.name).sort(), ["also-good", "good"]); + } finally { + cleanup(dir); + } +}); diff --git a/src/cli/commands/install.ts b/src/cli/commands/install.ts index 2c7254ff..013980ad 100644 --- a/src/cli/commands/install.ts +++ b/src/cli/commands/install.ts @@ -1,6 +1,6 @@ import { existsSync, mkdirSync, readFileSync, rmSync, writeFileSync } from "node:fs"; -import { join, resolve } from "node:path"; -import { execSync } from "node:child_process"; +import { join } from "node:path"; +import { spawn } from "node:child_process"; import { colorPalette } from "../shared/errors"; import { detectWorkspaceRoot } from "../shared/paths"; @@ -14,6 +14,29 @@ interface LockFile { libs: LockEntry[]; } +export interface InstallSpec { + name: string; + url: string; + version?: string; + libDir: string; +} + +export interface CloneOutcome { + spec: InstallSpec; + ok: boolean; + message?: string; +} + +export type CloneRunner = (spec: InstallSpec) => Promise; + +export interface RunInstallOptions { + cwd?: string; + cloneRunner?: CloneRunner; + concurrency?: number; +} + +const DEFAULT_CONCURRENCY = 4; + function deriveLibName(url: string): string { const lastSegment = url.split("/").pop() ?? url; return lastSegment.replace(/\.git$/, ""); @@ -53,80 +76,134 @@ function upsertLockEntry(lock: LockFile, entry: LockEntry): void { } } -function cloneLib( - url: string, - version: string | undefined, - targetDir: string, - force: boolean, - palette: ReturnType, -): boolean { - const name = deriveLibName(url); - const libDir = join(targetDir, name); - - if (existsSync(libDir)) { - if (force) { - rmSync(libDir, { recursive: true, force: true }); - } else { - process.stdout.write(`${palette.dim}▸ ${name} already exists, skipping (use --force to re-clone)${palette.reset}\n`); - return true; +function specToLockEntry(spec: InstallSpec): LockEntry { + return { name: spec.name, url: spec.url, ...(spec.version ? { version: spec.version } : {}) }; +} + +/** Default clone runner: `git clone --depth 1 [--branch ] ` via spawn. */ +function gitCloneRunner(spec: InstallSpec): Promise { + return new Promise((done) => { + const args = ["clone", "--depth", "1"]; + if (spec.version) { + args.push("--branch", spec.version); } - } + args.push(spec.url, spec.libDir); + const child = spawn("git", args, { stdio: ["ignore", "pipe", "pipe"] }); + let stderr = ""; + child.stderr.on("data", (chunk: Buffer) => { + stderr += chunk.toString(); + }); + child.on("error", (err) => { + done({ spec, ok: false, message: err.message }); + }); + child.on("close", (code) => { + if (code === 0) { + done({ spec, ok: true }); + } else { + const tail = stderr.trim().split(/\r?\n/).filter(Boolean).pop(); + done({ spec, ok: false, message: tail ?? `git clone exited with code ${code}` }); + } + }); + }); +} - const branchFlag = version ? ` --branch ${version}` : ""; - const cmd = `git clone --depth 1${branchFlag} ${url} ${libDir}`; - try { - execSync(cmd, { stdio: "pipe" }); - process.stdout.write(`${palette.green}✓ Installed ${name}${version ? ` @ ${version}` : ""}${palette.reset}\n`); - return true; - } catch (err) { - const msg = err instanceof Error ? err.message : String(err); - process.stderr.write(`Failed to install ${name}: ${msg}\n`); - return false; - } +async function runWithConcurrency(items: T[], limit: number, fn: (item: T) => Promise): Promise { + const results = new Array(items.length); + let next = 0; + const worker = async (): Promise => { + while (true) { + const i = next++; + if (i >= items.length) return; + results[i] = await fn(items[i]!); + } + }; + const workerCount = Math.max(1, Math.min(limit, items.length)); + await Promise.all(Array.from({ length: workerCount }, () => worker())); + return results; } -export function runInstall(rest: string[]): number { +export async function runInstall(rest: string[], opts: RunInstallOptions = {}): Promise { const palette = colorPalette(); const force = rest.includes("--force"); const args = rest.filter((a) => a !== "--force"); - const workspaceRoot = detectWorkspaceRoot(process.cwd()); + const cwd = opts.cwd ?? process.cwd(); + const workspaceRoot = detectWorkspaceRoot(cwd); const libsDir = join(workspaceRoot, ".jaiph", "libs"); const lockPath = join(workspaceRoot, ".jaiph", "libs.lock"); + const cloneRunner = opts.cloneRunner ?? gitCloneRunner; + const concurrency = Math.max(1, opts.concurrency ?? DEFAULT_CONCURRENCY); mkdirSync(libsDir, { recursive: true }); - // No args: restore from lockfile - if (args.length === 0) { - const lock = readLockFile(lockPath); + const isRestoreFromLock = args.length === 0; + let lock: LockFile; + let specs: InstallSpec[]; + + if (isRestoreFromLock) { + lock = readLockFile(lockPath); if (lock.libs.length === 0) { process.stdout.write("No libs in lockfile.\n"); return 0; } process.stdout.write(`\nRestoring ${lock.libs.length} lib(s) from lockfile\n\n`); - let ok = true; - for (const entry of lock.libs) { - if (!cloneLib(entry.url, entry.version, libsDir, force, palette)) { - ok = false; + specs = lock.libs.map((e) => ({ + name: e.name, + url: e.url, + version: e.version, + libDir: join(libsDir, e.name), + })); + } else { + process.stdout.write("\n"); + lock = readLockFile(lockPath); + specs = args.map((a) => { + const { url, version } = parseUrlAndVersion(a); + const name = deriveLibName(url); + return { name, url, version, libDir: join(libsDir, name) }; + }); + } + + // Plan phase: skip warm-path libs without invoking the cloner; queue the rest. + const skipped: InstallSpec[] = []; + const jobs: InstallSpec[] = []; + for (const spec of specs) { + if (existsSync(spec.libDir)) { + if (force) { + rmSync(spec.libDir, { recursive: true, force: true }); + jobs.push(spec); + } else { + process.stdout.write(`${palette.dim}▸ ${spec.name} already exists, skipping (use --force to re-clone)${palette.reset}\n`); + skipped.push(spec); } + } else { + jobs.push(spec); } - process.stdout.write("\n"); - return ok ? 0 : 1; } - // Install each specified lib - process.stdout.write("\n"); - const lock = readLockFile(lockPath); - let ok = true; - for (const arg of args) { - const { url, version } = parseUrlAndVersion(arg); - const name = deriveLibName(url); - if (!cloneLib(url, version, libsDir, force, palette)) { - ok = false; - continue; + const outcomes = await runWithConcurrency(jobs, concurrency, cloneRunner); + + let allOk = true; + for (const outcome of outcomes) { + if (outcome.ok) { + const v = outcome.spec.version ? ` @ ${outcome.spec.version}` : ""; + process.stdout.write(`${palette.green}✓ Installed ${outcome.spec.name}${v}${palette.reset}\n`); + } else { + allOk = false; + process.stderr.write(`Failed to install ${outcome.spec.name}: ${outcome.message ?? "unknown error"}\n`); + } + } + + if (!isRestoreFromLock) { + for (const spec of skipped) { + upsertLockEntry(lock, specToLockEntry(spec)); + } + for (const outcome of outcomes) { + if (outcome.ok) { + upsertLockEntry(lock, specToLockEntry(outcome.spec)); + } } - upsertLockEntry(lock, { name, url, ...(version ? { version } : {}) }); + writeLockFile(lockPath, lock); } - writeLockFile(lockPath, lock); + process.stdout.write("\n"); - return ok ? 0 : 1; + return allOk ? 0 : 1; } diff --git a/src/cli/index.ts b/src/cli/index.ts index 3248529e..dbdecf1b 100644 --- a/src/cli/index.ts +++ b/src/cli/index.ts @@ -42,7 +42,7 @@ export async function main(argv: string[]): Promise { return runFormat(rest); } if (cmd === "install") { - return runInstall(rest); + return await runInstall(rest); } if (cmd === "compile") { return runCompile(rest); From e8e42e7bce9c6a4adb7cdf024a4a1be764e6fca3 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Thu, 14 May 2026 16:54:19 +0200 Subject: [PATCH 03/14] Perf: single-parse compile prep for local jaiph run MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The default local `jaiph run ` path no longer parses the entry module twice and no longer re-parses the full import closure inside the spawned `node-workflow-runner` child. A new `prepareCompile` (`src/transpile/compile-prep.ts`) walks the entry plus its transitive `.jh` imports exactly once and returns a `CompilePrep` record (`{ entryFile, workspaceRoot, astByFile }`). `src/cli/commands/run.ts` reuses the entry AST for the banner, passes the prep into `buildScripts(..., prep)` so emission skips per-file reads/parses, and writes a deterministic JSON snapshot to `/.jaiph-compile-prep.json`. The spawned runner reads it via the internal `JAIPH_COMPILE_PREP_FILE` env var and forwards the deserialized prep to `buildRuntimeGraph`, which consumes the cached `Map` instead of re-walking the import closure on disk. The env var is set only for non-Docker host runs; `jaiph run --raw`, `jaiph test`, and Docker launches keep their existing parse paths. User-visible run semantics — banner, hooks, run artifacts, summaries, return values, exit codes, and `__JAIPH_EVENT__` streaming — are unchanged. New tests corrupt every source file on disk after `prepareCompile`, then exercise `buildScripts` + `buildRuntimeGraph` to prove no second parse happens, and cover cross-module workflow/rule/script resolution plus the serialize → deserialize → graph round-trip across the parent → child process boundary. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 37 --- docs/architecture.md | 55 ++++- docs/cli.md | 7 +- src/cli/commands/run.ts | 16 +- src/runtime/kernel/graph.ts | 55 +++-- src/runtime/kernel/node-workflow-runner.ts | 5 +- src/transpile/build.ts | 10 +- src/transpile/compile-prep.test.ts | 265 +++++++++++++++++++++ src/transpile/compile-prep.ts | 69 ++++++ src/transpiler.ts | 35 ++- 11 files changed, 477 insertions(+), 78 deletions(-) create mode 100644 src/transpile/compile-prep.test.ts create mode 100644 src/transpile/compile-prep.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index fc2cbf1c..4f8780d6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Performance — `jaiph run` local single-parse compile prep:** The default local `jaiph run ` path no longer parses the entry module twice and no longer re-parses the full import closure inside the spawned `node-workflow-runner` child. A new `prepareCompile` (`src/transpile/compile-prep.ts`) walks the entry plus its transitive `.jh` imports exactly once and returns a `CompilePrep` record (`{ entryFile, workspaceRoot, astByFile }`). `src/cli/commands/run.ts` reuses the entry AST for `metadataToConfig` (no second parse for the banner), passes the prep into `buildScripts(..., prep)` so `emitScriptsForModule` skips per-file `readFileSync` + `parsejaiph`, and writes a deterministic JSON snapshot to `/.jaiph-compile-prep.json` via `writeCompilePrep`. The spawned runner reads it through the new internal env var `JAIPH_COMPILE_PREP_FILE` and forwards the deserialized prep to `buildRuntimeGraph(entry, workspaceRoot, prep)`, which now consumes the cached `Map` instead of re-walking the import closure on disk. `attachScriptImportStubs` is factored out of `graph.ts` and is idempotent across cached and uncached paths. The env var is set **only** for non-Docker host runs (when `JAIPH_DOCKER_ENABLED` is off); `jaiph run --raw`, `jaiph test`, and Docker launches do not set it and keep their existing parse paths. User-visible run semantics — banner, hooks, run artifacts, `run_summary.jsonl`, return values, exit codes, and `__JAIPH_EVENT__` streaming — are unchanged. New tests in `src/transpile/compile-prep.test.ts` corrupt every source file on disk after `prepareCompile`, then call `buildScripts` + `buildRuntimeGraph` to prove no second parse happens; they also cover cross-module workflow/rule/script resolution, a three-module closure, and the serialize → deserialize → graph round-trip used to cross the parent → child process boundary. Docs updated in `docs/architecture.md` and `docs/cli.md`. - **Performance — `jaiph install` parallelism:** Missing-library clones now run in parallel through a small bounded-concurrency executor (default 4 in flight), replacing the previous sequential `execSync` loop. The user contract is unchanged: warm-path libraries (target directory exists and `--force` is absent) still skip without invoking `git` for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. The default clone runner now uses `spawn("git", ["clone", "--depth", "1", …])` so multiple clones can overlap network and process latency. `runInstall` is now `async` and exposes injectable `CloneRunner` / `concurrency` options for testing. Tests cover concurrent overlap (peak in-flight ≥ 2), warm-path skipping for explicit args and restore, invalid-remote and unknown-ref failure paths, mixed success/failure lockfile bookkeeping, and the existing corrupt/missing-lockfile behavior. Docs updated in `docs/cli.md` and `docs/libraries.md`. # 0.9.4 diff --git a/QUEUE.md b/QUEUE.md index 58a76a9e..aa0fb185 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -12,40 +12,3 @@ Process rules: 6. **Acceptance criteria are non-negotiable.** A task is not done until every acceptance bullet is verified by a test that fails when the contract is violated. "It works on my machine" or "the existing tests pass" is not acceptance. *** - -## Performance — remove redundant local workflow-start work #dev-ready - -**Problem** -The default local `jaiph run ` path does redundant startup work before the first useful workflow event: - -* `src/cli/commands/run.ts` parses the entry file to read metadata/config and print the banner. -* `buildScripts()` walks and parses the transitive `.jh` module set to emit script bodies. -* The spawned `src/runtime/kernel/node-workflow-runner.ts` then calls `buildRuntimeGraph()`, which reads and parses the import closure again before constructing `NodeWorkflowRuntime`. - -For small workflows this duplicate parse/graph setup is a plausible source of the observed 2-4 second lag. Optimize this path before chasing Docker, raw mode, or external subprocess costs. - -**Goal** -Reduce cold-start latency for default local `jaiph run ` by eliminating avoidable repeated `.jh` reads/parses between CLI compile prep and the runtime graph used by `NodeWorkflowRuntime`. - -**Scope** - -* In scope: non-Docker, non-`--raw` `jaiph run ` from the host CLI through the spawned Node workflow runner. -* Out of scope: `jaiph run --raw`, Docker startup/image prep, prompt provider latency, shell command runtime, and bootstrap install performance. -* Prefer one shared module-graph/compile-prep representation over separate ad hoc caches. If serialization is used to cross the process boundary, keep it internal and deterministic. -* Preserve user-visible run semantics: banner, hooks, run artifacts, summaries, return values, exit codes, and `__JAIPH_EVENT__` handling must remain compatible with current behavior. - -**Measurement notes** - -* Use a minimal workflow and one imported-module workflow as repro cases. -* Measure time from CLI process start to the first parsed `__JAIPH_EVENT__` line on stderr. If an implementation chooses a different first-event marker, define it in the PR or commit message. -* Record before/after timings on the same machine. These timings are evidence for the optimization, not acceptance criteria. - -**Acceptance criteria** - -* A unit or integration test proves the default local run path does not read/parse the entry module once in the parent and then re-read/re-parse the same module in the child to build the runtime graph. The test must fail if the old `run.ts` + `buildScripts()` + `node-workflow-runner.ts` duplicate parse pattern returns. -* A test with at least one imported `.jh` module proves the optimized graph/compile-prep path preserves cross-module workflow, rule, and script resolution. -* Existing local run behavior remains covered: a minimal workflow still emits the expected start/end events, writes run artifacts/summary metadata, returns the workflow return value, and exits with the correct status. -* The change does not alter `jaiph run --raw` or Docker launch behavior; add a focused test or assertion if shared launch code is touched. -* `npm test` passes. - -*** diff --git a/docs/architecture.md b/docs/architecture.md index 55e9ff50..46ae80ef 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -19,7 +19,7 @@ For **how to contribute** — branches, test layers, E2E assertion policy, and b Workflow authors write `.jh` / `.test.jh` modules. The toolchain turns those files into **validated** modules plus **extracted script files**, then the **same AST interpreter** runs workflows whether you use local `jaiph run`, Docker, or `jaiph test`. -1. Parse source into AST (the CLI parses once up front for `jaiph run` metadata such as `runtime` config; `buildRuntimeGraph` and transpilation use the same parser on disk contents). +1. Parse source into AST. For the default local `jaiph run ` path, the CLI walks the entry plus its transitive `.jh` import closure **once** through **`prepareCompile`** (`src/transpile/compile-prep.ts`) and reuses that **`CompilePrep`** for the banner (`metadataToConfig`), for **`buildScripts`** (script-body extraction), and — across the parent → child process boundary — for **`buildRuntimeGraph`** in the spawned runner (see [Local single-parse compile prep](#local-single-parse-compile-prep) and the sequence diagram below). Other paths (`jaiph run --raw`, Docker `jaiph run`, `jaiph test`, `jaiph compile`) keep their existing parser calls and re-read `.jh` sources on demand. 2. **Compile-time** validation (`validateReferences`, invoked from **`emitScriptsForModule`** / **`buildScripts()`**) runs before script extraction, not inside `buildRuntimeGraph()` (the graph loader only parses modules and follows imports). The **`jaiph compile`** command walks the same import closure but runs **`validateReferences` only**: it parses each reachable module on disk and **does not** emit **`scripts/`** (no **`buildScriptFiles`** / **`buildScripts`**), **does not** invoke **`buildRuntimeGraph()`**, and never spawns the workflow runner (`src/cli/commands/compile.ts`). For a **directory** argument it discovers `*.jh` via `walkjhFiles`, which **skips** `*.test.jh`; to validate a test module, pass that file explicitly. Imported modules in the closure are still validated recursively either way. 3. **CLI** (`dist/src/cli.js` via npm, or a **Bun-compiled** `dist/jaiph` binary) prepares script executables (scripts-only), then spawns a **detached child** that loads **`node-workflow-runner.js`**. That child calls `buildRuntimeGraph()` and runs **`NodeWorkflowRuntime`**. The child’s interpreter is **`process.execPath`** of the CLI process (Node when you run `node dist/src/cli.js`, the standalone Bun binary when you run `dist/jaiph`). Script steps execute as managed subprocesses; prompt, inbox I/O, and event/summary emission are handled by the kernel under `src/runtime/kernel/`. 4. Stream live events to the CLI and persist durable run artifacts. @@ -47,6 +47,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** - **`emitScriptsForModule`** parses, runs **`validateReferences`**, and **`buildScriptFiles`** — the only compile path for `jaiph run` / `jaiph test` — **persists only atomic `script` files** under `scripts/`. **`buildScripts()`** can also take a **directory** of non-test `*.jh` modules (`src/transpile/build.ts` uses `walkjhFiles`); the **`jaiph run`** and **`jaiph test`** commands always pass a **single entry file** (`.jh` or `*.test.jh`). Inline scripts (`` run `body`(args) ``) are also emitted as `scripts/__inline_` with deterministic hash-based names (`inlineScriptName` in `src/inline-script-name.ts`). There is no workflow-level bash emission. + - Both **`buildScripts()`** and **`emitScriptsForModule`** accept an optional **`CompilePrep`** parameter. When supplied, the transitive-module list comes from the pre-parsed cache instead of re-walking the import closure, and `validateReferences` reads its `readFile` / `parse` callbacks against that same cache so each reachable module is parsed exactly once per `jaiph run` (see [Local single-parse compile prep](#local-single-parse-compile-prep)). - **Node Workflow Runtime (`src/runtime/kernel/node-workflow-runtime.ts`)** - `NodeWorkflowRuntime` interprets the AST directly: walks workflow steps, manages scope/variables, delegates prompt and script execution to kernel helpers, handles channels/inbox/dispatch, owns the frame stack and heartbeat, and writes run artifacts. @@ -54,7 +55,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **`runtime-arg-parser.ts`** — stateless interpolation and call-argument parsing (`interpolate`, `parseInlineCaptureCall`, `commaArgsToInterpolated`, `parseArgsRaw`, `parseInlineScriptAt`, `parseManagedArgAt`, `parseArgTokens`, `stripOuterQuotes`, `parsePromptSchema`, `sanitizeName`, `nowIso`) plus shared constants and the `ParsedArgToken` / `PromptSchemaField` types. Direct unit tests live in `runtime-arg-parser.test.ts`. - **`runtime-event-emitter.ts`** — `RuntimeEventEmitter` owns **`__JAIPH_EVENT__`** writes on stderr (step/log traffic when not suppressed), **`run_summary.jsonl`** appends for the wider timeline (including workflow/prompt records that are summary-first), plus step/prompt sequence counters. Constructed with `{ runId, runDir, env, getFrameStack, getAsyncIndices, suppressLiveEvents? }`; the runtime delegates structured emission to it. The optional `suppressLiveEvents` flag (forwarded from `NodeWorkflowRuntime`'s `suppressLiveEvents` option) skips the live stderr **`__JAIPH_EVENT__`** lines while **`appendRunSummaryLine`** keeps updating **`run_summary.jsonl`** — used by in-process callers like the test runner that share stderr with `node --test` reporter output. The CLI's spawned `node-workflow-runner` child does not set it, so production runs stream events to stderr as before. - **`runtime-mock.ts`** — `executeMockBodyDef` and `executeMockShellBody` for `*.test.jh` workflow/rule/script mocks. Shell-kind mocks run `bash -c`; steps-kind mocks dispatch back into the runtime via an `executeStepsBack` callback so the body runs against the full step interpreter. - - `buildRuntimeGraph()` (`graph.ts`) loads reachable modules with **`parsejaiph` only** (import closure); it does **not** run `validateReferences`. Cross-module refs are resolved from that graph at runtime. For **`script import`** declarations, `buildRuntimeGraph()` injects synthetic `ScriptDef` stubs (`graph.ts`) so reference resolution matches the validated compile path without re-reading external script bodies at graph-build time. + - `buildRuntimeGraph()` (`graph.ts`) loads reachable modules with **`parsejaiph` only** (import closure); it does **not** run `validateReferences`. Cross-module refs are resolved from that graph at runtime. For **`script import`** declarations, `buildRuntimeGraph()` injects synthetic `ScriptDef` stubs (`graph.ts`) so reference resolution matches the validated compile path without re-reading external script bodies at graph-build time. The function also accepts an optional **`CompilePrep`**: when supplied, every reachable module is taken from the cache and no `.jh` file is read from disk in the runner. The stub-injection helper (`attachScriptImportStubs`) is idempotent so cached and uncached paths produce the same node shape. - **Node Test Runner (`src/runtime/kernel/node-test-runner.ts`)** - Executes `*.test.jh` test blocks using `NodeWorkflowRuntime` with mock support (mock prompts, mock workflow/rule/script bodies). Pure Node harness — no Bash test transpilation. @@ -69,6 +70,23 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Parses mount specs, resolves Docker config (image, network, timeout), and builds the `docker run` invocation when the CLI enables **Docker sandboxing** for `jaiph run` (environment-driven; there is no `jaiph run --docker` flag — see [Sandboxing](sandboxing.md)). The container runs the same `node-workflow-runner` entry as local execution. The default image is the official `ghcr.io/jaiphlang/jaiph-runtime` GHCR image; every selected image must already contain `jaiph` (no auto-install or derived-image build at runtime). Image preparation (`prepareImage`) runs before the CLI banner: it checks whether the image is local, pulls with `--quiet` if needed (short status lines on stderr instead of Docker’s default pull UI), and verifies that `jaiph` exists in the image. `spawnDockerProcess` does not pull or verify — it receives a pre-resolved image. The spawn call uses `stdio: ["ignore", "pipe", "pipe"]` — stdin is ignored so the Docker CLI does not block on stdin EOF, which would stall event streaming and hang the host CLI after the container exits. - **Workspace immutability:** Docker runs cannot modify the host workspace. The host checkout is mounted read-only; `/jaiph/workspace` is a sandbox-local copy-on-write overlay discarded on exit. The only host-writable path is `/jaiph/run` (run artifacts). Workflows that need to capture workspace changes should write files (for example a `git diff` into a temp path) and publish them with `artifacts.save()`. See [Sandboxing](sandboxing.md) for the full contract and [Libraries — `jaiphlang/artifacts`](libraries.md#jaiphlangartifacts--publishing-files-out-of-the-sandbox). +## Local single-parse compile prep +{: #local-single-parse-compile-prep} + +The default local `jaiph run ` path uses one shared module-graph representation across the parent CLI → child runner boundary so each reachable `.jh` is parsed exactly **once** per run. + +- **`prepareCompile(entryFile, workspaceRoot)`** (`src/transpile/compile-prep.ts`) walks the entry plus its transitive `import` edges through `resolveImportPath` and returns a **`CompilePrep`** record: `{ entryFile, workspaceRoot, astByFile: Map }`. **`jaiphlang/`** library imports resolve through the same workspace fallback as the rest of the toolchain. +- **`src/cli/commands/run.ts`** calls `prepareCompile` once after path normalization. The entry AST is reused for **`metadataToConfig(mod.metadata)`** (banner / `runtime` config) — no separate `parsejaiph(readFileSync(...))` for metadata. The same prep is passed to **`buildScripts(input, outDir, workspaceRoot, prep)`** so `emitScriptsForModule` skips `readFileSync` + `parsejaiph` per module; `validateReferences` runs against the cached AST via injected `readFile` / `parse` callbacks. +- **Process boundary.** The CLI serializes the prep with **`writeCompilePrep`** to **`/.jaiph-compile-prep.json`** (deterministic JSON: entries sorted by absolute path; ASTs included verbatim). It points the spawned **`node-workflow-runner.js`** at the file through the internal env var **`JAIPH_COMPILE_PREP_FILE`**. The runner reads it back with **`readCompilePrep`** and passes the result to **`buildRuntimeGraph(entry, workspaceRoot, prep)`**, which builds the `RuntimeGraph` from the cached `Map` instead of re-reading `.jh` files. Cross-module workflow / rule / script resolution and `script import` stub injection match the on-disk parse path. +- **Scope of the optimization.** `JAIPH_COMPILE_PREP_FILE` is set **only** when the host CLI spawns the local **`node-workflow-runner.js`** child with Docker sandboxing disabled (`dockerConfigForBanner.enabled === false`). It is **not** set on these paths, which keep their existing parse calls: + - **`jaiph run --raw`** — `runWorkflowRaw` (`src/cli/commands/run.ts`) calls `parsejaiph` / `buildScripts` directly without a prep cache; the runner uses inherited stdio and never reads this env var. + - **Docker `jaiph run`** — the host writes the prep file under `outDir`, but skips the env var because the inner container command is `jaiph run --raw …` and the host bind-mount layout does not plumb the cache file inside the container. + - **`jaiph test`** — `runTestFile` keeps its own one-time `buildRuntimeGraph(testFileAbs)` per test file (see [Test runner integration](#test-runner-integration-testjh-in-the-kernel)). + + When the env var is absent the runner falls back to the disk-walk parse path, preserving prior behavior. + +User-visible contracts (banner, hooks, run artifacts, `run_summary.jsonl`, `return_value.txt`, exit codes, `__JAIPH_EVENT__` streaming) are unchanged. + ## Runtime vs CLI responsibilities ### Runtime responsibilities (Node workflow runtime) @@ -149,7 +167,8 @@ flowchart TD VAL --> EMIT end - CLI -->|jaiph run| BS1[buildScripts] + CLI -->|jaiph run| CP1[prepareCompile entry + closure] + CP1 --> BS1[buildScripts prep] BS1 --> Transpile CLI -->|jaiph test| BS2[buildScripts(entry .test.jh)] @@ -158,8 +177,9 @@ flowchart TD Transpile -->|jaiph run local| RW[Node workflow runner child] Transpile -->|jaiph run Docker| DC[Container runs node-workflow-runner] + CP1 -. JAIPH_COMPILE_PREP_FILE (local non-Docker only) .-> RW - RW --> G[buildRuntimeGraph parse-only + imports] + RW --> G[buildRuntimeGraph parse-only or cached prep] G --> GRAPH[RuntimeGraph] RW --> RT[NodeWorkflowRuntime] RT --> GRAPH @@ -193,21 +213,26 @@ Interactive **`jaiph run`** (no **`--raw`**): banner, progress tree, hooks, and sequenceDiagram participant User participant CLI as CLI jaiph run - participant Prep as buildScripts + participant CP as prepareCompile + participant Prep as buildScripts(prep) participant TF as emitScriptsForModule per module participant Runner as node-workflow-runner - participant Graph as buildRuntimeGraph + participant Graph as buildRuntimeGraph(prep) participant Runtime as NodeWorkflowRuntime participant Kernel as JS kernel participant Report as Artifacts (.jaiph/runs) User->>CLI: jaiph run main.jh args... - Note over CLI: parse once for metadata config only - CLI->>Prep: buildScripts(input) - Prep->>TF: loop: parse + validateReferences + emit + CLI->>CP: prepareCompile(entry, workspace) + CP-->>CLI: CompilePrep (astByFile) + Note over CLI: reuse entry AST for metadataToConfig / banner + CLI->>Prep: buildScripts(input, outDir, workspace, prep) + Prep->>TF: loop: validateReferences + emit (cached AST) TF-->>Prep: scripts/ atomic only Prep-->>CLI: scriptsDir + env JAIPH_SCRIPTS - alt local + alt local (non-Docker) + CLI->>CLI: writeCompilePrep(/.jaiph-compile-prep.json) + Note over CLI: set JAIPH_COMPILE_PREP_FILE on child env CLI->>Runner: spawn detached node-workflow-runner else Docker CLI->>CLI: prepareImage (pull --quiet + verify jaiph) @@ -215,7 +240,13 @@ sequenceDiagram CLI->>Runner: spawn container running node-workflow-runner Note over CLI: CLI parses events on stderr only end - Runner->>Graph: buildRuntimeGraph(sourceAbs) parse-only + alt JAIPH_COMPILE_PREP_FILE set (local non-Docker) + Runner->>Runner: readCompilePrep(file) + Runner->>Graph: buildRuntimeGraph(sourceAbs, workspace, prep) + Note over Graph: no .jh re-reads + else absent (Docker / --raw / test runner) + Runner->>Graph: buildRuntimeGraph(sourceAbs) parse-only + end Graph-->>Runner: RuntimeGraph Runner->>Runtime: runDefault(run args) Runtime->>Kernel: prompt / managed scripts / emit / inbox @@ -264,7 +295,7 @@ sequenceDiagram ## Summary -- `.jh` / `*.test.jh` share parser/AST; **compile-time** validation runs in **`emitScriptsForModule`** during **`buildScripts`**. **`buildRuntimeGraph`** loads modules with **parse-only** imports. +- `.jh` / `*.test.jh` share parser/AST; **compile-time** validation runs in **`emitScriptsForModule`** during **`buildScripts`**. **`buildRuntimeGraph`** loads modules with **parse-only** imports — or, on the default local **`jaiph run`** path, from a shared **`CompilePrep`** the parent CLI built with **`prepareCompile`** and handed across the process boundary through **`JAIPH_COMPILE_PREP_FILE`** (see [Local single-parse compile prep](#local-single-parse-compile-prep)). - **`jaiph compile`** walks import closures with **`validateReferences` only**, and exits — no **`scripts/`** emission (**no **`buildScriptFiles`** / **`buildScripts`**), no **`buildRuntimeGraph()`**, no runner spawn. Directory discovery omits **`*.test.jh`** unless you pass a test file explicitly. - **Node-only runtime:** all execution — local `jaiph run`, Docker `jaiph run`, and `jaiph test` — goes through `NodeWorkflowRuntime`. Docker containers run `node-workflow-runner` with the compiled JS tree and scripts mounted, using the same semantics as local execution. - **CLI** owns launch, observation, hooks (except **`jaiph run --raw`**), and runtime preparation (`buildScripts`). **`jaiph run --raw`** still emits **`__JAIPH_EVENT__`** on stderr from the runtime; the CLI does not attach the interactive progress/hooks pipeline. **`jaiph test`** passes **`suppressLiveEvents: true`** into **`NodeWorkflowRuntime`** so **`RuntimeEventEmitter`** skips writing those live stderr lines while **`run_summary.jsonl`** still records workflow traffic where the emitter appends it. diff --git a/docs/cli.md b/docs/cli.md index 77e4dae2..e0898212 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -94,9 +94,11 @@ If a `.jh` file is executable and has `#!/usr/bin/env jaiph`, you can run it dir ### Compile-time and process model -The CLI runs `buildScripts()`, which walks the entry file and its import closure. Each reachable module is parsed and `validateReferences` runs before script files are written. Unrelated `.jh` files on disk are not read. +The default `jaiph run` path parses each reachable `.jh` **once**. The CLI calls **`prepareCompile`** (`src/transpile/compile-prep.ts`) to walk the entry plus its transitive `import` closure, producing a **`CompilePrep`** record (`{ entryFile, workspaceRoot, astByFile }`). The entry AST is reused for the banner (`metadataToConfig`), and the same prep is passed to **`buildScripts(input, outDir, workspaceRoot, prep)`** so `emitScriptsForModule` runs `validateReferences` and writes atomic `script` files **without** re-reading or re-parsing any module. Unrelated `.jh` files on disk are not read. -After validation, the CLI spawns the Node workflow runner as a detached child. The runner loads the graph with `buildRuntimeGraph()` (parse-only imports; no `validateReferences` here) and executes `NodeWorkflowRuntime`. Prompt steps, script subprocesses, inbox dispatch, and event emission are handled in the runtime kernel — workflows and rules are interpreted in-process; only `script` steps spawn a managed shell. The CLI listens on stderr for `__JAIPH_EVENT__` JSON lines, the single event channel for all execution modes. Stdout carries only plain script output, forwarded to the terminal as-is. +After validation, the CLI spawns the Node workflow runner as a detached child. For local (non-Docker) runs the CLI serializes the prep to `/.jaiph-compile-prep.json` with `writeCompilePrep` and points the child at it through the internal env var **`JAIPH_COMPILE_PREP_FILE`**. The runner deserializes the file and passes the cached `CompilePrep` to `buildRuntimeGraph(sourceFile, workspaceRoot, prep)`, which builds the `RuntimeGraph` from the cached `Map` instead of re-reading `.jh` sources. When the env var is absent — Docker `jaiph run`, `jaiph run --raw`, or any other caller — the runner falls back to the on-disk parse path (`buildRuntimeGraph` reads each module via `parsejaiph`). Either path runs `NodeWorkflowRuntime` with the same `RuntimeGraph` shape — `buildRuntimeGraph` still does **not** run `validateReferences`. Prompt steps, script subprocesses, inbox dispatch, and event emission are handled in the runtime kernel — workflows and rules are interpreted in-process; only `script` steps spawn a managed shell. The CLI listens on stderr for `__JAIPH_EVENT__` JSON lines, the single event channel for all execution modes. Stdout carries only plain script output, forwarded to the terminal as-is. + +For the full data flow across the parent → child process boundary, see [Architecture — Local single-parse compile prep](architecture.md#local-single-parse-compile-prep). ### Run progress and tree output @@ -421,6 +423,7 @@ These variables apply to `jaiph run` and workflow execution. Variables marked ** - `JAIPH_META_FILE` — path to the run metadata file (under the CLI’s build output directory for that invocation). Set on the **detached workflow child** only; the parent strips any inherited value so leftover exports do not collide. The runner writes `run_dir=` / `summary_file=` lines for the host to read after exit. - `JAIPH_SOURCE_ABS` — absolute path to the entry `.jh` file; set by the CLI for **`jaiph run`** before spawn. Required by the runner (local and Docker). - `JAIPH_SCRIPTS` — directory containing emitted **`script`** files for this run; set after **`buildScripts()`**. Any **`JAIPH_SCRIPTS`** exported in the parent shell is cleared before launch so nested toolchains do not point at the wrong tree. +- `JAIPH_COMPILE_PREP_FILE` — absolute path to a `CompilePrep` JSON snapshot (`/.jaiph-compile-prep.json`) the CLI wrote with `writeCompilePrep`. Set by the CLI **only** for the default local (non-Docker, non-`--raw`) `jaiph run` path so the spawned `node-workflow-runner.js` builds the runtime graph from the cached ASTs instead of re-reading the import closure. The file is internal and may move; do not depend on its path or contents. When the variable is absent (Docker `jaiph run`, `jaiph run --raw`, `jaiph test`), `buildRuntimeGraph()` falls back to parsing `.jh` from disk. See [Architecture — Local single-parse compile prep](architecture.md#local-single-parse-compile-prep). - `JAIPH_RUN_DIR`, `JAIPH_RUN_ID`, `JAIPH_RUN_SUMMARY_FILE` — for a normal (**non-raw**) **`jaiph run`**, the host generates **`JAIPH_RUN_ID`** once per invocation (UUID), passes it through to the detached child (and into Docker when sandboxed), and Docker failure-path discovery can match summaries by this id. The runtime uses **`JAIPH_RUN_ID`** as the stable run identifier; if it is absent, the runtime may assign its own UUID. **`JAIPH_RUN_DIR`** and **`JAIPH_RUN_SUMMARY_FILE`** are set inside the runner once the UTC run directory exists. - `JAIPH_SOURCE_FILE` — set automatically by the CLI to the entry file **basename**. Used to name run directories (see [Architecture — Durable artifact layout](architecture.md#durable-artifact-layout)). diff --git a/src/cli/commands/run.ts b/src/cli/commands/run.ts index c741b8f5..52aaf5cd 100644 --- a/src/cli/commands/run.ts +++ b/src/cli/commands/run.ts @@ -11,6 +11,7 @@ import { dirname, join, resolve, extname } from "node:path"; import { basename } from "node:path"; import { parsejaiph } from "../../parser"; import { buildScripts } from "../../transpiler"; +import { prepareCompile, writeCompilePrep } from "../../transpile/compile-prep"; import { metadataToConfig } from "../../config"; import { buildStepDisplayParamPairs, formatNamedParamsForDisplay } from "./format-params.js"; import { @@ -80,7 +81,8 @@ export async function runWorkflow(rest: string[]): Promise { } const hooksConfig = loadMergedHooks(workspaceRoot); - const mod = parsejaiph(readFileSync(inputAbs, "utf8"), inputAbs); + const prep = prepareCompile(inputAbs, workspaceRoot); + const mod = prep.astByFile.get(inputAbs)!; const effectiveConfig = metadataToConfig(mod.metadata); const outDir = target ? resolve(target) : mkdtempSync(join(tmpdir(), "jaiph-run-")); @@ -111,8 +113,18 @@ export async function runWorkflow(rest: string[]): Promise { dockerConfigForBanner.enabled, sandboxModeForBanner, ); - const { scriptsDir } = buildScripts(inputAbs, outDir, workspaceRoot); + const { scriptsDir } = buildScripts(inputAbs, outDir, workspaceRoot, prep); runtimeEnv.JAIPH_SCRIPTS = scriptsDir; + // Cache file consumed by the spawned runner (or container) so the runtime + // graph reuses these ASTs instead of re-parsing every reachable module. + // Docker mounts the workspace read-only, so place the cache under outDir, + // which the host already arranges for the container side via its existing + // sandbox layout. For local runs the runner reads the path directly. + const prepFile = join(outDir, ".jaiph-compile-prep.json"); + writeCompilePrep(prepFile, prep); + if (!dockerConfigForBanner.enabled) { + runtimeEnv.JAIPH_COMPILE_PREP_FILE = prepFile; + } const metaFile = join(outDir, `.jaiph-run-meta-${Date.now()}-${process.pid}.txt`); const emitter = createRunEmitter(); diff --git a/src/runtime/kernel/graph.ts b/src/runtime/kernel/graph.ts index c2839db1..01d2c8b2 100644 --- a/src/runtime/kernel/graph.ts +++ b/src/runtime/kernel/graph.ts @@ -1,6 +1,7 @@ import { readFileSync } from "node:fs"; import { resolve } from "node:path"; import { parsejaiph } from "../../parser"; +import type { CompilePrep } from "../../transpile/compile-prep"; import type { RuleDef, ScriptDef, WorkflowDef, WorkflowRefDef, RuleRefDef, jaiphModule } from "../../types"; import { resolveImportPath } from "../../transpile/resolve"; @@ -30,29 +31,53 @@ export interface ResolvedScript { script: ScriptDef; } -function buildNode(filePath: string, workspaceRoot?: string): RuntimeModuleNode { - const ast = parsejaiph(readFileSync(filePath, "utf8"), filePath); +/** Inject `ScriptDef` stubs for `import script` declarations so `resolveScriptRef` finds them. Idempotent. */ +function attachScriptImportStubs(ast: jaiphModule): void { + if (!ast.scriptImports) return; + for (const si of ast.scriptImports) { + if (ast.scripts.some((s) => s.name === si.alias)) continue; + ast.scripts.push({ + name: si.alias, + comments: [], + body: "", + bodyKind: "fenced", + loc: si.loc, + }); + } +} + +function nodeFromAst(filePath: string, ast: jaiphModule, workspaceRoot?: string): RuntimeModuleNode { const imports = new Map(); for (const imp of ast.imports) { imports.set(imp.alias, resolveImportPath(filePath, imp.path, workspaceRoot)); } - // Synthesise ScriptDef stubs for script imports so resolveScriptRef finds them. - if (ast.scriptImports) { - for (const si of ast.scriptImports) { - ast.scripts.push({ - name: si.alias, - comments: [], - body: "", - bodyKind: "fenced", - loc: si.loc, - }); - } - } + attachScriptImportStubs(ast); return { filePath, ast, imports }; } -export function buildRuntimeGraph(entryFile: string, workspaceRoot?: string): RuntimeGraph { +function buildNode(filePath: string, workspaceRoot?: string): RuntimeModuleNode { + const ast = parsejaiph(readFileSync(filePath, "utf8"), filePath); + return nodeFromAst(filePath, ast, workspaceRoot); +} + +/** + * When `prep` is supplied, every reachable module is taken from the pre-parsed + * cache and no `.jh` files are read from disk. The cache is shared with the + * parent CLI's `buildScripts` so each module is parsed exactly once per run. + */ +export function buildRuntimeGraph( + entryFile: string, + workspaceRoot?: string, + prep?: CompilePrep, +): RuntimeGraph { const entry = resolve(entryFile); + if (prep) { + const modules = new Map(); + for (const [filePath, ast] of prep.astByFile) { + modules.set(filePath, nodeFromAst(filePath, ast, workspaceRoot)); + } + return { entryFile: entry, modules }; + } const modules = new Map(); const queue: string[] = [entry]; while (queue.length > 0) { diff --git a/src/runtime/kernel/node-workflow-runner.ts b/src/runtime/kernel/node-workflow-runner.ts index a3432c4a..5a55b3a7 100644 --- a/src/runtime/kernel/node-workflow-runner.ts +++ b/src/runtime/kernel/node-workflow-runner.ts @@ -1,5 +1,6 @@ import { basename, dirname, join } from "node:path"; import { writeFileSync } from "node:fs"; +import { readCompilePrep } from "../../transpile/compile-prep"; import { buildRuntimeGraph } from "./graph"; import { NodeWorkflowRuntime } from "./node-workflow-runtime"; @@ -28,7 +29,9 @@ async function main(): Promise { process.env.JAIPH_SCRIPTS = join(dirname(builtScript), "scripts"); } const workspaceRoot = process.env.JAIPH_WORKSPACE || undefined; - const graph = buildRuntimeGraph(sourceFile, workspaceRoot); + const prepFile = process.env.JAIPH_COMPILE_PREP_FILE; + const prep = prepFile ? readCompilePrep(prepFile) : undefined; + const graph = buildRuntimeGraph(sourceFile, workspaceRoot, prep); const runtime = new NodeWorkflowRuntime(graph, { env: process.env, cwd: process.cwd() }); const status = workflowName === "default" ? await runtime.runDefault(runArgs) : 1; writeFileSync( diff --git a/src/transpile/build.ts b/src/transpile/build.ts index cbe4d478..0b49e88f 100644 --- a/src/transpile/build.ts +++ b/src/transpile/build.ts @@ -1,6 +1,7 @@ import { chmodSync, mkdirSync, readFileSync, readdirSync, statSync, writeFileSync } from "node:fs"; import { dirname, extname, join, parse, relative, resolve } from "node:path"; import { parsejaiph } from "../parser"; +import type { CompilePrep } from "./compile-prep"; import type { ScriptArtifact } from "./emit-script"; import { JAIPH_EXT_REGEX, resolveImportPath } from "./resolve"; @@ -115,13 +116,16 @@ export function collectTransitiveJhModules(entrypoint: string, workspaceRoot?: s } /** - * Writes extracted `script` bodies to `/scripts`. + * Writes extracted `script` bodies to `/scripts`. When `prep` is + * supplied, the transitive-module list comes from the pre-parsed cache instead + * of re-walking and re-parsing the import closure. */ export function buildScripts( inputPath: string, targetDir: string | undefined, emitScriptsFn: (file: string, root: string) => ScriptArtifact[], workspaceRoot?: string, + prep?: CompilePrep, ): { scriptsDir: string } { const absInput = resolve(inputPath); const inputStat = statSync(absInput); @@ -130,7 +134,9 @@ export function buildScripts( ensureDir(outRoot); const entrypointFile = inputStat.isFile() ? absInput : null; - const files = entrypointFile ? collectTransitiveJhModules(entrypointFile, workspaceRoot) : walkjhFiles(rootDir); + const files = prep + ? [...prep.astByFile.keys()].sort() + : entrypointFile ? collectTransitiveJhModules(entrypointFile, workspaceRoot) : walkjhFiles(rootDir); const scriptsRoot = join(outRoot, "scripts"); ensureDir(scriptsRoot); diff --git a/src/transpile/compile-prep.test.ts b/src/transpile/compile-prep.test.ts new file mode 100644 index 00000000..f96388c6 --- /dev/null +++ b/src/transpile/compile-prep.test.ts @@ -0,0 +1,265 @@ +import { mkdtempSync, readdirSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { test } from "node:test"; +import assert from "node:assert/strict"; + +import { buildScripts } from "../transpiler"; +import { buildRuntimeGraph, resolveScriptRef, resolveWorkflowRef } from "../runtime/kernel/graph"; +import { + prepareCompile, + serializeCompilePrep, + deserializeCompilePrep, +} from "./compile-prep"; + +function write(filePath: string, content: string): void { + writeFileSync(filePath, content, "utf8"); +} + +/** + * Acceptance criterion 1: the default local run path must not parse the entry + * module in the parent and then re-parse the same module in the child to build + * the runtime graph. + * + * Strategy: after `prepareCompile` parses every reachable `.jh`, we corrupt + * each file's contents to junk that the parser would reject. If `buildScripts` + * (parent) or `buildRuntimeGraph` (child) re-reads/re-parses any module, the + * call throws and the test fails. The old `run.ts` + `buildScripts()` + + * `node-workflow-runner.ts` duplicate-parse pattern is exactly what would + * fail here. + */ +test("compile-prep: buildScripts + buildRuntimeGraph reuse pre-parsed ASTs and never re-read .jh after prepare", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-noreparse-")); + try { + const main = join(dir, "main.jh"); + const lib = join(dir, "lib.jh"); + write( + lib, + [ + "rule check() {", + ' log "ok"', + "}", + "script helper = `echo hi`", + "workflow inner() {", + " echo ok", + "}", + "", + ].join("\n"), + ); + write( + main, + [ + 'import "./lib.jh" as lib', + "script local_script = `echo local`", + "workflow default() {", + " run lib.inner()", + "}", + "", + ].join("\n"), + ); + + const prep = prepareCompile(main); + assert.equal(prep.astByFile.size, 2); + assert.ok(prep.astByFile.has(main)); + assert.ok(prep.astByFile.has(lib)); + + // Corrupt source contents. Files still exist (so existsSync passes), but + // any new parse call would throw a parse error. + write(main, "!!! invalid jaiph syntax !!!\n"); + write(lib, "!!! invalid jaiph syntax !!!\n"); + + const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-out-")); + try { + const { scriptsDir } = buildScripts(main, outDir, undefined, prep); + const emitted = readdirSync(scriptsDir).sort(); + assert.deepEqual(emitted, ["helper", "local_script"]); + + const graph = buildRuntimeGraph(main, undefined, prep); + assert.equal(graph.modules.size, 2); + const inner = resolveWorkflowRef(graph, main, { + value: "lib.inner", + loc: { line: 1, col: 1 }, + }); + assert.equal(inner?.workflow.name, "inner"); + const helper = resolveScriptRef(graph, main, "lib.helper"); + assert.equal(helper?.script.name, "helper"); + } finally { + rmSync(outDir, { recursive: true, force: true }); + } + } finally { + rmSync(dir, { recursive: true, force: true }); + } +}); + +/** + * Acceptance criterion 2: the optimized graph/compile-prep path preserves + * cross-module workflow, rule, and script resolution. + */ +test("compile-prep: cross-module workflow, rule, and script resolution survives the optimized path", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-crossmod-")); + try { + const main = join(dir, "main.jh"); + const lib = join(dir, "lib.jh"); + write( + lib, + [ + "rule check() {", + ' log "ok"', + "}", + "script helper = `echo hi`", + "workflow inner() {", + " echo ok", + "}", + "", + ].join("\n"), + ); + write( + main, + [ + 'import "./lib.jh" as lib', + "rule local_check() {", + ' log "local"', + "}", + "script local_script = `echo local`", + "workflow default() {", + " run lib.inner()", + "}", + "", + ].join("\n"), + ); + + const prep = prepareCompile(main); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-out2-")); + try { + const { scriptsDir } = buildScripts(main, outDir, undefined, prep); + const emitted = readdirSync(scriptsDir).sort(); + assert.deepEqual(emitted, ["helper", "local_script"]); + + const graph = buildRuntimeGraph(main, undefined, prep); + const localWf = resolveWorkflowRef(graph, main, { + value: "default", + loc: { line: 1, col: 1 }, + }); + assert.equal(localWf?.workflow.name, "default"); + const importedWf = resolveWorkflowRef(graph, main, { + value: "lib.inner", + loc: { line: 1, col: 1 }, + }); + assert.equal(importedWf?.workflow.name, "inner"); + const localScript = resolveScriptRef(graph, main, "local_script"); + assert.equal(localScript?.script.name, "local_script"); + const importedScript = resolveScriptRef(graph, main, "lib.helper"); + assert.equal(importedScript?.script.name, "helper"); + } finally { + rmSync(outDir, { recursive: true, force: true }); + } + } finally { + rmSync(dir, { recursive: true, force: true }); + } +}); + +/** + * Cross-process boundary: the parent serializes the prep, the child + * deserializes it and reuses every AST. Asserts the JSON format is + * round-trippable so the worker can rebuild the graph without re-parsing. + */ +test("compile-prep: serialize round-trip preserves the import closure for the child runner", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-roundtrip-")); + try { + const main = join(dir, "main.jh"); + const lib = join(dir, "lib.jh"); + write( + lib, + [ + "workflow inner() {", + " echo ok", + "}", + "", + ].join("\n"), + ); + write( + main, + [ + 'import "./lib.jh" as lib', + "workflow default() {", + " run lib.inner()", + "}", + "", + ].join("\n"), + ); + + const prep = prepareCompile(main); + const serialized = serializeCompilePrep(prep); + // Corrupt source contents so any deserialized-path consumer that tries to + // re-parse would fail loudly. Files still exist so existsSync passes. + write(main, "!!! invalid !!!\n"); + write(lib, "!!! invalid !!!\n"); + const round = deserializeCompilePrep(serialized); + assert.equal(round.astByFile.size, 2); + const graph = buildRuntimeGraph(main, undefined, round); + const importedWf = resolveWorkflowRef(graph, main, { + value: "lib.inner", + loc: { line: 1, col: 1 }, + }); + assert.equal(importedWf?.workflow.name, "inner"); + } finally { + rmSync(dir, { recursive: true, force: true }); + } +}); + +/** + * Three-module closure: prove the optimization scales beyond the direct + * import case in the acceptance criteria. + */ +test("compile-prep: handles a 3-module closure with one shared parse", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-three-")); + try { + const main = join(dir, "main.jh"); + const libA = join(dir, "a.jh"); + const libB = join(dir, "b.jh"); + write(libA, "workflow a() {\n echo ok\n}\n"); + write( + libB, + [ + 'import "./a.jh" as a', + "workflow b() {", + " run a.a()", + "}", + "", + ].join("\n"), + ); + write( + main, + [ + 'import "./b.jh" as b', + "workflow default() {", + " run b.b()", + "}", + "", + ].join("\n"), + ); + + const prep = prepareCompile(main); + assert.equal(prep.astByFile.size, 3); + + // Corrupt every source: any downstream re-parse would now fail. + write(main, "!!! invalid !!!\n"); + write(libA, "!!! invalid !!!\n"); + write(libB, "!!! invalid !!!\n"); + + const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-three-out-")); + try { + buildScripts(main, outDir, undefined, prep); + const graph = buildRuntimeGraph(main, undefined, prep); + const bRef = resolveWorkflowRef(graph, main, { value: "b.b", loc: { line: 1, col: 1 } }); + assert.equal(bRef?.workflow.name, "b"); + // Resolve transitively into a.jh via b's imports. + const bNode = graph.modules.get(libB)!; + assert.equal(bNode.imports.get("a"), libA); + } finally { + rmSync(outDir, { recursive: true, force: true }); + } + } finally { + rmSync(dir, { recursive: true, force: true }); + } +}); diff --git a/src/transpile/compile-prep.ts b/src/transpile/compile-prep.ts new file mode 100644 index 00000000..dcfdbf2e --- /dev/null +++ b/src/transpile/compile-prep.ts @@ -0,0 +1,69 @@ +import { readFileSync, writeFileSync } from "node:fs"; +import { resolve } from "node:path"; +import { parsejaiph } from "../parser"; +import { resolveImportPath } from "./resolve"; +import type { jaiphModule } from "../types"; + +/** + * One-shot parse of a `.jh` entry plus its transitive import closure. Reused by + * `buildScripts` (validation + script emit) and `buildRuntimeGraph` (runtime + * dispatch) so each reachable module is parsed exactly once per `jaiph run`, + * even across the parent-CLI → child-runner process boundary. + */ +export interface CompilePrep { + entryFile: string; + workspaceRoot?: string; + /** AST for every reachable module, keyed by absolute path. */ + astByFile: Map; +} + +export function prepareCompile(entryFile: string, workspaceRoot?: string): CompilePrep { + const entry = resolve(entryFile); + const astByFile = new Map(); + const queue: string[] = [entry]; + while (queue.length > 0) { + const current = queue.shift()!; + if (astByFile.has(current)) continue; + const ast = parsejaiph(readFileSync(current, "utf8"), current); + astByFile.set(current, ast); + for (const imp of ast.imports) { + const importedFile = resolveImportPath(current, imp.path, workspaceRoot); + if (!astByFile.has(importedFile)) queue.push(importedFile); + } + } + return { entryFile: entry, workspaceRoot, astByFile }; +} + +/** Stable JSON encoding for cross-process transfer. */ +export function serializeCompilePrep(prep: CompilePrep): string { + const entries = [...prep.astByFile.entries()]; + entries.sort(([a], [b]) => (a < b ? -1 : a > b ? 1 : 0)); + return JSON.stringify({ + entryFile: prep.entryFile, + workspaceRoot: prep.workspaceRoot ?? null, + modules: entries.map(([file, ast]) => ({ file, ast })), + }); +} + +export function deserializeCompilePrep(content: string): CompilePrep { + const obj = JSON.parse(content) as { + entryFile: string; + workspaceRoot: string | null; + modules: Array<{ file: string; ast: jaiphModule }>; + }; + const astByFile = new Map(); + for (const m of obj.modules) astByFile.set(m.file, m.ast); + return { + entryFile: obj.entryFile, + workspaceRoot: obj.workspaceRoot ?? undefined, + astByFile, + }; +} + +export function writeCompilePrep(filePath: string, prep: CompilePrep): void { + writeFileSync(filePath, serializeCompilePrep(prep), "utf8"); +} + +export function readCompilePrep(filePath: string): CompilePrep { + return deserializeCompilePrep(readFileSync(filePath, "utf8")); +} diff --git a/src/transpiler.ts b/src/transpiler.ts index 86ab5141..9b493ac1 100644 --- a/src/transpiler.ts +++ b/src/transpiler.ts @@ -2,23 +2,39 @@ import { existsSync, readFileSync } from "node:fs"; import { dirname } from "node:path"; import { parsejaiph } from "./parser"; import { buildScripts as buildScriptsImpl, walkTestFiles } from "./transpile/build"; +import type { CompilePrep } from "./transpile/compile-prep"; import { buildScriptFiles, type ScriptArtifact } from "./transpile/emit-script"; import { resolveImportPath, workflowSymbolForFile } from "./transpile/resolve"; import { resolveScriptImportPath, validateReferences } from "./transpile/validate"; export { resolveImportPath, workflowSymbolForFile } from "./transpile/resolve"; export type { ScriptArtifact } from "./transpile/emit-script"; +export type { CompilePrep } from "./transpile/compile-prep"; /** * Parse, validate, and extract per-`script` bash files for one module (no workflow bash emission). + * When `prep` is supplied, reuses already-parsed ASTs instead of re-reading from disk. */ -export function emitScriptsForModule(inputFile: string, rootDir: string, workspaceRoot?: string): ScriptArtifact[] { - const ast = parsejaiph(readFileSync(inputFile, "utf8"), inputFile); +export function emitScriptsForModule( + inputFile: string, + rootDir: string, + workspaceRoot?: string, + prep?: CompilePrep, +): ScriptArtifact[] { + const cachedAst = prep?.astByFile.get(inputFile); + const ast = cachedAst ?? parsejaiph(readFileSync(inputFile, "utf8"), inputFile); + const readFile = prep + ? (path: string): string => (prep.astByFile.has(path) ? "" : readFileSync(path, "utf8")) + : (path: string): string => readFileSync(path, "utf8"); + const parse = prep + ? (content: string, filePath: string) => + prep.astByFile.get(filePath) ?? parsejaiph(content, filePath) + : parsejaiph; validateReferences(ast, { resolveImportPath, existsSync, - readFile: (path: string) => readFileSync(path, "utf8"), - parse: parsejaiph, + readFile, + parse, workspaceRoot, }); const workflowSymbol = workflowSymbolForFile(inputFile, rootDir); @@ -41,7 +57,12 @@ export function emitScriptsForModule(inputFile: string, rootDir: string, workspa export { walkTestFiles }; -export function buildScripts(inputPath: string, targetDir?: string, workspaceRoot?: string): { scriptsDir: string } { - const emitFn = (file: string, root: string) => emitScriptsForModule(file, root, workspaceRoot); - return buildScriptsImpl(inputPath, targetDir, emitFn, workspaceRoot); +export function buildScripts( + inputPath: string, + targetDir?: string, + workspaceRoot?: string, + prep?: CompilePrep, +): { scriptsDir: string } { + const emitFn = (file: string, root: string) => emitScriptsForModule(file, root, workspaceRoot, prep); + return buildScriptsImpl(inputPath, targetDir, emitFn, workspaceRoot, prep); } From d6a6fdd86de8756a2a6ec475a107f1b11b6dcf69 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Fri, 15 May 2026 10:52:03 +0200 Subject: [PATCH 04/14] design: parser & compiler simplification plan + dev-ready queue Add design/2026-05-15-parser-compiler-simplification.md documenting five load-bearing refactors (ModuleGraph, Expr-AST collapse, visitor-table validator, unified catch/recover, tokenizer + RD parser) plus five secondary improvements (Trivia/CST split, typed Arg[], single-pass validator walk, Diagnostics collector, validator/runtime decoupling). Append all ten as standalone, #dev-ready tasks to QUEUE.md in the recommended implementation order. Also drop .jaiph/language_redesign_spec.md (superseded). Co-Authored-By: Claude Opus 4.7 (1M context) --- .jaiph/language_redesign_spec.md | 800 ------------------ QUEUE.md | 300 +++++++ ...26-05-15-parser-compiler-simplification.md | 347 ++++++++ 3 files changed, 647 insertions(+), 800 deletions(-) delete mode 100644 .jaiph/language_redesign_spec.md create mode 100644 design/2026-05-15-parser-compiler-simplification.md diff --git a/.jaiph/language_redesign_spec.md b/.jaiph/language_redesign_spec.md deleted file mode 100644 index d33cbb83..00000000 --- a/.jaiph/language_redesign_spec.md +++ /dev/null @@ -1,800 +0,0 @@ -# Execution-Boundary Rework Specification - -## Core Problem - -Jaiph blends declarative orchestration with raw shell in workflows and rules. That blurs side-effect boundaries, blocks runtime portability (Go/Rust), and weakens sandbox control. - -Target: one strict boundary. Orchestration constructs orchestrate. A dedicated script construct executes. No exceptions. - -## Design Decisions (Locked) - -These are not options. Implementation starts from this table. - -| # | Decision | -|---|----------| -| 1 | Orchestration constructs (`workflow`, `rule`) contain **zero raw shell**. | -| 2 | Execution construct (`script`) is a **standalone executable** — bash by default, any language via custom shebang. | -| 3 | Construct name is **`script`** (not `function` or `bash`). | -| 4 | Variable declarations use **`const`** in orchestration, **`local`** in scripts. | -| 5 | Rules get **structured keyword parsing** (same model as workflows, restricted subset). | -| 6 | Every shell operation requires a **named `script`**. No anonymous bash blocks. | -| 7 | Scripts: **standard exit semantics** (exit code via `return N`/`exit N`, values via stdout). | -| 8 | Workflows/rules: **`return "value"`** for values, **`fail "reason"`** for explicit failures. | -| 9 | **One-shot cutover.** No compatibility mode, no deprecation warnings. | -| 10 | Scripts run in **full isolation** — only positional args, no inherited variables. | -| 11 | **No script-to-script calls.** Scripts are atomic. Composition happens in orchestration. | -| 12 | Shared utility code lives in **shared bash libraries** (sourced explicitly in bash scripts), not in Jaiph script cross-calls. | -| 13 | `if` uses **brace syntax** (`if ... { } else { }`), **`not`** for negation, **`else if`** for chaining. No `then`/`fi`/`elif`. | -| 14 | Scripts transpile to **separate executable files** with `+x` permission. | -| 15 | Default shebang is `#!/usr/bin/env bash`. User can provide a custom shebang as the first line of the script body (e.g. `#!/usr/bin/env node`). | -| 16 | Workflows, rules, and scripts support **named parameters** in declarations. Positional `$1`/`$2` boilerplate is eliminated. | - -## Legality Matrix - -### `workflow` - -| Construct | Allowed | Syntax | -|-----------|---------|--------| -| config | Yes | `config { key = "value" }` | -| const | Yes | `const name = "value"` / `const name = run ref` / `const name = ensure ref` / `const name = prompt "text"` | -| run | Yes | `run ref [args]` / `run ref [args] &` (async) | -| ensure | Yes | `ensure ref [args]` / `ensure ref [args] recover { ... }` | -| prompt | Yes | `prompt "text"` / `const name = prompt "text"` / `const name = prompt "text" returns '{ ... }'` | -| log | Yes | `log "message"` | -| logerr | Yes | `logerr "message"` | -| return | Yes | `return "value"` / `return $var` | -| fail | Yes | `fail "reason"` | -| if | Yes | `if [not] ensure ref { ... } [else if ...] [else { ... }]` / `if [not] run ref { ... }` | -| route | Yes | `channel -> ref1, ref2` | -| send | Yes | `channel <- "value"` / `channel <- $var` / `channel <- run ref` | -| wait | Yes | `wait` (waits for async `run` steps) | -| Raw shell | **No** | Hard parser error with rewrite guidance | - -### `rule` - -| Construct | Allowed | Syntax | -|-----------|---------|--------| -| const | Yes | `const name = "value"` / `const name = run ref` / `const name = ensure ref` (no `prompt` capture) | -| ensure | Yes | `ensure ref [args]` — other rules only, **no `recover`** | -| run | Yes | `run ref [args]` — **scripts only**, not workflows | -| log | Yes | `log "message"` | -| logerr | Yes | `logerr "message"` | -| return | Yes | `return "value"` / `return $var` | -| fail | Yes | `fail "reason"` | -| if | Yes | `if [not] ensure ref { ... }` / `if [not] run ref { ... }` (run targets scripts only) | -| prompt | **No** | Rules don't interact with AI | -| route / send | **No** | Rules don't use channels | -| async (`&`, `wait`) | **No** | | -| recover (in `ensure`) | **No** | Not in rule-to-rule calls | -| Raw shell | **No** | Hard parser error | - -### `script` - -| Construct | Allowed | Syntax | -|-----------|---------|--------| -| Custom shebang | Yes | `#!/usr/bin/env node` (first line of body; omit for default `#!/usr/bin/env bash`) | -| All body content | Yes | Full language content matching the shebang (bash by default) | -| Nested bash functions | Yes (bash) | `helper() { ... }` (internal to the script body) | -| Shared bash via workspace lib dir | **No** | Use `import script`, a sibling module, or inline bash in a `script` block — `JAIPH_LIB` is not provided | -| `return N` / `exit N` | Yes (bash) | Exit code (integer only) | -| stdout (`echo`, `printf`) | Yes | Value output mechanism | -| `local` | Yes (bash) | Bash variable declarations | -| Other Jaiph script calls | **No** | Scripts are atomic; compose in orchestration | -| `run`, `ensure`, `prompt` | **No** | Hard parser error (bash scripts only; skipped for custom shebangs) | -| `return "value"` | **No** | Use `echo` for values, `return 0` for success (bash scripts only) | -| `fail`, `const`, `log`, `logerr` | **No** | Jaiph keywords, not available in scripts (bash scripts only; skipped for custom shebangs) | -| Parent scope variables | **No** | Full isolation — only positional args | - -**Jaiph keyword guard**: for bash scripts (no shebang or `#!/usr/bin/env bash`), the parser rejects Jaiph-level keywords (`run`, `ensure`, `fail`, `const`, `log`, `logerr`, `prompt`) in the body. For custom shebangs (e.g. `#!/usr/bin/env node`), the guard is skipped — the user owns the body entirely. - -## Named Parameters - -All constructs support named parameters in their declarations: - -``` -workflow implement(task, role_name) { ... } -rule ensure_is_number(value) { ... } -script check_hash(file_path, expected_hash) { ... } -``` - -**Semantics:** - -- Parameters are available as named local variables inside the construct body. -- For workflows/rules: the transpiler emits `local task="$1"; local role_name="$2"` at the top of the function body. -- For bash scripts: the transpiler prepends `local file_path="$1"; local expected_hash="$2"` to the script file. For non-bash shebangs, named params are documentary only (the language uses its own argv mechanism). -- **Optional/default parameters**: `workflow deploy(env, version, dry_run = "false")` transpiles to `local dry_run="${3:-false}"`. -- Both positional and named calling conventions are valid at call sites: - - `run implement "$task" "$role_name"` — positional, mapped by declaration order. - - `run implement task="$task" role_name="$role_name"` — named (already partially supported via `parseParamKeysFromArgs`). -- **Arity validation**: the validator can check call sites against the declaration. `run implement` with zero args when `implement` declares two required params is a validation error. -- **Parentheses are optional**: `workflow default() { ... }` (no params) remains valid. Constructs with params use `name(params) { ... }`. - -## Script Isolation and Transpilation Model - -Scripts execute in **full isolation**. They receive only their positional arguments. No inherited variables from the orchestration scope, module-level constants, or other scripts' state. - -### Transpilation to separate files - -Each `script` block transpiles to a **standalone executable file** in the build output: - -``` -build/ - scripts/ - check_is_number # #!/usr/bin/env bash, +x - check_json_schema # #!/usr/bin/env node, +x - select_role # #!/usr/bin/env bash, +x - module_name.sh # orchestration (workflows + rules) -``` - -The transpiler: -1. Extracts each `script` body verbatim -2. Prepends the shebang (user-provided or default `#!/usr/bin/env bash`) -3. Writes to `build/scripts/` with `chmod +x` -4. In the module `.sh`, script calls become: `"$JAIPH_SCRIPTS/" "$@"` - -The runtime sets `$JAIPH_SCRIPTS` to the build output scripts directory. - -### Shebang syntax - -The first non-empty line of the script body is checked for `#!`. If present, it becomes the file's shebang. If absent, `#!/usr/bin/env bash` is used. - -``` -script check_json() { - #!/usr/bin/env node - const data = JSON.parse(process.argv[2]); - process.exit(data.valid ? 0 : 1); -} - -script check_is_number() { - [[ "$1" =~ ^[0-9]+$ ]] -} -``` - -### Data flow - -**Data flow is always explicit**: -- **Input**: named parameters (declared in signature) or positional arguments (`$1`, `$2`, ...). Named params are syntactic sugar — they transpile to positional arg assignments. -- **Output**: stdout (value), stderr (diagnostics), exit code (success/failure) -- **No side channel**: scripts cannot read `const` variables from workflows/rules - -### Shared utility code (bash scripts only) - -Scripts cannot call other Jaiph scripts. Factor repeated bash into **`import script "./helper.sh" as helper`** (path relative to the `.jh` file), another `.jh` module, or a small extra `script` in the same module. Do not use a workspace-wide bash drop directory outside the compiler model. - -Non-bash scripts use their language's own module system for shared code. - -## Semantics: Values, Returns, Failures - -### Scripts (isolated, standalone executables) - -Values are passed via **stdout**. Caller captures with `const result = run script_name`. - -Exit code determines success/failure: `return 0` / `exit 0` = success, `return 1` / `exit 1` = failure. - -The existing `jaiph::set_return_value` mechanism is **removed** from script transpilation. `return "$string"` in a bash script body is a **parser error** (bash `return` only accepts integers). - -### Workflows - -`return "value"` passes a value to the caller via the Jaiph runtime (not stdout). - -`fail "reason"` terminates the workflow with a non-zero exit and logs the reason to stderr. An unrecovered `ensure` failure also terminates the workflow. - -Exit code: 0 on natural completion or `return`. Non-zero on `fail` or unrecovered failure. - -### Rules - -`return "value"` passes a value to the caller. Captured by `const result = ensure rule_name`. - -`fail "reason"` causes the rule to fail. In the caller, this triggers a `recover` block (if present) or aborts. - -A rule that completes without hitting `fail` passes. - -### `fail` vs script failure - -| Context | How to fail | How to return a value | -|---------|-------------|----------------------| -| `script` | `return 1` / `exit 1` | `echo "value"` (stdout) | -| `workflow` | `fail "reason"` | `return "value"` | -| `rule` | `fail "reason"` | `return "value"` | - -## Migration Examples - -### Rule: raw shell → structured - -Before: - -``` -rule ensure_is_number() { - if ! [[ "$1" =~ ^[0-9]+$ ]]; then - echo "Expected a non-negative integer, got: $1" >&2 - exit 1 - fi -} -``` - -After: - -``` -script check_is_number(value) { - [[ "$value" =~ ^[0-9]+$ ]] -} - -rule ensure_is_number(value) { - if not run check_is_number "$value" { - fail "Expected a non-negative integer, got: $value" - } -} -``` - -### Workflow: inline shell → named script - -Before: - -``` -workflow default() { - n="${1:-10}" - ensure ensure_is_number "$n" - result = run fib "$n" - log "$result" -} -``` - -After: - -``` -workflow default(n = "10") { - ensure ensure_is_number "$n" - const result = run fib "$n" - log "$result" -} -``` - -### Script: return value via stdout (not `jaiph::set_return_value`) - -Before: - -``` -function fib() { - local result - result="$(fib_impl "$n")" - return "$result" -} -``` - -After: - -``` -script fib() { - fib_impl() { - local x="$1" - if [ "$x" -le 1 ]; then - echo "$x" - return 0 - fi - local a b - a="$(fib_impl "$((x - 1))")" - b="$(fib_impl "$((x - 2))")" - echo "$((a + b))" - } - fib_impl "$1" -} -``` - -All data is internal. Caller captures via `const result = run fib "$n"`. - -### Polyglot script: Node.js validation - -``` -script validate_json_schema(schema_path, data_path) { - #!/usr/bin/env node - const Ajv = require('ajv'); - const fs = require('fs'); - const ajv = new Ajv(); - const schema = JSON.parse(fs.readFileSync(process.argv[2], 'utf8')); - const data = JSON.parse(fs.readFileSync(process.argv[3], 'utf8')); - const valid = ajv.validate(schema, data); - if (!valid) { - console.error(JSON.stringify(ajv.errors)); - process.exit(1); - } -} - -workflow validate_config() { - ensure config_file_exists - const result = run validate_json_schema "schema.json" "config.json" - log "Config validated successfully" -} -``` - -### Prompt with `returns` + value dispatch (engineer.jh pattern) - -Before: - -``` -local role_surgical = "..." -local role_reductionist = "..." - -workflow implement() { - local role_name="$2" - local role - if [ "$role_name" = "surgical" ]; then - role="$role_surgical" - elif [ "$role_name" = "reductionist" ]; then - role="$role_reductionist" - fi - prompt "$role ..." -} -``` - -After: - -``` -script select_role(role_name) { - local role_surgical=' - You are a surgical engineer. ... - ' - local role_reductionist=' - You are a reductionist engineer. ... - ' - - case "$role_name" in - surgical) echo "$role_surgical" ;; - reductionist) echo "$role_reductionist" ;; - *) echo "Unknown role: $role_name" >&2; return 1 ;; - esac -} - -workflow implement(task, role_name) { - const role = run select_role "$role_name" - - prompt " - $role - ... - $task - " -} -``` - -Role data is internal to the script. Orchestration only passes the role name and receives the resolved text. Full isolation — script has zero knowledge of caller scope. - -### Send operator - -Before: - -``` -workflow scanner() { - findings <- echo "Found 3 issues in auth module" -} -``` - -After: - -``` -workflow scanner() { - findings <- "Found 3 issues in auth module" -} -``` - -### Rule with value return - -Before: - -``` -rule echo_line() { - echo "this goes to logs only" - return "captured-value" -} -``` - -After: - -``` -script echo_impl() { - echo "this goes to logs only" >&2 -} - -rule echo_line() { - run echo_impl - return "captured-value" -} -``` - -## Pattern Catalog: .jaiph/ and e2e/ audit - -Every `.jh` file was scanned. Below are all patterns found that require migration, grouped by category. - -### P1: Raw shell in workflows (every .jaiph/ file) - -**Files**: queue.jh, docs_parity.jh, simplifier.jh, architect_review.jh, ensure_ci_passes.jh, qa.jh, git.jh, log_keyword.jh, nested_run.jh, workflow_greeting.jh, prompt_unmatched.jh, rule_pass.jh, assign_capture.jh - -**Examples**: `echo "..."`, `printf`, `mkdir -p`, `rm -f`, `exit 0`, `exit 1`, `test -n`, bare assignment (`dataset="testdata"`) - -**Migration**: each becomes a named `script` or a `const` declaration. `exit 0` → `return` (early success). `exit 1` → `fail "reason"`. - -### P2: Raw shell in rules (every rule) - -**Files**: git.jh (`git rev-parse`, `test -z "$(git status)"`), queue.jh (`echo | grep -q`), ensure_ci_passes.jh (`npm run test:ci`), docs_parity.jh (`test -f`, `while IFS= read`), simplifier.jh, say_hello.jh, say_hello_json.jh, current_branch.jh - -**Migration**: shell logic moves to scripts. Rules become structured: `run` the script, `if`/`fail` on the result. - -### P3: Iteration in workflows - -**Files**: architect_review.jh (`while IFS= read -r header; do ... done <<< "$headers"`), docs_parity.jh (`for f in docs/*.md`, `for f in "${docs_md_files[@]}"`). - -**Problem**: the loop body contains orchestration keywords (`run`, `ensure`, `prompt`, `log`). Cannot be pushed to a script. - -**Resolution**: use **workflow recursion**. Extract per-item logic into a workflow, then recurse over the list. Split newline-delimited lists with tiny `script` steps (e.g. `printf '%s\n' "$1" | head -n 1` / `tail -n +2`) or `import script`. - -``` -script list_docs_files() { - for f in docs/*.md; do - echo "$f" - done -} - -workflow process_docs_recursive(file, remaining) { - run docs_page "$file" - - if run has_value "$remaining" { - const next = run first_line "$remaining" - const rest = run rest_lines "$remaining" - run process_docs_recursive "$next" "$rest" - } -} - -workflow default() { - const docs_files = run list_docs_files - const first = run first_line "$docs_files" - const rest = run rest_lines "$docs_files" - run process_docs_recursive "$first" "$rest" -} -``` - -**Future feature: `each` modifier.** Planned syntax sugar that replaces the recursion boilerplate: - -``` -run docs_page each $docs_files -``` - -`each` is a modifier on `run`/`ensure` that calls the target once per newline-delimited item. No loop body, no mutable state, no break/continue. Backward-compatible addition — does not block v1. - -### P4: Bash arrays in workflows - -**File**: docs_parity.jh — builds arrays dynamically (`local files=()`, `files+=("$f")`), passes them as args (`"${files[@]}"`). - -**Resolution**: avoid arrays in orchestration. Represent lists as newline-delimited strings. Scripts that need to process multiple items receive them as a single string argument. Glob expansion (`docs/*.md`) stays in scripts. - -### P5: Mutable variables in workflows - -**File**: architect_review.jh — `local failed=0` then `failed=1` inside a loop to track whether any task failed. - -**Resolution**: restructure to avoid mutable state. The per-item workflow performs side effects (marking tasks). After recursion completes, re-check the final state: - -``` -workflow review_single_task(header) { - const task = run queue.get_task_by_header "$header" - - if run is_dev_ready "$task" { - log "Already dev-ready: $header" - return - } - - const verdict = run review_task "$task" - if run matches "$verdict" "dev-ready" { - run queue.mark_task_dev_ready "$header" - log "Marked dev-ready: $header" - } else { - log "Needs work: $header" - } -} - -workflow default() { - const headers = run queue.get_all_task_headers - # recurse over headers (or use `each` when available) - ... - - const remaining = run queue.count_not_ready - if not run is_zero "$remaining" { - fail "One or more tasks need work" - } -} -``` - -No mutable counter. The source of truth is the queue state, not a variable. - -### P6: String comparison in workflows (SPEC GAP) - -**Files**: architect_review.jh (`[[ "$verdict" == "dev-ready" ]]`), engineer.jh (role name dispatch), git.jh (`[ -z "$role_name" ]`). - -**Resolution**: push to scripts. - -``` -script matches(a, b) { - [ "$a" = "$b" ] -} - -script has_value(val) { - [ -n "$val" ] -} - -if run matches "$verdict" "dev-ready" { - ... -} -``` - -These are small, reusable utility scripts in the same module (or behind `import script`). - -### P7: `return "$(command)"` in scripts (Jaiph value return) - -**Files**: queue.jh (`return "$(awk ...)"`), docs_parity.jh (`return "$(git diff ...)"`), simplifier.jh (same pattern). - -**Migration**: replace `return "$(command)"` with direct stdout passthrough: - -Before: `return "$(awk '/^## /{print}' "$queue_file")"` - -After: `awk '/^## /{print}' "$queue_file"` (just let stdout flow) - -### P8: `logerr` in rules - -**Files**: say_hello.jh, say_hello_json.jh — `logerr "message"` inside raw shell rule body. - -**Migration**: under structured rules, `logerr` becomes a Jaiph keyword (already in legality matrix): - -``` -rule name_was_provided(name) { - if not run has_value "$name" { - logerr "You didn't provide your name :(" - fail "name argument required" - } -} -``` - -### P9: `ensure` with `recover` containing shell - -**File**: ensure_ci_passes.jh — `recover` block contains `echo "$1" > "$ci_log_file"`, shell conditionals, and a `prompt`. - -**Migration**: shell in recover body moves to scripts. `prompt` stays (recover body follows workflow rules): - -``` -script save_ci_log(content, path) { - echo "$content" > "$path" -} - -script ci_log_exists(path) { - [ -s "$path" ] -} - -workflow ensure_ci_passes() { - const ci_log_file = ".jaiph/tmp/ensure_ci_passes.last.log" - run mkdir_p ".jaiph/tmp" - - ensure ci_passes recover { - run save_ci_log "$1" "$ci_log_file" - if not run ci_log_exists "$ci_log_file" { - fail "ci failure log is empty at $ci_log_file" - } - prompt "Fix failing CI... log at: $ci_log_file" - } - - run rm_file "$ci_log_file" -} -``` - -### P10: Shell variable expansion in `const` RHS - -**Files**: multiple — `"${1:-10}"`, `"${1:-}"`, `"${task%%$'\n'*}"`. - -**Ruling**: simple interpolation (`$var`, `"${var:-default}"`) is allowed in `const` RHS — these are value lookups, not computation. Bash string operations (`${var%%pattern}`, `${var//old/new}`) are computation — push to a script. - -| Allowed in `const` RHS | Not allowed (use script) | -|------------------------|---------------------------| -| `"$var"` | `"${var%%pattern}"` | -| `"${var:-default}"` | `"${var//old/new}"` | -| `"${var:+alt}"` | `"${#var}"` | -| `"literal"` | `$(command)` | - -### P11: Script-to-script calls - -**File**: docs_parity.jh — rule `only_expected_docs_changed_after_prompt` calls script `is_allowed_file` directly. - -**Migration**: under full isolation + no script-to-script calls, inline the logic or add a dedicated `import script` helper: - -``` -script check_only_expected_changed(allowed, changed) { - while IFS= read -r f; do - [ -z "$f" ] && continue - if [[ $'\n'"$allowed"$'\n' != *$'\n'"$f"$'\n'* ]]; then - echo "Unexpected file changed: $f" >&2 - return 1 - fi - done <<< "$changed" -} -``` - -## Implementation Plan - -### Phase 0: Architectural prep (before breaking changes) - -**0a. Refactor `validate.ts` — collapse duplicate ref resolution** -- Merge `validateRuleRef`, `validateWorkflowRef`, `validateRunInRuleRef`, `validateRunTargetRef`, `validateBareSendSymbol` into one generic `validateRef(ref, allowedKinds, context)` function -- Target: 788 → ~400 lines -- Zero behavior change - -**0b. Split `emit-workflow.ts` — separate emitters** -- Extract script emission into `emit-script.ts` -- Extract rule emission into `emit-rule.ts` -- `emit-workflow.ts` becomes orchestration-only assembly -- Creates natural seam for Phase 3 (separate script files) - -### Phase 1: Language additions (no breaking changes) - -**1a. Add `fail` keyword** -- AST: new `WorkflowStepDef` variant `{ type: "fail"; message: string; loc: SourceLoc }` -- Parser: recognize `fail "reason"` in `workflows.ts` -- Transpiler: emit `echo "reason" >&2; exit 1` - -**1b. Add `const` declaration** -- AST: new step type `{ type: "const"; name: string; value: ConstValue; loc: SourceLoc }` where `ConstValue` is string-expr | run-capture | ensure-capture | prompt-capture -- Parser: `const name = ...` with RHS dispatch -- Transpiler: emit `local name; name="value"` or appropriate capture form - -**1c. Formalize `wait` as keyword** -- AST: new variant `{ type: "wait"; loc: SourceLoc }` -- Parser: recognize `wait` in workflows (currently falls through to shell) -- Transpiler: emit `wait` - -**1d. Switch `if` to brace syntax** -- Parser: recognize `if [not] ensure/run ref { ... } [else if ...] [else { ... }]` -- Keep old `if ... then ... fi` working during Phase 1 (dual parsing) -- Transpiler: both forms emit the same bash - -### Phase 2: Rule parser rewrite - -**2a. Restructure `RuleDef`** -- Change `RuleDef.commands: string[]` → `RuleDef.steps: RuleStepDef[]` (or reuse `WorkflowStepDef` subset) -- Rewrite `rules.ts` with keyword-aware parsing (mirror `workflows.ts` structure) -- Port existing rule tests first, then validate structured output - -**2b. Update rule emission** -- `emit-workflow.ts`: handle structured rule steps instead of opaque command strings - -### Phase 3: `function` → `script` rename and separate file transpilation - -**3a. Rename keyword** -- Parser: accept `script` keyword instead of `function` -- AST: rename `FunctionDef` → `ScriptDef`, add `shebang?: string` field -- `jaiphModule`: rename `functions` → `scripts` -- Update all validator references - -**3b. Add shebang extraction** -- Parser: check first non-empty line of script body for `#!` -- If present, store in `ScriptDef.shebang` and exclude from body commands -- If absent, `shebang` remains `undefined` (default `#!/usr/bin/env bash`) - -**3c. Conditional keyword guard** -- For bash scripts (no shebang or bash shebang): keep existing Jaiph keyword rejection -- For custom shebangs: skip keyword guard entirely - -**3d. Emit scripts as separate files** -- Change `emitWorkflow` return type: `{ module: string; scripts: ScriptFile[] }` where `ScriptFile = { name: string; content: string; shebang: string }` -- Module `.sh` calls scripts via `"$JAIPH_SCRIPTS/" "$@"` -- `build.ts`: write script files with `chmod +x`, set `$JAIPH_SCRIPTS` - -**3e. Update all first-party `.jh` files** -- Rename `function` → `script` in all `.jaiph/*.jh` files -- Rename in all `e2e/*.jh` fixtures -- Update test fixtures and golden outputs - -**3f. Named parameters** -- Parser: recognize `name(param1, param2)` and `name(param1, param2 = "default")` in workflow, rule, and script declarations -- AST: add `params?: Array<{ name: string; default?: string }>` to `WorkflowDef`, `RuleDef`, `ScriptDef` -- Transpiler: for workflows/rules, emit `local param1="$1"; local param2="$2"` (or `"${2:-default}"` for defaults) at the top of the function body. For bash scripts, prepend the same to the script file. For non-bash scripts, params are documentary only. -- Validator: check call-site arity against declared params. Missing required args = validation error. Extra args beyond declared params = validation warning. -- Update all first-party `.jh` files to use named params where applicable -- Parentheses optional when no params: `workflow default() { ... }` remains valid - -### Phase 4: Script isolation - -**4a. Implement full isolation for script execution** -- Scripts run as separate processes (inherent from separate files + exec) -- Only positional args available (inherent from separate executable) -- Set `$JAIPH_SCRIPTS` and `$JAIPH_WORKSPACE` for script steps (no workspace bash lib dir) - -**4b. Reject script-to-script calls** -- Parser/validator: detect when a script body references another Jaiph script name -- Error: `"scripts cannot call other Jaiph scripts; use import script, inline bash, or compose in a workflow"` - -### Phase 5: Remove shell (breaking changes) - -**5a. Remove shell fallback from workflow parser** -- `workflows.ts`: delete the catch-all `type: "shell"` codepath -- Remove `shellAccumulator` / `braceDepthDelta` shell accumulation -- Emit parser error: `"raw shell is not allowed in workflow; extract to a script"` - -**5b. Remove shell fallback from rule parser** -- Same treatment after Phase 2 - -**5c. Remove old `if` syntax** -- Drop `if ... then ... fi` / `elif` parsing -- Only accept brace syntax with `not` / `else if` - -**5d. Enforce pure output in scripts** -- `scripts.ts`: reject `return "value"` (non-integer return) -- Remove `jaiph::set_return_value` from script transpilation - -**5e. Update send operator** -- Accept `"value"` / `$var` / `run ref` as RHS -- Reject raw shell command as RHS - -### Phase 6: Migrate all first-party code - -- Rewrite all `e2e/*.jh` fixtures -- Rewrite all `.jaiph/*.jh` workflows -- Factor repeated bash into `import script` or extra `script` blocks in the same module (P6, P11) -- Update test fixtures and golden transpilation outputs -- Update docs and README examples - -### Phase 7: Ship - -- Hard parser errors on all legacy syntax -- Error messages include rewrite examples -- Full e2e + golden snapshot CI gate -- Zero P0 parser/runtime failures before merge - -## Code Changes Required - -| File | Change | -|------|--------| -| `src/types.ts` | Rename `FunctionDef` → `ScriptDef`, add `shebang?: string`, add `params?: ParamDef[]`. Rename `jaiphModule.functions` → `jaiphModule.scripts`. Add `params?: ParamDef[]` to `WorkflowDef`, `RuleDef`. Add `fail`, `wait`, `const` step types. Change `RuleDef.commands` → `RuleDef.steps`. Remove `shell` condition kind from `if`. Add `not` / brace-style `if` AST. | -| `src/parser.ts` | Replace `function` keyword detection with `script`. Rename `parseFunctionBlock` → `parseScriptBlock`. | -| `src/parse/functions.ts` → `src/parse/scripts.ts` | Rename file. Update regex to match `script` keyword. Add shebang extraction. Conditional keyword guard (skip for custom shebangs). Parse named params in signature. | -| `src/parse/workflows.ts` | Remove shell fallback, shell accumulator. Add `fail`, `const`, `wait` parsing. Replace `if ... then ... fi` with brace syntax. | -| `src/parse/rules.ts` | Full rewrite: keyword-aware structured parser mirroring workflow parser. | -| `src/transpile/emit-workflow.ts` | Split: extract script emission to `emit-script.ts`, rule emission to `emit-rule.ts`. Change return type to include script files. Remove `jaiph::set_return_value` from script paths. | -| `src/transpile/emit-script.ts` | **New file.** Emit standalone script files with shebang + body. | -| `src/transpile/emit-rule.ts` | **New file.** Rule emission extracted from `emit-workflow.ts`. | -| `src/transpile/emit-steps.ts` | Remove `emitShellStep` for workflows. Add `emitFailStep`, `emitConstStep`, `emitWaitStep`. | -| `src/transpile/build.ts` | Handle new `emitWorkflow` return shape. Write script files with `chmod +x`. Set `$JAIPH_SCRIPTS` path. | -| `src/transpile/validate.ts` | Collapse duplicate ref resolution. Rename `function` → `script` in errors/lookups. Allow `run` in rules (scripts only). Remove shell-condition validation. Add script isolation validation. | -| `src/transpile/shell-jaiph-guard.ts` | Scope down — only applies to bash scripts now. | -| `e2e/*.jh` | Rewrite all fixtures to new syntax. | -| `.jaiph/*.jh` | Rewrite all workflows to new syntax. | -| `test/fixtures/**` | Update golden transpilation outputs. | -| `docs/*` | Update grammar, getting-started, CLI docs for `script` keyword and shebang. | - -## Risks - -| Risk | Impact | Mitigation | -|------|--------|------------| -| Wide breakage: all raw-shell workflows/rules fail at parse time | High | Single branch, full e2e gate, no merge without 100% pass | -| Rule parser rewrite introduces regressions | High | Port existing rule tests before rewriting parser | -| Ergonomic cost of named scripts for trivial shell | Medium | Accepted tradeoff — boundary clarity > brevity | -| `fail` interacts badly with `recover` | Medium | Explicit test: `ensure rule_with_fail recover { ... }` must trigger recover | -| `const` scoping conflicts with bash `local` | Low | `const` is parser-level immutability; transpiles to `local` | -| Return semantics confusion during migration | Medium | Parser errors guide users: `"return 'value' not allowed in script; use echo"` | -| Script isolation perf overhead (fork+exec per call) | Medium | Measure fork cost; scripts are already logically isolated. Optimize hot paths if needed | -| Users want a global bash grab-bag | Medium | `import script` + small modules; no `JAIPH_LIB` | -| `.jaiph/` workflow migration is large (9 files) | High | Migrate in parallel with parser changes; each file is independently testable | -| Separate file management complexity | Medium | Deterministic naming (`scripts/`), cleanup on rebuild | -| Custom shebang scripts may have missing dependencies | Low | Not Jaiph's problem — user owns their runtime. Document clearly | - -## Success Criteria - -- 100% first-party `.jh` files parse under new grammar -- 100% e2e pass under new runtime -- Zero `type: "shell"` steps in workflow/rule AST output -- `fail` triggers `recover` correctly in `ensure` blocks -- Script bodies reject `return "value"`, `fail`, `const`, other Jaiph keywords (bash scripts only) -- Script bodies reject calls to other Jaiph scripts -- Scripts execute as separate files with correct shebang and `+x` -- Custom shebang scripts (e.g. `#!/usr/bin/env node`) work end-to-end -- Scripts execute in full isolation (no inherited variables) -- `const` declarations work in workflows and rules with all RHS forms -- `if` brace syntax works with `not` and `else if` -- Parser errors for raw shell include actionable rewrite examples -- `jaiph::set_return_value` removed from script transpilation paths -- `validate.ts` under 500 lines after dedup -- `emit-workflow.ts` handles only orchestration; script/rule emission in separate files -- Named parameters work in workflow, rule, and script declarations -- Default parameter values work: `workflow deploy(env, dry_run = "false")` -- Arity validation catches missing required args at call sites diff --git a/QUEUE.md b/QUEUE.md index aa0fb185..f7fd8fa9 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -12,3 +12,303 @@ Process rules: 6. **Acceptance criteria are non-negotiable.** A task is not done until every acceptance bullet is verified by a test that fails when the contract is violated. "It works on my machine" or "the existing tests pass" is not acceptance. *** + +## Promote `CompilePrep` to a first-class `ModuleGraph` and make the parser I/O-pure #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. + +**Why:** Three different traversal strategies exist for "the set of modules in this build" — the validator recursively re-reads + re-parses imports via `ValidateContext` callbacks (`src/transpile/validate.ts`), `emitScriptsForModule` (`src/transpiler.ts`) re-wraps the same callbacks with an optional `prep` cache, and `buildScripts` (`src/transpile/build.ts`) walks the file system directly. `compile-prep` already proved the right model — pre-parse all reachable modules once, hand them to validator and emitter — but it is an optimization, not the path. + +**Scope:** + +- Introduce `ModuleGraph` (generalization of `CompilePrep`) as the single representation of "all modules reachable from an entry point, parsed once." +- `parsejaiph(source, filePath)` must remain a pure function `(string, string) => jaiphModule`. No fs calls reachable from `parsejaiph`. +- `validate(graph)` and `emit(graph, outDir)` must operate entirely in-memory. The `ValidateContext` callback shape (`resolveImportPath`, `existsSync`, `readFile`, `parse`, `workspaceRoot`) is removed. +- A single discovery routine (`loadModuleGraph(entry, workspaceRoot?)`) replaces `collectTransitiveJhModules`, the cache-population logic in `compile-prep.ts`, and the bespoke re-parse paths inside `validateReferences` / `emitScriptsForModule`. +- The `prep?` optional parameter on `emitScriptsForModule` and `buildScripts` goes away; both take a `ModuleGraph`. +- LSP / single-file edits and full compiles must share the same pipeline — only the graph root differs. + +**Acceptance criteria** (each verified by a test that fails when violated): + +1. `parsejaiph` cannot reach `fs`. A unit test stubs `node:fs` to throw on any call and parses every fixture in `test-fixtures/` and `examples/`; all must succeed. +2. `validate(graph)` and `emit(graph, outDir)` cannot reach `fs` for source/AST reads (writing emitted scripts is allowed inside `emit`). A unit test stubs `fs.readFileSync`/`fs.existsSync` to throw on any `.jh` path and runs the full pipeline against `test-fixtures/`; all must succeed. +3. `ValidateContext` is deleted from `src/transpile/validate.ts`; `validateReferences` takes a `ModuleGraph` (or equivalent) only. +4. Each `.jh` source file in a compile is parsed exactly once. A test instruments `parsejaiph` with a call counter and asserts no duplicate parses across the full pipeline for at least one fixture with transitive imports. +5. `npm test` and `npm run build` pass. The full golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against emitted output. +6. The CLI entry points (`src/cli.ts`, `src/cli/`) and `e2e` tests pass unchanged from a user perspective. + +**Out of scope:** changes to the AST shape (Refactor 3), the validator switch structure (Refactor 4), the parser internals (Refactors 1 & 2), and any surface syntax. + +*** + +## Split source-fidelity data from the semantic AST into a Trivia / CST layer #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. + +**Why:** `WorkflowStepDef` and `jaiphModule` today carry roughly ten fields whose only consumer is the formatter: `leadingComments`, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence`, `topLevelOrder`, `bareSource`, the `tripleQuoted` flags on literal/return/log/fail/send/const, `bodyKind`, `bodyIdentifier`. Every validator/emitter path has to ignore or thread these through unchanged. Pulling them out before the AST is collapsed (next task) lets the new `Expr` shape be designed against the *semantic* core only. + +**Scope:** + +- Introduce a `Trivia` layer (parallel map keyed by node id, or a CST node with both a semantic and a syntactic side) that owns all source-fidelity data currently on the AST. +- Every formatter-only field listed above is removed from `WorkflowStepDef`, `jaiphModule`, `ConstRhs`, `SendRhsDef`, and any other AST type, and re-homed in `Trivia`. +- `parsejaiph` returns `{ ast, trivia }` (or equivalent) instead of a single fat AST. +- The formatter is rewritten to read from `Trivia` alongside the AST. No other consumer (validator, emitter, transpiler, runtime) reads `Trivia` at all. +- Round-trip behavior is bit-for-bit identical for every fixture under `test-fixtures/` and `examples/`. + +**Acceptance criteria** (each verified by a test): + +1. None of the listed fields appear on any `WorkflowStepDef` variant, `jaiphModule`, `ConstRhs`, `SendRhsDef`, or other semantic AST type. A type-level test fails if any of them reappears. +2. Validator and emitter source files do not reference `Trivia` or its fields. A grep test fails if they do. +3. Formatter round-trip is bit-for-bit on every fixture under `test-fixtures/` and `examples/`. Add an explicit test that parses → formats → parses → formats and asserts both formatted outputs match. +4. `npm test` passes, including formatter round-trip tests and the golden corpus. +5. `npm run build` passes; TypeScript strict-mode errors are zero. + +**Out of scope:** the `Expr` collapse (next task) — this refactor only relocates source-fidelity fields, it does not change the semantic AST's shape. Surface syntax. + +**Dependency:** Refactor 5 (ModuleGraph, previous task) should be complete first so the parser is already I/O-pure when its return shape changes. + +*** + +## Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. + +**Why:** Every call-bearing AST node carries both `args: string` (raw text) and `bareIdentifierArgs: string[]` (a re-parse of which args happened to be bare identifiers). Validator must remember to check both. Emitter does its own re-parse of `args` because it doesn't trust either field alone. The dual representation is also why the validator has a `validateBareIdentifierArgs` helper called by hand at every site. + +**Scope:** + +- Introduce a typed `Arg` sum and replace the `args: string` + `bareIdentifierArgs?: string[]` pair on every call-bearing node: + + ```ts + type Arg = + | { kind: "literal"; raw: string } // "..." / ${var} / etc., as authored + | { kind: "var"; name: string }; // bare identifier reference + + // Call-bearing nodes carry args: Arg[]. No second field. + ``` + +- Parser does the bare-identifier classification once, at parse time. Validator and emitter consume `Arg[]` directly; no re-parse of `args` anywhere downstream. +- Affected nodes (non-exhaustive): every `WorkflowStepDef` variant with a call (`run`, `ensure`, `return.managed`, `log.managed`, `logerr.managed`, `send.rhs`), every `ConstRhs` capture variant. +- `validateBareIdentifierArgs` is deleted; its logic moves into the per-step validator that already walks the call. + +**Acceptance criteria** (each verified by a test): + +1. The field `bareIdentifierArgs` does not appear in any AST type definition under `src/types.ts`. A type-level test fails if it reappears. +2. No production code under `src/parse/` or `src/transpile/` re-parses the `args` string into bare-identifier components. A grep test fails if `args` is split on `,` or scanned char-by-char outside the tokenizer/parser. +3. `validateBareIdentifierArgs` is deleted; `validate.ts` contains no equivalent helper. A grep test fails if it reappears. +4. The full golden corpus passes byte-for-byte: `npm test`, including all `validate-*.test.ts` files and the golden corpus. +5. `npm run build` passes; TypeScript strict-mode errors are zero. + +**Out of scope:** the full `Expr` collapse (next task). Surface syntax. This refactor only changes how call arguments are represented; the call-bearing nodes themselves stay where they are. + +**Dependency:** None hard, but easier after the Trivia split (previous task) because the AST is otherwise stable. + +*** + +## Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. + +**Why:** The concept "a managed call that yields a value" is encoded three different ways in `src/types.ts`: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return`/`log`/`logerr` with a placeholder string (e.g. `value: "__match__"`, `value: "run inline_script"`). Inline scripts add a fourth (`run_inline_script_capture`). The same is true for `prompt`, `match`, and `ensure` captures. Validator, formatter, and emitter all have to know about the dual representation. + +**Scope:** + +- Introduce a single `Expr` sum type (or equivalent) used everywhere a value can appear: + + ```ts + type Expr = + | { kind: "literal"; raw: string } + | { kind: "var"; name: string; field?: string } + | { kind: "call"; callee: Ref; args: Arg[] } + | { kind: "ensure_call"; callee: Ref; args: Arg[] } + | { kind: "inline_script"; lang?: string; body: string; args?: Arg[] } + | { kind: "prompt"; body: Expr; returns?: Schema } + | { kind: "match"; subject: Expr; arms: MatchArm[] }; + ``` + +- Replace `ConstRhs` with `Expr`. +- Replace `SendRhsDef` with `Expr` (plus the channel arrow itself). +- `ReturnStep`, `LogStep`, `LogerrStep` become `{ value | message: Expr }`. The placeholder strings `"__match__"`, `"run inline_script"`, etc. are deleted. +- The `managed:` sidecar field is deleted from `WorkflowStepDef`. +- `WorkflowStepDef` ends up with ~7 variants (down from 14). +- All references to the deleted shapes in parser, validator, emitter, and formatter are migrated. + +**Acceptance criteria** (each verified by a test): + +1. The string literals `"__match__"`, `"run inline_script"`, and any other AST placeholder strings are absent from `src/`. Add a meta-test (e.g. a `grep` test) that fails if any reappear. +2. `WorkflowStepDef` has at most 8 variants. Add a type-level test (e.g. an exhaustive `switch` in a compile-time assertion file) that fails if a new variant is silently added. +3. `ConstRhs` and `SendRhsDef` are deleted as separate types; their fields are reachable via `Expr`. A test asserting the export surface of `src/types.ts` fails when those symbols reappear. +4. Every existing parser path that produced a `managed:` sidecar now produces an `Expr` node, and a new parser test asserts the AST shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …`. +5. `npm test` passes. The golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) passes byte-for-byte against emitted bash output. The formatter round-trip tests pass byte-for-byte against source. +6. `npm run build` passes; TypeScript strict-mode errors are zero. + +**Out of scope:** surface syntax, the validator's structural rewrite (Refactor 4), parser internals (Refactors 1 & 2). This refactor is purely an AST + producer/consumer migration. + +**Dependency:** The Trivia/CST split and `Arg[]` collapse (two previous tasks) should be complete first so the new `Expr` shape is designed against the semantic core only. + +*** + +## Fold the validator's pre-passes (knownVars / promptSchemas / immutableBindings) into a single workflow walk #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. + +**Why:** `src/transpile/validate.ts` walks each workflow's step tree at least three times before its main check loop runs: `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`. Each re-implements the same recursion over if/for_lines/catch/recover with subtly different rules — bug-fixes to "what counts as a binding here" land in 2–3 walkers. + +**Scope:** + +- Replace the three pre-passes with a single visitor that descends the workflow once, accumulating `{ knownVars, promptSchemas, bindings }` as it goes. +- The main per-step validator runs in the same descent (or as a second pass over the accumulated state), but the *structural* recursion over if/for_lines/catch/recover happens exactly once. +- All existing validation rules and error messages are preserved bit-for-bit. + +**Acceptance criteria** (each verified by a test): + +1. `collectKnownVars`, `collectPromptSchemas`, and `validateImmutableBindings` are deleted as separate functions. A grep test fails if they reappear by name. +2. There is exactly one recursion over workflow/rule step trees in `src/transpile/validate.ts`. A test counts recursive helpers that walk `WorkflowStepDef[]` and asserts ≤ 1. +3. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit. Snapshot test across every `validate-*.test.ts` fixture. +4. `npm test` passes, including all `validate-*.test.ts` files and the golden corpus. + +**Out of scope:** the visitor-table refactor (Refactor 4, two tasks ahead). Changes to validation rules. + +**Dependency:** The `Expr` collapse (previous task) should be complete first. + +*** + +## Replace fail-fast errors with a Diagnostics collector that aggregates per compile #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. + +**Why:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error. Users fix one error, recompile, fix the next, recompile. The validator also pre-orders some checks defensively because it knows it will only get to surface one error. A diagnostics collector lets the parser and validator append errors and the run report the full set at the end. + +**Scope:** + +- Introduce `class Diagnostics { errors: JaiphDiagnostic[]; add(...); hasFatal(): boolean; report(): never | void }` (or equivalent). +- Parser and validator append diagnostics instead of throwing for non-fatal errors. A "fatal" tier remains for cases where continuing would produce garbage AST (unterminated triple-quote, unterminated brace block). +- At the end of a compile, `Diagnostics.report()` either prints all collected errors sorted by file/line and exits non-zero, or returns cleanly. The CLI surfaces the full set instead of just the first. +- Existing call sites of `fail()` / `jaiphError()` migrate to `diagnostics.add(...)` where the error is recoverable. + +**Acceptance criteria** (each verified by a test): + +1. A fixture containing **N ≥ 3 independent errors** (e.g. an undefined channel, a duplicate import alias, and an unknown ref in a `run` call) reports all N errors in one compile, not just the first. Add a test that asserts the full set is reported in source order. +2. The existing single-error tests still pass: every `parse-*.test.ts` and `validate-*.test.ts` fixture that asserts a specific `{ message, line, col, code }` still gets exactly that error (now the only one in `Diagnostics`). +3. `fail()` and `jaiphError()` throwing call-sites are reduced to a documented "fatal" subset (count it in the test). Non-fatal call-sites use the collector. +4. CLI exit code on any non-empty `Diagnostics` is non-zero. Add an `e2e` or CLI test. +5. `npm test` and `npm run build` pass. + +**Out of scope:** changing what counts as an error (the *what*) — this refactor only changes the *how*. LSP integration (a follow-up). + +**Dependency:** None hard, but cheapest to do immediately before the visitor-table validator refactor (next task), since the new visitor's per-step entry/exit is the natural place to plug in the collector. + +*** + +## Replace the 1,441-line validator switch with a per-step visitor table indexed by scope #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. + +**Why:** `src/transpile/validate.ts` is one function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines). Each step type's validation is written twice with subtle differences, and the 5-check sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) is repeated by hand at 6+ sites per side — at least 12 places to keep in sync. + +**Scope:** + +- Replace the two inner walkers with a single AST visitor parameterized by a `Scope` value: + - `Scope` carries `allow: Set`, `refSpec: RefSpec`, and any other rule-vs-workflow differences. + - A `VALIDATORS: Record` table holds one validator per step type, written once. + - `validateCallStep("run" | "ensure")` is a single helper invoked by both `run` and `ensure` validators with different ref-spec / arity-kind arguments. +- The 5-check sequence is encapsulated in one helper (`validateManagedCallShape` or similar) invoked from each call-bearing validator. +- "Is this step allowed in this scope?" becomes a single set-lookup at the top of the visitor, not three throw sites. +- All existing error messages and error codes (`E_VALIDATE`, etc.) are preserved verbatim — both content and source location (line/col) must match what users see today. + +**Acceptance criteria** (each verified by a test): + +1. `src/transpile/validate.ts` is at most 700 lines (down from 1,441). Add a CI check (or test) that fails if it exceeds the bound. +2. `validateReferences` contains exactly one step-walking function. A grep test fails if a second walker is introduced. +3. Every `E_VALIDATE` error message and error location produced today is produced bit-for-bit by the new code. Add a snapshot-style test over every `validate-*.test.ts` fixture asserting `{ message, line, col, code }` matches the pre-refactor output. +4. Adding a new step type requires adding exactly one row to `VALIDATORS` and (if needed) updating the `Scope.allow` sets. Add a test that introduces a synthetic step type behind a test-only flag and asserts the validator rejects it with a single expected message until the row is added. +5. `npm test` passes (all of `validate-immutable-bindings.test.ts`, `validate-managed-calls.test.ts`, `validate-match.test.ts`, `validate-prompt-schema.test.ts`, `validate-ref-resolution.test.ts`, `validate-run-async.test.ts`, `validate-string.test.ts`, `validate-substitution.test.ts`, `validate-type-crossing.test.ts`, plus the golden corpus). + +**Out of scope:** changes to validation rules (the *what*) — this refactor only changes the *how*. Parser changes. AST changes (Refactor 3 must already be merged). + +**Dependency:** Refactor 3 (Expr collapse) and the single-pass-walk + Diagnostics tasks (previous two) must be complete first; otherwise the new visitor still needs to special-case the `managed:` sidecar and the pre-pass-walker pattern. + +*** + +## Decouple the validator from runtime semantics #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. + +**Why:** `src/transpile/validate.ts` imports `tripleQuotedRawForRuntime` from `src/runtime/orchestration-text.ts` so it can compute "what the runtime will see" when validating string content. That is a one-way dependency from compile-time on runtime semantics — a layering inversion that will keep biting if the runtime grows more such helpers. + +**Scope:** + +- Move the canonicalization of triple-quoted strings (currently `tripleQuotedRawForRuntime`) into a parser-side helper (e.g. `src/parse/triple-quote.ts:canonicalizeTripleQuotedString`). +- The validator imports from `src/parse/`, not `src/runtime/`. +- The runtime, if it still needs the same canonical form at runtime, imports from `src/parse/` as well (or the canonical form is baked in at compile time by the emitter). +- Any other `validate*.ts → runtime/*` imports get the same treatment. + +**Acceptance criteria** (each verified by a test): + +1. No file under `src/transpile/` imports from `src/runtime/`. A grep test fails if any such import appears. +2. The canonical string for every triple-quoted form in `test-fixtures/` and `examples/` is bit-for-bit unchanged before and after the move. A test compares pre/post output for every fixture. +3. `npm test` passes, including the golden corpus and all `validate-string.test.ts` cases. +4. `npm run build` passes; TypeScript strict-mode errors are zero. + +**Out of scope:** rethinking what the canonical form *is*. This refactor only relocates the helper. + +**Dependency:** None. + +*** + +## Unify `catch` and `recover` parsing into a single attached-block routine #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 2. + +**Why:** `src/parse/steps.ts` contains three near-identical 100+ line functions — `parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep` — that parse the same syntactic shape (` (binding) { body } | single-stmt`) and differ only in which host step they decorate and the literal keyword. Their body parser, `parseCatchStatement` (~280 lines), re-implements a stripped-down version of `parseBlockStatement` with diverging coverage. + +**Scope:** + +- Replace `parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep`, and `parseCatchStatement` with: + - `parseAttachedBlock(keyword: "catch" | "recover", host: WorkflowStepDef)` returning `{ bindings, body: WorkflowStepDef[] }`. + - A body parsed by the **same** `parseBlockStatement` used at the top level — no mini parser. +- All four functions and any helpers that exist only to serve them are deleted from `src/parse/steps.ts`. +- "Is this statement allowed inside a catch/recover body?" is a validator concern after this refactor, not enforced by which mini-parser branches happen to fire. + +**Acceptance criteria** (each verified by a test): + +1. `src/parse/steps.ts` is at most 200 lines (down from 757), and contains no function whose name matches `/parse(Run)?(Catch|Recover|EnsureStep)/`. A grep/size test fails if either bound is violated. +2. `parseBlockStatement` is the single entry point for any statement appearing inside a catch or recover body. Add a test that introduces a new statement form (behind a test-only flag) and asserts it is accepted identically at top level and inside `catch (e) { … }` and `recover(e) { … }` without parser changes inside the catch/recover code path. +3. Every existing parse error message and location related to `catch` / `recover` (bindings missing, too many bindings, unterminated block, etc.) is preserved bit-for-bit. Snapshot test over `parse-*.test.ts` fixtures. +4. The full parser/validator/emitter golden corpus passes byte-for-byte: `npm test`, including `parse-steps.test.ts`, `parse-bare-call.test.ts`, `parse-run-async.test.ts`, `compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`. + +**Out of scope:** the wider tokenizer rewrite (next task) — this task explicitly stays on the line-walking parser, since the goal is incremental simplification. Validator changes beyond minor message preservation. + +**Dependency:** Refactor 3 (AST collapse) should be complete first so the unified parser emits `Expr` nodes directly. If it is not, this task may proceed but must avoid introducing new producers of the deprecated `managed:` sidecar. + +*** + +## Replace the line-by-line ad-hoc parser with a tokenizer + recursive-descent parser #dev-ready + +**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 1. + +**Why:** The current parser walks `lines: string[]`, returns `{ step, nextIdx }` from every routine, and dispatches statements via a long cascade of `startsWith` + regex in `parseBlockStatement` (`src/parse/workflow-brace.ts:102-615`). Order matters — `"run async "` before `"run "`, etc. Quote/triple-quote/backtick/fence/brace state is re-implemented from scratch in at least seven independent scanners across `src/parse/`. Adding a new keyword or fixing a string-aware scanner means changes in multiple places. + +**Scope:** + +- Introduce a tokenizer (`src/parse/tokenize.ts` or similar) that owns *all* scanning state: identifiers, keywords, string literals (single + triple-quoted), backtick bodies, fenced code blocks, line comments, braces, parens, the send arrow `<-`, the match arm arrow `=>`, etc. +- Introduce a recursive-descent parser that consumes the token stream and dispatches via a `STATEMENT: Record` table. +- All ad-hoc scanners in `src/parse/` are deleted: `splitCatchStatements` (if still present), `splitStatementsOnSemicolons`, `matchSendOperator`, `hasUnquotedSendArrow`, `indexOfClosingDoubleQuote`, `stripQuotedArgContent`, `parseSendRhs`'s internal scanner, and any `inDoubleQuote` / `inTripleQuote` / `braceDepth` state machines outside the tokenizer. +- Surface syntax is unchanged. Error messages and error locations are preserved bit-for-bit where the existing tests assert them, and at minimum match in `code` + `line` + `col` everywhere else. +- Staging: it is acceptable (and recommended) to land the new parser behind a flag, run both parsers on the golden corpus in CI, diff their ASTs, and remove the old parser only once the diff is empty. + +**Acceptance criteria** (each verified by a test): + +1. `src/parse/` is at most 4,000 lines total (down from ~8,150), excluding test files. A CI check fails if exceeded. +2. The substrings `inDoubleQuote`, `inTripleQuote`, `braceDepth` appear only inside the tokenizer module. A grep test fails if any of those state-tracking idioms appear in other files under `src/parse/` or `src/transpile/`. +3. `parseBlockStatement` (or whatever the equivalent dispatcher is in the new parser) dispatches via a table, not a cascade. The size of any single function in `src/parse/` is bounded — no function exceeds 120 lines. A test computing function lengths fails if exceeded. +4. Every existing parse-error location and message asserted by `src/parse/parse-*.test.ts` matches verbatim. Add a snapshot test that re-emits `{ code, message, line, col }` for every error fixture and fails on any diff. +5. Adding a new top-level keyword (e.g. a synthetic `noop` for the test) requires changes in exactly two files (the tokenizer's keyword set + the `STATEMENT` table). A test introduces a synthetic keyword behind a flag and asserts it parses without touching any other file. +6. The full golden corpus passes byte-for-byte: `npm test`, including `compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`, all `parse-*.test.ts` files, and the formatter round-trip tests. +7. `npm run build` passes; TypeScript strict-mode errors are zero. + +**Out of scope:** adopting a parser generator (the grammar is small and the line-oriented language sensibility maps cleanly to a hand-written tokenizer). Surface syntax changes. Runtime / `runtime/` changes. + +**Dependency:** All previous tasks (Refactors 5, 3, 4, 2 plus all five appendix tasks) should be complete first so the new parser only has to target one AST shape and the validator does not need to special-case parser quirks during the transition. + +*** diff --git a/design/2026-05-15-parser-compiler-simplification.md b/design/2026-05-15-parser-compiler-simplification.md new file mode 100644 index 00000000..f2d2d09d --- /dev/null +++ b/design/2026-05-15-parser-compiler-simplification.md @@ -0,0 +1,347 @@ +# Parser & Compiler Simplification — design doc + +*Five refactors to compress `src/parse/` and `src/transpile/` by roughly a third, make the AST a clean sum type, and turn "add a new step or keyword" into a one-place change.* + +**Status:** design — ready for implementation +**Date (UTC):** 2026-05-15 + +--- + +## Problem + +The parser and compiler work, and the golden-test corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) pins their behavior tightly. But the code has accumulated: + +- Parallel cascades of `startsWith` + regex dispatch (`src/parse/workflow-brace.ts`, 615 lines). +- Seven independent copies of the same quote-aware scanner (`splitCatchStatements`, `splitStatementsOnSemicolons`, `matchSendOperator`, `hasUnquotedSendArrow`, `indexOfClosingDoubleQuote`, `stripQuotedArgContent`, the scanner inside `parseSendRhs`). +- Three near-identical 100+ line catch/recover parsers (`parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep` in `src/parse/steps.ts`) plus a mini parser (`parseCatchStatement`) that re-implements `parseBlockStatement`. +- An AST in which "managed call that yields a value" has **three different encodings** (`run_capture` const RHS; statement form; `managed:` sidecar on `return`/`log`/`logerr` with a placeholder `value: "__match__"` string). +- A 1,441-line `validate.ts` with two near-identical step walkers (`validateRuleStep`, `validateStep`) that each manually repeat the 5-check sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) at ~6 sites per side. +- Three different traversal strategies for "the set of modules in this build": the validator recursively re-reads + re-parses imports via `ValidateContext` callbacks; `emitScriptsForModule` wraps the same callbacks with a `prep` cache; `buildScripts` walks the file system directly. + +None of this is broken. All of it makes the code expensive to change and easy to break in subtle ways (e.g. a fix to triple-quote-aware splitting has to be applied in 2–4 places, and divergence between them isn't always caught by the existing tests). + +The five refactors below address the structural issues, in the order I recommend implementing them. + +--- + +## Refactor 1 — Real tokenizer instead of line-walking + regex cascades + +**Touches:** `src/parser.ts`, `src/parse/workflow-brace.ts` (615 lines), `src/parse/steps.ts` (757 lines), `src/parse/statement-split.ts` (304 lines), `src/parse/core.ts` (scanner helpers). + +### Current shape + +The parser walks `lines: string[]` and every routine returns `{ step, nextIdx }`. Statement dispatch is a long cascade of `startsWith` + regex in `parseBlockStatement` (`src/parse/workflow-brace.ts:102-615`). Order matters — `"run async "` must be tested before `"run "`, `"prompt "` before bare assignment, etc. Adding a new keyword means finding the right slot in the cascade. + +Quote-aware string scanning is re-implemented from scratch in at least seven places (grep `inDoubleQuote`, `inTripleQuote`, `braceDepth` across `src/parse/`). Each copy has slightly different rules for escaping, triple-quotes, and brace nesting. + +```ts +// Today (src/parse/workflow-brace.ts): +if (inner.startsWith("run async ")) { /* 40 lines */ } +if (inner.startsWith("run ")) { /* 50 lines */ } +if (inner.startsWith("ensure ")) { ... } +if (inner.startsWith("log ")) { ... } +// ... 14 more branches +``` + +### Proposed shape + +A tokenizer that owns string/triple-quote/backtick/fence/comment/brace state, plus a recursive-descent parser that consumes a token stream and dispatches via table lookup. + +```ts +// Proposed: +const tokens = tokenize(source); // single source of truth for scanning +const ast = parseModule(tokens); // recursive descent + +const STATEMENT: Record = { + run: parseRunStatement, + ensure: parseEnsureStatement, + log: parseLogStatement, + // ... +}; +``` + +### Net effect + +- One canonical scanner instead of seven. +- A new statement form becomes a one-file change (add a row to `STATEMENT`). +- Expected reduction: **~1,500 lines** in `src/parse/`. + +### Constraints + +- Must pass the full existing golden test corpus byte-for-byte. +- Staged behind a flag (run both parsers, diff ASTs in CI) during transition is acceptable. + +--- + +## Refactor 2 — Unify `catch` / `recover` / inline-block parsing + +**Touches:** `src/parse/steps.ts` — `parseEnsureStep` (130 lines), `parseRunCatchStep` (110 lines), `parseRunRecoverStep` (110 lines), `parseCatchStatement` (280 lines). + +### Current shape + +Three near-identical 100+ line functions parse the same syntactic shape: + +``` + (binding) { body } | single-stmt +``` + +They differ in only two things: which host step they decorate (`ensure` vs `run`) and the literal keyword (`catch` vs `recover`). + +The body parser inside them, `parseCatchStatement` (`src/parse/steps.ts:89-389`), is itself a stripped-down copy of `parseBlockStatement`. The two diverge in subtle ways — e.g. `parseCatchStatement` handles return/fail/run/ensure/prompt/log via slightly different regexes than the main path. + +### Proposed shape + +```ts +function parseAttachedBlock( + keyword: "catch" | "recover", + host: WorkflowStepDef, +): { bindings: { failure: string }; body: WorkflowStepDef[] }; + +// Body parsed by the SAME parseStatement used at the top level. +``` + +### Net effect + +- One body parser instead of two. +- "Is this statement allowed inside a catch?" becomes a validator concern (Refactor 4), not something the parser enforces by what each mini-routine happens to recognize. +- Expected reduction: **~400 lines**. + +--- + +## Refactor 3 — One `Call` / `Expr` shape, not three "managed" encodings + +**Touches:** `src/types.ts` — `WorkflowStepDef` (14 variants), `ConstRhs` (6 kinds), `SendRhsDef` (5 kinds). + +### Current shape + +The same concept — "a managed call that yields a value" — is encoded three different ways depending on where it appears: + +```ts +// As a statement: +{ type: "run", workflow, args, ... } + +// As a const RHS: +{ kind: "run_capture", ref, args, ... } + +// As a return / log / logerr value: +{ + type: "return", + value: "__match__", // placeholder string for the formatter + managed: { kind: "match", match }, +} +``` + +The `return + managed` form is the worst offender. It stores placeholder strings (`"__match__"`, `"run inline_script"`, `"run foo(...)"`) so the formatter has something to print, while the real semantic payload lives in `managed`. Validator and emitter both have to know about the dual representation. Inline scripts add a fourth variant — `run_inline_script_capture` — that is yet another form of the same idea. + +### Proposed shape + +```ts +type Expr = + | { kind: "literal"; raw: string; tripleQuoted?: boolean } + | { kind: "var"; name: string; field?: string } + | { kind: "call"; callee: Ref; args: Arg[]; bareIdentifierArgs?: string[] } + | { kind: "ensure_call"; callee: Ref; args: Arg[]; bareIdentifierArgs?: string[] } + | { kind: "inline_script"; lang?: string; body: string; args?: string } + | { kind: "prompt"; body: Expr; returns?: Schema } + | { kind: "match"; subject: Expr; arms: MatchArm[] }; + +// Everywhere a value can appear, it is now an Expr: +type ConstRhs = Expr; +type SendRhs = Expr | ChannelArrow; +type ReturnStep = { type: "return"; value: Expr; loc: SourceLoc }; +type LogStep = { type: "log"; message: Expr; loc: SourceLoc }; +``` + +### Net effect + +- `WorkflowStepDef` drops from ~14 → ~7 variants. +- Validator's per-step duplication of "is there a managed call here?" disappears — one `validateExpr` recursion handles it. +- The placeholder-string + sidecar pattern goes away entirely. + +### Migration note + +This is a breaking AST change, but the on-disk surface syntax does not move. The hard-rewrite policy (per `QUEUE.md`) allows this. Golden tests must pass byte-for-byte against the emitted bash output; the AST shape they pin (if any) is internal and is allowed to change. + +--- + +## Refactor 4 — Validator as a visitor table, not a 1,441-line switch + +**Touches:** `src/transpile/validate.ts` (1,441 lines, one function). + +### Current shape + +`validateReferences` contains two near-identical inner functions — `validateRuleStep` (~250 lines) and `validateStep` (~350 lines) — each a big switch over step types. They differ in three things: + +1. Which step types are allowed (`prompt` / `send` are rejected in rules). +2. Which ref-expectation spec is used (`RULE_REF_EXPECT` vs `RUN_TARGET_REF_EXPECT`). +3. Whether the scope is workflow-wide or rule-wide. + +Each step type's validation is written twice with subtle differences. The 5-check sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) is repeated by hand at 6+ sites per side, which means at least 12 places to keep in sync. + +### Proposed shape + +```ts +const VALIDATORS: Record = { + ensure: validateCallStep("ensure"), + run: validateCallStep("run"), + prompt: validatePrompt, + log: validateMessageStep("log"), + send: validateSend, + // ... +}; + +const SCOPE = { + workflow: { allow: ALL, refSpec: workflowRefs }, + rule: { allow: ALL.minus(["prompt","send"]), refSpec: ruleRefs }, +}; + +walk(ast, (step, ctx) => { + if (!ctx.scope.allow.has(step.type)) reject(step); + VALIDATORS[step.type](step, ctx); +}); +``` + +### Net effect + +- Each check (redirection, nested-managed, ref, arity, bare-args) is written once. +- "Is this step allowed here?" is a one-line set lookup, not three throw sites. +- Expected reduction: **~500–700 lines**. + +--- + +## Refactor 5 — Promote `CompilePrep` to a first-class `ModuleGraph` + +**Touches:** `src/transpile/compile-prep.ts`, `src/transpiler.ts`, `src/transpile/build.ts`, `src/transpile/validate.ts`. + +### Current shape + +The parser is intended to be pure (`source → AST`), but in practice the validator takes a `ValidateContext`: + +```ts +interface ValidateContext { + resolveImportPath: (fromFile, importPath, ws?) => string; + existsSync: (path) => boolean; + readFile: (path) => string; + parse: (content, filePath) => jaiphModule; + workspaceRoot?: string; +} +``` + +…so it can recursively read + re-parse imported modules. `emitScriptsForModule` then re-wraps those same callbacks with an optional `prep` cache. `buildScripts` walks the file system on its own. There are three different traversal strategies for "the set of modules in this build." + +`compile-prep` already proved the right model — pre-parse all reachable modules once, hand them to validator and emitter. It just isn't the only path. + +### Proposed shape + +```ts +// Pipeline: +const graph = loadModuleGraph(entry, workspaceRoot); // discover + parse-all +validate(graph); // pure, in-memory +emit(graph, outDir); // pure, in-memory + +// parsejaiph(source, file): jaiphModule — now I/O-pure. +// validate, emit never touch disk. +``` + +### Net effect + +- Parser becomes I/O-pure (easier to fuzz, easier to test). +- Validator drops its `ValidateContext` shape. +- Build, validate, and emit all read from one place. +- Same path serves single-file LSP edits (graph rooted at one file) and full compile (graph rooted at workspace root). +- Expected reduction: **~300 lines**. + +--- + +## Ordering rationale + +1. **Refactor 5 (ModuleGraph) first.** Mechanical, low-risk, unblocks the rest by making the parser pure. Existing acceptance tests pin behavior. +2. **Refactor 3 (Expr collapse) next.** Doing this before tokenizing means the new parser only has to target one expression shape. +3. **Refactor 4 (visitor-table validator).** With a simpler AST, this is straight refactoring against the golden corpus. +4. **Refactor 2 (unify catch/recover).** Cheap win, drops ~400 lines. +5. **Refactor 1 (tokenizer + RD parser) last.** Biggest change. Should sit on top of a cleaned-up AST and a pure pipeline so it can be staged behind a flag and run side-by-side with the old parser against the golden corpus. + +## Out of scope + +- **Parser generator.** The grammar is small and the line-oriented sensibility of the language (triple-quoted blocks, fence blocks, comments-on-their-own-line) maps cleanly to a hand-written tokenizer. +- **Surface syntax changes.** None of these refactors are user-visible. The golden test corpus pins behavior. +- **Runtime.** The bash emitter and `runtime/` stay put. + +--- + +## Appendix — Secondary improvements (A–E) + +The five refactors above are the load-bearing changes. The five below are smaller in scope but each addresses a real structural issue that the top 5 do not fully solve on their own. Where a secondary item is coupled to a top-5 refactor, the ordering rationale below makes the dependency explicit. + +### A — Split source-fidelity data from the semantic AST (CST / trivia layer) + +**Touches:** `src/types.ts`, plus every parser/formatter/validator/emitter consumer. + +`WorkflowStepDef` and `jaiphModule` today carry roughly ten fields that exist *only* so the formatter can round-trip: `leadingComments`, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence`, `topLevelOrder`, `bareSource`, the `tripleQuoted` flags on `literal`/`return`/`log`/`fail`/`send`/`const`, `bodyKind`, `bodyIdentifier`. Every consumer that does *not* care about formatting (validator, emitter) has to either ignore them or thread them through unchanged. + +**Proposed:** introduce a parallel `Trivia` map (keyed by node id) or a separate CST layer that owns the source-fidelity data. The semantic AST stops carrying it; formatter reads from `Trivia` alongside the AST. + +**Why it is appendix-only:** it changes most of the AST consumers, but the change is mechanical once the boundary is drawn. Biggest payoff if scheduled **before** Refactor 3, so the `Expr` shape is decided after the source-fidelity fields have been pulled out and the semantic core is visible. + +### B — Diagnostics collector instead of fail-fast error reporting + +**Touches:** `src/parse/core.ts` (`fail`), `src/errors.ts` (`jaiphError`), every call site in `src/parse/` and `src/transpile/`. + +Today `fail()` and `jaiphError()` both throw on the first error. A user fixes one error, recompiles, fixes the next, recompiles, etc. This is also the reason for some defensive ordering inside the validator — it tries to surface the "most useful" error first because it knows it will only get to surface one. + +**Proposed:** introduce a `Diagnostics` collector. Parser and validator append errors instead of throwing; the compile run reports the full set at the end (sorted by file/line). A "fatal" tier still exists for cases where continuing would produce garbage. + +**Why it is appendix-only:** almost zero marginal cost if done as part of Refactor 4 (visitor-table validator), since the new visitor already needs a unified entry/exit per step. Doing it standalone is also fine but touches more files. + +### C — Single-pass workflow walk + +**Touches:** `src/transpile/validate.ts`. + +The validator walks each workflow's step tree at least three times before its main check loop runs: `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`. Each walks the same nested step structure (if/for_lines/catch/recover) with subtly different recursion rules. Bug-fixes to "what counts as a binding here" land in 2–3 walkers. + +**Proposed:** one visitor that accumulates `{knownVars, promptSchemas, bindings}` as it descends, and the main per-step validator runs after (or during) that single descent. + +**Why it is appendix-only:** falls out naturally inside Refactor 4. Doing it separately is a fine ~50-line refactor. + +### D — Collapse `bareIdentifierArgs` into a typed `Arg[]` + +**Touches:** `src/types.ts`, `src/parse/core.ts` (`parseCallRef`), validator and emitter. + +Today every call-bearing node carries both `args: string` (raw text) and `bareIdentifierArgs: string[]` (a re-parse of which arguments happened to be bare identifiers). The validator must remember to check `bareIdentifierArgs` exists at each call site. The emitter has to do its own re-parse of `args` because it doesn't trust either field alone. + +**Proposed:** + +```ts +type Arg = + | { kind: "literal"; raw: string } + | { kind: "var"; name: string }; + +// Calls carry args: Arg[]. No second field. No re-parsing downstream. +``` + +**Why it is appendix-only:** can be done inside Refactor 3 (it is part of the same "single AST shape per concept" story) or as a standalone task. Standalone is cleaner if Refactor 3 is otherwise too large. + +### E — Decouple the validator from the runtime + +**Touches:** `src/transpile/validate.ts` (the `import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"` at the top), `src/runtime/orchestration-text.ts`. + +The validator imports a runtime helper (`tripleQuotedRawForRuntime`) so it can compute "what the runtime will see" when reporting errors. That is a one-way dependency from compile-time on runtime semantics. The right direction is the opposite: the parser/validator decides the canonical string, and the runtime consumes that decision. + +**Proposed:** move the canonicalization into a parser-side helper (e.g. `src/parse/triple-quote.ts:canonicalizeTripleQuotedString`). The runtime imports *that* instead of the validator importing a runtime function. + +**Why it is appendix-only:** small surface (one helper, ~30 lines), but it removes a layering inversion that will keep biting if the runtime grows more such helpers. + +### Ordering with the top 5 + +``` +1. Refactor 5 (ModuleGraph) +2. A (CST/trivia split) ← before Refactor 3 to settle AST shape +3. D (typed Arg[]) ← can fold into Refactor 3 if scoped slightly wider +4. Refactor 3 (Expr collapse) +5. C (single-pass workflow walk) ← prep for validator +6. B (Diagnostics collector) ← prep for validator +7. Refactor 4 (visitor-table validator) +8. E (decouple validator/runtime) +9. Refactor 2 (unify catch/recover) +10. Refactor 1 (tokenizer + RD parser) +``` From 12c7e5a5bbea703d2f087bab4b04d20f9ab0e990 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 11:13:41 +0200 Subject: [PATCH 05/14] Refactor: promote CompilePrep to ModuleGraph with I/O-pure pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace three divergent module-discovery strategies (validator callbacks, transpiler re-parse paths, and the file-system walk in buildScripts) with a single ModuleGraph representation. Both validate(graph) and emit(graph, outDir) now operate entirely in-memory; ValidateContext and the optional prep cache are gone. parsejaiph is provably fs-free, enforced by a stub-fs test against the full fixture corpus. LSP edits and full compiles share the pipeline — only the graph root differs. --- CHANGELOG.md | 2 +- QUEUE.md | 28 --- README.md | 2 +- docs/architecture.md | 98 ++++---- docs/cli.md | 8 +- docs/contributing.md | 2 +- docs/grammar.md | 2 +- docs/language.md | 2 +- docs/libraries.md | 2 +- docs/testing.md | 4 +- src/cli/commands/compile.ts | 41 +-- src/cli/commands/run.ts | 18 +- src/cli/commands/test.ts | 12 +- src/runtime/kernel/graph.ts | 66 ++--- src/runtime/kernel/node-test-runner.test.ts | 9 +- src/runtime/kernel/node-test-runner.ts | 17 +- src/runtime/kernel/node-workflow-runner.ts | 8 +- src/transpile/build.ts | 84 ++++--- src/transpile/compile-prep.ts | 69 ------ src/transpile/emit-from-graph.ts | 38 +++ ...pile-prep.test.ts => module-graph.test.ts} | 112 ++++----- src/transpile/module-graph.ts | 118 +++++++++ src/transpile/pipeline-io-purity.test.ts | 233 ++++++++++++++++++ src/transpile/validate.ts | 47 ++-- src/transpiler.ts | 78 +++--- test-infra/compiler-test-runner.ts | 14 +- 26 files changed, 679 insertions(+), 435 deletions(-) delete mode 100644 src/transpile/compile-prep.ts create mode 100644 src/transpile/emit-from-graph.ts rename src/transpile/{compile-prep.test.ts => module-graph.test.ts} (59%) create mode 100644 src/transpile/module-graph.ts create mode 100644 src/transpile/pipeline-io-purity.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 4f8780d6..17e086ac 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,6 @@ # Unreleased -- **Performance — `jaiph run` local single-parse compile prep:** The default local `jaiph run ` path no longer parses the entry module twice and no longer re-parses the full import closure inside the spawned `node-workflow-runner` child. A new `prepareCompile` (`src/transpile/compile-prep.ts`) walks the entry plus its transitive `.jh` imports exactly once and returns a `CompilePrep` record (`{ entryFile, workspaceRoot, astByFile }`). `src/cli/commands/run.ts` reuses the entry AST for `metadataToConfig` (no second parse for the banner), passes the prep into `buildScripts(..., prep)` so `emitScriptsForModule` skips per-file `readFileSync` + `parsejaiph`, and writes a deterministic JSON snapshot to `/.jaiph-compile-prep.json` via `writeCompilePrep`. The spawned runner reads it through the new internal env var `JAIPH_COMPILE_PREP_FILE` and forwards the deserialized prep to `buildRuntimeGraph(entry, workspaceRoot, prep)`, which now consumes the cached `Map` instead of re-walking the import closure on disk. `attachScriptImportStubs` is factored out of `graph.ts` and is idempotent across cached and uncached paths. The env var is set **only** for non-Docker host runs (when `JAIPH_DOCKER_ENABLED` is off); `jaiph run --raw`, `jaiph test`, and Docker launches do not set it and keep their existing parse paths. User-visible run semantics — banner, hooks, run artifacts, `run_summary.jsonl`, return values, exit codes, and `__JAIPH_EVENT__` streaming — are unchanged. New tests in `src/transpile/compile-prep.test.ts` corrupt every source file on disk after `prepareCompile`, then call `buildScripts` + `buildRuntimeGraph` to prove no second parse happens; they also cover cross-module workflow/rule/script resolution, a three-module closure, and the serialize → deserialize → graph round-trip used to cross the parent → child process boundary. Docs updated in `docs/architecture.md` and `docs/cli.md`. +- **Refactor — `ModuleGraph` is the single representation of "all `.jh` modules reachable from an entry point, parsed once":** The previous three traversal strategies for compile-time module discovery (validator re-reading imports through `ValidateContext`, `emitScriptsForModule` re-wrapping the same callbacks with an optional `prep` cache, and `buildScripts` walking the filesystem directly) collapse to one path. `parsejaiph(source, filePath)` is now strictly I/O-pure — it can no longer reach `fs`. The single discovery routine `loadModuleGraph(entry, workspaceRoot?)` (`src/transpile/module-graph.ts`) walks the entry plus its transitive `import` closure and returns `{ entryFile, workspaceRoot?, modules: Map }`; every other compile-time consumer takes the graph and never re-reads `.jh` from disk. `validateReferences(graph)` and `emitScriptsForModuleFromGraph(graph, file, rootDir)` operate entirely in-memory. The `ValidateContext` interface (`resolveImportPath` / `existsSync` / `readFile` / `parse` / `workspaceRoot` callbacks) is deleted from `src/transpile/validate.ts`; the validator consumes the graph and uses `existsSync` only to resolve `import script` paths (non-`.jh` bodies). `CompilePrep` / `prepareCompile` / `writeCompilePrep` / `readCompilePrep` and the optional `prep?` parameter on `emitScriptsForModule` / `buildScripts` are gone; `buildScripts(input, outDir, ws?)` now loads a graph internally and `buildScriptsFromGraph(graph, outDir, rootDir?)` is the entry point for callers that already loaded one. `buildRuntimeGraph` accepts either an entry file path (legacy) or an already-loaded `ModuleGraph` — `RuntimeGraph` is a type alias for `ModuleGraph` (the only "all reachable modules" representation in the codebase). The cross-process cache file moves to `/.jaiph-module-graph.json` (deterministic JSON: entries sorted by absolute path, ASTs included verbatim) via `writeModuleGraph` / `readModuleGraph`, and the internal env var the spawned `node-workflow-runner.js` reads is renamed `JAIPH_MODULE_GRAPH_FILE` (replacing `JAIPH_COMPILE_PREP_FILE`). Scope of the env-var hand-off is unchanged: set only for the default local non-Docker `jaiph run` path; `jaiph run --raw`, `jaiph test`, and Docker launches fall back to `loadModuleGraph` from the source file. User-visible contracts — banner, hooks, run artifacts, `run_summary.jsonl`, `return_value.txt`, exit codes, `__JAIPH_EVENT__` streaming, CLI usage, and the full golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) — are unchanged byte-for-byte. New tests (`src/transpile/module-graph.test.ts`, `src/transpile/pipeline-io-purity.test.ts`) stub `node:fs` to throw on any `.jh` read and run the full pipeline against `test-fixtures/` to pin the I/O-purity invariant; another test instruments `parsejaiph` with a call counter to assert no duplicate parses across `loadModuleGraph` → `validateReferences` → `emit` → `buildRuntimeGraph` for fixtures with transitive imports. `src/transpile/compile-prep.ts` and `compile-prep.test.ts` are removed. Docs updated in `docs/architecture.md`, `docs/cli.md`, and `docs/testing.md`. Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. - **Performance — `jaiph install` parallelism:** Missing-library clones now run in parallel through a small bounded-concurrency executor (default 4 in flight), replacing the previous sequential `execSync` loop. The user contract is unchanged: warm-path libraries (target directory exists and `--force` is absent) still skip without invoking `git` for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. The default clone runner now uses `spawn("git", ["clone", "--depth", "1", …])` so multiple clones can overlap network and process latency. `runInstall` is now `async` and exposes injectable `CloneRunner` / `concurrency` options for testing. Tests cover concurrent overlap (peak in-flight ≥ 2), warm-path skipping for explicit args and restore, invalid-remote and unknown-ref failure paths, mixed success/failure lockfile bookkeeping, and the existing corrupt/missing-lockfile behavior. Docs updated in `docs/cli.md` and `docs/libraries.md`. # 0.9.4 diff --git a/QUEUE.md b/QUEUE.md index f7fd8fa9..4be2bce2 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,34 +13,6 @@ Process rules: *** -## Promote `CompilePrep` to a first-class `ModuleGraph` and make the parser I/O-pure #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. - -**Why:** Three different traversal strategies exist for "the set of modules in this build" — the validator recursively re-reads + re-parses imports via `ValidateContext` callbacks (`src/transpile/validate.ts`), `emitScriptsForModule` (`src/transpiler.ts`) re-wraps the same callbacks with an optional `prep` cache, and `buildScripts` (`src/transpile/build.ts`) walks the file system directly. `compile-prep` already proved the right model — pre-parse all reachable modules once, hand them to validator and emitter — but it is an optimization, not the path. - -**Scope:** - -- Introduce `ModuleGraph` (generalization of `CompilePrep`) as the single representation of "all modules reachable from an entry point, parsed once." -- `parsejaiph(source, filePath)` must remain a pure function `(string, string) => jaiphModule`. No fs calls reachable from `parsejaiph`. -- `validate(graph)` and `emit(graph, outDir)` must operate entirely in-memory. The `ValidateContext` callback shape (`resolveImportPath`, `existsSync`, `readFile`, `parse`, `workspaceRoot`) is removed. -- A single discovery routine (`loadModuleGraph(entry, workspaceRoot?)`) replaces `collectTransitiveJhModules`, the cache-population logic in `compile-prep.ts`, and the bespoke re-parse paths inside `validateReferences` / `emitScriptsForModule`. -- The `prep?` optional parameter on `emitScriptsForModule` and `buildScripts` goes away; both take a `ModuleGraph`. -- LSP / single-file edits and full compiles must share the same pipeline — only the graph root differs. - -**Acceptance criteria** (each verified by a test that fails when violated): - -1. `parsejaiph` cannot reach `fs`. A unit test stubs `node:fs` to throw on any call and parses every fixture in `test-fixtures/` and `examples/`; all must succeed. -2. `validate(graph)` and `emit(graph, outDir)` cannot reach `fs` for source/AST reads (writing emitted scripts is allowed inside `emit`). A unit test stubs `fs.readFileSync`/`fs.existsSync` to throw on any `.jh` path and runs the full pipeline against `test-fixtures/`; all must succeed. -3. `ValidateContext` is deleted from `src/transpile/validate.ts`; `validateReferences` takes a `ModuleGraph` (or equivalent) only. -4. Each `.jh` source file in a compile is parsed exactly once. A test instruments `parsejaiph` with a call counter and asserts no duplicate parses across the full pipeline for at least one fixture with transitive imports. -5. `npm test` and `npm run build` pass. The full golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against emitted output. -6. The CLI entry points (`src/cli.ts`, `src/cli/`) and `e2e` tests pass unchanged from a user perspective. - -**Out of scope:** changes to the AST shape (Refactor 3), the validator switch structure (Refactor 4), the parser internals (Refactors 1 & 2), and any surface syntax. - -*** - ## Split source-fidelity data from the semantic AST into a Trivia / CST layer #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. diff --git a/README.md b/README.md index baeb4b2c..085c2706 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ - **Parser** (`src/parser.ts`, `src/parse/*`) — `.jh` / `.test.jh` → AST. - **Validator** (`src/transpile/validate.ts`) — imports and symbol references at compile time. - **Transpiler** (`src/transpile/*`) — emits atomic `script` files under `scripts/` only (no workflow-level shell). -- **Node workflow runtime** (`src/runtime/kernel/node-workflow-runtime.ts`, `graph.ts`) — interprets the AST; `buildRuntimeGraph()` is parse-only across imports. +- **Node workflow runtime** (`src/runtime/kernel/node-workflow-runtime.ts`, `graph.ts`) — interprets the AST; `buildRuntimeGraph(graph)` consumes the `ModuleGraph` produced by `loadModuleGraph` (no filesystem reads). - **Node test runner** (`src/runtime/kernel/node-test-runner.ts`) — `*.test.jh` blocks with mocks. - **JS kernel** (`src/runtime/kernel/`) — prompts, managed scripts, `__JAIPH_EVENT__`, inbox, mocks. Diagrams, runtime contracts, on-disk artifact layout, and distribution: **[Architecture](docs/architecture.md)**. Test layers and E2E policy: **[Contributing](docs/contributing.md)**. diff --git a/docs/architecture.md b/docs/architecture.md index 46ae80ef..d6f9a666 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -19,8 +19,8 @@ For **how to contribute** — branches, test layers, E2E assertion policy, and b Workflow authors write `.jh` / `.test.jh` modules. The toolchain turns those files into **validated** modules plus **extracted script files**, then the **same AST interpreter** runs workflows whether you use local `jaiph run`, Docker, or `jaiph test`. -1. Parse source into AST. For the default local `jaiph run ` path, the CLI walks the entry plus its transitive `.jh` import closure **once** through **`prepareCompile`** (`src/transpile/compile-prep.ts`) and reuses that **`CompilePrep`** for the banner (`metadataToConfig`), for **`buildScripts`** (script-body extraction), and — across the parent → child process boundary — for **`buildRuntimeGraph`** in the spawned runner (see [Local single-parse compile prep](#local-single-parse-compile-prep) and the sequence diagram below). Other paths (`jaiph run --raw`, Docker `jaiph run`, `jaiph test`, `jaiph compile`) keep their existing parser calls and re-read `.jh` sources on demand. -2. **Compile-time** validation (`validateReferences`, invoked from **`emitScriptsForModule`** / **`buildScripts()`**) runs before script extraction, not inside `buildRuntimeGraph()` (the graph loader only parses modules and follows imports). The **`jaiph compile`** command walks the same import closure but runs **`validateReferences` only**: it parses each reachable module on disk and **does not** emit **`scripts/`** (no **`buildScriptFiles`** / **`buildScripts`**), **does not** invoke **`buildRuntimeGraph()`**, and never spawns the workflow runner (`src/cli/commands/compile.ts`). For a **directory** argument it discovers `*.jh` via `walkjhFiles`, which **skips** `*.test.jh`; to validate a test module, pass that file explicitly. Imported modules in the closure are still validated recursively either way. +1. Parse source into AST. Every CLI path walks the entry plus its transitive `.jh` import closure **once** through **`loadModuleGraph`** (`src/transpile/module-graph.ts`) and reuses that **`ModuleGraph`** for the banner (`metadataToConfig`), validation (**`validateReferences(graph)`**), script-body extraction (**`buildScriptsFromGraph`**), and — across the parent → child process boundary on the default local `jaiph run` — for **`buildRuntimeGraph(graph)`** in the spawned runner (see [Local module graph](#local-module-graph) and the sequence diagram below). `parsejaiph(source, filePath)` is I/O-pure; `validate` and `emit` operate entirely on the in-memory graph and never re-read `.jh` files. The only fs entry point that reads `.jh` sources is `loadModuleGraph`. +2. **Compile-time** validation (`validateReferences(graph)`, invoked from **`emitScriptsForModuleFromGraph`** / **`buildScriptsFromGraph()`**) runs before script extraction. The validator consumes the in-memory graph; imported ASTs are looked up by absolute path and never re-read from disk. The **`jaiph compile`** command walks the same import closure but runs **`validateReferences` only**: it builds a graph per entry, validates it, and **does not** emit **`scripts/`**, **does not** invoke **`buildRuntimeGraph()`**, and never spawns the workflow runner (`src/cli/commands/compile.ts`). For a **directory** argument it discovers `*.jh` via `walkjhFiles`, which **skips** `*.test.jh`; to validate a test module, pass that file explicitly. Imported modules in the closure are still validated recursively either way. 3. **CLI** (`dist/src/cli.js` via npm, or a **Bun-compiled** `dist/jaiph` binary) prepares script executables (scripts-only), then spawns a **detached child** that loads **`node-workflow-runner.js`**. That child calls `buildRuntimeGraph()` and runs **`NodeWorkflowRuntime`**. The child’s interpreter is **`process.execPath`** of the CLI process (Node when you run `node dist/src/cli.js`, the standalone Bun binary when you run `dist/jaiph`). Script steps execute as managed subprocesses; prompt, inbox I/O, and event/summary emission are handled by the kernel under `src/runtime/kernel/`. 4. Stream live events to the CLI and persist durable run artifacts. @@ -46,8 +46,8 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** - - **`emitScriptsForModule`** parses, runs **`validateReferences`**, and **`buildScriptFiles`** — the only compile path for `jaiph run` / `jaiph test` — **persists only atomic `script` files** under `scripts/`. **`buildScripts()`** can also take a **directory** of non-test `*.jh` modules (`src/transpile/build.ts` uses `walkjhFiles`); the **`jaiph run`** and **`jaiph test`** commands always pass a **single entry file** (`.jh` or `*.test.jh`). Inline scripts (`` run `body`(args) ``) are also emitted as `scripts/__inline_` with deterministic hash-based names (`inlineScriptName` in `src/inline-script-name.ts`). There is no workflow-level bash emission. - - Both **`buildScripts()`** and **`emitScriptsForModule`** accept an optional **`CompilePrep`** parameter. When supplied, the transitive-module list comes from the pre-parsed cache instead of re-walking the import closure, and `validateReferences` reads its `readFile` / `parse` callbacks against that same cache so each reachable module is parsed exactly once per `jaiph run` (see [Local single-parse compile prep](#local-single-parse-compile-prep)). + - **`emitScriptsForModuleFromGraph`** validates one module against the graph and runs **`buildScriptFiles`** — the only compile path for `jaiph run` / `jaiph test` — **persists only atomic `script` files** under `scripts/`. **`buildScripts(input, outDir, ws?)`** is the path-based wrapper used by tests and the directory walk; it loads a `ModuleGraph` and delegates. **`buildScriptsFromGraph(graph, outDir)`** is the graph-based entry point used by `jaiph run` / `jaiph test`, which already loaded the graph. Inline scripts (`` run `body`(args) ``) are also emitted as `scripts/__inline_` with deterministic hash-based names (`inlineScriptName` in `src/inline-script-name.ts`). There is no workflow-level bash emission. + - The pipeline contract is `loadModuleGraph` → `validateReferences(graph)` → `emit(graph, outDir)`. `parsejaiph` is I/O-pure; `validate` and `emit` never touch `.jh` on disk. Each reachable module is parsed exactly once per `jaiph run` (see [Local module graph](#local-module-graph)). - **Node Workflow Runtime (`src/runtime/kernel/node-workflow-runtime.ts`)** - `NodeWorkflowRuntime` interprets the AST directly: walks workflow steps, manages scope/variables, delegates prompt and script execution to kernel helpers, handles channels/inbox/dispatch, owns the frame stack and heartbeat, and writes run artifacts. @@ -55,7 +55,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **`runtime-arg-parser.ts`** — stateless interpolation and call-argument parsing (`interpolate`, `parseInlineCaptureCall`, `commaArgsToInterpolated`, `parseArgsRaw`, `parseInlineScriptAt`, `parseManagedArgAt`, `parseArgTokens`, `stripOuterQuotes`, `parsePromptSchema`, `sanitizeName`, `nowIso`) plus shared constants and the `ParsedArgToken` / `PromptSchemaField` types. Direct unit tests live in `runtime-arg-parser.test.ts`. - **`runtime-event-emitter.ts`** — `RuntimeEventEmitter` owns **`__JAIPH_EVENT__`** writes on stderr (step/log traffic when not suppressed), **`run_summary.jsonl`** appends for the wider timeline (including workflow/prompt records that are summary-first), plus step/prompt sequence counters. Constructed with `{ runId, runDir, env, getFrameStack, getAsyncIndices, suppressLiveEvents? }`; the runtime delegates structured emission to it. The optional `suppressLiveEvents` flag (forwarded from `NodeWorkflowRuntime`'s `suppressLiveEvents` option) skips the live stderr **`__JAIPH_EVENT__`** lines while **`appendRunSummaryLine`** keeps updating **`run_summary.jsonl`** — used by in-process callers like the test runner that share stderr with `node --test` reporter output. The CLI's spawned `node-workflow-runner` child does not set it, so production runs stream events to stderr as before. - **`runtime-mock.ts`** — `executeMockBodyDef` and `executeMockShellBody` for `*.test.jh` workflow/rule/script mocks. Shell-kind mocks run `bash -c`; steps-kind mocks dispatch back into the runtime via an `executeStepsBack` callback so the body runs against the full step interpreter. - - `buildRuntimeGraph()` (`graph.ts`) loads reachable modules with **`parsejaiph` only** (import closure); it does **not** run `validateReferences`. Cross-module refs are resolved from that graph at runtime. For **`script import`** declarations, `buildRuntimeGraph()` injects synthetic `ScriptDef` stubs (`graph.ts`) so reference resolution matches the validated compile path without re-reading external script bodies at graph-build time. The function also accepts an optional **`CompilePrep`**: when supplied, every reachable module is taken from the cache and no `.jh` file is read from disk in the runner. The stub-injection helper (`attachScriptImportStubs`) is idempotent so cached and uncached paths produce the same node shape. + - `buildRuntimeGraph()` (`graph.ts`) accepts either an entry file path (legacy) or an already-loaded `ModuleGraph` and returns the runtime-ready view by injecting `ScriptDef` stubs for **`script import`** declarations so reference resolution matches the validated compile path without re-reading external script bodies. Cross-module refs are resolved from that graph at runtime. `RuntimeGraph` is a type alias for `ModuleGraph` — there is one canonical "all reachable modules" representation. The stub-injection helper (`attachScriptImportStubs`) is idempotent. - **Node Test Runner (`src/runtime/kernel/node-test-runner.ts`)** - Executes `*.test.jh` test blocks using `NodeWorkflowRuntime` with mock support (mock prompts, mock workflow/rule/script bodies). Pure Node harness — no Bash test transpilation. @@ -70,18 +70,18 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Parses mount specs, resolves Docker config (image, network, timeout), and builds the `docker run` invocation when the CLI enables **Docker sandboxing** for `jaiph run` (environment-driven; there is no `jaiph run --docker` flag — see [Sandboxing](sandboxing.md)). The container runs the same `node-workflow-runner` entry as local execution. The default image is the official `ghcr.io/jaiphlang/jaiph-runtime` GHCR image; every selected image must already contain `jaiph` (no auto-install or derived-image build at runtime). Image preparation (`prepareImage`) runs before the CLI banner: it checks whether the image is local, pulls with `--quiet` if needed (short status lines on stderr instead of Docker’s default pull UI), and verifies that `jaiph` exists in the image. `spawnDockerProcess` does not pull or verify — it receives a pre-resolved image. The spawn call uses `stdio: ["ignore", "pipe", "pipe"]` — stdin is ignored so the Docker CLI does not block on stdin EOF, which would stall event streaming and hang the host CLI after the container exits. - **Workspace immutability:** Docker runs cannot modify the host workspace. The host checkout is mounted read-only; `/jaiph/workspace` is a sandbox-local copy-on-write overlay discarded on exit. The only host-writable path is `/jaiph/run` (run artifacts). Workflows that need to capture workspace changes should write files (for example a `git diff` into a temp path) and publish them with `artifacts.save()`. See [Sandboxing](sandboxing.md) for the full contract and [Libraries — `jaiphlang/artifacts`](libraries.md#jaiphlangartifacts--publishing-files-out-of-the-sandbox). -## Local single-parse compile prep -{: #local-single-parse-compile-prep} +## Local module graph +{: #local-module-graph} -The default local `jaiph run ` path uses one shared module-graph representation across the parent CLI → child runner boundary so each reachable `.jh` is parsed exactly **once** per run. +The toolchain has one canonical representation — **`ModuleGraph`** — for "all `.jh` modules reachable from an entry point, parsed once." The same graph is used by the validator, the script emitter, and the runtime; on the default local `jaiph run` path it also crosses the parent CLI → child runner boundary so each reachable `.jh` is parsed exactly **once** per run. -- **`prepareCompile(entryFile, workspaceRoot)`** (`src/transpile/compile-prep.ts`) walks the entry plus its transitive `import` edges through `resolveImportPath` and returns a **`CompilePrep`** record: `{ entryFile, workspaceRoot, astByFile: Map }`. **`jaiphlang/`** library imports resolve through the same workspace fallback as the rest of the toolchain. -- **`src/cli/commands/run.ts`** calls `prepareCompile` once after path normalization. The entry AST is reused for **`metadataToConfig(mod.metadata)`** (banner / `runtime` config) — no separate `parsejaiph(readFileSync(...))` for metadata. The same prep is passed to **`buildScripts(input, outDir, workspaceRoot, prep)`** so `emitScriptsForModule` skips `readFileSync` + `parsejaiph` per module; `validateReferences` runs against the cached AST via injected `readFile` / `parse` callbacks. -- **Process boundary.** The CLI serializes the prep with **`writeCompilePrep`** to **`/.jaiph-compile-prep.json`** (deterministic JSON: entries sorted by absolute path; ASTs included verbatim). It points the spawned **`node-workflow-runner.js`** at the file through the internal env var **`JAIPH_COMPILE_PREP_FILE`**. The runner reads it back with **`readCompilePrep`** and passes the result to **`buildRuntimeGraph(entry, workspaceRoot, prep)`**, which builds the `RuntimeGraph` from the cached `Map` instead of re-reading `.jh` files. Cross-module workflow / rule / script resolution and `script import` stub injection match the on-disk parse path. -- **Scope of the optimization.** `JAIPH_COMPILE_PREP_FILE` is set **only** when the host CLI spawns the local **`node-workflow-runner.js`** child with Docker sandboxing disabled (`dockerConfigForBanner.enabled === false`). It is **not** set on these paths, which keep their existing parse calls: - - **`jaiph run --raw`** — `runWorkflowRaw` (`src/cli/commands/run.ts`) calls `parsejaiph` / `buildScripts` directly without a prep cache; the runner uses inherited stdio and never reads this env var. - - **Docker `jaiph run`** — the host writes the prep file under `outDir`, but skips the env var because the inner container command is `jaiph run --raw …` and the host bind-mount layout does not plumb the cache file inside the container. - - **`jaiph test`** — `runTestFile` keeps its own one-time `buildRuntimeGraph(testFileAbs)` per test file (see [Test runner integration](#test-runner-integration-testjh-in-the-kernel)). +- **`loadModuleGraph(entryFile, workspaceRoot?)`** (`src/transpile/module-graph.ts`) walks the entry plus its transitive `import` edges through `resolveImportPath` and returns `{ entryFile, workspaceRoot?, modules: Map }> }`. **`jaiphlang/`** library imports resolve through the same workspace fallback as the rest of the toolchain. This is the **only** routine that reads `.jh` sources from disk; `parsejaiph(source, filePath)` itself is I/O-pure. +- **`src/cli/commands/run.ts`** calls `loadModuleGraph` once after path normalization. The entry AST is reused for **`metadataToConfig(mod.metadata)`** (banner / `runtime` config). The same graph is passed to **`buildScriptsFromGraph(graph, outDir)`**, which calls `emitScriptsForModuleFromGraph` per reachable module; `validateReferences(graph)` runs against the in-memory ASTs. +- **Process boundary.** The CLI serializes the graph with **`writeModuleGraph`** to **`/.jaiph-module-graph.json`** (deterministic JSON: entries sorted by absolute path; ASTs included verbatim). It points the spawned **`node-workflow-runner.js`** at the file through the internal env var **`JAIPH_MODULE_GRAPH_FILE`**. The runner reads it back with **`readModuleGraph`** and passes the result to **`buildRuntimeGraph(graph)`**, which produces the runtime view (with `script import` stub injection) without touching disk. Cross-module workflow / rule / script resolution matches the on-disk load path. +- **Scope of the env-var hand-off.** `JAIPH_MODULE_GRAPH_FILE` is set **only** when the host CLI spawns the local **`node-workflow-runner.js`** child with Docker sandboxing disabled (`dockerConfigForBanner.enabled === false`). It is **not** set on these paths, which load the graph from disk inside the runner instead: + - **`jaiph run --raw`** — `runWorkflowRaw` (`src/cli/commands/run.ts`) calls `buildScripts` directly without writing the graph file; the runner uses inherited stdio and falls back to `loadModuleGraph` from the source file. + - **Docker `jaiph run`** — the host writes the graph file under `outDir`, but skips the env var because the inner container command is `jaiph run --raw …` and the host bind-mount layout does not plumb the cache file inside the container. + - **`jaiph test`** — `runSingleTestFile` builds the graph in `src/cli/commands/test.ts` and threads it through `runTestFile(graph, ...)` directly (no env var needed; same process). When the env var is absent the runner falls back to the disk-walk parse path, preserving prior behavior. @@ -137,9 +137,9 @@ Channels are validated at compile time (`validateReferences` / send RHS rules) a ## Test runner integration (`*.test.jh` in the kernel) -**How** `jaiph test` wires into the same stack as `jaiph run`: `*.test.jh` files are parsed in the CLI; `runTestFile()` drives blocks in-process. **`buildRuntimeGraph(testFile)`** is called **once per `runTestFile` invocation** and the resulting graph is reused across all blocks and `test_run_workflow` steps (the import closure is constant for a given test file within a single process run). Each `test_run_workflow` step resolves mocks against that cached graph, then constructs `NodeWorkflowRuntime` with `mockBodies` / mock prompt env, passing **`suppressLiveEvents: true`** so **`RuntimeEventEmitter`** skips writing **`__JAIPH_EVENT__`** lines to **stderr** while still appending **`run_summary.jsonl`** for that run. Without this flag, every workflow event would print to the test process's stderr and swamp `node --test` reporter output. Mock prompts, workflows, rules, and scripts are supported through the runtime's mock infrastructure. +**How** `jaiph test` wires into the same stack as `jaiph run`: `runSingleTestFile` (`src/cli/commands/test.ts`) calls `loadModuleGraph(testFileAbs, workspaceRoot)` once, then threads the resulting `ModuleGraph` through `buildScriptsFromGraph(graph, tmpDir)` and `runTestFile(graph, …)`. `runTestFile` calls `buildRuntimeGraph(graph)` once per file and the runtime view is reused across all blocks and `test_run_workflow` steps (the import closure is constant for a given test file within a single process run). Each `test_run_workflow` step resolves mocks against that runtime view, then constructs `NodeWorkflowRuntime` with `mockBodies` / mock prompt env, passing **`suppressLiveEvents: true`** so **`RuntimeEventEmitter`** skips writing **`__JAIPH_EVENT__`** lines to **stderr** while still appending **`run_summary.jsonl`** for that run. Without this flag, every workflow event would print to the test process's stderr and swamp `node --test` reporter output. Mock prompts, workflows, rules, and scripts are supported through the runtime's mock infrastructure. -Before that, the CLI prepares script executables via **`buildScripts(testFileAbs, tmpDir, workspaceRoot)`** — the same **`buildScripts`** helper as `jaiph run`, with the **test file as the entrypoint**. That walks the test module and its **import closure** (transitive `import` edges), runs **`validateReferences`** / **`emitScriptsForModule`** per reachable file, and writes `scripts/` so imported workflows have paths under `JAIPH_SCRIPTS`. Unrelated `*.jh` files elsewhere in the repo are not compiled unless imported. +The `buildScriptsFromGraph` call writes `scripts/` so imported workflows have paths under `JAIPH_SCRIPTS`. Unrelated `*.jh` files elsewhere in the repo are not compiled unless imported. Authoring rules, fixtures, and mock syntax for `*.test.jh` are documented in [Testing](testing.md), not here. @@ -158,28 +158,27 @@ The progress UI combines a **static** step tree derived from the workflow AST (` flowchart TD U[User / CI] --> CLI[CLI: Node or Bun jaiph] - subgraph Transpile["Per-module: emitScriptsForModule()"] - PARSE[parsejaiph] + subgraph Transpile["Per-module: emitScriptsForModuleFromGraph()"] VAL[validateReferences] EMIT[Emit atomic script files under scripts/] - PARSE --> VAL VAL -->|compile errors| ERR[Deterministic compile errors] VAL --> EMIT end - CLI -->|jaiph run| CP1[prepareCompile entry + closure] - CP1 --> BS1[buildScripts prep] + CLI -->|jaiph run| LMG1[loadModuleGraph entry + closure] + LMG1 --> BS1[buildScriptsFromGraph] BS1 --> Transpile - CLI -->|jaiph test| BS2[buildScripts(entry .test.jh)] + CLI -->|jaiph test| LMG2[loadModuleGraph(entry .test.jh)] + LMG2 --> BS2[buildScriptsFromGraph] BS2 --> Transpile - BS2 --> TR[Node Test Runner in-process] + LMG2 --> TR[Node Test Runner in-process] Transpile -->|jaiph run local| RW[Node workflow runner child] Transpile -->|jaiph run Docker| DC[Container runs node-workflow-runner] - CP1 -. JAIPH_COMPILE_PREP_FILE (local non-Docker only) .-> RW + LMG1 -. JAIPH_MODULE_GRAPH_FILE (local non-Docker only) .-> RW - RW --> G[buildRuntimeGraph parse-only or cached prep] + RW --> G[buildRuntimeGraph from graph] G --> GRAPH[RuntimeGraph] RW --> RT[NodeWorkflowRuntime] RT --> GRAPH @@ -213,26 +212,26 @@ Interactive **`jaiph run`** (no **`--raw`**): banner, progress tree, hooks, and sequenceDiagram participant User participant CLI as CLI jaiph run - participant CP as prepareCompile - participant Prep as buildScripts(prep) - participant TF as emitScriptsForModule per module + participant Load as loadModuleGraph + participant Prep as buildScriptsFromGraph + participant TF as emitScriptsForModuleFromGraph per module participant Runner as node-workflow-runner - participant Graph as buildRuntimeGraph(prep) + participant Graph as buildRuntimeGraph(graph) participant Runtime as NodeWorkflowRuntime participant Kernel as JS kernel participant Report as Artifacts (.jaiph/runs) User->>CLI: jaiph run main.jh args... - CLI->>CP: prepareCompile(entry, workspace) - CP-->>CLI: CompilePrep (astByFile) + CLI->>Load: loadModuleGraph(entry, workspace) + Load-->>CLI: ModuleGraph (modules map) Note over CLI: reuse entry AST for metadataToConfig / banner - CLI->>Prep: buildScripts(input, outDir, workspace, prep) - Prep->>TF: loop: validateReferences + emit (cached AST) + CLI->>Prep: buildScriptsFromGraph(graph, outDir) + Prep->>TF: loop: validateModule + emit (in-memory AST) TF-->>Prep: scripts/ atomic only Prep-->>CLI: scriptsDir + env JAIPH_SCRIPTS alt local (non-Docker) - CLI->>CLI: writeCompilePrep(/.jaiph-compile-prep.json) - Note over CLI: set JAIPH_COMPILE_PREP_FILE on child env + CLI->>CLI: writeModuleGraph(/.jaiph-module-graph.json) + Note over CLI: set JAIPH_MODULE_GRAPH_FILE on child env CLI->>Runner: spawn detached node-workflow-runner else Docker CLI->>CLI: prepareImage (pull --quiet + verify jaiph) @@ -240,12 +239,13 @@ sequenceDiagram CLI->>Runner: spawn container running node-workflow-runner Note over CLI: CLI parses events on stderr only end - alt JAIPH_COMPILE_PREP_FILE set (local non-Docker) - Runner->>Runner: readCompilePrep(file) - Runner->>Graph: buildRuntimeGraph(sourceAbs, workspace, prep) + alt JAIPH_MODULE_GRAPH_FILE set (local non-Docker) + Runner->>Runner: readModuleGraph(file) + Runner->>Graph: buildRuntimeGraph(graph) Note over Graph: no .jh re-reads else absent (Docker / --raw / test runner) - Runner->>Graph: buildRuntimeGraph(sourceAbs) parse-only + Runner->>Runner: loadModuleGraph(sourceAbs, workspace) + Runner->>Graph: buildRuntimeGraph(graph) end Graph-->>Runner: RuntimeGraph Runner->>Runtime: runDefault(run args) @@ -265,20 +265,20 @@ sequenceDiagram sequenceDiagram participant User participant CLI as CLI jaiph test - participant Parser as parsejaiph - participant Prep as buildScripts(test file) + participant Load as loadModuleGraph + participant Prep as buildScriptsFromGraph participant TestRunner as runTestFile / runTestBlock - participant Graph as buildRuntimeGraph + participant Graph as buildRuntimeGraph(graph) participant Runtime as NodeWorkflowRuntime participant Report as Artifacts User->>CLI: jaiph test flow.test.jh - CLI->>Parser: parse test file - Parser-->>CLI: jaiphModule + tests[] blocks - CLI->>Prep: buildScripts(test path, tmp) import closure + CLI->>Load: loadModuleGraph(test file, workspace) + Load-->>CLI: ModuleGraph (entry + import closure) + CLI->>Prep: buildScriptsFromGraph(graph, tmp) Prep-->>CLI: scriptsDir - CLI->>TestRunner: runTestFile(test path workspace scriptsDir blocks) - TestRunner->>Graph: buildRuntimeGraph(test file) once per file + CLI->>TestRunner: runTestFile(graph, workspace, scriptsDir, blocks) + TestRunner->>Graph: buildRuntimeGraph(graph) once per file Graph-->>TestRunner: RuntimeGraph cached loop each test block TestRunner->>TestRunner: mocks / shell steps / expectations @@ -295,7 +295,7 @@ sequenceDiagram ## Summary -- `.jh` / `*.test.jh` share parser/AST; **compile-time** validation runs in **`emitScriptsForModule`** during **`buildScripts`**. **`buildRuntimeGraph`** loads modules with **parse-only** imports — or, on the default local **`jaiph run`** path, from a shared **`CompilePrep`** the parent CLI built with **`prepareCompile`** and handed across the process boundary through **`JAIPH_COMPILE_PREP_FILE`** (see [Local single-parse compile prep](#local-single-parse-compile-prep)). +- `.jh` / `*.test.jh` share parser/AST. The pipeline is **`loadModuleGraph` → `validateReferences(graph)` → `emit(graph, outDir)`**; `parsejaiph` is I/O-pure and `validate` / `emit` operate entirely in-memory. **`buildRuntimeGraph`** consumes the same `ModuleGraph` (loaded in the runner from disk or — on the default local **`jaiph run`** path — deserialized from the parent CLI's graph file via **`JAIPH_MODULE_GRAPH_FILE`**; see [Local module graph](#local-module-graph)). - **`jaiph compile`** walks import closures with **`validateReferences` only**, and exits — no **`scripts/`** emission (**no **`buildScriptFiles`** / **`buildScripts`**), no **`buildRuntimeGraph()`**, no runner spawn. Directory discovery omits **`*.test.jh`** unless you pass a test file explicitly. - **Node-only runtime:** all execution — local `jaiph run`, Docker `jaiph run`, and `jaiph test` — goes through `NodeWorkflowRuntime`. Docker containers run `node-workflow-runner` with the compiled JS tree and scripts mounted, using the same semantics as local execution. - **CLI** owns launch, observation, hooks (except **`jaiph run --raw`**), and runtime preparation (`buildScripts`). **`jaiph run --raw`** still emits **`__JAIPH_EVENT__`** on stderr from the runtime; the CLI does not attach the interactive progress/hooks pipeline. **`jaiph test`** passes **`suppressLiveEvents: true`** into **`NodeWorkflowRuntime`** so **`RuntimeEventEmitter`** skips writing those live stderr lines while **`run_summary.jsonl`** still records workflow traffic where the emitter appends it. diff --git a/docs/cli.md b/docs/cli.md index e0898212..658f8d8b 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -94,11 +94,11 @@ If a `.jh` file is executable and has `#!/usr/bin/env jaiph`, you can run it dir ### Compile-time and process model -The default `jaiph run` path parses each reachable `.jh` **once**. The CLI calls **`prepareCompile`** (`src/transpile/compile-prep.ts`) to walk the entry plus its transitive `import` closure, producing a **`CompilePrep`** record (`{ entryFile, workspaceRoot, astByFile }`). The entry AST is reused for the banner (`metadataToConfig`), and the same prep is passed to **`buildScripts(input, outDir, workspaceRoot, prep)`** so `emitScriptsForModule` runs `validateReferences` and writes atomic `script` files **without** re-reading or re-parsing any module. Unrelated `.jh` files on disk are not read. +The default `jaiph run` path parses each reachable `.jh` **once**. The CLI calls **`loadModuleGraph`** (`src/transpile/module-graph.ts`) to walk the entry plus its transitive `import` closure, producing a **`ModuleGraph`** record (`{ entryFile, workspaceRoot?, modules: Map }`). `parsejaiph(source, filePath)` is itself I/O-pure — `loadModuleGraph` is the only routine that reads `.jh` sources from disk. The entry AST is reused for the banner (`metadataToConfig`), and the same graph is passed to **`buildScriptsFromGraph(graph, outDir)`**, which calls `emitScriptsForModuleFromGraph` per reachable module and writes atomic `script` files. `validateReferences(graph)` runs against the in-memory ASTs — neither validation nor emission re-reads `.jh` files. Unrelated `.jh` files on disk are not read. -After validation, the CLI spawns the Node workflow runner as a detached child. For local (non-Docker) runs the CLI serializes the prep to `/.jaiph-compile-prep.json` with `writeCompilePrep` and points the child at it through the internal env var **`JAIPH_COMPILE_PREP_FILE`**. The runner deserializes the file and passes the cached `CompilePrep` to `buildRuntimeGraph(sourceFile, workspaceRoot, prep)`, which builds the `RuntimeGraph` from the cached `Map` instead of re-reading `.jh` sources. When the env var is absent — Docker `jaiph run`, `jaiph run --raw`, or any other caller — the runner falls back to the on-disk parse path (`buildRuntimeGraph` reads each module via `parsejaiph`). Either path runs `NodeWorkflowRuntime` with the same `RuntimeGraph` shape — `buildRuntimeGraph` still does **not** run `validateReferences`. Prompt steps, script subprocesses, inbox dispatch, and event emission are handled in the runtime kernel — workflows and rules are interpreted in-process; only `script` steps spawn a managed shell. The CLI listens on stderr for `__JAIPH_EVENT__` JSON lines, the single event channel for all execution modes. Stdout carries only plain script output, forwarded to the terminal as-is. +After validation, the CLI spawns the Node workflow runner as a detached child. For local (non-Docker) runs the CLI serializes the graph to `/.jaiph-module-graph.json` with `writeModuleGraph` (deterministic JSON: entries sorted by absolute path, ASTs included verbatim) and points the child at it through the internal env var **`JAIPH_MODULE_GRAPH_FILE`**. The runner deserializes the file with `readModuleGraph` and passes the result to `buildRuntimeGraph(graph)`, which produces the `RuntimeGraph` (a type alias for `ModuleGraph`) by injecting `ScriptDef` stubs for `import script` declarations — without touching disk. When the env var is absent — Docker `jaiph run`, `jaiph run --raw`, `jaiph test`, or any other caller — the runner falls back to `loadModuleGraph(sourceFile, workspaceRoot)` on the source file. Either path runs `NodeWorkflowRuntime` with the same `RuntimeGraph` shape — `buildRuntimeGraph` still does **not** run `validateReferences`. Prompt steps, script subprocesses, inbox dispatch, and event emission are handled in the runtime kernel — workflows and rules are interpreted in-process; only `script` steps spawn a managed shell. The CLI listens on stderr for `__JAIPH_EVENT__` JSON lines, the single event channel for all execution modes. Stdout carries only plain script output, forwarded to the terminal as-is. -For the full data flow across the parent → child process boundary, see [Architecture — Local single-parse compile prep](architecture.md#local-single-parse-compile-prep). +For the full data flow across the parent → child process boundary, see [Architecture — Local module graph](architecture.md#local-module-graph). ### Run progress and tree output @@ -423,7 +423,7 @@ These variables apply to `jaiph run` and workflow execution. Variables marked ** - `JAIPH_META_FILE` — path to the run metadata file (under the CLI’s build output directory for that invocation). Set on the **detached workflow child** only; the parent strips any inherited value so leftover exports do not collide. The runner writes `run_dir=` / `summary_file=` lines for the host to read after exit. - `JAIPH_SOURCE_ABS` — absolute path to the entry `.jh` file; set by the CLI for **`jaiph run`** before spawn. Required by the runner (local and Docker). - `JAIPH_SCRIPTS` — directory containing emitted **`script`** files for this run; set after **`buildScripts()`**. Any **`JAIPH_SCRIPTS`** exported in the parent shell is cleared before launch so nested toolchains do not point at the wrong tree. -- `JAIPH_COMPILE_PREP_FILE` — absolute path to a `CompilePrep` JSON snapshot (`/.jaiph-compile-prep.json`) the CLI wrote with `writeCompilePrep`. Set by the CLI **only** for the default local (non-Docker, non-`--raw`) `jaiph run` path so the spawned `node-workflow-runner.js` builds the runtime graph from the cached ASTs instead of re-reading the import closure. The file is internal and may move; do not depend on its path or contents. When the variable is absent (Docker `jaiph run`, `jaiph run --raw`, `jaiph test`), `buildRuntimeGraph()` falls back to parsing `.jh` from disk. See [Architecture — Local single-parse compile prep](architecture.md#local-single-parse-compile-prep). +- `JAIPH_MODULE_GRAPH_FILE` — absolute path to a `ModuleGraph` JSON snapshot (`/.jaiph-module-graph.json`) the CLI wrote with `writeModuleGraph`. Set by the CLI **only** for the default local (non-Docker, non-`--raw`) `jaiph run` path so the spawned `node-workflow-runner.js` builds the runtime graph from the cached ASTs instead of re-reading the import closure. The file is internal and may move; do not depend on its path or contents. When the variable is absent (Docker `jaiph run`, `jaiph run --raw`, `jaiph test`), `buildRuntimeGraph()` falls back to `loadModuleGraph` on disk. See [Architecture — Local module graph](architecture.md#local-module-graph). - `JAIPH_RUN_DIR`, `JAIPH_RUN_ID`, `JAIPH_RUN_SUMMARY_FILE` — for a normal (**non-raw**) **`jaiph run`**, the host generates **`JAIPH_RUN_ID`** once per invocation (UUID), passes it through to the detached child (and into Docker when sandboxed), and Docker failure-path discovery can match summaries by this id. The runtime uses **`JAIPH_RUN_ID`** as the stable run identifier; if it is absent, the runtime may assign its own UUID. **`JAIPH_RUN_DIR`** and **`JAIPH_RUN_SUMMARY_FILE`** are set inside the runner once the UTC run directory exists. - `JAIPH_SOURCE_FILE` — set automatically by the CLI to the entry file **basename**. Used to name run directories (see [Architecture — Durable artifact layout](architecture.md#durable-artifact-layout)). diff --git a/docs/contributing.md b/docs/contributing.md index fbbd1422..15f54ffe 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -9,7 +9,7 @@ redirect_from: Contributor docs answer a narrow question: **where changes belong**, **how to run the same checks CI runs**, and **which test layer** should encode a behavior change. -At a high level, Jaiph is built as described in [Architecture](architecture.md) — transpile path (`emitScriptsForModule`, `buildScripts`), parse-only **`buildRuntimeGraph()`**, **`jaiph compile`** (validate-only), **`NodeWorkflowRuntime`**, artifact layout, and Docker helper contracts. Treat that page as authoritative for pipelines and boundaries; if anything here diverges from it or from the implementation, prefer **architecture + source**. +At a high level, Jaiph is built as described in [Architecture](architecture.md) — single-graph transpile path (`loadModuleGraph` → `validateReferences(graph)` → `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph`), graph-consuming **`buildRuntimeGraph(graph)`**, **`jaiph compile`** (validate-only), **`NodeWorkflowRuntime`**, artifact layout, and Docker helper contracts. Treat that page as authoritative for pipelines and boundaries; if anything here diverges from it or from the implementation, prefer **architecture + source**. For workflow syntax, library usage, tooling setup, and grammar details, see [Language](language.md), [Setup](setup.md), [Grammar](grammar.md), and the overview in [Getting Started](getting-started.md). diff --git a/docs/grammar.md b/docs/grammar.md index 36c95ff8..9de30995 100644 --- a/docs/grammar.md +++ b/docs/grammar.md @@ -20,7 +20,7 @@ Jaiph source files (`.jh`) combine a small orchestration language with shell exe This guide answers three questions for workflow authors: 1. **What can appear in a `.jh` file?** — Top-level imports, config, channels, module `const` bindings, scripts, rules, and workflows; execution constructs (`run`, `ensure`, `prompt`, control flow, channels) live in workflow and rule bodies with different restrictions. -2. **Where is it enforced?** — The parser (`src/parser.ts`, `src/parse/*`) builds the AST; **`validateReferences`** (`src/transpile/validate.ts`) rejects invalid references, arity, and disallowed constructs before **`emitScriptsForModule`** extracts **`script`** bodies to `scripts/`. The **Node workflow runtime** interprets everything else from the AST ([Architecture](architecture.md)). +2. **Where is it enforced?** — The parser (`src/parser.ts`, `src/parse/*`) builds the AST; **`validateReferences(graph)`** (`src/transpile/validate.ts`) rejects invalid references, arity, and disallowed constructs before **`emitScriptsForModuleFromGraph`** extracts **`script`** bodies to `scripts/`. The **Node workflow runtime** interprets everything else from the AST ([Architecture](architecture.md)). 3. **How do scripts relate to Jaiph?** — Only **`script`** definitions and inline **`run \`…\`()` / `run ```…```()`** bodies become executable files under `scripts/`; they run as child processes while workflows and rules stay in the interpreter. The sections below go from **values and declarations** through **steps**, **scripts**, **interpolation**, then **formal notes** (lexical, EBNF, validation catalog). diff --git a/docs/language.md b/docs/language.md index 8d6c789d..bef4727e 100644 --- a/docs/language.md +++ b/docs/language.md @@ -751,7 +751,7 @@ If the inline capture fails, the enclosing step fails. Nested inline captures ar ## Script isolation -**Emitted script files** do not embed module `const` values or other Jaiph “shims” — the transpiler writes the authored body plus a shebang (see `emitScriptsForModule` / `emit-script.ts`). Anything a script needs from the module must be passed as **positional arguments** (`$1`, `$2`, …), read from paths under `JAIPH_WORKSPACE`, or live in shared script sources (`import script`). +**Emitted script files** do not embed module `const` values or other Jaiph “shims” — the transpiler writes the authored body plus a shebang (see `emitScriptsForModuleFromGraph` / `emit-script.ts`). Anything a script needs from the module must be passed as **positional arguments** (`$1`, `$2`, …), read from paths under `JAIPH_WORKSPACE`, or live in shared script sources (`import script`). **Subprocess environment (`NodeWorkflowRuntime`):** Managed **script** steps (`run` on a named script, script import, or inline `` `…` `` / fenced body), and **workflow inline shell** lines, all use the same **`scope.env`**: the runner’s `process.env` as adjusted by Jaiph (for example `JAIPH_SCRIPTS`, `JAIPH_WORKSPACE`, `JAIPH_RUN_DIR`, `JAIPH_ARTIFACTS_DIR`, prompt-related `JAIPH_AGENT_*` when set, and keys derived from `config { … }`). It is **not** reset to a small fixed allowlist; anything visible to the workflow runner is visible to child processes unless your deployment strips the parent environment. diff --git a/docs/libraries.md b/docs/libraries.md index 484bfeb3..07ceee0d 100644 --- a/docs/libraries.md +++ b/docs/libraries.md @@ -27,7 +27,7 @@ Implications: - **Imports without `/`** — e.g. **`import "submod"`** — only relative-to-file lookup is attempted; there is **no** library fallback under `.jaiph/libs/` even if a matching folder name exists. - **`jaiph compile`** runs the same **`validateReferences`** check as **`jaiph run`** but does not emit **`scripts/`** or invoke **`buildRuntimeGraph()`** ([Architecture — Summary](architecture.md#summary)). -**Workspace root:** whatever the invoking CLI path passes into **`emitScriptsForModule`** / **`validateReferences`**: +**Workspace root:** whatever the invoking CLI path passes into **`loadModuleGraph`** (the single discovery routine consumed by **`validateReferences`** / **`emitScriptsForModuleFromGraph`**): - **`jaiph run`** and **`jaiph test`** on an explicit **`*.jh` / `*.test.jh`** file use **`detectWorkspaceRoot(dirname(entry))`** (same predicate for both commands). - **`jaiph test`** with **no** file argument discovers tests under **`detectWorkspaceRoot(process.cwd())`** (`src/cli/commands/test.ts`). diff --git a/docs/testing.md b/docs/testing.md index 106b4b0b..4dbcdde9 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -232,8 +232,8 @@ When all tests pass: `✓ N test(s) passed`. Exit status is 0 on full success, n The CLI parses each test file and passes `test "…" { … }` blocks to `runTestFile()` (`src/runtime/kernel/node-test-runner.ts`). That path aligns with [Architecture — Test runner integration](architecture.md#test-runner-integration-testjh-in-the-kernel): -1. **`buildScripts(testFileAbs, tmpDir, workspaceRoot)`** — same helper as `jaiph run`, with the **test file as the entrypoint** (`test.ts` calls it with the absolute path to the `*.test.jh` file). For a file entrypoint, the transpiler walks the test module and every file reachable by transitive **`import`** (see `collectTransitiveJhModules` in `src/transpile/build.ts`); it runs `validateReferences` / `emitScriptsForModule` per file and writes atomic **`script`** files into a temp `scripts/` tree. (If `buildScripts` were ever given a **directory** entrypoint, directory walks skip `*.test.jh` files — that is not how `jaiph test` invokes it.) -2. **`buildRuntimeGraph(testFileAbs, workspaceRoot)`** — called **once per test file**; the same graph is reused for every `test` block in that file and for every `run` step inside them. +1. **`loadModuleGraph(testFileAbs, workspaceRoot)`** + **`buildScriptsFromGraph(graph, tmpDir)`** — `runSingleTestFile` (`src/cli/commands/test.ts`) loads the `ModuleGraph` for the test file (entry plus transitive `import` closure, parsed once via `loadModuleGraph` in `src/transpile/module-graph.ts`), then hands it to `buildScriptsFromGraph`, which calls `emitScriptsForModuleFromGraph` per reachable module and writes atomic **`script`** files into a temp `scripts/` tree. Validation runs in-memory against the graph; no `.jh` is re-read after the initial load. +2. **`buildRuntimeGraph(graph)`** — called **once per test file** with the same `ModuleGraph`; the resulting runtime view is reused for every `test` block in that file and for every `run` step inside them. 3. For each block, a fresh temp layout sets env vars (below); workflows run in **`NodeWorkflowRuntime`**, not in a detached child. There is no Bash transpilation of full workflows on this path — only extracted `script` bodies are shell, same as production. The import graph is fixed for a single `jaiph test` process; **mutating imported `*.jh` on disk between blocks** is not a supported use case. diff --git a/src/cli/commands/compile.ts b/src/cli/commands/compile.ts index a375bfaa..8f7ed48e 100644 --- a/src/cli/commands/compile.ts +++ b/src/cli/commands/compile.ts @@ -1,11 +1,9 @@ -import { existsSync, readFileSync, statSync } from "node:fs"; +import { existsSync, statSync } from "node:fs"; import { dirname, resolve } from "node:path"; -import { parsejaiph } from "../../parser"; +import { loadModuleGraph } from "../../transpile/module-graph"; import { validateReferences } from "../../transpile/validate"; -import { resolveImportPath } from "../../transpile/resolve"; -import { collectTransitiveJhModules, walkjhFiles } from "../../transpile/build"; +import { walkjhFiles } from "../../transpile/build"; import { detectWorkspaceRoot } from "../shared/paths"; -import type { ValidateContext } from "../../transpile/validate"; export interface CompileDiagnostic { file: string; @@ -29,16 +27,6 @@ export function diagnosticFromThrown(err: unknown): CompileDiagnostic | null { }; } -function makeValidateContext(workspaceRoot?: string): ValidateContext { - return { - resolveImportPath, - existsSync, - readFile: (path: string) => readFileSync(path, "utf8"), - parse: parsejaiph, - workspaceRoot, - }; -} - function printUsage(): void { process.stderr.write( "Usage: jaiph compile [--json] [--workspace ] ...\n\n" + @@ -83,7 +71,7 @@ export function runCompile(args: string[]): number { return 1; } - const filesToValidate = new Set(); + const entries: Array<{ file: string; workspaceRoot: string }> = []; try { for (const p of paths) { @@ -97,15 +85,11 @@ export function runCompile(args: string[]): number { throw new Error(`compile expects .jh files: ${p}`); } const wr = workspaceFlag ?? detectWorkspaceRoot(dirname(abs)); - for (const f of collectTransitiveJhModules(abs, wr)) { - filesToValidate.add(f); - } + entries.push({ file: abs, workspaceRoot: wr }); } else if (st.isDirectory()) { const wr = workspaceFlag ?? detectWorkspaceRoot(abs); for (const entry of walkjhFiles(abs)) { - for (const f of collectTransitiveJhModules(entry, wr)) { - filesToValidate.add(f); - } + entries.push({ file: entry, workspaceRoot: wr }); } } else { throw new Error(`not a file or directory: ${p}`); @@ -128,17 +112,16 @@ export function runCompile(args: string[]): number { return 1; } - const sorted = [...filesToValidate].sort(); const seen = new Set(); - - for (const file of sorted) { + for (const { file, workspaceRoot } of entries) { if (seen.has(file)) continue; seen.add(file); - const wr = workspaceFlag ?? detectWorkspaceRoot(dirname(file)); - const ctx = makeValidateContext(wr); try { - const ast = parsejaiph(readFileSync(file, "utf8"), file); - validateReferences(ast, ctx); + const graph = loadModuleGraph(file, workspaceRoot); + validateReferences(graph); + // Mark every reachable module as already validated so a directory walk + // does not double-validate shared imports. + for (const reachable of graph.modules.keys()) seen.add(reachable); } catch (err) { const d = diagnosticFromThrown(err); if (json) { diff --git a/src/cli/commands/run.ts b/src/cli/commands/run.ts index 52aaf5cd..1d8ee0c9 100644 --- a/src/cli/commands/run.ts +++ b/src/cli/commands/run.ts @@ -10,8 +10,8 @@ import { tmpdir } from "node:os"; import { dirname, join, resolve, extname } from "node:path"; import { basename } from "node:path"; import { parsejaiph } from "../../parser"; -import { buildScripts } from "../../transpiler"; -import { prepareCompile, writeCompilePrep } from "../../transpile/compile-prep"; +import { buildScripts, buildScriptsFromGraph } from "../../transpiler"; +import { loadModuleGraph, writeModuleGraph } from "../../transpile/module-graph"; import { metadataToConfig } from "../../config"; import { buildStepDisplayParamPairs, formatNamedParamsForDisplay } from "./format-params.js"; import { @@ -81,8 +81,8 @@ export async function runWorkflow(rest: string[]): Promise { } const hooksConfig = loadMergedHooks(workspaceRoot); - const prep = prepareCompile(inputAbs, workspaceRoot); - const mod = prep.astByFile.get(inputAbs)!; + const graph = loadModuleGraph(inputAbs, workspaceRoot); + const mod = graph.modules.get(inputAbs)!.ast; const effectiveConfig = metadataToConfig(mod.metadata); const outDir = target ? resolve(target) : mkdtempSync(join(tmpdir(), "jaiph-run-")); @@ -113,17 +113,17 @@ export async function runWorkflow(rest: string[]): Promise { dockerConfigForBanner.enabled, sandboxModeForBanner, ); - const { scriptsDir } = buildScripts(inputAbs, outDir, workspaceRoot, prep); + const { scriptsDir } = buildScriptsFromGraph(graph, outDir); runtimeEnv.JAIPH_SCRIPTS = scriptsDir; - // Cache file consumed by the spawned runner (or container) so the runtime + // Serialized module graph consumed by the spawned runner so the runtime // graph reuses these ASTs instead of re-parsing every reachable module. // Docker mounts the workspace read-only, so place the cache under outDir, // which the host already arranges for the container side via its existing // sandbox layout. For local runs the runner reads the path directly. - const prepFile = join(outDir, ".jaiph-compile-prep.json"); - writeCompilePrep(prepFile, prep); + const graphFile = join(outDir, ".jaiph-module-graph.json"); + writeModuleGraph(graphFile, graph); if (!dockerConfigForBanner.enabled) { - runtimeEnv.JAIPH_COMPILE_PREP_FILE = prepFile; + runtimeEnv.JAIPH_MODULE_GRAPH_FILE = graphFile; } const metaFile = join(outDir, `.jaiph-run-meta-${Date.now()}-${process.pid}.txt`); diff --git a/src/cli/commands/test.ts b/src/cli/commands/test.ts index 340e7c88..f7e8bb21 100644 --- a/src/cli/commands/test.ts +++ b/src/cli/commands/test.ts @@ -1,14 +1,13 @@ import { mkdtempSync, - readFileSync, rmSync, statSync, } from "node:fs"; import { tmpdir } from "node:os"; import { dirname, join, resolve, extname } from "node:path"; import { basename } from "node:path"; -import { buildScripts, walkTestFiles } from "../../transpiler"; -import { parsejaiph } from "../../parser"; +import { buildScriptsFromGraph, walkTestFiles } from "../../transpiler"; +import { loadModuleGraph } from "../../transpile/module-graph"; import { jaiphError } from "../../errors"; import { detectWorkspaceRoot } from "../shared/paths"; import { parseArgs } from "../shared/usage"; @@ -76,7 +75,8 @@ export async function runSingleTestFile( workspaceRoot: string, _runArgs: string[], ): Promise { - const ast = parsejaiph(readFileSync(testFileAbs, "utf8"), testFileAbs); + const graph = loadModuleGraph(testFileAbs, workspaceRoot); + const ast = graph.modules.get(graph.entryFile)!.ast; if (!ast.tests || ast.tests.length === 0) { throw jaiphError(ast.filePath, 1, 1, "E_PARSE", "test file must contain at least one test block"); } @@ -85,8 +85,8 @@ export async function runSingleTestFile( const outDir = mkdtempSync(join(tmpdir(), "jaiph-test-")); try { /** Only compile the test module and its imports — not every `.jh` under the workspace. */ - const { scriptsDir } = buildScripts(testFileAbs, outDir, workspaceRoot); - return await runTestFile(testFileAbs, workspaceRoot, scriptsDir, ast.tests); + const { scriptsDir } = buildScriptsFromGraph(graph, outDir); + return await runTestFile(graph, workspaceRoot, scriptsDir, ast.tests); } finally { rmSync(outDir, { recursive: true, force: true }); } diff --git a/src/runtime/kernel/graph.ts b/src/runtime/kernel/graph.ts index 01d2c8b2..b5a896a9 100644 --- a/src/runtime/kernel/graph.ts +++ b/src/runtime/kernel/graph.ts @@ -1,20 +1,9 @@ -import { readFileSync } from "node:fs"; import { resolve } from "node:path"; -import { parsejaiph } from "../../parser"; -import type { CompilePrep } from "../../transpile/compile-prep"; +import { loadModuleGraph, type ModuleGraph, type ModuleNode } from "../../transpile/module-graph"; import type { RuleDef, ScriptDef, WorkflowDef, WorkflowRefDef, RuleRefDef, jaiphModule } from "../../types"; -import { resolveImportPath } from "../../transpile/resolve"; -export interface RuntimeModuleNode { - filePath: string; - ast: jaiphModule; - imports: Map; -} - -export interface RuntimeGraph { - entryFile: string; - modules: Map; -} +export type RuntimeModuleNode = ModuleNode; +export type RuntimeGraph = ModuleGraph; export interface ResolvedWorkflow { filePath: string; @@ -46,50 +35,23 @@ function attachScriptImportStubs(ast: jaiphModule): void { } } -function nodeFromAst(filePath: string, ast: jaiphModule, workspaceRoot?: string): RuntimeModuleNode { - const imports = new Map(); - for (const imp of ast.imports) { - imports.set(imp.alias, resolveImportPath(filePath, imp.path, workspaceRoot)); - } - attachScriptImportStubs(ast); - return { filePath, ast, imports }; -} - -function buildNode(filePath: string, workspaceRoot?: string): RuntimeModuleNode { - const ast = parsejaiph(readFileSync(filePath, "utf8"), filePath); - return nodeFromAst(filePath, ast, workspaceRoot); -} - /** - * When `prep` is supplied, every reachable module is taken from the pre-parsed - * cache and no `.jh` files are read from disk. The cache is shared with the - * parent CLI's `buildScripts` so each module is parsed exactly once per run. + * Adapt a {@link ModuleGraph} for runtime dispatch by injecting `ScriptDef` + * stubs for `import script` declarations so `resolveScriptRef` lookups + * succeed for cross-module script imports. The injection mutates the AST + * in-place; the helper is idempotent so repeated calls are safe. */ export function buildRuntimeGraph( - entryFile: string, + source: string | ModuleGraph, workspaceRoot?: string, - prep?: CompilePrep, ): RuntimeGraph { - const entry = resolve(entryFile); - if (prep) { - const modules = new Map(); - for (const [filePath, ast] of prep.astByFile) { - modules.set(filePath, nodeFromAst(filePath, ast, workspaceRoot)); - } - return { entryFile: entry, modules }; - } - const modules = new Map(); - const queue: string[] = [entry]; - while (queue.length > 0) { - const current = queue.shift()!; - if (modules.has(current)) continue; - const node = buildNode(current, workspaceRoot); - modules.set(current, node); - for (const imported of node.imports.values()) { - if (!modules.has(imported)) queue.push(imported); - } + const graph = typeof source === "string" + ? loadModuleGraph(source, workspaceRoot) + : source; + for (const node of graph.modules.values()) { + attachScriptImportStubs(node.ast); } - return { entryFile: entry, modules }; + return graph; } export function lookupWorkflow(graph: RuntimeGraph, fromFile: string, ref: WorkflowRefDef): WorkflowDef | null { diff --git a/src/runtime/kernel/node-test-runner.test.ts b/src/runtime/kernel/node-test-runner.test.ts index 8f276006..cc36d5bf 100644 --- a/src/runtime/kernel/node-test-runner.test.ts +++ b/src/runtime/kernel/node-test-runner.test.ts @@ -4,6 +4,7 @@ import { join } from "node:path"; import { test } from "node:test"; import assert from "node:assert/strict"; import { runTestFile } from "./node-test-runner"; +import { loadModuleGraph } from "../../transpile/module-graph"; import type { SourceLoc } from "../../types"; const loc: SourceLoc = { line: 1, col: 1 }; @@ -35,7 +36,7 @@ test "block B" { // Before this change, buildRuntimeGraph would be called once per // test_run_workflow step (2 calls). After caching, it is called once. // We verify behavioral correctness: both blocks pass with the shared graph. - const exitCode = await runTestFile(testFile, dir, scriptsDir, [ + const exitCode = await runTestFile(loadModuleGraph(testFile, dir), dir, scriptsDir, [ { description: "block A", loc, steps: [{ type: "test_run_workflow" as const, workflowRef: "greet", args: [], loc }], @@ -75,7 +76,7 @@ test "const drives mock and expect" { `, ); - const exitCode = await runTestFile(testFile, dir, scriptsDir, [ + const exitCode = await runTestFile(loadModuleGraph(testFile, dir), dir, scriptsDir, [ { description: "const drives mock and expect", loc, steps: [ @@ -119,7 +120,7 @@ test "undefined const ref" { `, ); - const exitCode = await runTestFile(testFile, dir, scriptsDir, [ + const exitCode = await runTestFile(loadModuleGraph(testFile, dir), dir, scriptsDir, [ { description: "undefined const ref", loc, steps: [ @@ -161,7 +162,7 @@ test "no implicit response" { `, ); - const exitCode = await runTestFile(testFile, dir, scriptsDir, [ + const exitCode = await runTestFile(loadModuleGraph(testFile, dir), dir, scriptsDir, [ { description: "no implicit response", loc, steps: [ diff --git a/src/runtime/kernel/node-test-runner.ts b/src/runtime/kernel/node-test-runner.ts index 4e7fd597..0e5604fb 100644 --- a/src/runtime/kernel/node-test-runner.ts +++ b/src/runtime/kernel/node-test-runner.ts @@ -2,6 +2,7 @@ import { mkdtempSync, rmSync, readdirSync, readFileSync } from "node:fs"; import { tmpdir } from "node:os"; import { basename, join } from "node:path"; import { buildRuntimeGraph, resolveWorkflowRef, resolveRuleRef, resolveScriptRef, type RuntimeGraph } from "./graph"; +import type { ModuleGraph } from "../../transpile/module-graph"; import { NodeWorkflowRuntime, type MockBodyDef } from "./node-workflow-runtime"; import type { MockPromptArm } from "./mock"; import type { TestBlockDef, TestStepDef } from "../../types"; @@ -256,11 +257,12 @@ async function runTestBlock( } export async function runTestFile( - testFileAbs: string, + moduleGraph: ModuleGraph, workspaceRoot: string, scriptsDir: string, blocks: TestBlockDef[], ): Promise { + const testFileAbs = moduleGraph.entryFile; const bold = "\x1b[1m"; const reset = "\x1b[0m"; const red = "\x1b[31m"; @@ -291,12 +293,13 @@ export async function runTestFile( process.stdout.write(`${bold}testing${reset} ${displayName}\n`); - // Build the runtime graph once for the entire test file. - // The graph depends only on testFileAbs and its import closure, which are - // constant across all blocks and steps within a single runTestFile call. - // If a future test step mutates imported files on disk mid-run, a manual - // rebuild would be needed — but that is not a supported pattern today. - const graph = buildRuntimeGraph(testFileAbs, workspaceRoot); + // Build the runtime view of the already-loaded module graph once for the + // entire test file. The graph depends only on testFileAbs and its import + // closure, which are constant across all blocks and steps within a single + // runTestFile call. If a future test step mutates imported files on disk + // mid-run, a manual rebuild would be needed — but that is not a supported + // pattern today. + const graph = buildRuntimeGraph(moduleGraph, workspaceRoot); let total = 0; let failed = 0; diff --git a/src/runtime/kernel/node-workflow-runner.ts b/src/runtime/kernel/node-workflow-runner.ts index 5a55b3a7..870c13e3 100644 --- a/src/runtime/kernel/node-workflow-runner.ts +++ b/src/runtime/kernel/node-workflow-runner.ts @@ -1,6 +1,6 @@ import { basename, dirname, join } from "node:path"; import { writeFileSync } from "node:fs"; -import { readCompilePrep } from "../../transpile/compile-prep"; +import { loadModuleGraph, readModuleGraph } from "../../transpile/module-graph"; import { buildRuntimeGraph } from "./graph"; import { NodeWorkflowRuntime } from "./node-workflow-runtime"; @@ -29,9 +29,9 @@ async function main(): Promise { process.env.JAIPH_SCRIPTS = join(dirname(builtScript), "scripts"); } const workspaceRoot = process.env.JAIPH_WORKSPACE || undefined; - const prepFile = process.env.JAIPH_COMPILE_PREP_FILE; - const prep = prepFile ? readCompilePrep(prepFile) : undefined; - const graph = buildRuntimeGraph(sourceFile, workspaceRoot, prep); + const graphFile = process.env.JAIPH_MODULE_GRAPH_FILE; + const moduleGraph = graphFile ? readModuleGraph(graphFile) : loadModuleGraph(sourceFile, workspaceRoot); + const graph = buildRuntimeGraph(moduleGraph); const runtime = new NodeWorkflowRuntime(graph, { env: process.env, cwd: process.cwd() }); const status = workflowName === "default" ? await runtime.runDefault(runArgs) : 1; writeFileSync( diff --git a/src/transpile/build.ts b/src/transpile/build.ts index 0b49e88f..4000d897 100644 --- a/src/transpile/build.ts +++ b/src/transpile/build.ts @@ -1,9 +1,9 @@ -import { chmodSync, mkdirSync, readFileSync, readdirSync, statSync, writeFileSync } from "node:fs"; +import { chmodSync, mkdirSync, readdirSync, statSync, writeFileSync } from "node:fs"; import { dirname, extname, join, parse, relative, resolve } from "node:path"; -import { parsejaiph } from "../parser"; -import type { CompilePrep } from "./compile-prep"; -import type { ScriptArtifact } from "./emit-script"; -import { JAIPH_EXT_REGEX, resolveImportPath } from "./resolve"; +import { emitScriptsForModuleFromGraph } from "./emit-from-graph"; +import type { ModuleGraph } from "./module-graph"; +import { loadModuleGraph } from "./module-graph"; +import { JAIPH_EXT_REGEX } from "./resolve"; function ensureDir(path: string): void { mkdirSync(path, { recursive: true }); @@ -96,58 +96,70 @@ export function walkTestFiles(inputPath: string): string[] { return files; } -/** Entry `.jh` plus all files reachable via `import` (transitive), sorted. */ -export function collectTransitiveJhModules(entrypoint: string, workspaceRoot?: string): string[] { - const visited = new Set(); - const queue = [entrypoint]; - while (queue.length > 0) { - const file = queue.pop()!; - if (visited.has(file)) continue; - visited.add(file); - const ast = parsejaiph(readFileSync(file, "utf8"), file); - for (const imp of ast.imports) { - const importedFile = resolveImportPath(file, imp.path, workspaceRoot); - if (!visited.has(importedFile)) queue.push(importedFile); - } - } - const files = [...visited]; - files.sort(); - return files; -} - /** - * Writes extracted `script` bodies to `/scripts`. When `prep` is - * supplied, the transitive-module list comes from the pre-parsed cache instead - * of re-walking and re-parsing the import closure. + * Path-based entry point. Loads a `ModuleGraph` and writes extracted `script` + * bodies under `/scripts`. For a directory input, every non-test + * `.jh` becomes its own root: each rooted graph is loaded and emitted. The + * directory walk preserves the historical multi-entry validation semantics + * for `jaiph compile ` and the integration test corpus. */ export function buildScripts( inputPath: string, targetDir: string | undefined, - emitScriptsFn: (file: string, root: string) => ScriptArtifact[], workspaceRoot?: string, - prep?: CompilePrep, ): { scriptsDir: string } { const absInput = resolve(inputPath); const inputStat = statSync(absInput); const rootDir = inputStat.isDirectory() ? absInput : dirname(absInput); const outRoot = resolve(targetDir ?? rootDir); ensureDir(outRoot); + const scriptsRoot = join(outRoot, "scripts"); + ensureDir(scriptsRoot); + + if (inputStat.isFile()) { + const graph = loadModuleGraph(absInput, workspaceRoot); + emitGraphInto(graph, rootDir, scriptsRoot); + return { scriptsDir: scriptsRoot }; + } - const entrypointFile = inputStat.isFile() ? absInput : null; - const files = prep - ? [...prep.astByFile.keys()].sort() - : entrypointFile ? collectTransitiveJhModules(entrypointFile, workspaceRoot) : walkjhFiles(rootDir); + for (const entry of walkjhFiles(absInput)) { + const graph = loadModuleGraph(entry, workspaceRoot); + emitGraphInto(graph, rootDir, scriptsRoot); + } + return { scriptsDir: scriptsRoot }; +} + +/** + * Graph-based entry point. The caller has already built a `ModuleGraph` (the + * default `jaiph run` path); emit every reachable module's scripts into + * `/scripts` without re-parsing anything. `rootDir` defaults to + * the entry's parent directory so symbol prefixes match the path-based form. + */ +export function buildScriptsFromGraph( + graph: ModuleGraph, + targetDir: string, + rootDir?: string, +): { scriptsDir: string } { + const outRoot = resolve(targetDir); + ensureDir(outRoot); const scriptsRoot = join(outRoot, "scripts"); ensureDir(scriptsRoot); + const resolvedRoot = resolve(rootDir ?? dirname(graph.entryFile)); + emitGraphInto(graph, resolvedRoot, scriptsRoot); + return { scriptsDir: scriptsRoot }; +} +function emitGraphInto(graph: ModuleGraph, rootDir: string, scriptsRoot: string): void { + const files = [...graph.modules.keys()].sort(); for (const file of files) { - const scripts = emitScriptsFn(file, rootDir); + const scripts = emitScriptsForModuleFromGraph(graph, file, rootDir); for (const s of scripts) { const scriptPath = join(scriptsRoot, s.name); writeFileSync(scriptPath, s.content, "utf8"); chmodSync(scriptPath, 0o755); } } - - return { scriptsDir: scriptsRoot }; } + +// Re-export so `jaiph compile` can use the centralized regex. +export { JAIPH_EXT_REGEX }; diff --git a/src/transpile/compile-prep.ts b/src/transpile/compile-prep.ts deleted file mode 100644 index dcfdbf2e..00000000 --- a/src/transpile/compile-prep.ts +++ /dev/null @@ -1,69 +0,0 @@ -import { readFileSync, writeFileSync } from "node:fs"; -import { resolve } from "node:path"; -import { parsejaiph } from "../parser"; -import { resolveImportPath } from "./resolve"; -import type { jaiphModule } from "../types"; - -/** - * One-shot parse of a `.jh` entry plus its transitive import closure. Reused by - * `buildScripts` (validation + script emit) and `buildRuntimeGraph` (runtime - * dispatch) so each reachable module is parsed exactly once per `jaiph run`, - * even across the parent-CLI → child-runner process boundary. - */ -export interface CompilePrep { - entryFile: string; - workspaceRoot?: string; - /** AST for every reachable module, keyed by absolute path. */ - astByFile: Map; -} - -export function prepareCompile(entryFile: string, workspaceRoot?: string): CompilePrep { - const entry = resolve(entryFile); - const astByFile = new Map(); - const queue: string[] = [entry]; - while (queue.length > 0) { - const current = queue.shift()!; - if (astByFile.has(current)) continue; - const ast = parsejaiph(readFileSync(current, "utf8"), current); - astByFile.set(current, ast); - for (const imp of ast.imports) { - const importedFile = resolveImportPath(current, imp.path, workspaceRoot); - if (!astByFile.has(importedFile)) queue.push(importedFile); - } - } - return { entryFile: entry, workspaceRoot, astByFile }; -} - -/** Stable JSON encoding for cross-process transfer. */ -export function serializeCompilePrep(prep: CompilePrep): string { - const entries = [...prep.astByFile.entries()]; - entries.sort(([a], [b]) => (a < b ? -1 : a > b ? 1 : 0)); - return JSON.stringify({ - entryFile: prep.entryFile, - workspaceRoot: prep.workspaceRoot ?? null, - modules: entries.map(([file, ast]) => ({ file, ast })), - }); -} - -export function deserializeCompilePrep(content: string): CompilePrep { - const obj = JSON.parse(content) as { - entryFile: string; - workspaceRoot: string | null; - modules: Array<{ file: string; ast: jaiphModule }>; - }; - const astByFile = new Map(); - for (const m of obj.modules) astByFile.set(m.file, m.ast); - return { - entryFile: obj.entryFile, - workspaceRoot: obj.workspaceRoot ?? undefined, - astByFile, - }; -} - -export function writeCompilePrep(filePath: string, prep: CompilePrep): void { - writeFileSync(filePath, serializeCompilePrep(prep), "utf8"); -} - -export function readCompilePrep(filePath: string): CompilePrep { - return deserializeCompilePrep(readFileSync(filePath, "utf8")); -} diff --git a/src/transpile/emit-from-graph.ts b/src/transpile/emit-from-graph.ts new file mode 100644 index 00000000..805e7dc9 --- /dev/null +++ b/src/transpile/emit-from-graph.ts @@ -0,0 +1,38 @@ +import { readFileSync } from "node:fs"; +import type { ModuleGraph } from "./module-graph"; +import { buildScriptFiles, type ScriptArtifact } from "./emit-script"; +import { workflowSymbolForFile } from "./resolve"; +import { resolveScriptImportPath, validateModule } from "./validate"; + +/** + * Parse, validate, and extract per-`script` bash files for one module in the + * graph. Operates entirely on in-memory ASTs from `graph`; `.jh` files are + * never re-read. External `import script` bodies still come from disk (they + * are not `.jh`). + */ +export function emitScriptsForModuleFromGraph( + graph: ModuleGraph, + inputFile: string, + rootDir: string, +): ScriptArtifact[] { + const node = graph.modules.get(inputFile); + if (!node) { + throw new Error(`emitScriptsForModule: ${inputFile} is not in the graph`); + } + const ast = node.ast; + validateModule(ast, graph); + const workflowSymbol = workflowSymbolForFile(inputFile, rootDir); + const importedWorkflowSymbols = new Map(); + for (const [alias, importedFile] of node.imports) { + importedWorkflowSymbols.set(alias, workflowSymbolForFile(importedFile, rootDir)); + } + let resolvedScriptImports: Map | undefined; + if (ast.scriptImports && ast.scriptImports.length > 0) { + resolvedScriptImports = new Map(); + for (const si of ast.scriptImports) { + const resolved = resolveScriptImportPath(ast.filePath, si.path); + resolvedScriptImports.set(si.alias, readFileSync(resolved, "utf8")); + } + } + return buildScriptFiles(ast, importedWorkflowSymbols, workflowSymbol, resolvedScriptImports); +} diff --git a/src/transpile/compile-prep.test.ts b/src/transpile/module-graph.test.ts similarity index 59% rename from src/transpile/compile-prep.test.ts rename to src/transpile/module-graph.test.ts index f96388c6..09b39999 100644 --- a/src/transpile/compile-prep.test.ts +++ b/src/transpile/module-graph.test.ts @@ -4,32 +4,27 @@ import { join } from "node:path"; import { test } from "node:test"; import assert from "node:assert/strict"; -import { buildScripts } from "../transpiler"; +import { buildScriptsFromGraph } from "../transpiler"; import { buildRuntimeGraph, resolveScriptRef, resolveWorkflowRef } from "../runtime/kernel/graph"; import { - prepareCompile, - serializeCompilePrep, - deserializeCompilePrep, -} from "./compile-prep"; + loadModuleGraph, + serializeModuleGraph, + deserializeModuleGraph, +} from "./module-graph"; function write(filePath: string, content: string): void { writeFileSync(filePath, content, "utf8"); } /** - * Acceptance criterion 1: the default local run path must not parse the entry - * module in the parent and then re-parse the same module in the child to build - * the runtime graph. - * - * Strategy: after `prepareCompile` parses every reachable `.jh`, we corrupt - * each file's contents to junk that the parser would reject. If `buildScripts` - * (parent) or `buildRuntimeGraph` (child) re-reads/re-parses any module, the - * call throws and the test fails. The old `run.ts` + `buildScripts()` + - * `node-workflow-runner.ts` duplicate-parse pattern is exactly what would - * fail here. + * Acceptance criterion 4 from the parser-simplification design: each `.jh` + * source file in a compile is parsed exactly once. After `loadModuleGraph` + * walks the entry plus its transitive imports, neither `buildScripts` nor + * `buildRuntimeGraph` may re-read a `.jh` source — verified by corrupting + * every file post-load and asserting the pipeline still succeeds. */ -test("compile-prep: buildScripts + buildRuntimeGraph reuse pre-parsed ASTs and never re-read .jh after prepare", () => { - const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-noreparse-")); +test("module-graph: buildScripts + buildRuntimeGraph reuse pre-parsed ASTs and never re-read .jh after load", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-graph-noreparse-")); try { const main = join(dir, "main.jh"); const lib = join(dir, "lib.jh"); @@ -58,30 +53,30 @@ test("compile-prep: buildScripts + buildRuntimeGraph reuse pre-parsed ASTs and n ].join("\n"), ); - const prep = prepareCompile(main); - assert.equal(prep.astByFile.size, 2); - assert.ok(prep.astByFile.has(main)); - assert.ok(prep.astByFile.has(lib)); + const graph = loadModuleGraph(main); + assert.equal(graph.modules.size, 2); + assert.ok(graph.modules.has(main)); + assert.ok(graph.modules.has(lib)); // Corrupt source contents. Files still exist (so existsSync passes), but // any new parse call would throw a parse error. write(main, "!!! invalid jaiph syntax !!!\n"); write(lib, "!!! invalid jaiph syntax !!!\n"); - const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-out-")); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-graph-out-")); try { - const { scriptsDir } = buildScripts(main, outDir, undefined, prep); + const { scriptsDir } = buildScriptsFromGraph(graph, outDir); const emitted = readdirSync(scriptsDir).sort(); assert.deepEqual(emitted, ["helper", "local_script"]); - const graph = buildRuntimeGraph(main, undefined, prep); - assert.equal(graph.modules.size, 2); - const inner = resolveWorkflowRef(graph, main, { + const runtime = buildRuntimeGraph(graph); + assert.equal(runtime.modules.size, 2); + const inner = resolveWorkflowRef(runtime, main, { value: "lib.inner", loc: { line: 1, col: 1 }, }); assert.equal(inner?.workflow.name, "inner"); - const helper = resolveScriptRef(graph, main, "lib.helper"); + const helper = resolveScriptRef(runtime, main, "lib.helper"); assert.equal(helper?.script.name, "helper"); } finally { rmSync(outDir, { recursive: true, force: true }); @@ -92,11 +87,11 @@ test("compile-prep: buildScripts + buildRuntimeGraph reuse pre-parsed ASTs and n }); /** - * Acceptance criterion 2: the optimized graph/compile-prep path preserves - * cross-module workflow, rule, and script resolution. + * Cross-module workflow, rule, and script resolution survives the graph + * pipeline. */ -test("compile-prep: cross-module workflow, rule, and script resolution survives the optimized path", () => { - const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-crossmod-")); +test("module-graph: cross-module workflow, rule, and script resolution", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-graph-crossmod-")); try { const main = join(dir, "main.jh"); const lib = join(dir, "lib.jh"); @@ -128,27 +123,27 @@ test("compile-prep: cross-module workflow, rule, and script resolution survives ].join("\n"), ); - const prep = prepareCompile(main); - const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-out2-")); + const graph = loadModuleGraph(main); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-graph-out2-")); try { - const { scriptsDir } = buildScripts(main, outDir, undefined, prep); + const { scriptsDir } = buildScriptsFromGraph(graph, outDir); const emitted = readdirSync(scriptsDir).sort(); assert.deepEqual(emitted, ["helper", "local_script"]); - const graph = buildRuntimeGraph(main, undefined, prep); - const localWf = resolveWorkflowRef(graph, main, { + const runtime = buildRuntimeGraph(graph); + const localWf = resolveWorkflowRef(runtime, main, { value: "default", loc: { line: 1, col: 1 }, }); assert.equal(localWf?.workflow.name, "default"); - const importedWf = resolveWorkflowRef(graph, main, { + const importedWf = resolveWorkflowRef(runtime, main, { value: "lib.inner", loc: { line: 1, col: 1 }, }); assert.equal(importedWf?.workflow.name, "inner"); - const localScript = resolveScriptRef(graph, main, "local_script"); + const localScript = resolveScriptRef(runtime, main, "local_script"); assert.equal(localScript?.script.name, "local_script"); - const importedScript = resolveScriptRef(graph, main, "lib.helper"); + const importedScript = resolveScriptRef(runtime, main, "lib.helper"); assert.equal(importedScript?.script.name, "helper"); } finally { rmSync(outDir, { recursive: true, force: true }); @@ -159,12 +154,12 @@ test("compile-prep: cross-module workflow, rule, and script resolution survives }); /** - * Cross-process boundary: the parent serializes the prep, the child + * Cross-process boundary: the parent serializes the graph, the child * deserializes it and reuses every AST. Asserts the JSON format is - * round-trippable so the worker can rebuild the graph without re-parsing. + * round-trippable so the runner can rebuild the graph without re-parsing. */ -test("compile-prep: serialize round-trip preserves the import closure for the child runner", () => { - const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-roundtrip-")); +test("module-graph: serialize round-trip preserves the import closure for the child runner", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-graph-roundtrip-")); try { const main = join(dir, "main.jh"); const lib = join(dir, "lib.jh"); @@ -188,16 +183,16 @@ test("compile-prep: serialize round-trip preserves the import closure for the ch ].join("\n"), ); - const prep = prepareCompile(main); - const serialized = serializeCompilePrep(prep); + const graph = loadModuleGraph(main); + const serialized = serializeModuleGraph(graph); // Corrupt source contents so any deserialized-path consumer that tries to // re-parse would fail loudly. Files still exist so existsSync passes. write(main, "!!! invalid !!!\n"); write(lib, "!!! invalid !!!\n"); - const round = deserializeCompilePrep(serialized); - assert.equal(round.astByFile.size, 2); - const graph = buildRuntimeGraph(main, undefined, round); - const importedWf = resolveWorkflowRef(graph, main, { + const round = deserializeModuleGraph(serialized); + assert.equal(round.modules.size, 2); + const runtime = buildRuntimeGraph(round); + const importedWf = resolveWorkflowRef(runtime, main, { value: "lib.inner", loc: { line: 1, col: 1 }, }); @@ -211,8 +206,8 @@ test("compile-prep: serialize round-trip preserves the import closure for the ch * Three-module closure: prove the optimization scales beyond the direct * import case in the acceptance criteria. */ -test("compile-prep: handles a 3-module closure with one shared parse", () => { - const dir = mkdtempSync(join(tmpdir(), "jaiph-prep-three-")); +test("module-graph: handles a 3-module closure with one shared parse", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-graph-three-")); try { const main = join(dir, "main.jh"); const libA = join(dir, "a.jh"); @@ -239,22 +234,21 @@ test("compile-prep: handles a 3-module closure with one shared parse", () => { ].join("\n"), ); - const prep = prepareCompile(main); - assert.equal(prep.astByFile.size, 3); + const graph = loadModuleGraph(main); + assert.equal(graph.modules.size, 3); // Corrupt every source: any downstream re-parse would now fail. write(main, "!!! invalid !!!\n"); write(libA, "!!! invalid !!!\n"); write(libB, "!!! invalid !!!\n"); - const outDir = mkdtempSync(join(tmpdir(), "jaiph-prep-three-out-")); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-graph-three-out-")); try { - buildScripts(main, outDir, undefined, prep); - const graph = buildRuntimeGraph(main, undefined, prep); - const bRef = resolveWorkflowRef(graph, main, { value: "b.b", loc: { line: 1, col: 1 } }); + buildScriptsFromGraph(graph, outDir); + const runtime = buildRuntimeGraph(graph); + const bRef = resolveWorkflowRef(runtime, main, { value: "b.b", loc: { line: 1, col: 1 } }); assert.equal(bRef?.workflow.name, "b"); - // Resolve transitively into a.jh via b's imports. - const bNode = graph.modules.get(libB)!; + const bNode = runtime.modules.get(libB)!; assert.equal(bNode.imports.get("a"), libA); } finally { rmSync(outDir, { recursive: true, force: true }); diff --git a/src/transpile/module-graph.ts b/src/transpile/module-graph.ts new file mode 100644 index 00000000..f896a07e --- /dev/null +++ b/src/transpile/module-graph.ts @@ -0,0 +1,118 @@ +import { existsSync, readFileSync, writeFileSync } from "node:fs"; +import { resolve } from "node:path"; +import { jaiphError } from "../errors"; +import { parsejaiph } from "../parser"; +import { resolveImportPath } from "./resolve"; +import type { jaiphModule } from "../types"; + +/** + * `ModuleGraph` is the single representation of "all `.jh` modules reachable + * from an entry point, parsed once." `loadModuleGraph` is the only routine + * that reads and parses `.jh` sources; `validateReferences` and the script + * emitter both consume the graph without touching the filesystem for source + * or AST reads. + */ + +export interface ModuleNode { + filePath: string; + ast: jaiphModule; + /** alias → resolved absolute path of imported `.jh` module */ + imports: Map; +} + +export interface ModuleGraph { + entryFile: string; + workspaceRoot?: string; + modules: Map; +} + +function buildNode(filePath: string, ast: jaiphModule, workspaceRoot?: string): ModuleNode { + const imports = new Map(); + for (const imp of ast.imports) { + imports.set(imp.alias, resolveImportPath(filePath, imp.path, workspaceRoot)); + } + return { filePath, ast, imports }; +} + +/** + * Walks the entry plus its transitive `.jh` import closure. Each reachable + * file is read from disk and parsed exactly once. Import paths are resolved + * via {@link resolveImportPath} so library fallbacks behave as elsewhere in + * the toolchain. Missing imports are not surfaced here; the validator + * reports `E_IMPORT_NOT_FOUND` once it inspects the graph. + */ +export function loadModuleGraph(entryFile: string, workspaceRoot?: string): ModuleGraph { + const entry = resolve(entryFile); + const modules = new Map(); + type QueueEntry = { file: string; importer?: { file: string; alias: string; loc: { line: number; col: number } } }; + const queue: QueueEntry[] = [{ file: entry }]; + while (queue.length > 0) { + const { file: current, importer } = queue.shift()!; + if (modules.has(current)) continue; + if (!existsSync(current)) { + if (importer) { + throw jaiphError( + importer.file, + importer.loc.line, + importer.loc.col, + "E_IMPORT_NOT_FOUND", + `import "${importer.alias}" resolves to missing file "${current}"`, + ); + } + throw jaiphError(current, 1, 1, "E_IMPORT_NOT_FOUND", `entry file not found: "${current}"`); + } + const ast = parsejaiph(readFileSync(current, "utf8"), current); + const node = buildNode(current, ast, workspaceRoot); + modules.set(current, node); + for (const imp of ast.imports) { + const resolved = node.imports.get(imp.alias)!; + if (!modules.has(resolved)) { + queue.push({ file: resolved, importer: { file: current, alias: imp.alias, loc: imp.loc } }); + } + } + } + return { entryFile: entry, workspaceRoot, modules }; +} + +/** Build a graph from an already-parsed AST plus its workspace-resolved imports. Used by the cross-process deserializer. */ +export function moduleGraphFromAsts( + entryFile: string, + astByFile: Map, + workspaceRoot?: string, +): ModuleGraph { + const modules = new Map(); + for (const [filePath, ast] of astByFile) { + modules.set(filePath, buildNode(filePath, ast, workspaceRoot)); + } + return { entryFile: resolve(entryFile), workspaceRoot, modules }; +} + +/** Stable JSON encoding for cross-process transfer (entries sorted by absolute path). */ +export function serializeModuleGraph(graph: ModuleGraph): string { + const entries = [...graph.modules.entries()]; + entries.sort(([a], [b]) => (a < b ? -1 : a > b ? 1 : 0)); + return JSON.stringify({ + entryFile: graph.entryFile, + workspaceRoot: graph.workspaceRoot ?? null, + modules: entries.map(([file, node]) => ({ file, ast: node.ast })), + }); +} + +export function deserializeModuleGraph(content: string): ModuleGraph { + const obj = JSON.parse(content) as { + entryFile: string; + workspaceRoot: string | null; + modules: Array<{ file: string; ast: jaiphModule }>; + }; + const astByFile = new Map(); + for (const m of obj.modules) astByFile.set(m.file, m.ast); + return moduleGraphFromAsts(obj.entryFile, astByFile, obj.workspaceRoot ?? undefined); +} + +export function writeModuleGraph(filePath: string, graph: ModuleGraph): void { + writeFileSync(filePath, serializeModuleGraph(graph), "utf8"); +} + +export function readModuleGraph(filePath: string): ModuleGraph { + return deserializeModuleGraph(readFileSync(filePath, "utf8")); +} diff --git a/src/transpile/pipeline-io-purity.test.ts b/src/transpile/pipeline-io-purity.test.ts new file mode 100644 index 00000000..8603ec45 --- /dev/null +++ b/src/transpile/pipeline-io-purity.test.ts @@ -0,0 +1,233 @@ +import { mkdtempSync, readdirSync, readFileSync, rmSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { extname, join, resolve } from "node:path"; +import { test } from "node:test"; +import assert from "node:assert/strict"; + +import { parsejaiph } from "../parser"; +import { loadModuleGraph } from "./module-graph"; +import { validateReferences } from "./validate"; +import { buildScriptsFromGraph } from "../transpiler"; + +// `require("node:fs")` returns the real, mutable module exports; the +// TypeScript-emitted `__importStar` wrapper used by `import * as fs` builds a +// separate getter-only object that defeats monkey-patching, so the purity +// guards below patch through `require` instead. +// eslint-disable-next-line @typescript-eslint/no-var-requires +const realFs: typeof import("node:fs") = require("node:fs"); + +/** Parser fixtures — exercised stand-alone (parse only; broken imports are fine here). */ +const PARSER_FIXTURE_ROOTS = [ + resolve(process.cwd(), "test-fixtures/golden-ast/fixtures"), + resolve(process.cwd(), "test-fixtures/sample-build/fixtures"), + resolve(process.cwd(), "examples"), +]; + +/** + * Pipeline fixtures — must have a self-contained import closure so + * `loadModuleGraph` + `validateReferences` + emit can run end-to-end. + * `test-fixtures/golden-ast` is excluded because its `imports.jh` fixture + * references a stub `lib.jh` that does not ship alongside it. + */ +const PIPELINE_FIXTURE_ROOTS = [ + resolve(process.cwd(), "test-fixtures/sample-build/fixtures"), + resolve(process.cwd(), "examples"), +]; + +function listJhFiles(dir: string): string[] { + const out: string[] = []; + const stack = [dir]; + while (stack.length > 0) { + const current = stack.pop()!; + for (const entry of readdirSync(current, { withFileTypes: true })) { + const full = join(current, entry.name); + if (entry.isDirectory()) stack.push(full); + else if (entry.isFile() && extname(entry.name) === ".jh") out.push(full); + } + } + return out; +} + +/** + * Acceptance criterion 1: `parsejaiph(source, filePath)` is I/O-pure. With + * every fs entry point stubbed to throw for the duration of the call, + * parsing every fixture must still succeed because the parser never reaches + * `node:fs` at all. + */ +test("parser-io-purity: parsejaiph never touches node:fs for any fixture", () => { + const fixtures: Array<{ file: string; content: string }> = []; + for (const root of PARSER_FIXTURE_ROOTS) { + for (const file of listJhFiles(root)) { + fixtures.push({ file, content: readFileSync(file, "utf8") }); + } + } + assert.ok(fixtures.length > 0, "expected to find .jh fixtures to parse"); + + for (const { file, content } of fixtures) { + const guard = installFsGuard(() => true); + try { + const ast = parsejaiph(content, file); + assert.equal(ast.filePath, file, `parse produced unexpected filePath for ${file}`); + } finally { + guard.restore(); + } + } +}); + +/** + * Acceptance criterion 2: once the module graph is loaded, neither + * `validate(graph)` nor `emit(graph, outDir)` may reach the filesystem for + * `.jh` source or AST reads. Writing emitted bash files is allowed. + * + * The test loads each fixture (fs is unstubbed during load), then stubs + * `fs.readFileSync` / `fs.existsSync` to throw on any `.jh` path, and runs + * `validateReferences(graph)` plus a full script emit. Both must succeed. + */ +test("pipeline-io-purity: validate(graph) and emit(graph, outDir) never read .jh from disk", () => { + const entries: string[] = []; + for (const root of PIPELINE_FIXTURE_ROOTS) { + for (const file of listJhFiles(root)) { + // Skip *.test.jh — those are exercised by the test-runner path; the + // graph pipeline still loads them but they share the same purity + // guarantees and lengthen the test for no extra coverage. + if (file.endsWith(".test.jh")) continue; + entries.push(file); + } + } + assert.ok(entries.length > 0, "expected to find .jh fixtures"); + + for (const entry of entries) { + const graph = loadModuleGraph(entry); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-emit-purity-")); + const guard = installFsGuard((path) => extname(path) === ".jh"); + try { + validateReferences(graph); + buildScriptsFromGraph(graph, outDir); + } finally { + guard.restore(); + rmSync(outDir, { recursive: true, force: true }); + } + } +}); + +/** + * Acceptance criterion 4: each `.jh` source file in a compile is parsed + * exactly once. The test creates a graph with transitive imports + * (entry → lib → leaf), counts `parsejaiph` invocations across + * `loadModuleGraph` + `validateReferences` + `buildScriptsFromGraph`, and + * asserts the count equals the number of unique modules. + */ +test("parse-once: full pipeline calls parsejaiph exactly once per reachable .jh module", () => { + const dir = mkdtempSync(join(tmpdir(), "jaiph-parse-once-")); + try { + const entry = join(dir, "main.jh"); + const libA = join(dir, "a.jh"); + const libB = join(dir, "b.jh"); + require("node:fs").writeFileSync(libA, "workflow a() {\n echo ok\n}\n", "utf8"); + require("node:fs").writeFileSync( + libB, + ['import "./a.jh" as a', "workflow b() {", " run a.a()", "}", ""].join("\n"), + "utf8", + ); + require("node:fs").writeFileSync( + entry, + ['import "./b.jh" as b', "workflow default() {", " run b.b()", "}", ""].join("\n"), + "utf8", + ); + + const counter = installParseCounter(); + try { + const graph = loadModuleGraph(entry); + validateReferences(graph); + const outDir = mkdtempSync(join(tmpdir(), "jaiph-parse-once-out-")); + try { + buildScriptsFromGraph(graph, outDir); + } finally { + rmSync(outDir, { recursive: true, force: true }); + } + assert.equal(graph.modules.size, 3); + assert.equal( + counter.byFile.size, + 3, + `expected 3 unique files parsed, got ${[...counter.byFile.keys()].join(", ")}`, + ); + for (const [file, count] of counter.byFile) { + assert.equal(count, 1, `file ${file} parsed ${count} times (expected 1)`); + } + } finally { + counter.restore(); + } + } finally { + rmSync(dir, { recursive: true, force: true }); + } +}); + +interface FsGuard { + restore(): void; +} + +/** + * Replace `fs.readFileSync`, `fs.existsSync`, `fs.statSync` so they throw + * when `shouldBlock(path)` returns true. Patching is done against the real + * `require("node:fs")` exports because the TS `__importStar` wrapper used + * by `import * as fs` returns getter-only properties. + */ +function installFsGuard(shouldBlock: (path: string) => boolean): FsGuard { + const orig = { + readFileSync: realFs.readFileSync, + existsSync: realFs.existsSync, + statSync: realFs.statSync, + }; + const guardCall = (name: string, path: unknown): void => { + if (typeof path !== "string") return; + if (shouldBlock(path)) { + throw new Error(`fs.${name} blocked by purity guard: ${path}`); + } + }; + const mutable = realFs as unknown as Record; + mutable.readFileSync = (path: unknown, opts?: unknown) => { + guardCall("readFileSync", path); + return orig.readFileSync(path as Parameters[0], opts as Parameters[1]); + }; + mutable.existsSync = (path: unknown) => { + guardCall("existsSync", path); + return orig.existsSync(path as Parameters[0]); + }; + mutable.statSync = (path: unknown, opts?: unknown) => { + guardCall("statSync", path); + return orig.statSync(path as Parameters[0], opts as Parameters[1]); + }; + return { + restore(): void { + mutable.readFileSync = orig.readFileSync; + mutable.existsSync = orig.existsSync; + mutable.statSync = orig.statSync; + }, + }; +} + +interface ParseCounter { + byFile: Map; + restore(): void; +} + +/** + * Replace the exported `parsejaiph` on the module so every call goes through + * a counting wrapper. Works because TypeScript's CJS output rewrites named + * imports as property reads against the module's exports object. + */ +function installParseCounter(): ParseCounter { + const parserMod = require("../parser") as { parsejaiph: typeof parsejaiph }; + const original = parserMod.parsejaiph; + const byFile = new Map(); + parserMod.parsejaiph = function counting(source: string, filePath: string) { + byFile.set(filePath, (byFile.get(filePath) ?? 0) + 1); + return original(source, filePath); + } as typeof parsejaiph; + return { + byFile, + restore(): void { + parserMod.parsejaiph = original; + }, + }; +} diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index 30627918..1a8ba196 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -1,6 +1,8 @@ +import { existsSync } from "node:fs"; import { dirname, resolve } from "node:path"; import { jaiphError } from "../errors"; import type { jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; +import type { ModuleGraph } from "./module-graph"; import type { SubstitutionValidateEnv } from "./validate-substitution"; import { validateManagedWorkflowShell } from "./validate-substitution"; import type { RefResolutionContext, RefTargetKind } from "./validate-ref-resolution"; @@ -28,14 +30,6 @@ import { dedentCommonLeadingWhitespace } from "../parse/dedent"; import { matchSendOperator } from "../parse/core"; import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"; -export interface ValidateContext { - resolveImportPath: (fromFile: string, importPath: string, workspaceRoot?: string) => string; - existsSync: (path: string) => boolean; - readFile: (path: string) => string; - parse: (content: string, filePath: string) => jaiphModule; - workspaceRoot?: string; -} - /** True when `<-` appears outside quotes (same idea as `matchSendOperator`). */ function hasUnquotedSendArrow(line: string): boolean { let inSingleQuote = false; @@ -492,7 +486,19 @@ export function resolveScriptImportPath(fromFile: string, importPath: string): s return resolve(dirname(fromFile), importPath); } -export function validateReferences(ast: jaiphModule, ctx: ValidateContext): void { +/** Validate every module in the graph. Equivalent to `validateModule` per entry, plus de-dup. */ +export function validateReferences(graph: ModuleGraph): void { + for (const node of graph.modules.values()) { + validateModule(node.ast, graph); + } +} + +/** + * Validate one module's references against the graph. Imported ASTs are read + * from `graph.modules` — no `.jh` filesystem access. `existsSync` is used + * only for `import script` paths, which point at non-`.jh` script bodies. + */ +export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const localChannels = new Set(ast.channels.map((c) => c.name)); const localRules = new Set(ast.rules.map((r) => r.name)); const localWorkflows = new Set(ast.workflows.map((w) => w.name)); @@ -500,11 +506,13 @@ export function validateReferences(ast: jaiphModule, ctx: ValidateContext): void const importsByAlias = new Map(); const importedAstCache = new Map(); - // Validate script imports: resolve paths and check existence. + // Validate script imports: resolve paths and check existence. These point + // at non-`.jh` script bodies (resolved + emitted later), so `existsSync` is + // allowed here under acceptance criterion 2. if (ast.scriptImports) { for (const si of ast.scriptImports) { const resolved = resolveScriptImportPath(ast.filePath, si.path); - if (!ctx.existsSync(resolved)) { + if (!existsSync(resolved)) { throw jaiphError( ast.filePath, si.loc.line, @@ -517,6 +525,7 @@ export function validateReferences(ast: jaiphModule, ctx: ValidateContext): void } } + const node = graph.modules.get(ast.filePath); for (const imp of ast.imports) { if (importsByAlias.has(imp.alias)) { throw jaiphError( @@ -527,9 +536,19 @@ export function validateReferences(ast: jaiphModule, ctx: ValidateContext): void `duplicate import alias "${imp.alias}"`, ); } - const resolved = ctx.resolveImportPath(ast.filePath, imp.path, ctx.workspaceRoot); + const resolved = node?.imports.get(imp.alias); + if (!resolved) { + throw jaiphError( + ast.filePath, + imp.loc.line, + imp.loc.col, + "E_IMPORT_NOT_FOUND", + `import "${imp.alias}" could not be resolved`, + ); + } importsByAlias.set(imp.alias, resolved); - if (!ctx.existsSync(resolved)) { + const importedAst = graph.modules.get(resolved)?.ast; + if (!importedAst) { throw jaiphError( ast.filePath, imp.loc.line, @@ -538,7 +557,7 @@ export function validateReferences(ast: jaiphModule, ctx: ValidateContext): void `import "${imp.alias}" resolves to missing file "${resolved}"`, ); } - importedAstCache.set(resolved, ctx.parse(ctx.readFile(resolved), resolved)); + importedAstCache.set(resolved, importedAst); } const refCtx: RefResolutionContext = { diff --git a/src/transpiler.ts b/src/transpiler.ts index 9b493ac1..d6ceba0b 100644 --- a/src/transpiler.ts +++ b/src/transpiler.ts @@ -1,68 +1,52 @@ -import { existsSync, readFileSync } from "node:fs"; -import { dirname } from "node:path"; -import { parsejaiph } from "./parser"; -import { buildScripts as buildScriptsImpl, walkTestFiles } from "./transpile/build"; -import type { CompilePrep } from "./transpile/compile-prep"; -import { buildScriptFiles, type ScriptArtifact } from "./transpile/emit-script"; -import { resolveImportPath, workflowSymbolForFile } from "./transpile/resolve"; -import { resolveScriptImportPath, validateReferences } from "./transpile/validate"; +import type { ModuleGraph } from "./transpile/module-graph"; +import { loadModuleGraph } from "./transpile/module-graph"; +import { buildScripts as buildScriptsImpl, buildScriptsFromGraph as buildScriptsFromGraphImpl, walkTestFiles } from "./transpile/build"; +import { emitScriptsForModuleFromGraph } from "./transpile/emit-from-graph"; +import type { ScriptArtifact } from "./transpile/emit-script"; export { resolveImportPath, workflowSymbolForFile } from "./transpile/resolve"; export type { ScriptArtifact } from "./transpile/emit-script"; -export type { CompilePrep } from "./transpile/compile-prep"; +export type { ModuleGraph, ModuleNode } from "./transpile/module-graph"; +export { loadModuleGraph } from "./transpile/module-graph"; +export { emitScriptsForModuleFromGraph } from "./transpile/emit-from-graph"; /** - * Parse, validate, and extract per-`script` bash files for one module (no workflow bash emission). - * When `prep` is supplied, reuses already-parsed ASTs instead of re-reading from disk. + * Path-based wrapper for callers that don't already have a graph (tests and + * legacy entry points). Loads a single-entry graph and emits scripts for the + * entry module. Imported modules are validated transitively as part of the + * shared graph but their script bodies are not emitted from this call. */ export function emitScriptsForModule( inputFile: string, rootDir: string, workspaceRoot?: string, - prep?: CompilePrep, ): ScriptArtifact[] { - const cachedAst = prep?.astByFile.get(inputFile); - const ast = cachedAst ?? parsejaiph(readFileSync(inputFile, "utf8"), inputFile); - const readFile = prep - ? (path: string): string => (prep.astByFile.has(path) ? "" : readFileSync(path, "utf8")) - : (path: string): string => readFileSync(path, "utf8"); - const parse = prep - ? (content: string, filePath: string) => - prep.astByFile.get(filePath) ?? parsejaiph(content, filePath) - : parsejaiph; - validateReferences(ast, { - resolveImportPath, - existsSync, - readFile, - parse, - workspaceRoot, - }); - const workflowSymbol = workflowSymbolForFile(inputFile, rootDir); - const importedWorkflowSymbols = new Map(); - for (const imp of ast.imports) { - const importedFile = resolveImportPath(ast.filePath, imp.path, workspaceRoot); - importedWorkflowSymbols.set(imp.alias, workflowSymbolForFile(importedFile, rootDir)); - } - // Resolve script imports: read external script files so they are emitted as artifacts. - let resolvedScriptImports: Map | undefined; - if (ast.scriptImports && ast.scriptImports.length > 0) { - resolvedScriptImports = new Map(); - for (const si of ast.scriptImports) { - const resolved = resolveScriptImportPath(ast.filePath, si.path); - resolvedScriptImports.set(si.alias, readFileSync(resolved, "utf8")); - } - } - return buildScriptFiles(ast, importedWorkflowSymbols, workflowSymbol, resolvedScriptImports); + const graph = loadModuleGraph(inputFile, workspaceRoot); + return emitScriptsForModuleFromGraph(graph, graph.entryFile, rootDir); } export { walkTestFiles }; +/** + * Path-based wrapper. Loads the module graph and emits per-script bash files + * for every reachable module (file entry) or every non-test `.jh` under the + * directory (directory entry). Kept for tests and the `jaiph test` path. + */ export function buildScripts( inputPath: string, targetDir?: string, workspaceRoot?: string, - prep?: CompilePrep, ): { scriptsDir: string } { - const emitFn = (file: string, root: string) => emitScriptsForModule(file, root, workspaceRoot, prep); - return buildScriptsImpl(inputPath, targetDir, emitFn, workspaceRoot, prep); + return buildScriptsImpl(inputPath, targetDir, workspaceRoot); +} + +/** + * Graph-based entry point. Used by `jaiph run` where the parent CLI already + * built the graph and wants to skip a second discovery walk. + */ +export function buildScriptsFromGraph( + graph: ModuleGraph, + targetDir: string, +): { scriptsDir: string } { + return buildScriptsFromGraphImpl(graph, targetDir); } diff --git a/test-infra/compiler-test-runner.ts b/test-infra/compiler-test-runner.ts index 7db6c0cd..8302b7fe 100644 --- a/test-infra/compiler-test-runner.ts +++ b/test-infra/compiler-test-runner.ts @@ -1,11 +1,10 @@ import test from "node:test"; import assert from "node:assert/strict"; -import { readFileSync, writeFileSync, mkdtempSync, rmSync, readdirSync, existsSync } from "node:fs"; +import { readFileSync, writeFileSync, mkdtempSync, rmSync, readdirSync } from "node:fs"; import { join, resolve } from "node:path"; import { tmpdir } from "node:os"; -import { parsejaiph } from "../src/parser"; +import { loadModuleGraph } from "../src/transpile/module-graph"; import { validateReferences } from "../src/transpile/validate"; -import { resolveImportPath } from "../src/transpile/resolve"; // --- txtar parser --- @@ -119,13 +118,8 @@ function runTestCase(tc: TxtarTestCase): void { let caughtError: Error | undefined; try { - const ast = parsejaiph(readFileSync(entryPath, "utf8"), entryPath); - validateReferences(ast, { - resolveImportPath, - existsSync: (p: string) => existsSync(p), - readFile: (p: string) => readFileSync(p, "utf8"), - parse: parsejaiph, - }); + const graph = loadModuleGraph(entryPath); + validateReferences(graph); } catch (err) { caughtError = err as Error; } From f8fb2388082084f25c61a922f1827cf9337a4478 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 12:06:59 +0200 Subject: [PATCH 06/14] Refactor: split source-fidelity data into a Trivia / CST layer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Around ten formatter-only fields — leadingComments on imports / script imports / channels / const decls / test blocks, configLeadingComments, trailingTopLevelComments, configBodySequence, topLevelOrder, bareSource on return, tripleQuoted flags on literal/return/log/logerr/fail/send/ const, and prompt / script bodyKind / bodyIdentifier discriminators — are removed from jaiphModule, WorkflowStepDef, ConstRhs, SendRhsDef, WorkflowMetadata, ImportDef, ScriptImportDef, ChannelDef, ScriptDef, and TestBlockDef and re-homed in a new parallel Trivia store (src/parse/trivia.ts) keyed by AST-node identity plus a small ModuleTrivia record. The parser exposes parsejaiphWithTrivia → {ast, trivia}; legacy parsejaiph drops trivia for validator / transpiler / runtime / loadModuleGraph. The formatter (emitModule(ast, trivia, opts?)) is Trivia's only consumer. New tests pin the invariants: trivia-ast-shape.test.ts (AC1, type-level), trivia-grep.test.ts (AC2), and roundtrip.test.ts (AC3, parse → format → parse → format bit-for-bit on every fixture under examples/ and test-fixtures/golden-ast/fixtures/). Golden AST fixtures regenerated to drop the moved fields. User-visible contracts (CLI behavior, format round-trip, run artifacts, banner, hooks, exit codes, __JAIPH_EVENT__ streaming) are unchanged. Implements design/2026-05-15-parser-compiler-simplification.md § Appendix A. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 28 --- docs/architecture.md | 11 +- docs/contributing.md | 1 + src/cli/commands/format.ts | 8 +- src/format/emit.test.ts | 10 +- src/format/emit.ts | 181 ++++++++++-------- src/format/roundtrip.test.ts | 73 +++++++ src/parse/const-rhs.ts | 34 ++-- src/parse/metadata.ts | 8 +- src/parse/parse-interpreter-tags.test.ts | 22 +-- src/parse/parse-metadata.test.ts | 6 +- src/parse/parse-prompt.test.ts | 63 +++--- src/parse/parse-steps.test.ts | 2 - src/parse/prompt.ts | 61 +++--- src/parse/rules.ts | 4 +- src/parse/scripts.ts | 34 ++-- src/parse/send-rhs.ts | 8 +- src/parse/steps.ts | 46 +++-- src/parse/tests.ts | 10 +- src/parse/triple-quote.ts | 10 + src/parse/trivia-ast-shape.test.ts | 92 +++++++++ src/parse/trivia-grep.test.ts | 49 +++++ src/parse/trivia.ts | 78 ++++++++ src/parse/workflow-brace.ts | 66 ++++--- src/parse/workflows.ts | 3 + src/parser.ts | 51 +++-- src/runtime/kernel/graph.ts | 1 - src/runtime/kernel/node-workflow-runtime.ts | 29 +-- src/runtime/orchestration-text.ts | 10 +- src/transpile/validate-ref-resolution.test.ts | 4 +- src/transpile/validate-string.ts | 21 +- src/transpile/validate.ts | 74 ++++--- src/types.ts | 49 +---- .../golden-ast/expected/brace-if.json | 20 -- .../golden-ast/expected/imports.json | 6 - test-fixtures/golden-ast/expected/log.json | 6 - .../golden-ast/expected/match-multiline.json | 6 - test-fixtures/golden-ast/expected/match.json | 6 - test-fixtures/golden-ast/expected/params.json | 15 -- .../golden-ast/expected/prompt-capture.json | 7 - .../golden-ast/expected/run-ensure.json | 19 -- .../golden-ast/expected/script-defs.json | 21 -- 43 files changed, 721 insertions(+), 533 deletions(-) create mode 100644 src/format/roundtrip.test.ts create mode 100644 src/parse/trivia-ast-shape.test.ts create mode 100644 src/parse/trivia-grep.test.ts create mode 100644 src/parse/trivia.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 17e086ac..ebca52e1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Split source-fidelity data from the semantic AST into a `Trivia` (CST) layer:** Around ten fields whose only consumer was the formatter — `leadingComments` on imports / script imports / channels / `const` decls / `test` blocks, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence` (both module- and workflow-scoped), `topLevelOrder`, `bareSource` on `return`, the `tripleQuoted` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, and the prompt / script `bodyKind` / `bodyIdentifier` discriminators — are removed from `jaiphModule`, `WorkflowStepDef`, `ConstRhs`, `SendRhsDef`, `WorkflowMetadata`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `ScriptDef`, and `TestBlockDef`, and re-homed in a new parallel `Trivia` store (`src/parse/trivia.ts`) keyed by AST-node identity (per-node `WeakMap`) plus a small `ModuleTrivia` record for module-level data. The parser exposes `parsejaiphWithTrivia(source, filePath) → { ast, trivia }`; the legacy `parsejaiph(source, filePath)` is now a thin wrapper that drops trivia for callers that don't care (validator, transpiler, runtime, `loadModuleGraph`). The formatter (`emitModule(ast, trivia, opts?)`) is the only consumer of `Trivia`; validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts`. New tests pin the invariants: `src/parse/trivia-ast-shape.test.ts` is a compile-time assertion (with runtime echo) that none of the listed fields reappear on any semantic AST type (AC1); `src/parse/trivia-grep.test.ts` greps validator and emitter source files and fails if any of them references `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` or imports from `parse/trivia` (AC2); `src/format/roundtrip.test.ts` walks every `.jh` under `examples/` and `test-fixtures/golden-ast/fixtures/` and asserts `parse → format → parse → format` converges bit-for-bit (AC3). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to drop the moved fields. User-visible contracts (CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming) are unchanged. `npm test` and `npm run build` pass with zero TypeScript strict-mode errors (AC4 / AC5). Out of scope: the `Expr` collapse — this refactor only relocates source-fidelity fields without changing the semantic AST's shape. Docs updated in `docs/architecture.md` (new **Trivia / CST layer** section with anchor `#trivia-cst-layer`, plus updated **Parser**, **AST / Types**, and **Formatter** bullets) and `docs/contributing.md` (new row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. - **Refactor — `ModuleGraph` is the single representation of "all `.jh` modules reachable from an entry point, parsed once":** The previous three traversal strategies for compile-time module discovery (validator re-reading imports through `ValidateContext`, `emitScriptsForModule` re-wrapping the same callbacks with an optional `prep` cache, and `buildScripts` walking the filesystem directly) collapse to one path. `parsejaiph(source, filePath)` is now strictly I/O-pure — it can no longer reach `fs`. The single discovery routine `loadModuleGraph(entry, workspaceRoot?)` (`src/transpile/module-graph.ts`) walks the entry plus its transitive `import` closure and returns `{ entryFile, workspaceRoot?, modules: Map }`; every other compile-time consumer takes the graph and never re-reads `.jh` from disk. `validateReferences(graph)` and `emitScriptsForModuleFromGraph(graph, file, rootDir)` operate entirely in-memory. The `ValidateContext` interface (`resolveImportPath` / `existsSync` / `readFile` / `parse` / `workspaceRoot` callbacks) is deleted from `src/transpile/validate.ts`; the validator consumes the graph and uses `existsSync` only to resolve `import script` paths (non-`.jh` bodies). `CompilePrep` / `prepareCompile` / `writeCompilePrep` / `readCompilePrep` and the optional `prep?` parameter on `emitScriptsForModule` / `buildScripts` are gone; `buildScripts(input, outDir, ws?)` now loads a graph internally and `buildScriptsFromGraph(graph, outDir, rootDir?)` is the entry point for callers that already loaded one. `buildRuntimeGraph` accepts either an entry file path (legacy) or an already-loaded `ModuleGraph` — `RuntimeGraph` is a type alias for `ModuleGraph` (the only "all reachable modules" representation in the codebase). The cross-process cache file moves to `/.jaiph-module-graph.json` (deterministic JSON: entries sorted by absolute path, ASTs included verbatim) via `writeModuleGraph` / `readModuleGraph`, and the internal env var the spawned `node-workflow-runner.js` reads is renamed `JAIPH_MODULE_GRAPH_FILE` (replacing `JAIPH_COMPILE_PREP_FILE`). Scope of the env-var hand-off is unchanged: set only for the default local non-Docker `jaiph run` path; `jaiph run --raw`, `jaiph test`, and Docker launches fall back to `loadModuleGraph` from the source file. User-visible contracts — banner, hooks, run artifacts, `run_summary.jsonl`, `return_value.txt`, exit codes, `__JAIPH_EVENT__` streaming, CLI usage, and the full golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) — are unchanged byte-for-byte. New tests (`src/transpile/module-graph.test.ts`, `src/transpile/pipeline-io-purity.test.ts`) stub `node:fs` to throw on any `.jh` read and run the full pipeline against `test-fixtures/` to pin the I/O-purity invariant; another test instruments `parsejaiph` with a call counter to assert no duplicate parses across `loadModuleGraph` → `validateReferences` → `emit` → `buildRuntimeGraph` for fixtures with transitive imports. `src/transpile/compile-prep.ts` and `compile-prep.test.ts` are removed. Docs updated in `docs/architecture.md`, `docs/cli.md`, and `docs/testing.md`. Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. - **Performance — `jaiph install` parallelism:** Missing-library clones now run in parallel through a small bounded-concurrency executor (default 4 in flight), replacing the previous sequential `execSync` loop. The user contract is unchanged: warm-path libraries (target directory exists and `--force` is absent) still skip without invoking `git` for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. The default clone runner now uses `spawn("git", ["clone", "--depth", "1", …])` so multiple clones can overlap network and process latency. `runInstall` is now `async` and exposes injectable `CloneRunner` / `concurrency` options for testing. Tests cover concurrent overlap (peak in-flight ≥ 2), warm-path skipping for explicit args and restore, invalid-remote and unknown-ref failure paths, mixed success/failure lockfile bookkeeping, and the existing corrupt/missing-lockfile behavior. Docs updated in `docs/cli.md` and `docs/libraries.md`. diff --git a/QUEUE.md b/QUEUE.md index 4be2bce2..a5940a72 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,34 +13,6 @@ Process rules: *** -## Split source-fidelity data from the semantic AST into a Trivia / CST layer #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. - -**Why:** `WorkflowStepDef` and `jaiphModule` today carry roughly ten fields whose only consumer is the formatter: `leadingComments`, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence`, `topLevelOrder`, `bareSource`, the `tripleQuoted` flags on literal/return/log/fail/send/const, `bodyKind`, `bodyIdentifier`. Every validator/emitter path has to ignore or thread these through unchanged. Pulling them out before the AST is collapsed (next task) lets the new `Expr` shape be designed against the *semantic* core only. - -**Scope:** - -- Introduce a `Trivia` layer (parallel map keyed by node id, or a CST node with both a semantic and a syntactic side) that owns all source-fidelity data currently on the AST. -- Every formatter-only field listed above is removed from `WorkflowStepDef`, `jaiphModule`, `ConstRhs`, `SendRhsDef`, and any other AST type, and re-homed in `Trivia`. -- `parsejaiph` returns `{ ast, trivia }` (or equivalent) instead of a single fat AST. -- The formatter is rewritten to read from `Trivia` alongside the AST. No other consumer (validator, emitter, transpiler, runtime) reads `Trivia` at all. -- Round-trip behavior is bit-for-bit identical for every fixture under `test-fixtures/` and `examples/`. - -**Acceptance criteria** (each verified by a test): - -1. None of the listed fields appear on any `WorkflowStepDef` variant, `jaiphModule`, `ConstRhs`, `SendRhsDef`, or other semantic AST type. A type-level test fails if any of them reappears. -2. Validator and emitter source files do not reference `Trivia` or its fields. A grep test fails if they do. -3. Formatter round-trip is bit-for-bit on every fixture under `test-fixtures/` and `examples/`. Add an explicit test that parses → formats → parses → formats and asserts both formatted outputs match. -4. `npm test` passes, including formatter round-trip tests and the golden corpus. -5. `npm run build` passes; TypeScript strict-mode errors are zero. - -**Out of scope:** the `Expr` collapse (next task) — this refactor only relocates source-fidelity fields, it does not change the semantic AST's shape. Surface syntax. - -**Dependency:** Refactor 5 (ModuleGraph, previous task) should be complete first so the parser is already I/O-pure when its return shape changes. - -*** - ## Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. diff --git a/docs/architecture.md b/docs/architecture.md index d6f9a666..7c7c1874 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -36,11 +36,16 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Parses runtime events and renders progress (except `--raw`); dispatches hooks. - **Parser (`src/parser.ts`, `src/parse/*`)** - - Converts `.jh`/`.test.jh` into `jaiphModule` AST. + - Converts `.jh`/`.test.jh` into a **semantic AST** (`jaiphModule`) plus a parallel **`Trivia`** store of source-fidelity data. `parsejaiphWithTrivia(source, filePath)` returns `{ ast, trivia }`; the legacy `parsejaiph(source, filePath)` is a thin wrapper that returns only the `ast` for consumers that don't need round-trip data. Both entry points are I/O-pure. - Reusable primitives: `parseFencedBlock()` (`src/parse/fence.ts`) handles triple-backtick fenced bodies with optional lang tokens for scripts and inline scripts. `parseTripleQuoteBlock()` (`src/parse/triple-quote.ts`) handles `"""..."""` blocks for prompts, `const`, `log`, `logerr`, `fail`, `return`, and `send` — all positions where multiline strings appear. - **AST / Types (`src/types.ts`)** - - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). + - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). + +- **Trivia / CST layer (`src/parse/trivia.ts`)** + {: #trivia-cst-layer} + - `Trivia` is a parallel store keyed by AST-node identity (per-node via `WeakMap`) and a small `ModuleTrivia` record for module-level data. The parser builds it alongside the AST; **only the formatter reads it**. Validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts` — a grep test (`src/parse/trivia-grep.test.ts`) pins this invariant by rejecting any reference to `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` from validator and emitter source files. + - A separate type-shape test (`src/parse/trivia-ast-shape.test.ts`) asserts at compile time that none of the formatter-only fields reappear on `jaiphModule`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `TestBlockDef`, `WorkflowMetadata`, `ScriptDef`, or any `WorkflowStepDef` / `ConstRhs` / `SendRhsDef` variant. - **Validator (`src/transpile/validate.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. @@ -64,7 +69,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Prompt execution (`prompt.ts`), streaming parse (`stream-parser.ts`), schema (`schema.ts`), **`mock.ts`** (sequential prompt responses / mock-arm dispatch from test env JSON), **`runtime-mock.ts`** (mock workflow/rule/script **bodies** for `*.test.jh`), **`emit.ts`** (durable **`run_summary.jsonl`** helpers — `appendRunSummaryLine`, `formatUtcTimestamp` — consumed by `RuntimeEventEmitter`), **`workflow-launch.ts`** (spawn contract). **`RuntimeEventEmitter`** (`runtime-event-emitter.ts`) owns live **`__JAIPH_EVENT__`** lines on stderr and coordinates summary writes plus step/prompt sequence counters. Script subprocesses are launched directly from `NodeWorkflowRuntime`. - **Formatter (`src/format/emit.ts`)** - - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. Pure AST→text emitter; no side-effects beyond file writes. + - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. `emitModule(ast, trivia, opts?)` reads the semantic AST together with the parallel **`Trivia`** store ([Trivia (CST layer)](#trivia-cst-layer)) to round-trip leading comments, top-level order, `config` body sequence, `"""..."""` and `bareSource` forms, and prompt / script body discriminators. Pure data→text emitter; no side-effects beyond file writes. Round-trip is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` — pinned by `src/format/roundtrip.test.ts`, which asserts `parse → format → parse → format` converges in one step on every fixture. - **Docker runtime helper (`src/runtime/docker.ts`)** - Parses mount specs, resolves Docker config (image, network, timeout), and builds the `docker run` invocation when the CLI enables **Docker sandboxing** for `jaiph run` (environment-driven; there is no `jaiph run --docker` flag — see [Sandboxing](sandboxing.md)). The container runs the same `node-workflow-runner` entry as local execution. The default image is the official `ghcr.io/jaiphlang/jaiph-runtime` GHCR image; every selected image must already contain `jaiph` (no auto-install or derived-image build at runtime). Image preparation (`prepareImage`) runs before the CLI banner: it checks whether the image is local, pulls with `--quiet` if needed (short status lines on stderr instead of Docker’s default pull UI), and verifies that `jaiph` exists in the image. `spawnDockerProcess` does not pull or verify — it receives a pre-resolved image. The spawn call uses `stdio: ["ignore", "pipe", "pipe"]` — stdin is ignored so the Docker CLI does not block on stdin EOF, which would stall event streaming and hang the host CLI after the container exits. diff --git a/docs/contributing.md b/docs/contributing.md index 15f54ffe..793d0bea 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -102,6 +102,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Module tests** | `src/**/*.test.ts` (colocated) | Bugs in pure functions (event parsing, param formatting, path resolution, config merging) | The function is self-contained, takes input and returns output, no I/O | | **Compiler acceptance tests** | `src/transpile/*.acceptance.test.ts` (colocated) | Cross-module compiler behavior: validation errors, resolution, and other cases that need a temp project tree or subprocess | You need a deterministic error string, multi-file `buildScripts`, or behavior that does not fit a tiny golden snippet | | **Compiler golden tests** | `src/transpile/compiler-golden.test.ts` (colocated) | Regressions in the parser, validation messages, and scripts-only extraction (`buildScriptFiles` in `emit-script.ts`) — expectations are inline in the test file | You changed the parser, validator, or script extraction and need to lock an exact error string, extracted script shape, or corpus behavior | +| **Trivia / formatter round-trip** | `src/parse/trivia-ast-shape.test.ts`, `src/parse/trivia-grep.test.ts`, `src/format/roundtrip.test.ts` | Source-fidelity invariants: no trivia fields on semantic AST types (compile-time), validator/emitter sources do not reference `Trivia`, and `parse → format → parse → format` is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` | You changed the parser, formatter, AST types, or anything that touches source-fidelity round-trip (see [Architecture — Trivia (CST layer)](architecture.md#trivia-cst-layer)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | | **Integration tests** | `integration/*.test.ts`, `integration/sample-build/*.test.ts` | Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses | diff --git a/src/cli/commands/format.ts b/src/cli/commands/format.ts index 05162bef..8a9b6aa8 100644 --- a/src/cli/commands/format.ts +++ b/src/cli/commands/format.ts @@ -1,6 +1,6 @@ import { readFileSync, writeFileSync } from "node:fs"; import { resolve } from "node:path"; -import { parsejaiph } from "../../parser"; +import { parsejaiphWithTrivia } from "../../parser"; import { emitModule } from "../../format/emit"; export function runFormat(args: string[]): number { @@ -52,16 +52,16 @@ export function runFormat(args: string[]): number { const firstLine = source.split(/\r?\n/, 1)[0]; const shebang = firstLine.startsWith("#!") ? firstLine : null; - let mod; + let parsed; try { - mod = parsejaiph(source, abs); + parsed = parsejaiphWithTrivia(source, abs); } catch (err) { const msg = err instanceof Error ? err.message : String(err); process.stderr.write(`parse error: ${msg}\n`); return 1; } - let formatted = emitModule(mod, { indent }); + let formatted = emitModule(parsed.ast, parsed.trivia, { indent }); if (shebang) { formatted = shebang + "\n\n" + formatted; } diff --git a/src/format/emit.test.ts b/src/format/emit.test.ts index 450b827f..7262f79d 100644 --- a/src/format/emit.test.ts +++ b/src/format/emit.test.ts @@ -1,11 +1,11 @@ import { describe, it } from "node:test"; import assert from "node:assert/strict"; -import { parsejaiph } from "../parser"; +import { parsejaiphWithTrivia } from "../parser"; import { emitModule } from "./emit"; function roundTrip(source: string, filePath = "test.jh"): string { - const mod = parsejaiph(source, filePath); - return emitModule(mod); + const { ast, trivia } = parsejaiphWithTrivia(source, filePath); + return emitModule(ast, trivia); } describe("emitModule", () => { @@ -166,8 +166,8 @@ describe("emitModule", () => { "}", "", ].join("\n"); - const mod = parsejaiph(input, "test.jh"); - assert.equal(emitModule(mod, { indent: 4 }), expected); + const { ast, trivia } = parsejaiphWithTrivia(input, "test.jh"); + assert.equal(emitModule(ast, trivia, { indent: 4 }), expected); }); it("reorders out-of-order definitions to canonical order", () => { diff --git a/src/format/emit.ts b/src/format/emit.ts index f1315f22..9ed3827c 100644 --- a/src/format/emit.ts +++ b/src/format/emit.ts @@ -14,6 +14,7 @@ import type { TopLevelEmitOrder, } from "../types"; import { parseCallRef } from "../parse/core"; +import { createTrivia, type NodeTrivia, type Trivia } from "../parse/trivia"; export interface EmitOptions { indent: number; @@ -21,6 +22,11 @@ export interface EmitOptions { const DEFAULT_OPTIONS: EmitOptions = { indent: 2 }; +/** Lookup helper: trivia entry for a node, with safe empty default. */ +function tn(trivia: Trivia, node: object): NodeTrivia { + return trivia.getNode(node) ?? {}; +} + /** When `topLevelOrder` is missing (hand-built AST), match pre–source-order emit behavior. */ function legacyTopLevelOrder(mod: jaiphModule): TopLevelEmitOrder[] { const o: TopLevelEmitOrder[] = []; @@ -36,14 +42,30 @@ function legacyTopLevelOrder(mod: jaiphModule): TopLevelEmitOrder[] { return o; } -function topLevelOrderForEmit(mod: jaiphModule): TopLevelEmitOrder[] { - if (mod.topLevelOrder && mod.topLevelOrder.length > 0) return mod.topLevelOrder; +function topLevelOrderForEmit(mod: jaiphModule, trivia: Trivia): TopLevelEmitOrder[] { + const order = trivia.getModule().topLevelOrder; + if (order && order.length > 0) return order; return legacyTopLevelOrder(mod); } -export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS): string { +export function emitModule( + mod: jaiphModule, + triviaOrOpts: Trivia | EmitOptions = createTrivia(), + optsArg?: EmitOptions, +): string { + // Backwards-compatible: callers may pass (mod, opts) when they don't care about trivia. + let trivia: Trivia; + let opts: EmitOptions; + if (triviaOrOpts instanceof Object && "indent" in triviaOrOpts && !("getModule" in triviaOrOpts)) { + trivia = createTrivia(); + opts = triviaOrOpts as EmitOptions; + } else { + trivia = triviaOrOpts as Trivia; + opts = optsArg ?? DEFAULT_OPTIONS; + } const sections: string[] = []; const pad = " ".repeat(opts.indent); + const modTrivia = trivia.getModule(); // Shebang — we don't store it in the AST, so the caller must prepend it if needed. // (handled by the format command reading the first line of the original source) @@ -51,16 +73,14 @@ export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS const importLines: string[] = []; if (mod.scriptImports) { for (const si of mod.scriptImports) { - if (si.leadingComments?.length) { - importLines.push(emitCommentBlock(si.leadingComments)); - } + const lc = tn(trivia, si).leadingComments; + if (lc?.length) importLines.push(emitCommentBlock(lc)); importLines.push(`import script "${si.path}" as ${si.alias}`); } } for (const imp of mod.imports) { - if (imp.leadingComments?.length) { - importLines.push(emitCommentBlock(imp.leadingComments)); - } + const lc = tn(trivia, imp).leadingComments; + if (lc?.length) importLines.push(emitCommentBlock(lc)); importLines.push(`import "${imp.path}" as ${imp.alias}`); } if (importLines.length > 0) { @@ -68,17 +88,16 @@ export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS } if (mod.metadata) { - if (mod.configLeadingComments?.length) { - sections.push(emitCommentBlock(mod.configLeadingComments)); + if (modTrivia.configLeadingComments?.length) { + sections.push(emitCommentBlock(modTrivia.configLeadingComments)); } - sections.push(emitConfig(mod.metadata, pad)); + sections.push(emitConfig(mod.metadata, pad, trivia)); } const channelLines: string[] = []; for (const ch of mod.channels) { - if (ch.leadingComments?.length) { - channelLines.push(emitCommentBlock(ch.leadingComments)); - } + const lc = tn(trivia, ch).leadingComments; + if (lc?.length) channelLines.push(emitCommentBlock(lc)); channelLines.push(emitChannel(ch)); } if (channelLines.length > 0) { @@ -87,7 +106,7 @@ export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS const exportedNames = new Set(mod.exports); - for (const item of topLevelOrderForEmit(mod)) { + for (const item of topLevelOrderForEmit(mod, trivia)) { if (item.kind === "env") { const env = mod.envDecls![item.index]; const envLines: string[] = []; @@ -99,12 +118,12 @@ export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS continue; } if (item.kind === "rule") { - sections.push(emitRule(mod.rules[item.index], pad, exportedNames.has(mod.rules[item.index].name))); + sections.push(emitRule(mod.rules[item.index], pad, exportedNames.has(mod.rules[item.index].name), trivia)); continue; } if (item.kind === "script") { sections.push( - emitScript(mod.scripts[item.index], pad, exportedNames.has(mod.scripts[item.index].name)), + emitScript(mod.scripts[item.index], pad, exportedNames.has(mod.scripts[item.index].name), trivia), ); continue; } @@ -114,15 +133,16 @@ export function emitModule(mod: jaiphModule, opts: EmitOptions = DEFAULT_OPTIONS mod.workflows[item.index], pad, exportedNames.has(mod.workflows[item.index].name), + trivia, ), ); continue; } - sections.push(emitTestBlock(mod.tests![item.index], pad)); + sections.push(emitTestBlock(mod.tests![item.index], pad, trivia)); } - if (mod.trailingTopLevelComments?.length) { - sections.push(emitCommentBlock(mod.trailingTopLevelComments)); + if (modTrivia.trailingTopLevelComments?.length) { + sections.push(emitCommentBlock(modTrivia.trailingTopLevelComments)); } return sections.join("\n\n") + "\n"; @@ -185,10 +205,11 @@ function emitConfigKeyLines(meta: WorkflowMetadata, key: string, pad: string): s } } -function emitConfig(meta: WorkflowMetadata, pad: string): string { +function emitConfig(meta: WorkflowMetadata, pad: string, trivia: Trivia): string { const lines: string[] = ["config {"]; - if (meta.configBodySequence?.length) { - for (const part of meta.configBodySequence) { + const seq = trivia.getNode(meta)?.configBodySequence; + if (seq?.length) { + for (const part of seq) { if (part.kind === "comment") { lines.push(`${pad}${part.text}`); } else { @@ -255,22 +276,23 @@ function emitCommentBlock(comments: string[]): string { return emitComments(comments).join("\n"); } -function emitRule(rule: RuleDef, pad: string, exported: boolean): string { +function emitRule(rule: RuleDef, pad: string, exported: boolean, trivia: Trivia): string { const lines: string[] = []; lines.push(...emitComments(rule.comments)); const paramStr = `(${rule.params.join(", ")})`; const prefix = exported ? "export " : ""; lines.push(`${prefix}rule ${rule.name}${paramStr} {`); - lines.push(...emitSteps(rule.steps, pad, pad)); + lines.push(...emitSteps(rule.steps, pad, pad, trivia)); lines.push("}"); return lines.join("\n"); } -function emitScript(script: ScriptDef, _pad: string, exported: boolean): string { +function emitScript(script: ScriptDef, _pad: string, exported: boolean, trivia: Trivia): string { const lines: string[] = []; lines.push(...emitComments(script.comments)); const prefix = exported ? "export " : ""; - if (script.bodyKind === "fenced" || script.lang || script.body.includes("\n")) { + const bodyKind = tn(trivia, script).scriptBodyKind; + if (bodyKind === "fenced" || script.lang || script.body.includes("\n")) { const langTag = script.lang ?? ""; lines.push(`${prefix}script ${script.name} = \`\`\`${langTag}`); for (const bl of script.body.split("\n")) { @@ -283,7 +305,7 @@ function emitScript(script: ScriptDef, _pad: string, exported: boolean): string return lines.join("\n"); } -function emitWorkflow(wf: WorkflowDef, pad: string, exported: boolean): string { +function emitWorkflow(wf: WorkflowDef, pad: string, exported: boolean, trivia: Trivia): string { const lines: string[] = []; lines.push(...emitComments(wf.comments)); @@ -292,13 +314,13 @@ function emitWorkflow(wf: WorkflowDef, pad: string, exported: boolean): string { lines.push(`${prefix}workflow ${wf.name}${paramStr} {`); if (wf.metadata) { - const configLines = emitConfig(wf.metadata, pad); + const configLines = emitConfig(wf.metadata, pad, trivia); for (const cl of configLines.split("\n")) { lines.push(`${pad}${cl}`); } } - lines.push(...emitSteps(wf.steps, pad, pad)); + lines.push(...emitSteps(wf.steps, pad, pad, trivia)); lines.push("}"); return lines.join("\n"); @@ -329,10 +351,10 @@ function emitLogMessageRhs(message: string): string { return JSON.stringify(message); } -function emitSteps(steps: WorkflowStepDef[], pad: string, currentIndent: string): string[] { +function emitSteps(steps: WorkflowStepDef[], pad: string, currentIndent: string, trivia: Trivia): string[] { const lines: string[] = []; for (const step of steps) { - lines.push(...emitStep(step, pad, currentIndent)); + lines.push(...emitStep(step, pad, currentIndent, trivia)); } return lines; } @@ -470,9 +492,10 @@ function emitMatchArm(arm: import("../types").MatchArmDef, armIndent: string, bo return [`${armIndent}${patStr} => ${arm.body}`]; } -function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): string[] { +function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, trivia: Trivia): string[] { const lines: string[] = []; const ci = currentIndent; + const stepTrivia = tn(trivia, step); switch (step.type) { case "blank_line": @@ -499,12 +522,12 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st const b = step.catch.bindings; const bindStr = `(${b.failure})`; if ("single" in step.catch) { - const recoverLines = emitStep(step.catch.single, pad, ""); + const recoverLines = emitStep(step.catch.single, pad, "", trivia); const recoverText = recoverLines.map((l) => l.trim()).join("\n"); lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} ${recoverText}`); } else { lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} {`); - lines.push(...emitSteps(step.catch.block, pad, ci + pad)); + lines.push(...emitSteps(step.catch.block, pad, ci + pad, trivia)); lines.push(`${ci}}`); } } else { @@ -521,24 +544,24 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st const b = step.recover.bindings; const bindStr = `(${b.failure})`; if ("single" in step.recover) { - const recoverLines = emitStep(step.recover.single, pad, ""); + const recoverLines = emitStep(step.recover.single, pad, "", trivia); const recoverText = recoverLines.map((l) => l.trim()).join("\n"); lines.push(`${ci}${capture}run ${asyncPrefix}${ref} recover ${bindStr} ${recoverText}`); } else { lines.push(`${ci}${capture}run ${asyncPrefix}${ref} recover ${bindStr} {`); - lines.push(...emitSteps(step.recover.block, pad, ci + pad)); + lines.push(...emitSteps(step.recover.block, pad, ci + pad, trivia)); lines.push(`${ci}}`); } } else if (step.catch) { const b = step.catch.bindings; const bindStr = `(${b.failure})`; if ("single" in step.catch) { - const recoverLines = emitStep(step.catch.single, pad, ""); + const recoverLines = emitStep(step.catch.single, pad, "", trivia); const recoverText = recoverLines.map((l) => l.trim()).join("\n"); lines.push(`${ci}${capture}run ${asyncPrefix}${ref} catch ${bindStr} ${recoverText}`); } else { lines.push(`${ci}${capture}run ${asyncPrefix}${ref} catch ${bindStr} {`); - lines.push(...emitSteps(step.catch.block, pad, ci + pad)); + lines.push(...emitSteps(step.catch.block, pad, ci + pad, trivia)); lines.push(`${ci}}`); } } else { @@ -566,10 +589,12 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st case "prompt": { const capture = step.captureName ? `${step.captureName} = ` : ""; const returns = step.returns ? ` returns "${step.returns}"` : ""; - if (step.bodyKind === "identifier" && step.bodyIdentifier) { - lines.push(`${ci}${capture}prompt ${step.bodyIdentifier}${returns}`); - } else if (step.bodyKind === "triple_quoted") { - const inner = step.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + const bodyKind = stepTrivia.bodyKind; + const bodyIdentifier = stepTrivia.bodyIdentifier; + if (bodyKind === "identifier" && bodyIdentifier) { + lines.push(`${ci}${capture}prompt ${bodyIdentifier}${returns}`); + } else if (bodyKind === "triple_quoted") { + const inner = stepTrivia.rawBody ?? step.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); lines.push(`${ci}${capture}prompt """`); for (const bl of inner.split("\n")) { lines.push(bl); @@ -585,7 +610,8 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st } case "const": { - lines.push(`${ci}${emitConstStep(step.name, step.value)}`); + const valueTrivia = tn(trivia, step.value); + lines.push(`${ci}${emitConstStep(step.name, step.value, valueTrivia)}`); // Handle multi-line inline script capture body if (step.value.kind === "run_inline_script_capture" && (step.value.lang || step.value.body.includes("\n"))) { @@ -596,8 +622,8 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st lines.push(`${ci}\`\`\`(${argsStr})`); } // Handle multi-line triple-quoted prompt capture body - if (step.value.kind === "prompt_capture" && step.value.bodyKind === "triple_quoted") { - const inner = step.value.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + if (step.value.kind === "prompt_capture" && valueTrivia.bodyKind === "triple_quoted") { + const inner = valueTrivia.rawBody ?? step.value.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); for (const bl of inner.split("\n")) { lines.push(bl); } @@ -614,9 +640,8 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st lines.push(`${ci}}`); } // Handle multi-line triple-quoted expr (const name = """...""") - if (step.value.kind === "expr" && step.value.bashRhs.startsWith('"') && - step.value.bashRhs.endsWith('"') && step.value.bashRhs.includes("\n")) { - const inner = step.value.bashRhs.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + if (step.value.kind === "expr" && valueTrivia.tripleQuoted) { + const inner = valueTrivia.rawBody ?? step.value.bashRhs.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); for (const bl of inner.split("\n")) { lines.push(bl); } @@ -626,8 +651,8 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st } case "fail": { - if (step.message.includes("\n")) { - const inner = step.message.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + if (stepTrivia.tripleQuoted) { + const inner = stepTrivia.rawBody ?? step.message.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); lines.push(`${ci}fail """`); for (const bl of inner.split("\n")) { lines.push(bl); @@ -642,9 +667,10 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st case "log": if (step.managed?.kind === "run_inline_script") { lines.push(...emitInlineScriptLines(`${ci}log run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); - } else if (step.message.includes("\n")) { + } else if (stepTrivia.tripleQuoted) { + const inner = stepTrivia.rawBody ?? step.message; lines.push(`${ci}log """`); - for (const bl of step.message.split("\n")) { + for (const bl of inner.split("\n")) { lines.push(bl); } lines.push(`${ci}"""`); @@ -656,9 +682,10 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st case "logerr": if (step.managed?.kind === "run_inline_script") { lines.push(...emitInlineScriptLines(`${ci}logerr run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); - } else if (step.message.includes("\n")) { + } else if (stepTrivia.tripleQuoted) { + const inner = stepTrivia.rawBody ?? step.message; lines.push(`${ci}logerr """`); - for (const bl of step.message.split("\n")) { + for (const bl of inner.split("\n")) { lines.push(bl); } lines.push(`${ci}"""`); @@ -682,10 +709,10 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st } else if (step.managed.kind === "run_inline_script") { lines.push(...emitInlineScriptLines(`${ci}return run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); } - } else if (step.bareSource) { - lines.push(`${ci}return ${step.bareSource}`); - } else if (step.value.includes("\n")) { - const inner = step.value.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + } else if (stepTrivia.bareSource) { + lines.push(`${ci}return ${stepTrivia.bareSource}`); + } else if (stepTrivia.tripleQuoted) { + const inner = stepTrivia.rawBody ?? step.value.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); lines.push(`${ci}return """`); for (const bl of inner.split("\n")) { lines.push(bl); @@ -698,8 +725,9 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st } case "send": { - if (step.rhs.kind === "literal" && step.rhs.token.includes("\n")) { - const inner = step.rhs.token.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + const rhsTrivia = tn(trivia, step.rhs); + if (step.rhs.kind === "literal" && rhsTrivia.tripleQuoted) { + const inner = rhsTrivia.rawBody ?? step.rhs.token.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); lines.push(`${ci}${step.channel} <- """`); for (const bl of inner.split("\n")) { lines.push(bl); @@ -727,14 +755,14 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st ? `"${step.operand.value}"` : `/${step.operand.source}/`; lines.push(`${ci}if ${step.subject} ${step.operator} ${operandStr} {`); - lines.push(...emitSteps(step.body, pad, ci + pad)); + lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); lines.push(`${ci}}`); break; } case "for_lines": { lines.push(`${ci}for ${step.iterVar} in ${step.sourceVar} {`); - lines.push(...emitSteps(step.body, pad, ci + pad)); + lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); lines.push(`${ci}}`); break; } @@ -743,10 +771,10 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string): st return lines; } -function emitConstStep(name: string, value: ConstRhs): string { +function emitConstStep(name: string, value: ConstRhs, valueTrivia: NodeTrivia): string { switch (value.kind) { case "expr": - if (value.bashRhs.startsWith('"') && value.bashRhs.endsWith('"') && value.bashRhs.includes("\n")) { + if (valueTrivia.tripleQuoted) { // Multi-line: caller handles remaining lines return `const ${name} = """`; } @@ -759,10 +787,10 @@ function emitConstStep(name: string, value: ConstRhs): string { return `const ${name} = ensure ${emitRef(value.ref, value.args, value.bareIdentifierArgs)}`; case "prompt_capture": { const returns = value.returns ? ` returns "${value.returns}"` : ""; - if (value.bodyKind === "identifier" && value.bodyIdentifier) { - return `const ${name} = prompt ${value.bodyIdentifier}${returns}`; + if (valueTrivia.bodyKind === "identifier" && valueTrivia.bodyIdentifier) { + return `const ${name} = prompt ${valueTrivia.bodyIdentifier}${returns}`; } - if (value.bodyKind === "triple_quoted") { + if (valueTrivia.bodyKind === "triple_quoted") { // Multi-line: caller handles remaining lines return `const ${name} = prompt """`; } @@ -798,20 +826,21 @@ function emitSendRhs(rhs: SendRhsDef): string { } } -function emitTestBlock(test: TestBlockDef, pad: string): string { +function emitTestBlock(test: TestBlockDef, pad: string, trivia: Trivia): string { const lines: string[] = []; - if (test.leadingComments?.length) { - lines.push(...emitComments(test.leadingComments)); + const lc = tn(trivia, test).leadingComments; + if (lc?.length) { + lines.push(...emitComments(lc)); } lines.push(`test "${test.description}" {`); for (const step of test.steps) { - lines.push(...emitTestStep(step, pad)); + lines.push(...emitTestStep(step, pad, trivia)); } lines.push("}"); return lines.join("\n"); } -function emitTestStep(step: TestStepDef, pad: string): string[] { +function emitTestStep(step: TestStepDef, pad: string, trivia: Trivia): string[] { switch (step.type) { case "comment": return [`${pad}${step.text}`]; @@ -852,14 +881,14 @@ function emitTestStep(step: TestStepDef, pad: string): string[] { case "test_mock_workflow": { const paramStr = `(${step.params.join(", ")})`; const lines = [`${pad}mock workflow ${step.ref}${paramStr} {`]; - lines.push(...emitSteps(step.steps, pad, pad + pad)); + lines.push(...emitSteps(step.steps, pad, pad + pad, trivia)); lines.push(`${pad}}`); return lines; } case "test_mock_rule": { const paramStr = `(${step.params.join(", ")})`; const lines = [`${pad}mock rule ${step.ref}${paramStr} {`]; - lines.push(...emitSteps(step.steps, pad, pad + pad)); + lines.push(...emitSteps(step.steps, pad, pad + pad, trivia)); lines.push(`${pad}}`); return lines; } diff --git a/src/format/roundtrip.test.ts b/src/format/roundtrip.test.ts new file mode 100644 index 00000000..0acc3ed3 --- /dev/null +++ b/src/format/roundtrip.test.ts @@ -0,0 +1,73 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync, readdirSync, statSync } from "node:fs"; +import { join, resolve } from "node:path"; +import { parsejaiphWithTrivia } from "../parser"; +import { emitModule } from "./emit"; + +// Tests run from dist/src/format/roundtrip.test.js, so repo root is four levels up. +const repoRoot = resolve(__dirname, "../../.."); + +function findjhFiles(root: string): string[] { + const out: string[] = []; + const stack = [root]; + while (stack.length > 0) { + const dir = stack.pop()!; + let entries: string[]; + try { + entries = readdirSync(dir); + } catch { + continue; + } + for (const e of entries) { + const p = join(dir, e); + let s; + try { + s = statSync(p); + } catch { + continue; + } + if (s.isDirectory()) { + stack.push(p); + } else if (p.endsWith(".jh") && !p.endsWith(".broken.jh")) { + // Skip *.test.jh? We include them — they're also DSL. + out.push(p); + } + } + } + return out.sort(); +} + +const fixtureRoots = [ + join(repoRoot, "examples"), + join(repoRoot, "test-fixtures/golden-ast/fixtures"), +]; + +const allFixtures: string[] = []; +for (const root of fixtureRoots) { + allFixtures.push(...findjhFiles(root)); +} + +if (allFixtures.length === 0) { + test("AC3: round-trip fixtures present", () => { + assert.fail("expected at least one .jh fixture under examples/ and test-fixtures/"); + }); +} + +for (const file of allFixtures) { + const rel = file.replace(repoRoot + "/", ""); + test(`AC3: parse → format → parse → format is bit-for-bit on ${rel}`, () => { + const source = readFileSync(file, "utf8"); + // First pass: parse and format. + const first = parsejaiphWithTrivia(source, file); + const formatted1 = emitModule(first.ast, first.trivia); + // Second pass: parse the formatted output and format again. + const second = parsejaiphWithTrivia(formatted1, file); + const formatted2 = emitModule(second.ast, second.trivia); + assert.equal( + formatted2, + formatted1, + `second formatting diverged from first for ${rel}`, + ); + }); +} diff --git a/src/parse/const-rhs.ts b/src/parse/const-rhs.ts index 4d528718..20ca1a4f 100644 --- a/src/parse/const-rhs.ts +++ b/src/parse/const-rhs.ts @@ -1,6 +1,7 @@ import type { ConstRhs, RuleRefDef, WorkflowRefDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { fail, parseCallRef, rejectTrailingContent } from "./core"; -import { parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; +import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; import { parseAnonymousInlineScript } from "./inline-script"; import { parsePromptStep } from "./prompt"; import { parseMatchExpr } from "./match"; @@ -58,6 +59,7 @@ export function parseConstRhs( col: number, forRule: boolean, constName: string, + trivia: Trivia = createTrivia(), ): { value: ConstRhs; nextLineIdx: number } { const head = rhs.trimStart(); if (head.startsWith("prompt ")) { @@ -67,22 +69,26 @@ export function parseConstRhs( const innerRaw = lines[lineIdx]; const promptCol = innerRaw.indexOf("prompt") + 1; const promptArg = rhs.slice(rhs.indexOf("prompt") + "prompt".length).trimStart(); - const result = parsePromptStep(filePath, lines, lineIdx, promptArg, promptCol, constName); + const result = parsePromptStep(filePath, lines, lineIdx, promptArg, promptCol, constName, trivia); const st = result.step; if (st.type !== "prompt" || st.captureName !== constName) { fail(filePath, "const ... = prompt internal parse error", lineNo, col); } - return { - value: { - kind: "prompt_capture", - raw: st.raw, - loc: st.loc, - returns: st.returns, - ...(st.bodyKind ? { bodyKind: st.bodyKind } : {}), - ...(st.bodyIdentifier ? { bodyIdentifier: st.bodyIdentifier } : {}), - }, - nextLineIdx: result.nextLineIdx, + const promptTrivia = trivia.getNode(st); + const value: ConstRhs = { + kind: "prompt_capture", + raw: st.raw, + loc: st.loc, + returns: st.returns, }; + if (promptTrivia) { + trivia.setNode(value, { + ...(promptTrivia.bodyKind ? { bodyKind: promptTrivia.bodyKind } : {}), + ...(promptTrivia.bodyIdentifier ? { bodyIdentifier: promptTrivia.bodyIdentifier } : {}), + ...(promptTrivia.rawBody !== undefined ? { rawBody: promptTrivia.rawBody } : {}), + }); + } + return { value, nextLineIdx: result.nextLineIdx }; } if (head.startsWith("run ")) { const rest = head.slice("run ".length).trim(); @@ -168,7 +174,9 @@ export function parseConstRhs( tqLines[lineIdx] = head; const { body, nextIdx, afterClose } = parseTripleQuoteBlock(filePath, tqLines, lineIdx); if (afterClose) fail(filePath, 'unexpected content after closing """', nextIdx); - return { value: { kind: "expr", bashRhs: tripleQuoteBodyToRaw(body), tripleQuoted: true }, nextLineIdx: nextIdx - 1 }; + const value: ConstRhs = { kind: "expr", bashRhs: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + trivia.setNode(value, { tripleQuoted: true, rawBody: body }); + return { value, nextLineIdx: nextIdx - 1 }; } const callLike = head.includes("(") ? parseCallRef(head.trimEnd()) : null; if (callLike) { diff --git a/src/parse/metadata.ts b/src/parse/metadata.ts index 240a230e..0c913ba6 100644 --- a/src/parse/metadata.ts +++ b/src/parse/metadata.ts @@ -1,4 +1,5 @@ -import type { ConfigBodyPart, WorkflowMetadata } from "../types"; +import type { WorkflowMetadata } from "../types"; +import type { Trivia, ConfigBodyPart } from "./trivia"; import { colFromRaw, fail } from "./core"; const ALLOWED_KEYS = new Set([ @@ -176,6 +177,7 @@ export function parseConfigBlock( filePath: string, lines: string[], startIndex: number, + trivia?: Trivia, ): { metadata: WorkflowMetadata; nextIndex: number } { const openLineNo = startIndex + 1; const rawOpen = lines[startIndex]; @@ -202,8 +204,8 @@ export function parseConfigBlock( continue; } if (line === "}") { - if (bodySequence.length > 0) { - out.configBodySequence = bodySequence; + if (bodySequence.length > 0 && trivia) { + trivia.setNode(out, { configBodySequence: bodySequence }); } idx += 1; return { metadata: out, nextIndex: idx }; diff --git a/src/parse/parse-interpreter-tags.test.ts b/src/parse/parse-interpreter-tags.test.ts index 78327093..e09829fc 100644 --- a/src/parse/parse-interpreter-tags.test.ts +++ b/src/parse/parse-interpreter-tags.test.ts @@ -1,50 +1,50 @@ import test from "node:test"; import assert from "node:assert/strict"; -import { parsejaiph } from "../parser"; +import { parsejaiph, parsejaiphWithTrivia } from "../parser"; // === Accepted: fenced block with lang tag === test("fenced block with python3 lang tag parses correctly", () => { - const mod = parsejaiph('script transform = ```python3\nprint("hi")\n```', "test.jh"); + const { ast: mod, trivia } = parsejaiphWithTrivia('script transform = ```python3\nprint("hi")\n```', "test.jh"); assert.equal(mod.scripts.length, 1); assert.equal(mod.scripts[0].name, "transform"); assert.equal(mod.scripts[0].lang, "python3"); assert.equal(mod.scripts[0].body, 'print("hi")'); - assert.equal(mod.scripts[0].bodyKind, "fenced"); + assert.equal(trivia.getNode(mod.scripts[0])?.scriptBodyKind, "fenced"); }); test("fenced block with node lang tag parses correctly", () => { - const mod = parsejaiph("script transform = ```node\nconsole.log('hi');\n```", "test.jh"); + const { ast: mod, trivia } = parsejaiphWithTrivia("script transform = ```node\nconsole.log('hi');\n```", "test.jh"); assert.equal(mod.scripts.length, 1); assert.equal(mod.scripts[0].name, "transform"); assert.equal(mod.scripts[0].lang, "node"); assert.equal(mod.scripts[0].body, "console.log('hi');"); - assert.equal(mod.scripts[0].bodyKind, "fenced"); + assert.equal(trivia.getNode(mod.scripts[0])?.scriptBodyKind, "fenced"); }); test("any arbitrary lang tag is valid (no allowlist)", () => { - const mod = parsejaiph("script run_deno = ```deno\nconsole.log('hi');\n```", "test.jh"); + const { ast: mod, trivia } = parsejaiphWithTrivia("script run_deno = ```deno\nconsole.log('hi');\n```", "test.jh"); assert.equal(mod.scripts.length, 1); assert.equal(mod.scripts[0].lang, "deno"); - assert.equal(mod.scripts[0].bodyKind, "fenced"); + assert.equal(trivia.getNode(mod.scripts[0])?.scriptBodyKind, "fenced"); }); // === Accepted: plain script without lang tag === test("plain script without lang tag has no lang", () => { - const mod = parsejaiph('script setup = `echo hello`', "test.jh"); + const { ast: mod, trivia } = parsejaiphWithTrivia('script setup = `echo hello`', "test.jh"); assert.equal(mod.scripts[0].lang, undefined); assert.equal(mod.scripts[0].body, "echo hello"); - assert.equal(mod.scripts[0].bodyKind, "backtick"); + assert.equal(trivia.getNode(mod.scripts[0])?.scriptBodyKind, "backtick"); }); // === Accepted: manual shebang in fenced body (no lang tag) === test("manual shebang in fenced body without lang tag works", () => { - const mod = parsejaiph('script analyze = ```\n#!/usr/bin/env ruby\nputs "hi"\n```', "test.jh"); + const { ast: mod, trivia } = parsejaiphWithTrivia('script analyze = ```\n#!/usr/bin/env ruby\nputs "hi"\n```', "test.jh"); assert.equal(mod.scripts[0].lang, undefined); assert.equal(mod.scripts[0].body, '#!/usr/bin/env ruby\nputs "hi"'); - assert.equal(mod.scripts[0].bodyKind, "fenced"); + assert.equal(trivia.getNode(mod.scripts[0])?.scriptBodyKind, "fenced"); }); // === Rejected: both fence tag and manual shebang === diff --git a/src/parse/parse-metadata.test.ts b/src/parse/parse-metadata.test.ts index a83332c9..45a9a438 100644 --- a/src/parse/parse-metadata.test.ts +++ b/src/parse/parse-metadata.test.ts @@ -2,6 +2,7 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parseConfigBlock } from "./metadata"; import { parsejaiph } from "../parser"; +import { createTrivia } from "./trivia"; test("parseConfigBlock: parses minimal config with one key", () => { const lines = [ @@ -132,9 +133,10 @@ test("parseConfigBlock: skips empty lines and comments", () => { "", "}", ]; - const { metadata } = parseConfigBlock("test.jh", lines, 0); + const trivia = createTrivia(); + const { metadata } = parseConfigBlock("test.jh", lines, 0, trivia); assert.equal(metadata.agent?.command, "claude"); - assert.deepEqual(metadata.configBodySequence, [ + assert.deepEqual(trivia.getNode(metadata)?.configBodySequence, [ { kind: "comment", text: "# this is a comment" }, { kind: "assign", key: "agent.command" }, ]); diff --git a/src/parse/parse-prompt.test.ts b/src/parse/parse-prompt.test.ts index a546b297..3ef93cbd 100644 --- a/src/parse/parse-prompt.test.ts +++ b/src/parse/parse-prompt.test.ts @@ -1,12 +1,15 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parsePromptStep } from "./prompt"; +import { createTrivia } from "./trivia"; + +const trivia = createTrivia(); // === parsePromptStep: single-line string literal === test("parsePromptStep: parses simple single-line prompt", () => { const lines = [' prompt "Hello world"']; - const result = parsePromptStep("test.jh", lines, 0, '"Hello world"', 3); + const result = parsePromptStep("test.jh", lines, 0, '"Hello world"', 3, undefined, trivia); assert.equal(result.step.type, "prompt"); assert.equal(result.step.raw, '"Hello world"'); assert.equal(result.step.loc.line, 1); @@ -14,24 +17,24 @@ test("parsePromptStep: parses simple single-line prompt", () => { assert.equal(result.step.captureName, undefined); assert.equal(result.step.returns, undefined); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "string"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "string"); } }); test("parsePromptStep: parses captured prompt", () => { const lines = [' answer = prompt "What?"']; - const result = parsePromptStep("test.jh", lines, 0, '"What?"', 3, "answer"); + const result = parsePromptStep("test.jh", lines, 0, '"What?"', 3, "answer", trivia); assert.equal(result.step.type, "prompt"); assert.equal(result.step.raw, '"What?"'); assert.equal(result.step.captureName, "answer"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "string"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "string"); } }); test("parsePromptStep: parses prompt with returns schema (double-quoted)", () => { const lines = [' prompt "Classify" returns "{ type: string }"']; - const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3); + const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3, undefined, trivia); assert.equal(result.step.type, "prompt"); assert.equal(result.step.raw, '"Classify"'); assert.equal(result.step.returns, "{ type: string }"); @@ -40,7 +43,7 @@ test("parsePromptStep: parses prompt with returns schema (double-quoted)", () => test("parsePromptStep: rejects single-quoted returns schema", () => { const lines = [" prompt \"Classify\" returns '{ type: string }'"]; assert.throws( - () => parsePromptStep("test.jh", lines, 0, "\"Classify\" returns '{ type: string }'", 3), + () => parsePromptStep("test.jh", lines, 0, "\"Classify\" returns '{ type: string }'", 3, undefined, trivia), /single-quoted strings are not supported/, ); }); @@ -53,7 +56,7 @@ test("parsePromptStep: multiline quoted prompt throws with clear error", () => { ' world"', ]; assert.throws( - () => parsePromptStep("test.jh", lines, 0, '"Hello', 3), + () => parsePromptStep("test.jh", lines, 0, '"Hello', 3, undefined, trivia), /multiline prompt strings are no longer supported/, ); }); @@ -62,11 +65,11 @@ test("parsePromptStep: multiline quoted prompt throws with clear error", () => { test("parsePromptStep: parses bare identifier prompt", () => { const lines = [' prompt myVar']; - const result = parsePromptStep("test.jh", lines, 0, "myVar", 3); + const result = parsePromptStep("test.jh", lines, 0, "myVar", 3, undefined, trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "identifier"); - assert.equal(result.step.bodyIdentifier, "myVar"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "myVar"); assert.equal(result.step.raw, '"${myVar}"'); assert.equal(result.step.returns, undefined); } @@ -74,23 +77,23 @@ test("parsePromptStep: parses bare identifier prompt", () => { test("parsePromptStep: parses identifier prompt with returns", () => { const lines = [' prompt myVar returns "{ type: string }"']; - const result = parsePromptStep("test.jh", lines, 0, 'myVar returns "{ type: string }"', 3); + const result = parsePromptStep("test.jh", lines, 0, 'myVar returns "{ type: string }"', 3, undefined, trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "identifier"); - assert.equal(result.step.bodyIdentifier, "myVar"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "myVar"); assert.equal(result.step.returns, "{ type: string }"); } }); test("parsePromptStep: parses captured identifier prompt", () => { const lines = [' answer = prompt text']; - const result = parsePromptStep("test.jh", lines, 0, "text", 3, "answer"); + const result = parsePromptStep("test.jh", lines, 0, "text", 3, "answer", trivia); assert.equal(result.step.type, "prompt"); assert.equal(result.step.captureName, "answer"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "identifier"); - assert.equal(result.step.bodyIdentifier, "text"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "text"); } }); @@ -103,10 +106,10 @@ test("parsePromptStep: parses triple-quoted block prompt", () => { 'Analyze the following: ${input}', '"""', ]; - const result = parsePromptStep("test.jh", lines, 0, '"""', 3); + const result = parsePromptStep("test.jh", lines, 0, '"""', 3, undefined, trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "triple_quoted"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); // raw contains the body wrapped in quotes for runtime interpolation assert.ok(result.step.raw.includes("You are a helpful assistant.")); assert.ok(result.step.raw.includes("${input}")); @@ -119,11 +122,11 @@ test("parsePromptStep: parses captured triple-quoted block prompt", () => { 'Hello multiline', '"""', ]; - const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer"); + const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); assert.equal(result.step.type, "prompt"); assert.equal(result.step.captureName, "answer"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "triple_quoted"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); } }); @@ -134,10 +137,10 @@ test("parsePromptStep: triple-quoted block may be followed by returns on the nex '"""', 'returns "{ role: string }"', ]; - const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer"); + const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "triple_quoted"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); assert.equal(result.step.returns, "{ role: string }"); } assert.equal(result.nextLineIdx, 3); @@ -149,10 +152,10 @@ test("parsePromptStep: triple-quoted block may close with returns on same line", "Hello", '""" returns "{ role: string }"', ]; - const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer"); + const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { - assert.equal(result.step.bodyKind, "triple_quoted"); + assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); assert.equal(result.step.returns, "{ role: string }"); } assert.equal(result.nextLineIdx, 2); @@ -165,7 +168,7 @@ test("parsePromptStep: unterminated triple-quoted block throws", () => { 'no closing triple-quote', ]; assert.throws( - () => parsePromptStep("test.jh", lines, 0, '"""', 3), + () => parsePromptStep("test.jh", lines, 0, '"""', 3, undefined, trivia), /unterminated triple-quoted block/, ); }); @@ -179,7 +182,7 @@ test("parsePromptStep: triple-backtick fence is rejected with guidance", () => { '```', ]; assert.throws( - () => parsePromptStep("test.jh", lines, 0, "```", 3), + () => parsePromptStep("test.jh", lines, 0, "```", 3, undefined, trivia), /prompt blocks use triple quotes.*triple backticks are for scripts/, ); }); @@ -189,7 +192,7 @@ test("parsePromptStep: triple-backtick fence is rejected with guidance", () => { test("parsePromptStep: unterminated single-line string throws", () => { const lines = [' prompt "Hello']; assert.throws( - () => parsePromptStep("test.jh", lines, 0, '"Hello', 3), + () => parsePromptStep("test.jh", lines, 0, '"Hello', 3, undefined, trivia), /multiline prompt strings are no longer supported/, ); }); @@ -197,7 +200,7 @@ test("parsePromptStep: unterminated single-line string throws", () => { test("parsePromptStep: invalid text after prompt string throws", () => { const lines = [' prompt "Hello" garbage']; assert.throws( - () => parsePromptStep("test.jh", lines, 0, '"Hello" garbage', 3), + () => parsePromptStep("test.jh", lines, 0, '"Hello" garbage', 3, undefined, trivia), /expected keyword "returns"/, ); }); @@ -205,14 +208,14 @@ test("parsePromptStep: invalid text after prompt string throws", () => { test("parsePromptStep: unterminated returns schema throws", () => { const lines = [' prompt "Hello" returns "{ type: string']; assert.throws( - () => parsePromptStep("test.jh", lines, 0, '"Hello" returns "{ type: string', 3), + () => parsePromptStep("test.jh", lines, 0, '"Hello" returns "{ type: string', 3, undefined, trivia), /unterminated returns schema/, ); }); test("parsePromptStep: returns with double-quoted schema", () => { const lines = [' prompt "Classify" returns "{ type: string }"']; - const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3); + const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3, undefined, trivia); assert.equal(result.step.type, "prompt"); if (result.step.type === "prompt") { assert.equal(result.step.returns, "{ type: string }"); diff --git a/src/parse/parse-steps.test.ts b/src/parse/parse-steps.test.ts index c4a20985..d4c39c5e 100644 --- a/src/parse/parse-steps.test.ts +++ b/src/parse/parse-steps.test.ts @@ -148,7 +148,6 @@ test("parseEnsureStep: multiline catch block with triple-quoted prompt", () => { const p = step.catch.block[1]; assert.equal(p.type, "prompt"); if (p.type === "prompt") { - assert.equal(p.bodyKind, "triple_quoted"); assert.ok(p.raw.includes("fix CI")); } assert.equal(step.catch.block[2].type, "run"); @@ -279,7 +278,6 @@ test("parsejaiph: workflow with ensure catch and multiline triple-quoted prompt" const p = ensureStep.catch.block[0]; assert.equal(p.type, "prompt"); if (p.type === "prompt") { - assert.equal(p.bodyKind, "triple_quoted"); assert.ok(p.raw.includes("hello")); } } diff --git a/src/parse/prompt.ts b/src/parse/prompt.ts index 8ce101fc..0f51b4d6 100644 --- a/src/parse/prompt.ts +++ b/src/parse/prompt.ts @@ -1,6 +1,7 @@ import type { WorkflowStepDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { fail, hasUnescapedClosingQuote, indexOfClosingDoubleQuote } from "./core"; -import { parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; +import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; /** * Prompt body source tag stored in the AST. @@ -181,6 +182,7 @@ export function parsePromptStep( promptArg: string, promptCol: number, captureName?: string, + trivia: Trivia = createTrivia(), ): { step: WorkflowStepDef; nextLineIdx: number } { const lineNo = lineIdx + 1; @@ -214,8 +216,9 @@ export function parsePromptStep( tripleQuoteLineIdx, ); - // Wrap body in quotes so the runtime's interpolateWithCaptures can process ${} vars - const raw = tripleQuoteBodyToRaw(body); + // Wrap body in quotes so the runtime's interpolateWithCaptures can process ${} vars. + // Apply the same dedent at parse time so the runtime no longer needs a tripleQuoted flag. + const raw = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); const linesForReturns = lines.length === 0 ? tqLines : lines; let returnsSchema: string | undefined = returnsOnClosingLine; @@ -235,15 +238,16 @@ export function parsePromptStep( } } + const step = { + type: "prompt" as const, + raw, + loc: { line: lineNo, col: promptCol }, + ...(captureName ? { captureName } : {}), + ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), + }; + trivia.setNode(step, { bodyKind: "triple_quoted", rawBody: body }); return { - step: { - type: "prompt", - raw, - bodyKind: "triple_quoted", - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), - ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), - }, + step, nextLineIdx: consumeEndIdx - 1, }; } @@ -263,15 +267,16 @@ export function parsePromptStep( lines, lineIdx, ); + const step = { + type: "prompt" as const, + raw: promptRaw, + loc: { line: lineNo, col: promptCol }, + ...(captureName ? { captureName } : {}), + ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), + }; + trivia.setNode(step, { bodyKind: "string" }); return { - step: { - type: "prompt", - raw: promptRaw, - bodyKind: "string", - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), - ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), - }, + step, nextLineIdx: nextIndex - 1, }; } @@ -299,16 +304,16 @@ export function parsePromptStep( // Store as "${identifier}" so the runtime interpolates the variable const raw = `"\${${identifier}}"`; + const step = { + type: "prompt" as const, + raw, + loc: { line: lineNo, col: promptCol }, + ...(captureName ? { captureName } : {}), + ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), + }; + trivia.setNode(step, { bodyKind: "identifier", bodyIdentifier: identifier }); return { - step: { - type: "prompt", - raw, - bodyKind: "identifier", - bodyIdentifier: identifier, - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), - ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), - }, + step, nextLineIdx: nextIndex - 1, }; } diff --git a/src/parse/rules.ts b/src/parse/rules.ts index 81466f77..6b681c83 100644 --- a/src/parse/rules.ts +++ b/src/parse/rules.ts @@ -1,4 +1,5 @@ import type { RuleDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { braceDepthDelta, colFromRaw, fail, parseParamList, stripQuotes } from "./core"; import { parseBlockStatement } from "./workflow-brace"; @@ -7,6 +8,7 @@ export function parseRuleBlock( lines: string[], startIndex: number, pendingComments: string[], + trivia: Trivia = createTrivia(), ): { rule: RuleDef; nextIndex: number; exported: boolean } { const lineNo = startIndex + 1; const raw = lines[startIndex]; @@ -133,7 +135,7 @@ export function parseRuleBlock( } continue; } - const st = parseBlockStatement(filePath, lines, i, { forRule: true }); + const st = parseBlockStatement(filePath, lines, i, trivia, { forRule: true }); if (st.step.type !== "shell") { flushCommand(); rule.steps.push(st.step); diff --git a/src/parse/scripts.ts b/src/parse/scripts.ts index 2ea92056..cc2f7e67 100644 --- a/src/parse/scripts.ts +++ b/src/parse/scripts.ts @@ -1,4 +1,5 @@ import type { ScriptDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { fail, parseSingleBacktickBody } from "./core"; import { parseFencedBlock } from "./fence"; @@ -42,6 +43,7 @@ export function parseScriptBlock( lines: string[], startIndex: number, pendingComments: string[], + trivia: Trivia = createTrivia(), ): { scriptDef: ScriptDef; nextIndex: number; exported: boolean } { const lineNo = startIndex + 1; const raw = lines[startIndex]; @@ -100,15 +102,16 @@ export function parseScriptBlock( ); } + const scriptDef: ScriptDef = { + name: scriptName, + comments: pendingComments, + body, + ...(lang ? { lang } : {}), + loc: { line: lineNo, col: 1 }, + }; + trivia.setNode(scriptDef, { scriptBodyKind: "fenced" }); return { - scriptDef: { - name: scriptName, - comments: pendingComments, - body, - ...(lang ? { lang } : {}), - bodyKind: "fenced", - loc: { line: lineNo, col: 1 }, - }, + scriptDef, nextIndex: nextIdx, exported: isExported, }; @@ -124,14 +127,15 @@ export function parseScriptBlock( validateScriptBodyNoInterpolation(body, filePath, lineNo, 1); + const scriptDef: ScriptDef = { + name: scriptName, + comments: pendingComments, + body, + loc: { line: lineNo, col: 1 }, + }; + trivia.setNode(scriptDef, { scriptBodyKind: "backtick" }); return { - scriptDef: { - name: scriptName, - comments: pendingComments, - body, - bodyKind: "backtick", - loc: { line: lineNo, col: 1 }, - }, + scriptDef, nextIndex: startIndex + 1, exported: isExported, }; diff --git a/src/parse/send-rhs.ts b/src/parse/send-rhs.ts index 77f4e929..50d5e6f1 100644 --- a/src/parse/send-rhs.ts +++ b/src/parse/send-rhs.ts @@ -1,6 +1,7 @@ import type { SendRhsDef, WorkflowRefDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { fail, hasUnescapedClosingQuote, indexOfClosingDoubleQuote, isRef, parseCallRef, rejectTrailingContent } from "./core"; -import { parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; +import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; const SEND_RHS_HINT = 'send right-hand side must be a quoted string ("..."), a variable ($name or ${...}), or "run [args]" — not raw shell; use a script or use const'; @@ -13,6 +14,7 @@ export function parseSendRhs( col: number, lines?: string[], idx?: number, + trivia: Trivia = createTrivia(), ): { rhs: SendRhsDef; nextIdx: number } { const t = rhs.trim(); const defaultNext = (idx ?? lineNo - 1) + 1; @@ -24,7 +26,9 @@ export function parseSendRhs( tqLines[idx] = t; const { body, nextIdx, afterClose } = parseTripleQuoteBlock(filePath, tqLines, idx); if (afterClose) fail(filePath, 'unexpected content after closing """', nextIdx); - return { rhs: { kind: "literal", token: tripleQuoteBodyToRaw(body), tripleQuoted: true }, nextIdx }; + const rhsNode: SendRhsDef = { kind: "literal", token: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + trivia.setNode(rhsNode, { tripleQuoted: true, rawBody: body }); + return { rhs: rhsNode, nextIdx }; } if (t.startsWith('"')) { if (!hasUnescapedClosingQuote(t, 1)) { diff --git a/src/parse/steps.ts b/src/parse/steps.ts index 4a6cf130..01ebbd19 100644 --- a/src/parse/steps.ts +++ b/src/parse/steps.ts @@ -1,4 +1,5 @@ import type { WorkflowStepDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { parseConstRhs } from "./const-rhs"; import { fail, indexOfClosingDoubleQuote, isRef, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; import { parseAnonymousInlineScript } from "./inline-script"; @@ -91,6 +92,7 @@ function parseCatchStatement( lineNo: number, col: number, stmt: string, + trivia: Trivia, ): WorkflowStepDef { const t = stmt.trim(); if (!t) { @@ -148,12 +150,15 @@ function parseCatchStatement( : isBare ? bareIdentifierToQuotedString(retVal) : retVal; - return { + const step: WorkflowStepDef = { type: "return", value, loc: { line: lineNo, col }, - ...(isBareDotted || isBare ? { bareSource: retVal.trim() } : {}), }; + if (isBareDotted || isBare) { + trivia.setNode(step, { bareSource: retVal.trim() }); + } + return step; } if (/^fail\s+/.test(t)) { const arg = t.slice("fail".length).trimStart(); @@ -172,7 +177,7 @@ function parseCatchStatement( const name = constMatch[1]; const rhs = constMatch[2].trim(); const syntheticLines = [t]; - const { value } = parseConstRhs(filePath, syntheticLines, 0, rhs, lineNo, col, false, name); + const { value } = parseConstRhs(filePath, syntheticLines, 0, rhs, lineNo, col, false, name, trivia); return { type: "const", name, @@ -230,7 +235,7 @@ function parseCatchStatement( if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s)); + const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); return { type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -240,7 +245,7 @@ function parseCatchStatement( }; } if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after); + const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); return { type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -270,7 +275,7 @@ function parseCatchStatement( if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s)); + const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); return { type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -280,7 +285,7 @@ function parseCatchStatement( }; } if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after); + const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); return { type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -322,7 +327,7 @@ function parseCatchStatement( if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s)); + const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); return { type: "ensure", ref: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -332,7 +337,7 @@ function parseCatchStatement( }; } if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after); + const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); return { type: "ensure", ref: { value: callPart.ref, loc: { line: lineNo, col } }, @@ -370,7 +375,7 @@ function parseCatchStatement( if (t.startsWith("prompt ")) { return parsePromptStep( filePath, [], lineNo - 1, t.slice("prompt ".length).trimStart(), - col + t.indexOf("prompt"), + col + t.indexOf("prompt"), undefined, trivia, ).step; } if (t.startsWith("log ") || t === "log") { @@ -400,6 +405,7 @@ export function parseEnsureStep( innerRaw: string, ensureBody: string, captureName?: string, + trivia: Trivia = createTrivia(), ): { step: WorkflowStepDef; nextIdx: number } { const catchIdx = ensureBody.indexOf(" catch "); const ensureCol = innerRaw.indexOf("ensure") + 1; @@ -499,7 +505,7 @@ export function parseEnsureStep( if (statements.length === 0) { fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; } @@ -513,7 +519,7 @@ export function parseEnsureStep( if (statements.length === 0) { fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: idx }; } @@ -521,7 +527,7 @@ export function parseEnsureStep( fail(filePath, "catch requires a body after bindings", innerNo, catchCol); } - const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings); + const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); return { step: { ...base, catch: { single: singleStep, bindings } }, nextIdx: idx }; } @@ -537,6 +543,7 @@ export function parseRunRecoverStep( innerRaw: string, runBody: string, captureName?: string, + trivia: Trivia = createTrivia(), ): { step: WorkflowStepDef; nextIdx: number } | null { // Match ` recover(`, ` recover `, or ` recover` at end of line const recoverMatch = runBody.match(/ recover(?=[\s(]|$)/); @@ -615,7 +622,7 @@ export function parseRunRecoverStep( if (statements.length === 0) { fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); return { step: { ...base, recover: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; } @@ -629,7 +636,7 @@ export function parseRunRecoverStep( if (statements.length === 0) { fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, recoverCol, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, recoverCol, s, trivia)); return { step: { ...base, recover: { block: blockSteps, bindings } }, nextIdx: idx }; } @@ -637,7 +644,7 @@ export function parseRunRecoverStep( fail(filePath, "recover requires a body after bindings", innerNo, recoverCol); } - const singleStep = parseCatchStatement(filePath, innerNo, recoverCol, afterBindings); + const singleStep = parseCatchStatement(filePath, innerNo, recoverCol, afterBindings, trivia); return { step: { ...base, recover: { single: singleStep, bindings } }, nextIdx: idx }; } @@ -653,6 +660,7 @@ export function parseRunCatchStep( innerRaw: string, runBody: string, captureName?: string, + trivia: Trivia = createTrivia(), ): { step: WorkflowStepDef; nextIdx: number } | null { const catchIdx = runBody.indexOf(" catch "); if (catchIdx === -1) return null; @@ -730,7 +738,7 @@ export function parseRunCatchStep( if (statements.length === 0) { fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; } @@ -744,7 +752,7 @@ export function parseRunCatchStep( if (statements.length === 0) { fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s)); + const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: idx }; } @@ -752,6 +760,6 @@ export function parseRunCatchStep( fail(filePath, "catch requires a body after bindings", innerNo, catchCol); } - const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings); + const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); return { step: { ...base, catch: { single: singleStep, bindings } }, nextIdx: idx }; } diff --git a/src/parse/tests.ts b/src/parse/tests.ts index 0771a0bc..3d69c32e 100644 --- a/src/parse/tests.ts +++ b/src/parse/tests.ts @@ -1,4 +1,5 @@ import type { MatchArmDef, TestBlockDef, WorkflowStepDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { colFromRaw, fail, hasUnescapedClosingQuote, isRef, parseParamList, stripQuotes } from "./core"; import { parseMatchArms } from "./match"; import { parseBraceBlockBody } from "./workflow-brace"; @@ -99,7 +100,7 @@ export function parseTestBlock( filePath: string, lines: string[], startIndex: number, - leadingComments?: string[], + trivia: Trivia = createTrivia(), ): { testBlock: TestBlockDef; nextIndex: number } { const lineNo = startIndex + 1; const raw = lines[startIndex]; @@ -115,9 +116,6 @@ export function parseTestBlock( steps: [], loc: { line: lineNo, col: raw.indexOf("test") + 1 }, }; - if (leadingComments && leadingComments.length > 0) { - testBlock.leadingComments = [...leadingComments]; - } let i = startIndex + 1; for (; i < lines.length; i += 1) { @@ -183,7 +181,7 @@ export function parseTestBlock( rejectOldMockSyntax(filePath, inner, "workflow", innerNo, col); const mockWfHeader = parseMockHeader(filePath, inner, "mock workflow ", innerNo, col); if (mockWfHeader) { - const { steps, nextIdx } = parseBraceBlockBody(filePath, lines, i + 1, innerNo, { forRule: false }); + const { steps, nextIdx } = parseBraceBlockBody(filePath, lines, i + 1, innerNo, trivia, { forRule: false }); testBlock.steps.push({ type: "test_mock_workflow", ref: mockWfHeader.ref, params: mockWfHeader.params, steps, loc }); i = nextIdx - 1; continue; @@ -193,7 +191,7 @@ export function parseTestBlock( rejectOldMockSyntax(filePath, inner, "rule", innerNo, col); const mockRuleHeader = parseMockHeader(filePath, inner, "mock rule ", innerNo, col); if (mockRuleHeader) { - const { steps, nextIdx } = parseBraceBlockBody(filePath, lines, i + 1, innerNo, { forRule: true }); + const { steps, nextIdx } = parseBraceBlockBody(filePath, lines, i + 1, innerNo, trivia, { forRule: true }); testBlock.steps.push({ type: "test_mock_rule", ref: mockRuleHeader.ref, params: mockRuleHeader.params, steps, loc }); i = nextIdx - 1; continue; diff --git a/src/parse/triple-quote.ts b/src/parse/triple-quote.ts index 4856acbf..e1a13b8d 100644 --- a/src/parse/triple-quote.ts +++ b/src/parse/triple-quote.ts @@ -1,3 +1,4 @@ +import { dedentCommonLeadingWhitespace } from "./dedent"; import { fail } from "./core"; /** Per language.md: trim blank lines adjacent to opening/closing `"""` only — do not dedent inner margin. */ @@ -58,6 +59,15 @@ export function tripleQuoteBodyToRaw(body: string): string { return `"${body.replace(/\\/g, "\\\\").replace(/"/g, '\\"')}"`; } +/** + * Apply common-leading-whitespace dedent to a triple-quoted body. The parser + * applies this so the semantic AST string carries the runtime-ready form; + * runtime & validator stop needing a `tripleQuoted` flag. + */ +export function dedentTripleQuotedBody(body: string): string { + return dedentCommonLeadingWhitespace(body); +} + /** * Helper for step parsers: when a step argument starts with `"""`, splice it back * onto the source line and parse the triple-quoted block. Errors if any content diff --git a/src/parse/trivia-ast-shape.test.ts b/src/parse/trivia-ast-shape.test.ts new file mode 100644 index 00000000..458cd209 --- /dev/null +++ b/src/parse/trivia-ast-shape.test.ts @@ -0,0 +1,92 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import type { + ChannelDef, + ConstRhs, + ImportDef, + ScriptDef, + ScriptImportDef, + SendRhsDef, + TestBlockDef, + WorkflowMetadata, + WorkflowStepDef, + jaiphModule, +} from "../types"; + +/** + * AC1: trivia / source-fidelity fields must not live on semantic AST types. + * + * Each helper below assigns an object literal with the field that *used* to + * exist; if anyone re-adds the field to the public type, the literal type + * widens, the type assertion below fails, and TypeScript breaks compilation — + * which is what the criterion asks for. + */ + +type HasField = T extends Record ? true : false; + +// jaiphModule must not carry: configLeadingComments, trailingTopLevelComments, topLevelOrder. +const _moduleNoConfigLeading: HasField = false; +const _moduleNoTrailing: HasField = false; +const _moduleNoTopLevelOrder: HasField = false; + +// ImportDef / ScriptImportDef / ChannelDef / TestBlockDef must not carry leadingComments. +const _importNoLeading: HasField = false; +const _scriptImportNoLeading: HasField = false; +const _channelNoLeading: HasField = false; +const _testBlockNoLeading: HasField = false; + +// WorkflowMetadata must not carry configBodySequence. +const _metaNoConfigSeq: HasField = false; + +// ScriptDef must not carry bodyKind. +const _scriptNoBodyKind: HasField = false; + +// Pick concrete variants out of WorkflowStepDef and assert no trivia fields. +type LogStep = Extract; +type LogerrStep = Extract; +type FailStep = Extract; +type ReturnStep = Extract; +type PromptStep = Extract; + +const _logNoTripleQuoted: HasField = false; +const _logerrNoTripleQuoted: HasField = false; +const _failNoTripleQuoted: HasField = false; +const _returnNoTripleQuoted: HasField = false; +const _returnNoBareSource: HasField = false; +const _promptNoBodyKind: HasField = false; +const _promptNoBodyIdentifier: HasField = false; + +// ConstRhs.expr must not carry tripleQuoted. +type ConstExpr = Extract; +type ConstPromptCapture = Extract; +const _constExprNoTripleQuoted: HasField = false; +const _constPromptNoBodyKind: HasField = false; +const _constPromptNoBodyIdentifier: HasField = false; + +// SendRhsDef literal must not carry tripleQuoted. +type SendLiteral = Extract; +const _sendLiteralNoTripleQuoted: HasField = false; + +// Reference the symbols so they are not tree-shaken or marked unused. +test("AC1: no trivia fields on semantic AST types", () => { + assert.equal(_moduleNoConfigLeading, false); + assert.equal(_moduleNoTrailing, false); + assert.equal(_moduleNoTopLevelOrder, false); + assert.equal(_importNoLeading, false); + assert.equal(_scriptImportNoLeading, false); + assert.equal(_channelNoLeading, false); + assert.equal(_testBlockNoLeading, false); + assert.equal(_metaNoConfigSeq, false); + assert.equal(_scriptNoBodyKind, false); + assert.equal(_logNoTripleQuoted, false); + assert.equal(_logerrNoTripleQuoted, false); + assert.equal(_failNoTripleQuoted, false); + assert.equal(_returnNoTripleQuoted, false); + assert.equal(_returnNoBareSource, false); + assert.equal(_promptNoBodyKind, false); + assert.equal(_promptNoBodyIdentifier, false); + assert.equal(_constExprNoTripleQuoted, false); + assert.equal(_constPromptNoBodyKind, false); + assert.equal(_constPromptNoBodyIdentifier, false); + assert.equal(_sendLiteralNoTripleQuoted, false); +}); diff --git a/src/parse/trivia-grep.test.ts b/src/parse/trivia-grep.test.ts new file mode 100644 index 00000000..7b409b27 --- /dev/null +++ b/src/parse/trivia-grep.test.ts @@ -0,0 +1,49 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync } from "node:fs"; +import { resolve, join } from "node:path"; + +// Tests run from dist/src/parse/, so repo root is three levels up. +const repoRoot = resolve(__dirname, "../../.."); + +/** Validator and emitter source files that must not reference Trivia. */ +const PROTECTED_FILES = [ + "src/transpile/validate.ts", + "src/transpile/validate-string.ts", + "src/transpile/validate-prompt-schema.ts", + "src/transpile/validate-ref-resolution.ts", + "src/transpile/validate-substitution.ts", + "src/transpile/validate-match.test.ts", + "src/transpile/emit-script.ts", + "src/transpile/emit-from-graph.ts", +]; + +test("AC2: validator and emitter sources do not import Trivia", () => { + for (const rel of PROTECTED_FILES) { + const abs = join(repoRoot, rel); + let content: string; + try { + content = readFileSync(abs, "utf8"); + } catch { + // File doesn't exist in this checkout — skip rather than fail. + continue; + } + // No imports from the trivia module. + assert.equal( + /from\s+["'][^"']*\/parse\/trivia["']/.test(content), + false, + `${rel} imports from parse/trivia — validator/emitter must not read Trivia`, + ); + // No reference to the Trivia identifier or its node-trivia fields. + const forbidden = ["Trivia", "createTrivia", "NodeTrivia", "ModuleTrivia"]; + for (const sym of forbidden) { + // Word boundary on each side. + const re = new RegExp(`\\b${sym}\\b`); + assert.equal( + re.test(content), + false, + `${rel} references ${sym} — validator/emitter must not see Trivia`, + ); + } + } +}); diff --git a/src/parse/trivia.ts b/src/parse/trivia.ts new file mode 100644 index 00000000..06bd14f3 --- /dev/null +++ b/src/parse/trivia.ts @@ -0,0 +1,78 @@ +import type { TopLevelEmitOrder } from "../types"; + +/** One line inside `config { }`: comment or assignment (formatter round-trip order). */ +export type ConfigBodyPart = + | { kind: "comment"; text: string } + | { kind: "assign"; key: string }; + +/** + * Per-node source-fidelity data. Each field is optional; presence indicates a + * particular surface form chosen by the author that the formatter needs to + * round-trip. The validator/emitter never look at this map. + * + * - `tripleQuoted`: the literal/return/log/logerr/fail/send/const was written + * as `"""..."""`. The AST string is the *dedented* form (so runtime & + * validator don't need this flag); the original raw body is in `rawBody`. + * - `rawBody`: original triple-quoted body (without surrounding `"""`), used + * by the formatter to re-emit the author's exact indentation. + * - `bareSource`: `return foo` and `return foo.bar` sugar — formatter + * re-emits the bare form instead of `"${foo}"`. + * - `bodyKind` (prompt): `"string" | "identifier" | "triple_quoted"`. + * - `bodyIdentifier` (prompt): identifier name when `bodyKind === "identifier"`. + * - `scriptBodyKind` (script): `"backtick" | "fenced"`. + * - `leadingComments`: `#` lines immediately before an import / channel / + * test block / env decl. + */ +export interface NodeTrivia { + tripleQuoted?: boolean; + rawBody?: string; + bareSource?: string; + bodyKind?: "string" | "identifier" | "triple_quoted"; + bodyIdentifier?: string; + scriptBodyKind?: "backtick" | "fenced"; + leadingComments?: string[]; + /** Order and comment lines inside `config { … }`; keyed on the metadata object. */ + configBodySequence?: ConfigBodyPart[]; +} + +/** Module-level source-fidelity data not tied to a specific node. */ +export interface ModuleTrivia { + configLeadingComments?: string[]; + configBodySequence?: ConfigBodyPart[]; + trailingTopLevelComments?: string[]; + topLevelOrder?: TopLevelEmitOrder[]; +} + +/** + * Trivia store. The parser builds it alongside the semantic AST and returns + * both via `parsejaiph`. The formatter reads it; nobody else does. + */ +export class Trivia { + private nodes = new WeakMap(); + private moduleData: ModuleTrivia = {}; + + setNode(node: object, info: NodeTrivia): void { + const existing = this.nodes.get(node); + if (existing) { + Object.assign(existing, info); + } else { + this.nodes.set(node, { ...info }); + } + } + + getNode(node: object): NodeTrivia | undefined { + return this.nodes.get(node); + } + + setModule(info: Partial): void { + Object.assign(this.moduleData, info); + } + + getModule(): ModuleTrivia { + return this.moduleData; + } +} + +export function createTrivia(): Trivia { + return new Trivia(); +} diff --git a/src/parse/workflow-brace.ts b/src/parse/workflow-brace.ts index 485d1c10..f0a52e26 100644 --- a/src/parse/workflow-brace.ts +++ b/src/parse/workflow-brace.ts @@ -1,4 +1,5 @@ import type { WorkflowMetadata, WorkflowStepDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { colFromRaw, fail, @@ -9,7 +10,7 @@ import { parseLogMessageRhs, rejectTrailingContent, } from "./core"; -import { consumeTripleQuotedArg, tripleQuoteBodyToRaw } from "./triple-quote"; +import { consumeTripleQuotedArg, dedentTripleQuotedBody, tripleQuoteBodyToRaw } from "./triple-quote"; import { parseConstRhs } from "./const-rhs"; import { parseAnonymousInlineScript } from "./inline-script"; import { parseConfigBlock } from "./metadata"; @@ -37,6 +38,7 @@ export function parseBraceBlockBody( lines: string[], startIdx: number, openerLineNo: number, + trivia: Trivia = createTrivia(), opts?: BlockParseOpts, ): { steps: WorkflowStepDef[]; nextIdx: number } { const steps: WorkflowStepDef[] = []; @@ -72,7 +74,7 @@ export function parseBraceBlockBody( if (hadNonCommentStep) { fail(filePath, "config block inside workflow must appear before any steps", innerNo); } - const { metadata, nextIndex } = parseConfigBlock(filePath, lines, idx); + const { metadata, nextIndex } = parseConfigBlock(filePath, lines, idx, trivia); opts.onConfigBlock(metadata, innerNo); idx = nextIndex; continue; @@ -89,7 +91,7 @@ export function parseBraceBlockBody( ); } hadNonCommentStep = true; - const one = parseBlockStatement(filePath, lines, idx, opts); + const one = parseBlockStatement(filePath, lines, idx, trivia, opts); steps.push(one.step); idx = one.nextIdx; } @@ -103,6 +105,7 @@ export function parseBlockStatement( filePath: string, lines: string[], idx: number, + trivia: Trivia = createTrivia(), opts?: BlockParseOpts, ): { step: WorkflowStepDef; nextIdx: number } { const innerRaw = lines[idx]; @@ -145,7 +148,7 @@ export function parseBlockStatement( fail(filePath, `operator "${operator}" requires a regex operand (/pattern/), not a string`, innerNo, ifLoc.col); } - const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo); + const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, trivia); return { step: { type: "if", subject, operator, operand, body, loc: ifLoc }, nextIdx, @@ -166,7 +169,7 @@ export function parseBlockStatement( const iterVar = forHead[1]; const sourceVar = forHead[2]; const forLoc = { line: innerNo, col: innerRaw.indexOf("for") + 1 }; - const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, opts); + const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, trivia, opts); return { step: { type: "for_lines", iterVar, sourceVar, body, loc: forLoc }, nextIdx, @@ -186,7 +189,7 @@ export function parseBlockStatement( const name = constMatch[1]; const rhs = constMatch[2].trim(); const { value, nextLineIdx } = parseConstRhs( - filePath, lines, idx, rhs, innerNo, innerRaw.indexOf(rhs) + 1, forRule, name, + filePath, lines, idx, rhs, innerNo, innerRaw.indexOf(rhs) + 1, forRule, name, trivia, ); const nextLine = nextLineIdx > idx ? nextLineIdx + 1 : idx + 1; return { @@ -201,11 +204,10 @@ export function parseBlockStatement( const failCol = innerRaw.indexOf("fail") + 1; if (arg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, arg); - const message = tripleQuoteBodyToRaw(body); - return { - step: { type: "fail", message, tripleQuoted: true, loc: { line: innerNo, col: failCol } }, - nextIdx, - }; + const message = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); + const step = { type: "fail" as const, message, loc: { line: innerNo, col: failCol } }; + trivia.setNode(step, { tripleQuoted: true, rawBody: body }); + return { step, nextIdx }; } if (!arg.startsWith('"')) { fail(filePath, 'fail must match: fail "" or fail """..."""', innerNo, failCol); @@ -232,7 +234,7 @@ export function parseBlockStatement( const ensureBody = inner.slice("ensure ".length).trim(); const r = parseEnsureStep( filePath, lines, idx, innerNo, innerRaw, - ensureBody, + ensureBody, undefined, trivia, ); return { step: r.step, nextIdx: r.nextIdx + 1 }; } @@ -243,7 +245,7 @@ export function parseBlockStatement( fail(filePath, "run async is not supported with inline scripts", innerNo, innerRaw.indexOf("run") + 1); } // run async ... recover(name) { ... } - const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody); + const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); if (recoverResult && recoverResult.step.type === "run") { return { step: { ...recoverResult.step, async: true }, @@ -251,7 +253,7 @@ export function parseBlockStatement( }; } // run async ... catch(name) { ... } - const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody); + const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); if (catchResult && catchResult.step.type === "run") { return { step: { ...catchResult.step, async: true }, @@ -298,12 +300,12 @@ export function parseBlockStatement( fail(filePath, 'inline script syntax has changed: use run `body`(args) instead of run script(args) "body"', innerNo); } // Check for run ... recover (loop semantics) - const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody); + const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); if (recoverResult) { return { step: recoverResult.step, nextIdx: recoverResult.nextIdx + 1 }; } // Check for run ... catch - const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody); + const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); if (catchResult) { return { step: catchResult.step, nextIdx: catchResult.nextIdx + 1 }; } @@ -342,7 +344,7 @@ export function parseBlockStatement( if (inner.startsWith("prompt ")) { const promptCol = innerRaw.indexOf("prompt") + 1; const promptArg = innerRaw.slice(innerRaw.indexOf("prompt") + "prompt".length).trimStart(); - const result = parsePromptStep(filePath, lines, idx, promptArg, promptCol); + const result = parsePromptStep(filePath, lines, idx, promptArg, promptCol, undefined, trivia); return { step: result.step, nextIdx: result.nextLineIdx + 1 }; } @@ -392,7 +394,9 @@ export function parseBlockStatement( } if (logArg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logArg); - return { step: { type: "log", message: body, tripleQuoted: true, loc: { line: innerNo, col: logCol } }, nextIdx }; + const step = { type: "log" as const, message: dedentTripleQuotedBody(body), loc: { line: innerNo, col: logCol } }; + trivia.setNode(step, { tripleQuoted: true, rawBody: body }); + return { step, nextIdx }; } if (logArg.startsWith('"') && !hasUnescapedClosingQuote(logArg, 1)) { fail(filePath, 'multiline strings use triple quotes: log """..."""', innerNo, logCol); @@ -428,7 +432,9 @@ export function parseBlockStatement( } if (logerrArg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logerrArg); - return { step: { type: "logerr", message: body, tripleQuoted: true, loc: { line: innerNo, col: logerrCol } }, nextIdx }; + const step = { type: "logerr" as const, message: dedentTripleQuotedBody(body), loc: { line: innerNo, col: logerrCol } }; + trivia.setNode(step, { tripleQuoted: true, rawBody: body }); + return { step, nextIdx }; } if (logerrArg.startsWith('"') && !hasUnescapedClosingQuote(logerrArg, 1)) { fail(filePath, 'multiline strings use triple quotes: logerr """..."""', innerNo, logerrCol); @@ -455,10 +461,13 @@ export function parseBlockStatement( // return """...""" if (returnValue.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, returnValue); - return { - step: { type: "return", value: tripleQuoteBodyToRaw(body), tripleQuoted: true, loc: retLoc }, - nextIdx, + const step = { + type: "return" as const, + value: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)), + loc: retLoc, }; + trivia.setNode(step, { tripleQuoted: true, rawBody: body }); + return { step, nextIdx }; } // return match var { ... } const returnMatchHead = returnValue.match(/^match\s+(.+?)\s*\{\s*$/); @@ -561,13 +570,12 @@ export function parseBlockStatement( : isBare ? bareIdentifierToQuotedString(returnValue) : returnValue; + const step = { type: "return" as const, value, loc: retLoc }; + if (isBareDotted || isBare) { + trivia.setNode(step, { bareSource: returnValue.trim() }); + } return { - step: { - type: "return", - value, - loc: retLoc, - ...(isBareDotted || isBare ? { bareSource: returnValue.trim() } : {}), - }, + step, nextIdx: idx + 1, }; } @@ -592,7 +600,7 @@ export function parseBlockStatement( } const arrowIdx = inner.indexOf("<-"); const rhsCol = arrowIdx >= 0 ? arrowIdx + 3 : 1; - const { rhs, nextIdx: sendNextIdx } = parseSendRhs(filePath, sendMatch.rhsText, innerNo, rhsCol, lines, idx); + const { rhs, nextIdx: sendNextIdx } = parseSendRhs(filePath, sendMatch.rhsText, innerNo, rhsCol, lines, idx, trivia); return { step: { type: "send", diff --git a/src/parse/workflows.ts b/src/parse/workflows.ts index 3ec9156f..d972d133 100644 --- a/src/parse/workflows.ts +++ b/src/parse/workflows.ts @@ -1,4 +1,5 @@ import type { WorkflowDef } from "../types"; +import { createTrivia, type Trivia } from "./trivia"; import { fail, parseParamList } from "./core"; import { parseBraceBlockBody } from "./workflow-brace"; @@ -7,6 +8,7 @@ export function parseWorkflowBlock( lines: string[], startIndex: number, pendingComments: string[], + trivia: Trivia = createTrivia(), ): { workflow: WorkflowDef; nextIndex: number; exported: boolean } { const lineNo = startIndex + 1; const rawDecl = lines[startIndex]; @@ -58,6 +60,7 @@ export function parseWorkflowBlock( lines, startIndex + 1, lineNo, + trivia, { forRule: false, preserveBlankLines: true, diff --git a/src/parser.ts b/src/parser.ts index 15696835..bc3379d1 100644 --- a/src/parser.ts +++ b/src/parser.ts @@ -1,4 +1,5 @@ -import { jaiphModule } from "./types"; +import { jaiphModule, TopLevelEmitOrder } from "./types"; +import { Trivia, createTrivia } from "./parse/trivia"; import { fail } from "./parse/core"; import { parseChannelLine } from "./parse/channels"; import { parseEnvDecl } from "./parse/env"; @@ -9,7 +10,17 @@ import { parseScriptBlock } from "./parse/scripts"; import { parseWorkflowBlock } from "./parse/workflows"; import { parseTestBlock } from "./parse/tests"; +export interface ParseResult { + ast: jaiphModule; + trivia: Trivia; +} + export function parsejaiph(source: string, filePath: string): jaiphModule { + return parsejaiphWithTrivia(source, filePath).ast; +} + +export function parsejaiphWithTrivia(source: string, filePath: string): ParseResult { + const trivia = createTrivia(); const lines = source.split(/\r?\n/); const mod: jaiphModule = { filePath, @@ -19,8 +30,8 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { rules: [], scripts: [], workflows: [], - topLevelOrder: [], }; + const topLevelOrder: TopLevelEmitOrder[] = []; let i = 0; let pendingTopLevelComments: string[] = []; @@ -48,10 +59,10 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { fail(filePath, "duplicate config block (only one allowed per file)", lineNo, 1); } if (pendingTopLevelComments.length > 0) { - mod.configLeadingComments = [...pendingTopLevelComments]; + trivia.setModule({ configLeadingComments: [...pendingTopLevelComments] }); pendingTopLevelComments = []; } - const { metadata, nextIndex } = parseConfigBlock(filePath, lines, i - 1); + const { metadata, nextIndex } = parseConfigBlock(filePath, lines, i - 1, trivia); mod.metadata = metadata; i = nextIndex; continue; @@ -60,7 +71,7 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { if (line.startsWith("import script ")) { const si = parseScriptImportLine(filePath, line, raw, lineNo); if (pendingTopLevelComments.length > 0) { - si.leadingComments = [...pendingTopLevelComments]; + trivia.setNode(si, { leadingComments: [...pendingTopLevelComments] }); pendingTopLevelComments = []; } if (!mod.scriptImports) mod.scriptImports = []; @@ -71,7 +82,7 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { if (line.startsWith("import ")) { const imp = parseImportLine(filePath, line, raw, lineNo); if (pendingTopLevelComments.length > 0) { - imp.leadingComments = [...pendingTopLevelComments]; + trivia.setNode(imp, { leadingComments: [...pendingTopLevelComments] }); pendingTopLevelComments = []; } mod.imports.push(imp); @@ -81,7 +92,7 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { if (line.startsWith("channel ")) { const ch = parseChannelLine(filePath, line, raw, lineNo); if (pendingTopLevelComments.length > 0) { - ch.leadingComments = [...pendingTopLevelComments]; + trivia.setNode(ch, { leadingComments: [...pendingTopLevelComments] }); pendingTopLevelComments = []; } mod.channels.push(ch); @@ -99,11 +110,14 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { filePath, lines, i - 1, - pendingTopLevelComments.length > 0 ? [...pendingTopLevelComments] : undefined, + trivia, ); + if (pendingTopLevelComments.length > 0) { + trivia.setNode(testBlock, { leadingComments: [...pendingTopLevelComments] }); + } pendingTopLevelComments = []; mod.tests.push(testBlock); - mod.topLevelOrder!.push({ kind: "test", index: mod.tests.length - 1 }); + topLevelOrder.push({ kind: "test", index: mod.tests.length - 1 }); i = nextIndex; continue; } @@ -118,43 +132,43 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { mod.envDecls = []; } mod.envDecls.push(envDecl); - mod.topLevelOrder!.push({ kind: "env", index: mod.envDecls.length - 1 }); + topLevelOrder.push({ kind: "env", index: mod.envDecls.length - 1 }); i = nextIndex; continue; } if (/^(export\s+)?rule\s/.test(line)) { - const { rule, nextIndex, exported } = parseRuleBlock(filePath, lines, i - 1, pendingTopLevelComments); + const { rule, nextIndex, exported } = parseRuleBlock(filePath, lines, i - 1, pendingTopLevelComments, trivia); pendingTopLevelComments = []; if (exported) { mod.exports.push(rule.name); } mod.rules.push(rule); - mod.topLevelOrder!.push({ kind: "rule", index: mod.rules.length - 1 }); + topLevelOrder.push({ kind: "rule", index: mod.rules.length - 1 }); i = nextIndex; continue; } if (/^(export\s+)?script\s/.test(line)) { - const { scriptDef, nextIndex, exported } = parseScriptBlock(filePath, lines, i - 1, pendingTopLevelComments); + const { scriptDef, nextIndex, exported } = parseScriptBlock(filePath, lines, i - 1, pendingTopLevelComments, trivia); pendingTopLevelComments = []; if (exported) { mod.exports.push(scriptDef.name); } mod.scripts.push(scriptDef); - mod.topLevelOrder!.push({ kind: "script", index: mod.scripts.length - 1 }); + topLevelOrder.push({ kind: "script", index: mod.scripts.length - 1 }); i = nextIndex; continue; } if (/^(export\s+)?workflow\s/.test(line)) { - const { workflow, nextIndex, exported } = parseWorkflowBlock(filePath, lines, i - 1, pendingTopLevelComments); + const { workflow, nextIndex, exported } = parseWorkflowBlock(filePath, lines, i - 1, pendingTopLevelComments, trivia); pendingTopLevelComments = []; if (exported) { mod.exports.push(workflow.name); } mod.workflows.push(workflow); - mod.topLevelOrder!.push({ kind: "workflow", index: mod.workflows.length - 1 }); + topLevelOrder.push({ kind: "workflow", index: mod.workflows.length - 1 }); i = nextIndex; continue; } @@ -162,8 +176,9 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { fail(filePath, `unsupported top-level statement: ${line}`, lineNo); } + trivia.setModule({ topLevelOrder }); if (pendingTopLevelComments.length > 0) { - mod.trailingTopLevelComments = [...pendingTopLevelComments]; + trivia.setModule({ trailingTopLevelComments: [...pendingTopLevelComments] }); } // Unified namespace: imports, channels, rules, workflows, scripts, and consts all share one name space. @@ -189,5 +204,5 @@ export function parsejaiph(source: string, filePath: string): jaiphModule { } } - return mod; + return { ast: mod, trivia }; } diff --git a/src/runtime/kernel/graph.ts b/src/runtime/kernel/graph.ts index b5a896a9..73022f0f 100644 --- a/src/runtime/kernel/graph.ts +++ b/src/runtime/kernel/graph.ts @@ -29,7 +29,6 @@ function attachScriptImportStubs(ast: jaiphModule): void { name: si.alias, comments: [], body: "", - bodyKind: "fenced", loc: si.loc, }); } diff --git a/src/runtime/kernel/node-workflow-runtime.ts b/src/runtime/kernel/node-workflow-runtime.ts index d6c91545..7ef18adc 100644 --- a/src/runtime/kernel/node-workflow-runtime.ts +++ b/src/runtime/kernel/node-workflow-runtime.ts @@ -12,10 +12,7 @@ import { buildStepDisplayParamPairs } from "../../cli/commands/format-params.js" import { resolveRuleRef, resolveScriptRef, resolveWorkflowRef, type RuntimeGraph } from "./graph"; import type { WorkflowMetadata } from "../../types"; import { extractJson, validateFields } from "./schema"; -import { - plainMultilineOrchestrationForRuntime, - tripleQuotedRawForRuntime, -} from "../orchestration-text"; +import { tripleQuotedRawForRuntime } from "../orchestration-text"; import { commaArgsToInterpolated, interpolate, @@ -529,8 +526,7 @@ export class NodeWorkflowRuntime { if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); message = result.returnValue ?? result.output.trim(); } else { - const raw = step.tripleQuoted ? plainMultilineOrchestrationForRuntime(step.message) : step.message; - const ir = await this.interpolateWithCaptures(raw, scope); + const ir = await this.interpolateWithCaptures(step.message, scope); if (!ir.ok) return this.mergeStepResult(accOut, accErr, ir.result); message = ir.value; } @@ -546,8 +542,7 @@ export class NodeWorkflowRuntime { continue; } if (step.type === "fail") { - const failMsg = step.tripleQuoted ? tripleQuotedRawForRuntime(step.message) : step.message; - const failIr = await this.interpolateWithCaptures(failMsg, scope); + const failIr = await this.interpolateWithCaptures(step.message, scope); if (!failIr.ok) return this.mergeStepResult(accOut, accErr, failIr.result); const message = failIr.value; return this.mergeStepResult(accOut, accErr, { status: 1, output: "", error: message }); @@ -588,8 +583,7 @@ export class NodeWorkflowRuntime { return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); } // Match Bash semantics: return "$var" should return var value, not literal quotes. - const retRaw = step.tripleQuoted ? tripleQuotedRawForRuntime(step.value) : step.value; - const retIr = await this.interpolateWithCaptures(retRaw, scope); + const retIr = await this.interpolateWithCaptures(step.value, scope); if (!retIr.ok) return this.mergeStepResult(accOut, accErr, retIr.result); returnValue = stripOuterQuotes(retIr.value); return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); @@ -605,9 +599,7 @@ export class NodeWorkflowRuntime { } let payload = ""; if (step.rhs.kind === "literal") { - const sendTok = - step.rhs.tripleQuoted ? tripleQuotedRawForRuntime(step.rhs.token) : step.rhs.token; - const sendIr = await this.interpolateWithCaptures(sendTok, scope); + const sendIr = await this.interpolateWithCaptures(step.rhs.token, scope); if (!sendIr.ok) return this.mergeStepResult(accOut, accErr, sendIr.result); payload = stripOuterQuotes(sendIr.value); } else if (step.rhs.kind === "var") { @@ -673,16 +665,14 @@ export class NodeWorkflowRuntime { error: 'prompt with "returns" schema must capture to a variable', }); } - const r = await this.runPromptStep(scope, step.raw, step.bodyKind, step.returns, step.captureName, io); + const r = await this.runPromptStep(scope, step.raw, step.returns, step.captureName, io); accOut += r.output; if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); continue; } if (step.type === "const") { if (step.value.kind === "expr") { - const exprRhs = - step.value.tripleQuoted ? tripleQuotedRawForRuntime(step.value.bashRhs) : step.value.bashRhs; - const exprIr = await this.interpolateWithCaptures(exprRhs, scope); + const exprIr = await this.interpolateWithCaptures(step.value.bashRhs, scope); if (!exprIr.ok) return this.mergeStepResult(accOut, accErr, exprIr.result); scope.vars.set(step.name, stripOuterQuotes(exprIr.value)); continue; @@ -733,7 +723,6 @@ export class NodeWorkflowRuntime { const r = await this.runPromptStep( scope, step.value.raw, - step.value.bodyKind, step.value.returns, step.name, io, @@ -1091,13 +1080,11 @@ export class NodeWorkflowRuntime { private async runPromptStep( scope: Scope, raw: string, - bodyKind: "string" | "identifier" | "triple_quoted" | undefined, returns: string | undefined, captureName: string | undefined, io: StepIO | undefined, ): Promise<{ ok: true; output: string } | { ok: false; result: StepResult; output: string }> { - const promptRaw = bodyKind === "triple_quoted" ? tripleQuotedRawForRuntime(raw) : raw; - const promptIr = await this.interpolateWithCaptures(promptRaw, scope); + const promptIr = await this.interpolateWithCaptures(raw, scope); if (!promptIr.ok) return { ok: false, result: promptIr.result, output: "" }; let promptText = promptIr.value; const promptConfig = resolveConfig(scope.env); diff --git a/src/runtime/orchestration-text.ts b/src/runtime/orchestration-text.ts index 0940e27b..f31d9af1 100644 --- a/src/runtime/orchestration-text.ts +++ b/src/runtime/orchestration-text.ts @@ -7,16 +7,12 @@ function unescapeDslDoubleQuotedInner(inner: string): string { } /** - * Values stored as `tripleQuoteBodyToRaw(parsedBody)` keep source indentation for the formatter. - * At runtime, apply common-leading-whitespace removal (same as historical parse-time dedent). + * Apply common-leading-whitespace dedent to a `tripleQuoteBodyToRaw`-encoded + * value. Still used for match-arm bodies (which carry their own + * `tripleQuotedBody` flag and are not part of the trivia split). */ export function tripleQuotedRawForRuntime(raw: string): string { if (raw.length < 2 || raw[0] !== '"' || raw[raw.length - 1] !== '"') return raw; const inner = unescapeDslDoubleQuotedInner(raw.slice(1, -1)); return tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner)); } - -/** Plain multiline text from `log """…"""` / `logerr` / `fail` (no surrounding quotes in AST). */ -export function plainMultilineOrchestrationForRuntime(text: string): string { - return dedentCommonLeadingWhitespace(text); -} diff --git a/src/transpile/validate-ref-resolution.test.ts b/src/transpile/validate-ref-resolution.test.ts index a45329f3..42234774 100644 --- a/src/transpile/validate-ref-resolution.test.ts +++ b/src/transpile/validate-ref-resolution.test.ts @@ -58,7 +58,7 @@ test("lookupKind: finds workflow", () => { test("lookupKind: finds script", () => { const mod = minimalModule({ - scripts: [{ name: "build_it", comments: [], body: "", bodyKind: "backtick" as const, loc: { line: 1, col: 1 } }], + scripts: [{ name: "build_it", comments: [], body: "", loc: { line: 1, col: 1 } }], }); assert.equal(lookupKind(mod, "build_it"), "script"); }); @@ -241,7 +241,7 @@ test("validateRef: bare_send_rhs rejects local workflow", () => { test("validateRef: bare_send_rhs rejects local script", () => { const mod = minimalModule({ - scripts: [{ name: "build", comments: [], body: "", bodyKind: "backtick" as const, loc: { line: 1, col: 1 } }], + scripts: [{ name: "build", comments: [], body: "", loc: { line: 1, col: 1 } }], }); const ctx = makeCtx(); assert.throws( diff --git a/src/transpile/validate-string.ts b/src/transpile/validate-string.ts index f6cdff05..34777e53 100644 --- a/src/transpile/validate-string.ts +++ b/src/transpile/validate-string.ts @@ -11,7 +11,6 @@ import { jaiphError } from "../errors"; import { parseCallRef } from "../parse/core"; -import { dedentCommonLeadingWhitespace } from "../parse/dedent"; /** * Check for shell fallback/expansion syntax inside ${...} blocks. @@ -298,15 +297,15 @@ export function validatePromptString( filePath: string, line: number, col: number, - opts?: { tripleQuoted?: boolean }, ): void { - let content = stripDoubleQuotes(raw); - if (opts?.tripleQuoted) content = dedentCommonLeadingWhitespace(content); + const content = stripDoubleQuotes(raw); validateJaiphStringContent(content, filePath, line, col, "prompt"); } /** - * Validate a log/logerr message (inner content without quotes). + * Validate a log/logerr message (inner content without quotes). Triple-quoted + * messages arrive pre-dedented from the parser, so this validator no longer + * needs to know about that distinction. */ export function validateLogString( message: string, @@ -314,10 +313,8 @@ export function validateLogString( line: number, col: number, keyword: string, - opts?: { tripleQuoted?: boolean }, ): void { - const text = opts?.tripleQuoted ? dedentCommonLeadingWhitespace(message) : message; - validateJaiphStringContent(text, filePath, line, col, keyword); + validateJaiphStringContent(message, filePath, line, col, keyword); } /** @@ -328,10 +325,8 @@ export function validateFailString( filePath: string, line: number, col: number, - opts?: { tripleQuoted?: boolean }, ): void { - let content = stripDoubleQuotes(message); - if (opts?.tripleQuoted) content = dedentCommonLeadingWhitespace(content); + const content = stripDoubleQuotes(message); validateJaiphStringContent(content, filePath, line, col, "fail"); } @@ -343,11 +338,9 @@ export function validateReturnString( filePath: string, line: number, col: number, - opts?: { tripleQuoted?: boolean }, ): void { if (value.startsWith('"')) { - let content = stripDoubleQuotes(value); - if (opts?.tripleQuoted) content = dedentCommonLeadingWhitespace(content); + const content = stripDoubleQuotes(value); validateJaiphStringContent(content, filePath, line, col, "return"); } } diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index 1a8ba196..0bd0aff8 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -26,7 +26,6 @@ import { extractDotFieldRefs, } from "./validate-string"; import { validatePromptReturnsSchema, validatePromptStepReturns } from "./validate-prompt-schema"; -import { dedentCommonLeadingWhitespace } from "../parse/dedent"; import { matchSendOperator } from "../parse/core"; import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"; @@ -611,10 +610,13 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return m?.[1]; }; - /** Inner string for validation: same margin removal as runtime for `"""` orchestration text. */ - const semanticQuotedOrchestrationInner = (dqRaw: string, tripleQuoted: boolean): string => { - if (!tripleQuoted) return stripDQ(dqRaw); - return stripDQ(tripleQuotedRawForRuntime(dqRaw)); + /** Inner string for validation. Triple-quoted bodies are pre-dedented by the parser. */ + const semanticQuotedOrchestrationInner = (dqRaw: string): string => stripDQ(dqRaw); + + /** Detect `prompt ` form from raw `"${identifier}"` shape. */ + const promptBareIdentifier = (raw: string): string | undefined => { + const m = raw.match(/^"\$\{([A-Za-z_][A-Za-z0-9_]*)\}"$/); + return m?.[1]; }; /** Parse field names from a returns schema string like '{ name: string, age: number }'. */ @@ -762,8 +764,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (s.type === "fail") { - validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col, { tripleQuoted: s.tripleQuoted }); - const failInner = semanticQuotedOrchestrationInner(s.message, s.tripleQuoted === true); + validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col); + const failInner = semanticQuotedOrchestrationInner(s.message); validateRuleStringCaptures(failInner, s.loc); validateSimpleInterpolationIdentifiers( failInner, @@ -781,8 +783,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (s.type === "log") { if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log", { tripleQuoted: s.tripleQuoted }); - const logRuleInner = s.tripleQuoted ? dedentCommonLeadingWhitespace(s.message) : s.message; + validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log"); + const logRuleInner = s.message; validateRuleStringCaptures(logRuleInner, s.loc); validateSimpleInterpolationIdentifiers( logRuleInner, @@ -800,10 +802,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (s.type === "logerr") { if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr", { - tripleQuoted: s.tripleQuoted, - }); - const logerrRuleInner = s.tripleQuoted ? dedentCommonLeadingWhitespace(s.message) : s.message; + validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr"); + const logerrRuleInner = s.message; validateRuleStringCaptures(logerrRuleInner, s.loc); validateSimpleInterpolationIdentifiers( logerrRuleInner, @@ -840,9 +840,9 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } // run_inline_script — no ref to validate } else { - validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col, { tripleQuoted: s.tripleQuoted }); + validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col); if (s.value.startsWith('"')) { - const retRuleInner = semanticQuotedOrchestrationInner(s.value, s.tripleQuoted === true); + const retRuleInner = semanticQuotedOrchestrationInner(s.value); validateRuleStringCaptures(retRuleInner, s.loc); validateSimpleInterpolationIdentifiers( retRuleInner, @@ -1108,14 +1108,13 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (s.type === "prompt") { - if (s.bodyKind === "identifier" && s.bodyIdentifier && localScripts.has(s.bodyIdentifier)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${s.bodyIdentifier}" is a script — use a string const instead`); + const promptIdent = promptBareIdentifier(s.raw); + if (promptIdent && localScripts.has(promptIdent)) { + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); } - validatePromptString(s.raw, ast.filePath, s.loc.line, s.loc.col, { - tripleQuoted: s.bodyKind === "triple_quoted", - }); + validatePromptString(s.raw, ast.filePath, s.loc.line, s.loc.col); validatePromptStepReturns(s, ast.filePath); - const promptInner = semanticQuotedOrchestrationInner(s.raw, s.bodyKind === "triple_quoted"); + const promptInner = semanticQuotedOrchestrationInner(s.raw); validateWorkflowStringCaptures(promptInner, s.loc); validateDotFieldRefs(promptInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1134,10 +1133,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (s.type === "log") { if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log", { - tripleQuoted: s.tripleQuoted, - }); - const logInner = s.tripleQuoted ? dedentCommonLeadingWhitespace(s.message) : s.message; + validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log"); + const logInner = s.message; validateWorkflowStringCaptures(logInner, s.loc); validateDotFieldRefs(logInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1156,10 +1153,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (s.type === "logerr") { if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr", { - tripleQuoted: s.tripleQuoted, - }); - const logerrInner = s.tripleQuoted ? dedentCommonLeadingWhitespace(s.message) : s.message; + validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr"); + const logerrInner = s.message; validateWorkflowStringCaptures(logerrInner, s.loc); validateDotFieldRefs(logerrInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1197,9 +1192,9 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } return; } - validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col, { tripleQuoted: s.tripleQuoted }); + validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col); if (s.value.startsWith('"')) { - const retInner = semanticQuotedOrchestrationInner(s.value, s.tripleQuoted === true); + const retInner = semanticQuotedOrchestrationInner(s.value); validateWorkflowStringCaptures(retInner, s.loc); validateDotFieldRefs(retInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1218,8 +1213,8 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (s.type === "fail") { - validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col, { tripleQuoted: s.tripleQuoted }); - const failWfInner = semanticQuotedOrchestrationInner(s.message, s.tripleQuoted === true); + validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col); + const failWfInner = semanticQuotedOrchestrationInner(s.message); validateWorkflowStringCaptures(failWfInner, s.loc); validateDotFieldRefs(failWfInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1256,16 +1251,15 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateBareIdentifierArgs(ast.filePath, v.ref.loc, v.bareIdentifierArgs, wfKnownVars, recoverBindings); } else if (v.kind === "prompt_capture") { - if (v.bodyKind === "identifier" && v.bodyIdentifier && localScripts.has(v.bodyIdentifier)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${v.bodyIdentifier}" is a script — use a string const instead`); + const promptIdent = promptBareIdentifier(v.raw); + if (promptIdent && localScripts.has(promptIdent)) { + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); } - validatePromptString(v.raw, ast.filePath, s.loc.line, s.loc.col, { - tripleQuoted: v.bodyKind === "triple_quoted", - }); + validatePromptString(v.raw, ast.filePath, s.loc.line, s.loc.col); if (v.returns !== undefined) { validatePromptReturnsSchema(v.returns, ast.filePath, s.loc.line, s.loc.col); } - const pcInner = semanticQuotedOrchestrationInner(v.raw, v.bodyKind === "triple_quoted"); + const pcInner = semanticQuotedOrchestrationInner(v.raw); validateWorkflowStringCaptures(pcInner, s.loc); validateDotFieldRefs(pcInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( @@ -1289,7 +1283,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (scriptName && localScripts.has(scriptName)) { throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); } - const exprInner = semanticQuotedOrchestrationInner(v.bashRhs, v.tripleQuoted === true); + const exprInner = semanticQuotedOrchestrationInner(v.bashRhs); validateWorkflowStringCaptures(exprInner, s.loc); validateDotFieldRefs(exprInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( diff --git a/src/types.ts b/src/types.ts index 61e6abff..e093e213 100644 --- a/src/types.ts +++ b/src/types.ts @@ -7,8 +7,6 @@ export interface ImportDef { path: string; alias: string; loc: SourceLoc; - /** Top-level `#` lines immediately before this import (formatter). */ - leadingComments?: string[]; } /** `import script "" as ` — binds an external script file as a local script symbol. */ @@ -18,8 +16,6 @@ export interface ScriptImportDef { /** Bound script name. */ alias: string; loc: SourceLoc; - /** Top-level `#` lines immediately before this import (formatter). */ - leadingComments?: string[]; } export interface RuleRefDef { @@ -52,16 +48,12 @@ export interface MatchExprDef { } export type ConstRhs = - | { kind: "expr"; bashRhs: string; /** `const x = """..."""` — runtime dedents margin. */ tripleQuoted?: boolean } + | { kind: "expr"; bashRhs: string } | { kind: "run_capture"; ref: WorkflowRefDef; args?: string; bareIdentifierArgs?: string[]; async?: boolean } | { kind: "ensure_capture"; ref: RuleRefDef; args?: string; bareIdentifierArgs?: string[] } | { kind: "prompt_capture"; raw: string; - /** Body source: "string" (quoted literal), "identifier" (bare var ref), "triple_quoted" (""" block). */ - bodyKind?: "string" | "identifier" | "triple_quoted"; - /** Original identifier name when bodyKind is "identifier". */ - bodyIdentifier?: string; loc: SourceLoc; returns?: string; } @@ -70,7 +62,7 @@ export type ConstRhs = /** RHS of `channel <- …` */ export type SendRhsDef = - | { kind: "literal"; token: string; /** `channel <- """..."""` — runtime dedents margin. */ tripleQuoted?: boolean } + | { kind: "literal"; token: string } | { kind: "var"; bash: string } | { kind: "run"; ref: WorkflowRefDef; args?: string; bareIdentifierArgs?: string[] } /** Parsed then rejected in validation (use `run ref` to capture a return value). */ @@ -92,8 +84,6 @@ export interface ChannelDef { name: string; routes?: WorkflowRefDef[]; loc: SourceLoc; - /** Top-level `#` lines immediately before this channel (formatter). */ - leadingComments?: string[]; } export interface WorkflowDef { @@ -114,8 +104,6 @@ export interface ScriptDef { body: string; /** Fence language tag (e.g. "python3", "node"). Maps to `#!/usr/bin/env `. */ lang?: string; - /** How the body was provided: "backtick" (single `), "fenced" (``` block). */ - bodyKind: "backtick" | "fenced"; loc: SourceLoc; } @@ -153,10 +141,6 @@ export type WorkflowStepDef = | { type: "prompt"; raw: string; - /** Body source: "string" (quoted literal), "identifier" (bare var ref), "triple_quoted" (""" block). */ - bodyKind?: "string" | "identifier" | "triple_quoted"; - /** Original identifier name when bodyKind is "identifier". */ - bodyIdentifier?: string; loc: SourceLoc; /** When set, capture prompt stdout into this variable name. */ captureName?: string; @@ -171,8 +155,6 @@ export type WorkflowStepDef = | { type: "fail"; message: string; - /** Set when `fail """..."""`; runtime dedents margin. */ - tripleQuoted?: boolean; loc: SourceLoc; } | { @@ -184,8 +166,6 @@ export type WorkflowStepDef = | { type: "log"; message: string; - /** Set when `log """..."""`; runtime dedents margin. */ - tripleQuoted?: boolean; loc: SourceLoc; /** When set, log message comes from a managed inline-script call. */ managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] }; @@ -193,8 +173,6 @@ export type WorkflowStepDef = | { type: "logerr"; message: string; - /** Set when `logerr """..."""`; runtime dedents margin. */ - tripleQuoted?: boolean; loc: SourceLoc; /** When set, logerr message comes from a managed inline-script call. */ managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] }; @@ -208,14 +186,6 @@ export type WorkflowStepDef = | { type: "return"; value: string; - /** Set when `return """..."""`; runtime dedents margin. */ - tripleQuoted?: boolean; - /** - * Original source expression when `return ` was bare-identifier - * sugar (`return response` → value `"${response}"`). Preserved so the - * formatter can emit the bare form authored by the user. - */ - bareSource?: string; loc: SourceLoc; /** When set, return value comes from a managed run/ensure/match instead of the literal `value`. */ managed?: @@ -284,8 +254,6 @@ export interface jaiphModule { filePath: string; /** Optional in-file workflow metadata (agent model, command, run options). */ metadata?: WorkflowMetadata; - /** Top-level `#` lines immediately before `config {` (formatter). */ - configLeadingComments?: string[]; imports: ImportDef[]; /** `import script "" as ` declarations. */ scriptImports?: ScriptImportDef[]; @@ -298,10 +266,6 @@ export interface jaiphModule { envDecls?: EnvDeclDef[]; /** Present only when parsing a *.test.jh file. */ tests?: TestBlockDef[]; - /** Encounter order of rule / script / workflow / env / test (excludes imports, config, channels). */ - topLevelOrder?: TopLevelEmitOrder[]; - /** Top-level `#` lines after the last declaration (formatter). */ - trailingTopLevelComments?: string[]; } /** Docker sandbox runtime configuration. */ @@ -311,11 +275,6 @@ export interface RuntimeConfig { dockerTimeoutSeconds?: number; } -/** One line inside `config { }`: comment or assignment (formatter round-trip order). */ -export type ConfigBodyPart = - | { kind: "comment"; text: string } - | { kind: "assign"; key: string }; - /** In-file workflow metadata (replaces config file for V1). */ export interface WorkflowMetadata { agent?: { @@ -329,8 +288,6 @@ export interface WorkflowMetadata { run?: { debug?: boolean; logsDir?: string; recoverLimit?: number }; runtime?: RuntimeConfig; module?: { name?: string; version?: string; description?: string }; - /** Preserves `#` lines and assignment order inside `config { }` (formatter). */ - configBodySequence?: ConfigBodyPart[]; } /** Step inside a test block. Only present when module is a test file (*.test.jh). */ @@ -397,8 +354,6 @@ export interface TestBlockDef { description: string; steps: TestStepDef[]; loc: SourceLoc; - /** Top-level `#` lines immediately before this `test` block (formatter). */ - leadingComments?: string[]; } export interface JaiphTestModule { diff --git a/test-fixtures/golden-ast/expected/brace-if.json b/test-fixtures/golden-ast/expected/brace-if.json index 1da5f6a0..aa70b932 100644 --- a/test-fixtures/golden-ast/expected/brace-if.json +++ b/test-fixtures/golden-ast/expected/brace-if.json @@ -20,35 +20,15 @@ "scripts": [ { "body": "true", - "bodyKind": "backtick", "comments": [], "name": "ok_impl" }, { "body": "printf '%s' \"$1\" > \"$2\"", - "bodyKind": "backtick", "comments": [], "name": "save" } ], - "topLevelOrder": [ - { - "index": 0, - "kind": "rule" - }, - { - "index": 0, - "kind": "script" - }, - { - "index": 1, - "kind": "script" - }, - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/imports.json b/test-fixtures/golden-ast/expected/imports.json index b6143de6..ecd705d5 100644 --- a/test-fixtures/golden-ast/expected/imports.json +++ b/test-fixtures/golden-ast/expected/imports.json @@ -9,12 +9,6 @@ ], "rules": [], "scripts": [], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/log.json b/test-fixtures/golden-ast/expected/log.json index 6e7ead45..a8d99f76 100644 --- a/test-fixtures/golden-ast/expected/log.json +++ b/test-fixtures/golden-ast/expected/log.json @@ -4,12 +4,6 @@ "imports": [], "rules": [], "scripts": [], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/match-multiline.json b/test-fixtures/golden-ast/expected/match-multiline.json index b8bdc32a..39863b4c 100644 --- a/test-fixtures/golden-ast/expected/match-multiline.json +++ b/test-fixtures/golden-ast/expected/match-multiline.json @@ -4,12 +4,6 @@ "imports": [], "rules": [], "scripts": [], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/match.json b/test-fixtures/golden-ast/expected/match.json index 7d9ee26e..c64c2651 100644 --- a/test-fixtures/golden-ast/expected/match.json +++ b/test-fixtures/golden-ast/expected/match.json @@ -4,12 +4,6 @@ "imports": [], "rules": [], "scripts": [], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/params.json b/test-fixtures/golden-ast/expected/params.json index 30b00be5..941179de 100644 --- a/test-fixtures/golden-ast/expected/params.json +++ b/test-fixtures/golden-ast/expected/params.json @@ -22,25 +22,10 @@ "scripts": [ { "body": "echo ok", - "bodyKind": "backtick", "comments": [], "name": "checker" } ], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - }, - { - "index": 0, - "kind": "rule" - }, - { - "index": 0, - "kind": "script" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/prompt-capture.json b/test-fixtures/golden-ast/expected/prompt-capture.json index f853797a..b9a88f9c 100644 --- a/test-fixtures/golden-ast/expected/prompt-capture.json +++ b/test-fixtures/golden-ast/expected/prompt-capture.json @@ -4,12 +4,6 @@ "imports": [], "rules": [], "scripts": [], - "topLevelOrder": [ - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], @@ -20,7 +14,6 @@ "name": "answer", "type": "const", "value": { - "bodyKind": "string", "kind": "prompt_capture", "raw": "\"What is your name?\"" } diff --git a/test-fixtures/golden-ast/expected/run-ensure.json b/test-fixtures/golden-ast/expected/run-ensure.json index 0c450c19..f641a2db 100644 --- a/test-fixtures/golden-ast/expected/run-ensure.json +++ b/test-fixtures/golden-ast/expected/run-ensure.json @@ -20,29 +20,10 @@ "scripts": [ { "body": "true", - "bodyKind": "backtick", "comments": [], "name": "validator" } ], - "topLevelOrder": [ - { - "index": 0, - "kind": "rule" - }, - { - "index": 0, - "kind": "script" - }, - { - "index": 0, - "kind": "workflow" - }, - { - "index": 1, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], diff --git a/test-fixtures/golden-ast/expected/script-defs.json b/test-fixtures/golden-ast/expected/script-defs.json index b72757d1..07eb7c9d 100644 --- a/test-fixtures/golden-ast/expected/script-defs.json +++ b/test-fixtures/golden-ast/expected/script-defs.json @@ -6,41 +6,20 @@ "scripts": [ { "body": "echo hello", - "bodyKind": "backtick", "comments": [], "name": "greet" }, { "body": "echo \"line 1\"\necho \"line 2\"", - "bodyKind": "fenced", "comments": [], "name": "multiline" }, { "body": "echo \"Hello ${USER}\"\necho \"${PATH:-/usr/bin}\"", - "bodyKind": "fenced", "comments": [], "name": "with_shell_expansion" } ], - "topLevelOrder": [ - { - "index": 0, - "kind": "script" - }, - { - "index": 1, - "kind": "script" - }, - { - "index": 2, - "kind": "script" - }, - { - "index": 0, - "kind": "workflow" - } - ], "workflows": [ { "comments": [], From a9ae7ff353b07c1b36390d4438903fb256ba267a Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 12:49:26 +0200 Subject: [PATCH 07/14] Refactor: collapse call args into typed Arg[] across AST Replace every call-bearing node's `args: string` + `bareIdentifierArgs?: string[]` pair with a single `args?: Arg[]`, where each Arg is either `{ kind: "literal"; raw }` or `{ kind: "var"; name }`. The parser does the bare-identifier classification once at parse time, and the validator and emitter consume the typed list directly without any downstream re-parse of the raw `args` string. Drops the `validateBareIdentifierArgs` helper; its scope check now lives in the per-step validator that already walks the call. Adds grep and AST-shape regression tests so neither `bareIdentifierArgs` nor an args re-parse can reappear under src/parse or src/transpile. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 36 ---- docs/architecture.md | 4 +- docs/contributing.md | 1 + docs/spec-async-handles.md | 2 +- integration/sample-build/cli-tree.test.ts | 2 +- src/format/emit.ts | 128 +++--------- src/parse/arg-ast-shape.test.ts | 57 ++++++ src/parse/arg-grep.test.ts | 67 ++++++ src/parse/const-rhs.ts | 4 - src/parse/core.ts | 91 +++++---- src/parse/inline-script.ts | 6 +- src/parse/parse-bare-call.test.ts | 5 +- src/parse/parse-const-rhs.test.ts | 2 +- src/parse/parse-core.test.ts | 37 ++-- src/parse/parse-inline-script.test.ts | 8 +- src/parse/parse-return.test.ts | 21 +- src/parse/parse-run-async.test.ts | 9 +- src/parse/parse-send-rhs.test.ts | 2 +- src/parse/parse-steps.test.ts | 2 +- src/parse/send-rhs.ts | 1 - src/parse/steps.ts | 21 +- src/parse/workflow-brace.ts | 15 +- src/runtime/kernel/node-workflow-runtime.ts | 29 +-- src/runtime/kernel/runtime-arg-parser.ts | 4 +- src/transpile/compiler-golden.test.ts | 4 +- src/transpile/validate-string.test.ts | 4 +- src/transpile/validate-string.ts | 5 +- src/transpile/validate.ts | 193 +++++++++--------- src/types.ts | 39 ++-- test-fixtures/compiler-txtar/valid.txt | 2 +- .../golden-ast/expected/brace-if.json | 12 +- 32 files changed, 420 insertions(+), 394 deletions(-) create mode 100644 src/parse/arg-ast-shape.test.ts create mode 100644 src/parse/arg-grep.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index ebca52e1..f1a6209d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site:** Every call-bearing AST node used to carry the call arguments twice — `args: string` (the raw source between the parens) and `bareIdentifierArgs?: string[]` (a re-parse of which of those arguments happened to be bare identifiers). The validator had to remember to check both fields and call a hand-rolled `validateBareIdentifierArgs` helper at every site; the emitter re-parsed `args` from scratch because it didn't trust either field on its own. Both fields are gone. The parser now classifies each argument once, at parse time, into a new typed sum `type Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }` and stores it on every call-bearing node as `args?: Arg[]`. Affected nodes: `run` / `ensure` workflow steps, `run_inline_script` steps, the `managed` sidecar on `return` / `log` / `logerr` (in all four shapes — `run`, `ensure`, `run_inline_script`, `match`), the `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS variants, and the `run` send RHS. Downstream consumers walk the typed list directly: the validator's per-call check sequence is now arity (`args.length`), shell-redirection rejection on `literal` raws, nested-unmanaged-call rejection on `literal` raws, ref resolution, and `var`-arg resolution against in-scope bindings via a new `validateArgVarRefs` (the standalone `validateBareIdentifierArgs` helper is deleted); the formatter renders each `Arg` directly (`var` → bare name, `literal` → raw) instead of re-tokenizing a `${ident}`-rewritten string; the runtime turns `Arg[]` back into the space-separated argv string via `argsToRuntimeString` in `src/parse/core.ts` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. New tests pin the invariants: `src/parse/arg-ast-shape.test.ts` is a compile-time assertion that `bareIdentifierArgs` does not appear on `WorkflowStepDef` (`ensure`, `run`, `run_inline_script`, `log.managed`, `logerr.managed`, `return.managed` in `run` / `ensure` / `run_inline_script` shapes), `ConstRhs` (`run_capture`, `ensure_capture`, `run_inline_script_capture`), or the `run` `SendRhsDef` variant (AC1); `src/parse/arg-grep.test.ts` walks every non-test `.ts` under `src/parse/` and `src/transpile/` and fails if any production file matches `args.split(",")` or the bare token `bareIdentifierArgs` (AC2), and separately fails if any file under `src/transpile/` references `validateBareIdentifierArgs` (AC3). The golden compiler corpus, `validate-*.test.ts` files, and the golden AST corpus pass byte-for-byte (AC4); `npm run build` passes with zero TypeScript strict-mode errors (AC5). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the full `Expr` collapse (next task) and surface syntax. Docs updated in `docs/architecture.md` (extended **AST / Types** bullet documenting the typed `Arg` sum; updated **Validator** and **Formatter** bullets to drop the dual representation), `docs/contributing.md` (new **Call-args AST shape** row in the test-layer table), and `docs/spec-async-handles.md` (replaces the stale `commaArgsToSpaced` reference with `argsToRuntimeString`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. - **Refactor — Split source-fidelity data from the semantic AST into a `Trivia` (CST) layer:** Around ten fields whose only consumer was the formatter — `leadingComments` on imports / script imports / channels / `const` decls / `test` blocks, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence` (both module- and workflow-scoped), `topLevelOrder`, `bareSource` on `return`, the `tripleQuoted` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, and the prompt / script `bodyKind` / `bodyIdentifier` discriminators — are removed from `jaiphModule`, `WorkflowStepDef`, `ConstRhs`, `SendRhsDef`, `WorkflowMetadata`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `ScriptDef`, and `TestBlockDef`, and re-homed in a new parallel `Trivia` store (`src/parse/trivia.ts`) keyed by AST-node identity (per-node `WeakMap`) plus a small `ModuleTrivia` record for module-level data. The parser exposes `parsejaiphWithTrivia(source, filePath) → { ast, trivia }`; the legacy `parsejaiph(source, filePath)` is now a thin wrapper that drops trivia for callers that don't care (validator, transpiler, runtime, `loadModuleGraph`). The formatter (`emitModule(ast, trivia, opts?)`) is the only consumer of `Trivia`; validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts`. New tests pin the invariants: `src/parse/trivia-ast-shape.test.ts` is a compile-time assertion (with runtime echo) that none of the listed fields reappear on any semantic AST type (AC1); `src/parse/trivia-grep.test.ts` greps validator and emitter source files and fails if any of them references `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` or imports from `parse/trivia` (AC2); `src/format/roundtrip.test.ts` walks every `.jh` under `examples/` and `test-fixtures/golden-ast/fixtures/` and asserts `parse → format → parse → format` converges bit-for-bit (AC3). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to drop the moved fields. User-visible contracts (CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming) are unchanged. `npm test` and `npm run build` pass with zero TypeScript strict-mode errors (AC4 / AC5). Out of scope: the `Expr` collapse — this refactor only relocates source-fidelity fields without changing the semantic AST's shape. Docs updated in `docs/architecture.md` (new **Trivia / CST layer** section with anchor `#trivia-cst-layer`, plus updated **Parser**, **AST / Types**, and **Formatter** bullets) and `docs/contributing.md` (new row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. - **Refactor — `ModuleGraph` is the single representation of "all `.jh` modules reachable from an entry point, parsed once":** The previous three traversal strategies for compile-time module discovery (validator re-reading imports through `ValidateContext`, `emitScriptsForModule` re-wrapping the same callbacks with an optional `prep` cache, and `buildScripts` walking the filesystem directly) collapse to one path. `parsejaiph(source, filePath)` is now strictly I/O-pure — it can no longer reach `fs`. The single discovery routine `loadModuleGraph(entry, workspaceRoot?)` (`src/transpile/module-graph.ts`) walks the entry plus its transitive `import` closure and returns `{ entryFile, workspaceRoot?, modules: Map }`; every other compile-time consumer takes the graph and never re-reads `.jh` from disk. `validateReferences(graph)` and `emitScriptsForModuleFromGraph(graph, file, rootDir)` operate entirely in-memory. The `ValidateContext` interface (`resolveImportPath` / `existsSync` / `readFile` / `parse` / `workspaceRoot` callbacks) is deleted from `src/transpile/validate.ts`; the validator consumes the graph and uses `existsSync` only to resolve `import script` paths (non-`.jh` bodies). `CompilePrep` / `prepareCompile` / `writeCompilePrep` / `readCompilePrep` and the optional `prep?` parameter on `emitScriptsForModule` / `buildScripts` are gone; `buildScripts(input, outDir, ws?)` now loads a graph internally and `buildScriptsFromGraph(graph, outDir, rootDir?)` is the entry point for callers that already loaded one. `buildRuntimeGraph` accepts either an entry file path (legacy) or an already-loaded `ModuleGraph` — `RuntimeGraph` is a type alias for `ModuleGraph` (the only "all reachable modules" representation in the codebase). The cross-process cache file moves to `/.jaiph-module-graph.json` (deterministic JSON: entries sorted by absolute path, ASTs included verbatim) via `writeModuleGraph` / `readModuleGraph`, and the internal env var the spawned `node-workflow-runner.js` reads is renamed `JAIPH_MODULE_GRAPH_FILE` (replacing `JAIPH_COMPILE_PREP_FILE`). Scope of the env-var hand-off is unchanged: set only for the default local non-Docker `jaiph run` path; `jaiph run --raw`, `jaiph test`, and Docker launches fall back to `loadModuleGraph` from the source file. User-visible contracts — banner, hooks, run artifacts, `run_summary.jsonl`, `return_value.txt`, exit codes, `__JAIPH_EVENT__` streaming, CLI usage, and the full golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) — are unchanged byte-for-byte. New tests (`src/transpile/module-graph.test.ts`, `src/transpile/pipeline-io-purity.test.ts`) stub `node:fs` to throw on any `.jh` read and run the full pipeline against `test-fixtures/` to pin the I/O-purity invariant; another test instruments `parsejaiph` with a call counter to assert no duplicate parses across `loadModuleGraph` → `validateReferences` → `emit` → `buildRuntimeGraph` for fixtures with transitive imports. `src/transpile/compile-prep.ts` and `compile-prep.test.ts` are removed. Docs updated in `docs/architecture.md`, `docs/cli.md`, and `docs/testing.md`. Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. - **Performance — `jaiph install` parallelism:** Missing-library clones now run in parallel through a small bounded-concurrency executor (default 4 in flight), replacing the previous sequential `execSync` loop. The user contract is unchanged: warm-path libraries (target directory exists and `--force` is absent) still skip without invoking `git` for both explicit args and restore-from-lock; failed clones still exit non-zero and do not produce a lock entry; restore-from-lock still does not invent new lock entries. The default clone runner now uses `spawn("git", ["clone", "--depth", "1", …])` so multiple clones can overlap network and process latency. `runInstall` is now `async` and exposes injectable `CloneRunner` / `concurrency` options for testing. Tests cover concurrent overlap (peak in-flight ≥ 2), warm-path skipping for explicit args and restore, invalid-remote and unknown-ref failure paths, mixed success/failure lockfile bookkeeping, and the existing corrupt/missing-lockfile behavior. Docs updated in `docs/cli.md` and `docs/libraries.md`. diff --git a/QUEUE.md b/QUEUE.md index a5940a72..911ae667 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,42 +13,6 @@ Process rules: *** -## Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. - -**Why:** Every call-bearing AST node carries both `args: string` (raw text) and `bareIdentifierArgs: string[]` (a re-parse of which args happened to be bare identifiers). Validator must remember to check both. Emitter does its own re-parse of `args` because it doesn't trust either field alone. The dual representation is also why the validator has a `validateBareIdentifierArgs` helper called by hand at every site. - -**Scope:** - -- Introduce a typed `Arg` sum and replace the `args: string` + `bareIdentifierArgs?: string[]` pair on every call-bearing node: - - ```ts - type Arg = - | { kind: "literal"; raw: string } // "..." / ${var} / etc., as authored - | { kind: "var"; name: string }; // bare identifier reference - - // Call-bearing nodes carry args: Arg[]. No second field. - ``` - -- Parser does the bare-identifier classification once, at parse time. Validator and emitter consume `Arg[]` directly; no re-parse of `args` anywhere downstream. -- Affected nodes (non-exhaustive): every `WorkflowStepDef` variant with a call (`run`, `ensure`, `return.managed`, `log.managed`, `logerr.managed`, `send.rhs`), every `ConstRhs` capture variant. -- `validateBareIdentifierArgs` is deleted; its logic moves into the per-step validator that already walks the call. - -**Acceptance criteria** (each verified by a test): - -1. The field `bareIdentifierArgs` does not appear in any AST type definition under `src/types.ts`. A type-level test fails if it reappears. -2. No production code under `src/parse/` or `src/transpile/` re-parses the `args` string into bare-identifier components. A grep test fails if `args` is split on `,` or scanned char-by-char outside the tokenizer/parser. -3. `validateBareIdentifierArgs` is deleted; `validate.ts` contains no equivalent helper. A grep test fails if it reappears. -4. The full golden corpus passes byte-for-byte: `npm test`, including all `validate-*.test.ts` files and the golden corpus. -5. `npm run build` passes; TypeScript strict-mode errors are zero. - -**Out of scope:** the full `Expr` collapse (next task). Surface syntax. This refactor only changes how call arguments are represented; the call-bearing nodes themselves stay where they are. - -**Dependency:** None hard, but easier after the Trivia split (previous task) because the AST is otherwise stable. - -*** - ## Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. diff --git a/docs/architecture.md b/docs/architecture.md index 7c7c1874..13b8764a 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -41,6 +41,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **AST / Types (`src/types.ts`)** - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). + - **Call arguments are a typed sum.** Every call-bearing node — `run` / `ensure` steps and the `managed` sidecar on `return` / `log` / `logerr`, `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS, the `run` send RHS, and the `run_inline_script` step — carries `args?: Arg[]` where `Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }`. The parser classifies each argument once (a bare in-scope-style identifier becomes `var`; everything else — quoted strings, `${…}` interpolations, nested `run …` / `ensure …` calls, inline-script bodies — is stored verbatim as `literal`). There is no separate `args: string` text payload or shadow `bareIdentifierArgs: string[]` field, and no downstream consumer re-parses call arguments: the validator walks the typed list to enforce arity, reject nested unmanaged calls inside literals, and resolve `var` refs against in-scope bindings; the emitter renders by mapping each `Arg` to its source form; the runtime turns `Arg[]` back into a runtime string via `argsToRuntimeString` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. - **Trivia / CST layer (`src/parse/trivia.ts`)** {: #trivia-cst-layer} @@ -49,6 +50,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Validator (`src/transpile/validate.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. + - Per call site the validator runs five checks against the typed **`Arg[]`** directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution, arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. There is no longer a separate `validateBareIdentifierArgs` helper, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** - **`emitScriptsForModuleFromGraph`** validates one module against the graph and runs **`buildScriptFiles`** — the only compile path for `jaiph run` / `jaiph test` — **persists only atomic `script` files** under `scripts/`. **`buildScripts(input, outDir, ws?)`** is the path-based wrapper used by tests and the directory walk; it loads a `ModuleGraph` and delegates. **`buildScriptsFromGraph(graph, outDir)`** is the graph-based entry point used by `jaiph run` / `jaiph test`, which already loaded the graph. Inline scripts (`` run `body`(args) ``) are also emitted as `scripts/__inline_` with deterministic hash-based names (`inlineScriptName` in `src/inline-script-name.ts`). There is no workflow-level bash emission. @@ -69,7 +71,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Prompt execution (`prompt.ts`), streaming parse (`stream-parser.ts`), schema (`schema.ts`), **`mock.ts`** (sequential prompt responses / mock-arm dispatch from test env JSON), **`runtime-mock.ts`** (mock workflow/rule/script **bodies** for `*.test.jh`), **`emit.ts`** (durable **`run_summary.jsonl`** helpers — `appendRunSummaryLine`, `formatUtcTimestamp` — consumed by `RuntimeEventEmitter`), **`workflow-launch.ts`** (spawn contract). **`RuntimeEventEmitter`** (`runtime-event-emitter.ts`) owns live **`__JAIPH_EVENT__`** lines on stderr and coordinates summary writes plus step/prompt sequence counters. Script subprocesses are launched directly from `NodeWorkflowRuntime`. - **Formatter (`src/format/emit.ts`)** - - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. `emitModule(ast, trivia, opts?)` reads the semantic AST together with the parallel **`Trivia`** store ([Trivia (CST layer)](#trivia-cst-layer)) to round-trip leading comments, top-level order, `config` body sequence, `"""..."""` and `bareSource` forms, and prompt / script body discriminators. Pure data→text emitter; no side-effects beyond file writes. Round-trip is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` — pinned by `src/format/roundtrip.test.ts`, which asserts `parse → format → parse → format` converges in one step on every fixture. + - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. `emitModule(ast, trivia, opts?)` reads the semantic AST together with the parallel **`Trivia`** store ([Trivia (CST layer)](#trivia-cst-layer)) to round-trip leading comments, top-level order, `config` body sequence, `"""..."""` and `bareSource` forms, and prompt / script body discriminators. Call arguments render straight off the typed `Arg[]` — `var` → bare name, `literal` → raw — so the formatter no longer re-parses any args string or consults a `bareIdentifierArgs` shadow field. Pure data→text emitter; no side-effects beyond file writes. Round-trip is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` — pinned by `src/format/roundtrip.test.ts`, which asserts `parse → format → parse → format` converges in one step on every fixture. - **Docker runtime helper (`src/runtime/docker.ts`)** - Parses mount specs, resolves Docker config (image, network, timeout), and builds the `docker run` invocation when the CLI enables **Docker sandboxing** for `jaiph run` (environment-driven; there is no `jaiph run --docker` flag — see [Sandboxing](sandboxing.md)). The container runs the same `node-workflow-runner` entry as local execution. The default image is the official `ghcr.io/jaiphlang/jaiph-runtime` GHCR image; every selected image must already contain `jaiph` (no auto-install or derived-image build at runtime). Image preparation (`prepareImage`) runs before the CLI banner: it checks whether the image is local, pulls with `--quiet` if needed (short status lines on stderr instead of Docker’s default pull UI), and verifies that `jaiph` exists in the image. `spawnDockerProcess` does not pull or verify — it receives a pre-resolved image. The spawn call uses `stdio: ["ignore", "pipe", "pipe"]` — stdin is ignored so the Docker CLI does not block on stdin EOF, which would stall event streaming and hang the host CLI after the container exits. diff --git a/docs/contributing.md b/docs/contributing.md index 793d0bea..0bb1a9d8 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -103,6 +103,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Compiler acceptance tests** | `src/transpile/*.acceptance.test.ts` (colocated) | Cross-module compiler behavior: validation errors, resolution, and other cases that need a temp project tree or subprocess | You need a deterministic error string, multi-file `buildScripts`, or behavior that does not fit a tiny golden snippet | | **Compiler golden tests** | `src/transpile/compiler-golden.test.ts` (colocated) | Regressions in the parser, validation messages, and scripts-only extraction (`buildScriptFiles` in `emit-script.ts`) — expectations are inline in the test file | You changed the parser, validator, or script extraction and need to lock an exact error string, extracted script shape, or corpus behavior | | **Trivia / formatter round-trip** | `src/parse/trivia-ast-shape.test.ts`, `src/parse/trivia-grep.test.ts`, `src/format/roundtrip.test.ts` | Source-fidelity invariants: no trivia fields on semantic AST types (compile-time), validator/emitter sources do not reference `Trivia`, and `parse → format → parse → format` is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` | You changed the parser, formatter, AST types, or anything that touches source-fidelity round-trip (see [Architecture — Trivia (CST layer)](architecture.md#trivia-cst-layer)) | +| **Call-args AST shape** | `src/parse/arg-ast-shape.test.ts`, `src/parse/arg-grep.test.ts` | Pins the typed-`Arg[]` invariant: no `bareIdentifierArgs` field on any call-bearing AST type (compile-time), no `args.split(",")` or `bareIdentifierArgs` text in production `src/parse/` or `src/transpile/` sources, and no `validateBareIdentifierArgs` helper in the validator | You changed how call arguments flow through the parser, validator, or emitter and need to confirm nothing re-introduces the parallel raw-string representation (see [Architecture — AST / Types](architecture.md#core-components)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | | **Integration tests** | `integration/*.test.ts`, `integration/sample-build/*.test.ts` | Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses | diff --git a/docs/spec-async-handles.md b/docs/spec-async-handles.md index 4d260f60..54479f6d 100644 --- a/docs/spec-async-handles.md +++ b/docs/spec-async-handles.md @@ -49,7 +49,7 @@ A handle resolves to the `run` result: workflow **`return`**, or **trimmed scrip ### Reads that force resolution -The runtime scans for `${name}` in the places below. **Call arguments:** at parse time, bare identifiers in a `run` / `ensure` argument list are rewritten to **`${name}`** (`commaArgsToSpaced` in `src/parse/core.ts`), so they go through the same `resolveHandlesInInput` path as explicit interpolation (see [Grammar — Call-site arguments](grammar.md#call-site-arguments) and [Language — `run`](language.md#run--execute-a-workflow-or-script)). +The runtime scans for `${name}` in the places below. **Call arguments:** the parser classifies each argument once into a typed `Arg` (`{ kind: "var"; name }` for bare identifiers, `{ kind: "literal"; raw }` for everything else); when the runtime needs the space-separated argv string, `argsToRuntimeString` in `src/parse/core.ts` renders each `var` as **`${name}`** and emits each `literal` verbatim, so bare-identifier args go through the same `resolveHandlesInInput` path as explicit interpolation (see [Grammar — Call-site arguments](grammar.md#call-site-arguments) and [Language — `run`](language.md#run--execute-a-workflow-or-script)). | Access pattern | Example | Forces resolution? | | --- | --- | --- | diff --git a/integration/sample-build/cli-tree.test.ts b/integration/sample-build/cli-tree.test.ts index bafc96df..30f457e4 100644 --- a/integration/sample-build/cli-tree.test.ts +++ b/integration/sample-build/cli-tree.test.ts @@ -172,7 +172,7 @@ test("jaiph run tree shows workflow params inline when run has key=value args", [ 'import "sub.jh" as sub', "workflow default() {", - ' run sub.default(path="docs/cli.md" mode="strict")', + ' run sub.default(path="docs/cli.md", mode="strict")', "}", "", ].join("\n"), diff --git a/src/format/emit.ts b/src/format/emit.ts index 9ed3827c..66175e3a 100644 --- a/src/format/emit.ts +++ b/src/format/emit.ts @@ -1,4 +1,5 @@ import type { + Arg, jaiphModule, WorkflowStepDef, ConstRhs, @@ -13,7 +14,6 @@ import type { WorkflowMetadata, TopLevelEmitOrder, } from "../types"; -import { parseCallRef } from "../parse/core"; import { createTrivia, type NodeTrivia, type Trivia } from "../parse/trivia"; export interface EmitOptions { @@ -359,99 +359,25 @@ function emitSteps(steps: WorkflowStepDef[], pad: string, currentIndent: string, return lines; } -/** Try to parse `` `body`(args) `` from the start of a string. Returns consumed length or null. */ -function parseInlineScriptArg(s: string): { body: string; innerArgs: string; consumed: number } | null { - if (!s.startsWith("`")) return null; - const closeIdx = s.indexOf("`", 1); - if (closeIdx === -1) return null; - const body = s.slice(1, closeIdx); - const afterClose = s.slice(closeIdx + 1); - if (!afterClose.startsWith("(")) return null; - let depth = 1; - let j = 1; - let inQuote: string | null = null; - while (j < afterClose.length && depth > 0) { - const ch = afterClose[j]; - if (inQuote) { - if (ch === inQuote && afterClose[j - 1] !== "\\") inQuote = null; - } else { - if (ch === '"' || ch === "'") inQuote = ch; - else if (ch === "(") depth++; - else if (ch === ")") depth--; - } - j++; - } - if (depth !== 0) return null; - const innerArgs = afterClose.slice(1, j - 1).trim(); - return { body, innerArgs, consumed: closeIdx + 1 + j }; -} - -/** Convert space-separated args back to comma-separated format with bare identifiers. */ -function formatArgs(args: string, bareIdentifierArgs?: string[]): string { - const bare = new Set(bareIdentifierArgs ?? []); - const tokens: string[] = []; - let i = 0; - while (i < args.length) { - while (i < args.length && (args[i] === " " || args[i] === "\t")) i++; - if (i >= args.length) break; - const tail = args.slice(i); - const keyword = tail.startsWith("run ") - ? "run" - : tail.startsWith("ensure ") - ? "ensure" - : null; - if (keyword) { - const afterKeyword = args.slice(i + keyword.length).trimStart(); - const skipped = args.slice(i + keyword.length).length - afterKeyword.length; - const call = parseCallRef(afterKeyword); - if (call && (call.rest.length === 0 || /^\s/.test(call.rest))) { - const consumed = afterKeyword.length - call.rest.length; - tokens.push(`${keyword} ${call.ref}(${formatArgs(call.args ?? "", call.bareIdentifierArgs)})`); - i += keyword.length + skipped + consumed; - continue; - } - // Try inline script form: run `body`(args) - if (keyword === "run") { - const inlineResult = parseInlineScriptArg(afterKeyword); - if (inlineResult) { - const formattedInner = inlineResult.innerArgs ? formatArgs(inlineResult.innerArgs) : ""; - tokens.push(`run \`${inlineResult.body}\`(${formattedInner})`); - i += keyword.length + skipped + inlineResult.consumed; - continue; - } - } - } - if (args[i] === '"') { - let j = i + 1; - while (j < args.length && !(args[j] === '"' && args[j - 1] !== "\\")) j++; - tokens.push(args.slice(i, j + 1)); - i = j + 1; - } else { - let j = i; - while (j < args.length && args[j] !== " " && args[j] !== "\t") j++; - const token = args.slice(i, j); - const m = token.match(/^\$\{([a-zA-Z_][a-zA-Z0-9_]*)\}$/); - if (m && bare.has(m[1])) { - tokens.push(m[1]); - } else { - tokens.push(token); - } - i = j; - } - } - return tokens.join(", "); +/** + * Render `Arg[]` back as comma-separated source form. Each `var` becomes the bare name + * and each `literal` is emitted as authored (already in source form, including nested + * `run …` / `ensure …` calls and inline-script bodies). + */ +function formatArgs(args: Arg[] | undefined): string { + if (!args || args.length === 0) return ""; + return args.map((a) => (a.kind === "var" ? a.name : a.raw)).join(", "); } /** Emit inline script form: `prefix \`body\`(args)` or fenced block. */ function emitInlineScriptLines( prefix: string, body: string, - lang?: string, - args?: string, - bareIdentifierArgs?: string[], + lang: string | undefined, + args: Arg[] | undefined, ci?: string, ): string[] { - const argsStr = formatArgs(args ?? "", bareIdentifierArgs); + const argsStr = formatArgs(args); if (lang || body.includes("\n")) { const langTag = lang ?? ""; const result = [`${prefix} \`\`\`${langTag}`]; @@ -464,9 +390,9 @@ function emitInlineScriptLines( return [`${prefix} \`${body}\`(${argsStr})`]; } -function emitRef(ref: { value: string }, args?: string, bareIdentifierArgs?: string[]): string { +function emitRef(ref: { value: string }, args: Arg[] | undefined): string { if (args !== undefined) { - return `${ref.value}(${formatArgs(args, bareIdentifierArgs)})`; + return `${ref.value}(${formatArgs(args)})`; } return `${ref.value}()`; } @@ -516,7 +442,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri } case "ensure": { - const ref = emitRef(step.ref, step.args, step.bareIdentifierArgs); + const ref = emitRef(step.ref, step.args); const capture = step.captureName ? `${step.captureName} = ` : ""; if (step.catch) { const b = step.catch.bindings; @@ -537,7 +463,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri } case "run": { - const ref = emitRef(step.workflow, step.args, step.bareIdentifierArgs); + const ref = emitRef(step.workflow, step.args); const capture = step.captureName ? `${step.captureName} = ` : ""; const asyncPrefix = step.async ? "async " : ""; if (step.recover) { @@ -572,7 +498,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri case "run_inline_script": { const capture = step.captureName ? `${step.captureName} = ` : ""; - const argsStr = formatArgs(step.args ?? "", step.bareIdentifierArgs); + const argsStr = formatArgs(step.args); if (step.lang || step.body.includes("\n")) { const langTag = step.lang ?? ""; lines.push(`${ci}${capture}run \`\`\`${langTag}`); @@ -618,7 +544,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri for (const bl of step.value.body.split("\n")) { lines.push(bl); } - const argsStr = formatArgs(step.value.args ?? "", step.value.bareIdentifierArgs); + const argsStr = formatArgs(step.value.args); lines.push(`${ci}\`\`\`(${argsStr})`); } // Handle multi-line triple-quoted prompt capture body @@ -666,7 +592,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri case "log": if (step.managed?.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}log run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); + lines.push(...emitInlineScriptLines(`${ci}log run`, step.managed.body, step.managed.lang, step.managed.args, ci)); } else if (stepTrivia.tripleQuoted) { const inner = stepTrivia.rawBody ?? step.message; lines.push(`${ci}log """`); @@ -681,7 +607,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri case "logerr": if (step.managed?.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}logerr run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); + lines.push(...emitInlineScriptLines(`${ci}logerr run`, step.managed.body, step.managed.lang, step.managed.args, ci)); } else if (stepTrivia.tripleQuoted) { const inner = stepTrivia.rawBody ?? step.message; lines.push(`${ci}logerr """`); @@ -697,9 +623,9 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri case "return": { if (step.managed) { if (step.managed.kind === "run") { - lines.push(`${ci}return run ${emitRef(step.managed.ref, step.managed.args, step.managed.bareIdentifierArgs)}`); + lines.push(`${ci}return run ${emitRef(step.managed.ref, step.managed.args)}`); } else if (step.managed.kind === "ensure") { - lines.push(`${ci}return ensure ${emitRef(step.managed.ref, step.managed.args, step.managed.bareIdentifierArgs)}`); + lines.push(`${ci}return ensure ${emitRef(step.managed.ref, step.managed.args)}`); } else if (step.managed.kind === "match") { lines.push(`${ci}return match ${step.managed.match.subject} {`); for (const arm of step.managed.match.arms) { @@ -707,7 +633,7 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri } lines.push(`${ci}}`); } else if (step.managed.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}return run`, step.managed.body, step.managed.lang, step.managed.args, step.managed.bareIdentifierArgs, ci)); + lines.push(...emitInlineScriptLines(`${ci}return run`, step.managed.body, step.managed.lang, step.managed.args, ci)); } } else if (stepTrivia.bareSource) { lines.push(`${ci}return ${stepTrivia.bareSource}`); @@ -781,10 +707,10 @@ function emitConstStep(name: string, value: ConstRhs, valueTrivia: NodeTrivia): return `const ${name} = ${value.bashRhs}`; case "run_capture": { const asyncMod = value.async ? "async " : ""; - return `const ${name} = run ${asyncMod}${emitRef(value.ref, value.args, value.bareIdentifierArgs)}`; + return `const ${name} = run ${asyncMod}${emitRef(value.ref, value.args)}`; } case "ensure_capture": - return `const ${name} = ensure ${emitRef(value.ref, value.args, value.bareIdentifierArgs)}`; + return `const ${name} = ensure ${emitRef(value.ref, value.args)}`; case "prompt_capture": { const returns = value.returns ? ` returns "${value.returns}"` : ""; if (valueTrivia.bodyKind === "identifier" && valueTrivia.bodyIdentifier) { @@ -801,7 +727,7 @@ function emitConstStep(name: string, value: ConstRhs, valueTrivia: NodeTrivia): return `const ${name} = match ${value.match.subject} {`; } case "run_inline_script_capture": { - const argsStr = formatArgs(value.args ?? "", value.bareIdentifierArgs); + const argsStr = formatArgs(value.args); if (value.lang || value.body.includes("\n")) { const langTag = value.lang ?? ""; return `const ${name} = run \`\`\`${langTag}`; @@ -818,7 +744,7 @@ function emitSendRhs(rhs: SendRhsDef): string { case "var": return rhs.bash; case "run": - return `run ${emitRef(rhs.ref, rhs.args, rhs.bareIdentifierArgs)}`; + return `run ${emitRef(rhs.ref, rhs.args)}`; case "bare_ref": return rhs.ref.value; case "shell": diff --git a/src/parse/arg-ast-shape.test.ts b/src/parse/arg-ast-shape.test.ts new file mode 100644 index 00000000..77103ba6 --- /dev/null +++ b/src/parse/arg-ast-shape.test.ts @@ -0,0 +1,57 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import type { ConstRhs, SendRhsDef, WorkflowStepDef } from "../types"; + +/** + * AC1: `bareIdentifierArgs` must not appear on any call-bearing AST node. + * + * Each helper below probes a specific variant where the field used to live; if + * it is re-added, `HasField` widens to `true`, the type-level assertion fails, + * and TypeScript breaks compilation. + */ +type HasField = T extends Record ? true : false; + +type EnsureStep = Extract; +type RunStep = Extract; +type RunInlineScriptStep = Extract; +type LogStep = Extract; +type LogerrStep = Extract; +type ReturnStep = Extract; +type LogManaged = NonNullable; +type LogerrManaged = NonNullable; +type ReturnManaged = NonNullable; +type ReturnManagedRun = Extract; +type ReturnManagedEnsure = Extract; +type ReturnManagedInline = Extract; +type RunCapture = Extract; +type EnsureCapture = Extract; +type InlineScriptCapture = Extract; +type SendRun = Extract; + +const _ensureNoBare: HasField = false; +const _runNoBare: HasField = false; +const _inlineNoBare: HasField = false; +const _logManagedNoBare: HasField = false; +const _logerrManagedNoBare: HasField = false; +const _returnManagedRunNoBare: HasField = false; +const _returnManagedEnsureNoBare: HasField = false; +const _returnManagedInlineNoBare: HasField = false; +const _runCaptureNoBare: HasField = false; +const _ensureCaptureNoBare: HasField = false; +const _inlineCaptureNoBare: HasField = false; +const _sendRunNoBare: HasField = false; + +test("AC1: bareIdentifierArgs does not appear on any call-bearing AST type", () => { + assert.equal(_ensureNoBare, false); + assert.equal(_runNoBare, false); + assert.equal(_inlineNoBare, false); + assert.equal(_logManagedNoBare, false); + assert.equal(_logerrManagedNoBare, false); + assert.equal(_returnManagedRunNoBare, false); + assert.equal(_returnManagedEnsureNoBare, false); + assert.equal(_returnManagedInlineNoBare, false); + assert.equal(_runCaptureNoBare, false); + assert.equal(_ensureCaptureNoBare, false); + assert.equal(_inlineCaptureNoBare, false); + assert.equal(_sendRunNoBare, false); +}); diff --git a/src/parse/arg-grep.test.ts b/src/parse/arg-grep.test.ts new file mode 100644 index 00000000..ceb8a372 --- /dev/null +++ b/src/parse/arg-grep.test.ts @@ -0,0 +1,67 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readdirSync, readFileSync, statSync } from "node:fs"; +import { resolve, join } from "node:path"; + +// Tests run from dist/src/parse/, so repo root is three levels up. +const repoRoot = resolve(__dirname, "../../.."); + +function listTsFiles(dir: string): string[] { + const out: string[] = []; + const walk = (d: string): void => { + for (const name of readdirSync(d)) { + const abs = join(d, name); + const st = statSync(abs); + if (st.isDirectory()) { + walk(abs); + } else if (name.endsWith(".ts") && !name.endsWith(".test.ts") && !name.endsWith(".d.ts")) { + out.push(abs); + } + } + }; + walk(dir); + return out; +} + +const parseSources = listTsFiles(join(repoRoot, "src/parse")); +const transpileSources = listTsFiles(join(repoRoot, "src/transpile")); + +/** + * AC2: no production code under src/parse/ or src/transpile/ may re-parse a + * call's `args` payload into bare-identifier components. The tokenizer / parser + * builds `Arg[]` once via `commaArgsToArgList` in `src/parse/core.ts`; + * downstream consumers walk that typed list directly — no `args.split(",")`, + * no `bareIdentifierArgs` shadow field, no ad-hoc rescans. + */ +test("AC2: no args re-parse into bare-identifier components outside the tokenizer", () => { + const forbidden: RegExp[] = [ + /\bargs\.split\s*\(\s*[`'"],/, + /\bbareIdentifierArgs\b/, + ]; + for (const file of [...parseSources, ...transpileSources]) { + const content = readFileSync(file, "utf8"); + for (const re of forbidden) { + assert.equal( + re.test(content), + false, + `${file} matches forbidden args re-parse pattern ${re}`, + ); + } + } +}); + +/** + * AC3: `validateBareIdentifierArgs` is deleted. The bare-arg check folds into + * the per-step validator that already walks the call: each `Arg` of kind + * `"var"` is resolved against in-scope bindings inline. + */ +test("AC3: validateBareIdentifierArgs does not reappear in src/transpile/", () => { + for (const file of transpileSources) { + const content = readFileSync(file, "utf8"); + assert.equal( + /\bvalidateBareIdentifierArgs\b/.test(content), + false, + `${file} references validateBareIdentifierArgs — it must stay deleted`, + ); + } +}); diff --git a/src/parse/const-rhs.ts b/src/parse/const-rhs.ts index 20ca1a4f..14e97d97 100644 --- a/src/parse/const-rhs.ts +++ b/src/parse/const-rhs.ts @@ -107,7 +107,6 @@ export function parseConstRhs( return { value: { kind: "run_capture", ref, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), async: true, }, nextLineIdx: lineIdx, @@ -121,7 +120,6 @@ export function parseConstRhs( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), }, nextLineIdx: result.nextLineIdx - 1, }; @@ -138,7 +136,6 @@ export function parseConstRhs( return { value: { kind: "run_capture", ref, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, nextLineIdx: lineIdx, }; @@ -156,7 +153,6 @@ export function parseConstRhs( return { value: { kind: "ensure_capture", ref, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, nextLineIdx: lineIdx, }; diff --git a/src/parse/core.ts b/src/parse/core.ts index 0cac7c10..5f405b6e 100644 --- a/src/parse/core.ts +++ b/src/parse/core.ts @@ -1,4 +1,5 @@ import { jaiphError } from "../errors"; +import type { Arg } from "../types"; export function fail(filePath: string, message: string, lineNo: number, col = 1): never { throw jaiphError(filePath, lineNo, col, "E_PARSE", message); @@ -162,13 +163,17 @@ export function parseParamList(filePath: string, content: string, lineNo: number } /** - * Convert comma-separated call arguments to space-separated form for runtime. - * Respects quoted strings so commas inside quotes are preserved. - * Bare identifiers (valid names, not keywords) are converted to ${name} form. + * Split a comma-separated call argument list into typed `Arg[]`. + * + * Each top-level comma-separated segment is classified: + * - bare identifier (and not a Jaiph keyword): `{ kind: "var", name }` + * - anything else (quoted string, ${…}, nested `run …` / `ensure …` call, inline-script + * form, etc.): `{ kind: "literal", raw }`, stored as authored. + * + * Commas inside quoted strings are preserved (the scanner tracks quote state). */ -function commaArgsToSpaced(content: string): { spaced: string; bareIdentifiers: string[] } { - const parts: string[] = []; - const bareIdentifiers: string[] = []; +export function commaArgsToArgList(content: string): Arg[] { + const out: Arg[] = []; let current = ""; let inQuote: string | null = null; for (let j = 0; j < content.length; j++) { @@ -177,39 +182,54 @@ function commaArgsToSpaced(content: string): { spaced: string; bareIdentifiers: current += ch; if (ch === inQuote && content[j - 1] !== "\\") inQuote = null; } else if (ch === ",") { - const trimmed = current.trim(); - if (trimmed) { - if (isBareIdentifier(trimmed)) { - bareIdentifiers.push(trimmed); - parts.push(`\${${trimmed}}`); - } else { - parts.push(trimmed); - } - } + pushArg(out, current); current = ""; } else { if (ch === '"' || ch === "'") inQuote = ch; current += ch; } } - const trimmed = current.trim(); - if (trimmed) { - if (isBareIdentifier(trimmed)) { - bareIdentifiers.push(trimmed); - parts.push(`\${${trimmed}}`); - } else { - parts.push(trimmed); - } - } - return { spaced: parts.filter((p) => p).join(" "), bareIdentifiers }; + pushArg(out, current); + return out; +} + +function pushArg(out: Arg[], segment: string): void { + const trimmed = segment.trim(); + if (!trimmed) return; + out.push(isBareIdentifier(trimmed) ? { kind: "var", name: trimmed } : { kind: "literal", raw: trimmed }); +} + +/** + * Convert `Arg[]` back to the space-separated string the runtime consumes: + * - `var` → `${name}` (so runtime interpolation expands it against in-scope vars) + * - `literal` → raw as authored + * + * Empty / undefined → empty string. + */ +export function argsToRuntimeString(args: Arg[] | undefined): string { + if (!args || args.length === 0) return ""; + return args.map((a) => (a.kind === "var" ? `\${${a.name}}` : a.raw)).join(" "); +} + +/** + * Convert `Arg[]` back to comma-separated source form: + * - `var` → name (bare) + * - `literal` → raw as authored + * + * Used to populate the placeholder `value` string on managed + * `return run …` / `return ensure …` steps. Empty / undefined → empty string. + */ +export function argsToSourceForm(args: Arg[] | undefined): string { + if (!args || args.length === 0) return ""; + return args.map((a) => (a.kind === "var" ? a.name : a.raw)).join(", "); } /** * Parse a call expression `ref(args)` or `ref()` from a string. - * Returns the ref, optional args (space-separated), bare identifier names, and the rest of the string after `)`. + * Returns the ref, optional typed `Arg[]`, and the rest of the string after `)`. * Returns null if the string doesn't start with a valid call expression. */ -export function parseCallRef(s: string): { ref: string; args?: string; bareIdentifierArgs?: string[]; rest: string } | null { +export function parseCallRef(s: string): { ref: string; args?: Arg[]; rest: string } | null { const t = s.trimStart(); // Parenthesized form: ref(args) or ref() const refMatch = t.match(/^([A-Za-z_][A-Za-z0-9_]*(?:\.[A-Za-z_][A-Za-z0-9_]*)?)\(/); @@ -234,13 +254,8 @@ export function parseCallRef(s: string): { ref: string; args?: string; bareIdent const argsContent = t.slice(parenStart, i - 1).trim(); const rest = t.slice(i); if (!argsContent) return { ref, rest }; - const { spaced, bareIdentifiers } = commaArgsToSpaced(argsContent); - return { - ref, - args: spaced || undefined, - ...(bareIdentifiers.length > 0 ? { bareIdentifierArgs: bareIdentifiers } : {}), - rest, - }; + const args = commaArgsToArgList(argsContent); + return { ref, ...(args.length > 0 ? { args } : {}), rest }; } // Bare identifier form (no parens) is no longer allowed — require parentheses. return null; @@ -248,14 +263,14 @@ export function parseCallRef(s: string): { ref: string; args?: string; bareIdent /** * Parse a parenthesized argument list `(args)` or `()` at the start of a string. - * Returns args (space-separated), bare identifier names, and remaining text after `)`. - * Returns null if the string doesn't start with `(`. + * Returns typed `Arg[]` and remaining text after `)`. Returns null if the string + * doesn't start with `(`. */ -export function parseParenArgs(s: string): { args?: string; bareIdentifierArgs?: string[]; rest: string } | null { +export function parseParenArgs(s: string): { args?: Arg[]; rest: string } | null { if (!s.trimStart().startsWith("(")) return null; const result = parseCallRef(`__anon${s.trimStart()}`); if (!result) return null; - return { args: result.args, bareIdentifierArgs: result.bareIdentifierArgs, rest: result.rest }; + return { args: result.args, rest: result.rest }; } /** diff --git a/src/parse/inline-script.ts b/src/parse/inline-script.ts index c7aeebd0..aacbee67 100644 --- a/src/parse/inline-script.ts +++ b/src/parse/inline-script.ts @@ -1,12 +1,12 @@ import { fail, parseParenArgs, parseSingleBacktickBody } from "./core"; import { parseFencedBlock } from "./fence"; import { validateScriptBodyNoInterpolation } from "./scripts"; +import type { Arg } from "../types"; export interface InlineScriptParsed { body: string; lang?: string; - args?: string; - bareIdentifierArgs?: string[]; + args?: Arg[]; nextLineIdx: number; } @@ -62,7 +62,6 @@ export function parseAnonymousInlineScript( body, ...(lang ? { lang } : {}), args: argsResult.args, - ...(argsResult.bareIdentifierArgs ? { bareIdentifierArgs: argsResult.bareIdentifierArgs } : {}), nextLineIdx: nextIdx, }; } @@ -93,7 +92,6 @@ export function parseAnonymousInlineScript( return { body, args: argsResult.args, - ...(argsResult.bareIdentifierArgs ? { bareIdentifierArgs: argsResult.bareIdentifierArgs } : {}), nextLineIdx: lineIdx + 1, }; } diff --git a/src/parse/parse-bare-call.test.ts b/src/parse/parse-bare-call.test.ts index ffe4ca6c..75e89ee6 100644 --- a/src/parse/parse-bare-call.test.ts +++ b/src/parse/parse-bare-call.test.ts @@ -27,7 +27,10 @@ test("run with args and parens still works", () => { assert.equal(step.type, "run"); if (step.type === "run") { assert.equal(step.workflow.value, "deploy"); - assert.equal(step.args, '"prod" "v1"'); + assert.deepEqual(step.args, [ + { kind: "literal", raw: '"prod"' }, + { kind: "literal", raw: '"v1"' }, + ]); } }); diff --git a/src/parse/parse-const-rhs.test.ts b/src/parse/parse-const-rhs.test.ts index 333f42ab..411a2269 100644 --- a/src/parse/parse-const-rhs.test.ts +++ b/src/parse/parse-const-rhs.test.ts @@ -129,7 +129,7 @@ test("parseConstRhs: parses run capture with args", () => { assert.equal(result.value.kind, "run_capture"); if (result.value.kind === "run_capture") { assert.equal(result.value.ref.value, "my_script"); - assert.equal(result.value.args, '"arg"'); + assert.deepEqual(result.value.args, [{ kind: "literal", raw: '"arg"' }]); } }); diff --git a/src/parse/parse-core.test.ts b/src/parse/parse-core.test.ts index 6a3318ee..020353c2 100644 --- a/src/parse/parse-core.test.ts +++ b/src/parse/parse-core.test.ts @@ -198,62 +198,61 @@ test("isBareIdentifier: rejects string with spaces", () => { assert.equal(isBareIdentifier("has space"), false); }); -// === parseCallRef: bare identifiers === +// === parseCallRef: typed Arg[] classification === -test("parseCallRef: bare identifier arg is converted to interpolation form", () => { +test("parseCallRef: bare identifier becomes var arg", () => { const result = parseCallRef("foo(task)"); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, "${task}"); - assert.deepEqual(result.bareIdentifierArgs, ["task"]); + assert.deepEqual(result.args, [{ kind: "var", name: "task" }]); }); test("parseCallRef: bare identifier mixed with quoted arg", () => { const result = parseCallRef('foo(task, "hello")'); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, '${task} "hello"'); - assert.deepEqual(result.bareIdentifierArgs, ["task"]); + assert.deepEqual(result.args, [ + { kind: "var", name: "task" }, + { kind: "literal", raw: '"hello"' }, + ]); }); test("parseCallRef: multiple bare identifiers", () => { const result = parseCallRef("foo(task, branch_name)"); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, "${task} ${branch_name}"); - assert.deepEqual(result.bareIdentifierArgs, ["task", "branch_name"]); + assert.deepEqual(result.args, [ + { kind: "var", name: "task" }, + { kind: "var", name: "branch_name" }, + ]); }); -test("parseCallRef: keyword arg is not treated as bare identifier", () => { +test("parseCallRef: keyword arg is stored as literal (not var)", () => { const result = parseCallRef("foo(run)"); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, "run"); - assert.equal(result.bareIdentifierArgs, undefined); + assert.deepEqual(result.args, [{ kind: "literal", raw: "run" }]); }); -test("parseCallRef: quoted string arg is not treated as bare identifier", () => { +test("parseCallRef: quoted string arg is stored as literal", () => { const result = parseCallRef('foo("task")'); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, '"task"'); - assert.equal(result.bareIdentifierArgs, undefined); + assert.deepEqual(result.args, [{ kind: "literal", raw: '"task"' }]); }); -test("parseCallRef: ${var} arg is not treated as bare identifier", () => { +test("parseCallRef: ${var} interpolation arg is stored as literal", () => { const result = parseCallRef("foo(${task})"); assert.ok(result); assert.equal(result.ref, "foo"); - assert.equal(result.args, "${task}"); - assert.equal(result.bareIdentifierArgs, undefined); + assert.deepEqual(result.args, [{ kind: "literal", raw: "${task}" }]); }); -test("parseCallRef: no args returns no bareIdentifierArgs", () => { +test("parseCallRef: no args returns undefined args", () => { const result = parseCallRef("foo()"); assert.ok(result); assert.equal(result.ref, "foo"); assert.equal(result.args, undefined); - assert.equal(result.bareIdentifierArgs, undefined); }); // === parseCallRef: bare identifier (no parens) — now returns null === diff --git a/src/parse/parse-inline-script.test.ts b/src/parse/parse-inline-script.test.ts index 8fae049f..f6308c5b 100644 --- a/src/parse/parse-inline-script.test.ts +++ b/src/parse/parse-inline-script.test.ts @@ -31,7 +31,10 @@ workflow default() { assert.equal(step.type, "run_inline_script"); if (step.type === "run_inline_script") { assert.equal(step.body, "echo $1"); - assert.equal(step.args, '"arg1" "arg2"'); + assert.deepEqual(step.args, [ + { kind: "literal", raw: '"arg1"' }, + { kind: "literal", raw: '"arg2"' }, + ]); } }); @@ -107,8 +110,7 @@ test("parser: rule body supports multiline fenced run ```", () => { assert.equal(step.type, "run_inline_script"); if (step.type === "run_inline_script") { assert.ok(step.body.includes('if [ -z "$1" ]')); - assert.equal(step.args, "${name}"); - assert.deepEqual(step.bareIdentifierArgs, ["name"]); + assert.deepEqual(step.args, [{ kind: "var", name: "name" }]); } }); diff --git a/src/parse/parse-return.test.ts b/src/parse/parse-return.test.ts index 3478a418..6344edf5 100644 --- a/src/parse/parse-return.test.ts +++ b/src/parse/parse-return.test.ts @@ -28,8 +28,13 @@ test("return run parses managed run call with args", () => { if (step.type === "return") { assert.ok(step.managed); assert.equal(step.managed!.kind, "run"); - assert.equal(step.managed!.ref.value, "helper"); - assert.equal(step.managed!.args, '"a" "b"'); + if (step.managed!.kind === "run") { + assert.equal(step.managed!.ref.value, "helper"); + assert.deepEqual(step.managed!.args, [ + { kind: "literal", raw: '"a"' }, + { kind: "literal", raw: '"b"' }, + ]); + } } }); @@ -73,7 +78,9 @@ test("return ensure parses managed ensure call with args", () => { if (step.type === "return") { assert.ok(step.managed); assert.equal(step.managed!.kind, "ensure"); - assert.equal(step.managed!.args, '"x"'); + if (step.managed!.kind === "ensure") { + assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); + } } }); @@ -163,7 +170,7 @@ test("return run inline script with args", () => { assert.equal(step.managed!.kind, "run_inline_script"); if (step.managed!.kind === "run_inline_script") { assert.equal(step.managed!.body, "echo $1"); - assert.equal(step.managed!.args, '"x"'); + assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); } } }); @@ -200,8 +207,10 @@ test("log run inline script with args", () => { if (step.type === "log") { assert.ok(step.managed); assert.equal(step.managed!.kind, "run_inline_script"); - assert.equal(step.managed!.body, "echo $1"); - assert.equal(step.managed!.args, '"x"'); + if (step.managed!.kind === "run_inline_script") { + assert.equal(step.managed!.body, "echo $1"); + assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); + } } }); diff --git a/src/parse/parse-run-async.test.ts b/src/parse/parse-run-async.test.ts index 7727ae46..c6540445 100644 --- a/src/parse/parse-run-async.test.ts +++ b/src/parse/parse-run-async.test.ts @@ -20,7 +20,7 @@ test("parse: run async produces run step with async flag", () => { test("parse: run async with args", () => { const src = [ "workflow default() {", - ' run async other_wf("hello" "$x")', + ' run async other_wf("hello", "$x")', "}", ].join("\n"); const mod = parsejaiph(src, "test.jh"); @@ -28,7 +28,10 @@ test("parse: run async with args", () => { assert.equal(step.type, "run"); if (step.type === "run") { assert.equal(step.workflow.value, "other_wf"); - assert.equal(step.args, '"hello" "$x"'); + assert.deepEqual(step.args, [ + { kind: "literal", raw: '"hello"' }, + { kind: "literal", raw: '"$x"' }, + ]); assert.equal(step.async, true); } }); @@ -106,7 +109,7 @@ test("parse: const capture + run async with args", () => { assert.equal(step.value.kind, "run_capture"); if (step.value.kind === "run_capture") { assert.equal(step.value.ref.value, "other_wf"); - assert.equal(step.value.args, '"hello"'); + assert.deepEqual(step.value.args, [{ kind: "literal", raw: '"hello"' }]); assert.equal(step.value.async, true); } } diff --git a/src/parse/parse-send-rhs.test.ts b/src/parse/parse-send-rhs.test.ts index 67754ef6..f3810a9f 100644 --- a/src/parse/parse-send-rhs.test.ts +++ b/src/parse/parse-send-rhs.test.ts @@ -67,7 +67,7 @@ test("parseSendRhs: run call with args", () => { assert.equal(rhs.kind, "run"); if (rhs.kind === "run") { assert.equal(rhs.ref.value, "my_script"); - assert.equal(rhs.args, '"arg1"'); + assert.deepEqual(rhs.args, [{ kind: "literal", raw: '"arg1"' }]); } }); diff --git a/src/parse/parse-steps.test.ts b/src/parse/parse-steps.test.ts index d4c39c5e..2fd95612 100644 --- a/src/parse/parse-steps.test.ts +++ b/src/parse/parse-steps.test.ts @@ -21,7 +21,7 @@ test("parseEnsureStep: parses ensure with args", () => { const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule("arg1")'); if (step.type === "ensure") { assert.equal(step.ref.value, "my_rule"); - assert.equal(step.args, '"arg1"'); + assert.deepEqual(step.args, [{ kind: "literal", raw: '"arg1"' }]); } }); diff --git a/src/parse/send-rhs.ts b/src/parse/send-rhs.ts index 50d5e6f1..f69dc412 100644 --- a/src/parse/send-rhs.ts +++ b/src/parse/send-rhs.ts @@ -52,7 +52,6 @@ export function parseSendRhs( rhs: { kind: "run", ref, ...(call.args ? { args: call.args } : {}), - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, nextIdx: defaultNext, }; diff --git a/src/parse/steps.ts b/src/parse/steps.ts index 01ebbd19..62d5ec3b 100644 --- a/src/parse/steps.ts +++ b/src/parse/steps.ts @@ -1,7 +1,7 @@ import type { WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { parseConstRhs } from "./const-rhs"; -import { fail, indexOfClosingDoubleQuote, isRef, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; +import { argsToSourceForm, fail, indexOfClosingDoubleQuote, isRef, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; import { parseAnonymousInlineScript } from "./inline-script"; import { isBareIdentifierReturn, bareIdentifierToQuotedString, isBareDottedIdentifierReturn, dottedReturnToQuotedString } from "./workflow-return-dotted"; import { parsePromptStep } from "./prompt"; @@ -115,13 +115,12 @@ function parseCatchStatement( if (call && !call.rest.trim()) { return { type: "return", - value: `run ${call.ref}(${call.args ?? ""})`, + value: `run ${call.ref}(${argsToSourceForm(call.args)})`, loc: { line: lineNo, col }, managed: { kind: "run", ref: { value: call.ref, loc: { line: lineNo, col } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, }; } @@ -132,13 +131,12 @@ function parseCatchStatement( if (call && !call.rest.trim()) { return { type: "return", - value: `ensure ${call.ref}(${call.args ?? ""})`, + value: `ensure ${call.ref}(${argsToSourceForm(call.args)})`, loc: { line: lineNo, col }, managed: { kind: "ensure", ref: { value: call.ref, loc: { line: lineNo, col } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, }; } @@ -213,7 +211,6 @@ function parseCatchStatement( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), loc: { line: lineNo, col }, }; } @@ -240,7 +237,6 @@ function parseCatchStatement( type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), recover: { block: blockSteps, bindings }, }; } @@ -250,7 +246,6 @@ function parseCatchStatement( type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), recover: { single: singleStep, bindings }, }; } @@ -280,7 +275,6 @@ function parseCatchStatement( type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), catch: { block: blockSteps, bindings }, }; } @@ -290,7 +284,6 @@ function parseCatchStatement( type: "run", workflow: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), catch: { single: singleStep, bindings }, }; } @@ -305,7 +298,6 @@ function parseCatchStatement( type: "run", workflow: { value: call.ref, loc: { line: lineNo, col } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }; } } @@ -332,7 +324,6 @@ function parseCatchStatement( type: "ensure", ref: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), catch: { block: blockSteps, bindings }, }; } @@ -342,7 +333,6 @@ function parseCatchStatement( type: "ensure", ref: { value: callPart.ref, loc: { line: lineNo, col } }, args: callPart.args, - ...(callPart.bareIdentifierArgs ? { bareIdentifierArgs: callPart.bareIdentifierArgs } : {}), catch: { single: singleStep, bindings }, }; } @@ -357,7 +347,6 @@ function parseCatchStatement( type: "ensure", ref: { value: call.ref, loc: { line: lineNo, col } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }; } } @@ -432,7 +421,6 @@ export function parseEnsureStep( type: "ensure", ref: { value: call.ref, loc: { line: innerNo, col: ensureCol } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), ...(captureName ? { captureName } : {}), }, nextIdx: idx, @@ -481,7 +469,6 @@ export function parseEnsureStep( const refLoc = { value: ref, loc: { line: innerNo, col: ensureCol } }; const base = { type: "ensure" as const, ref: refLoc, args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), ...(captureName ? { captureName } : {}), }; @@ -598,7 +585,6 @@ export function parseRunRecoverStep( type: "run" as const, workflow: { value: call.ref, loc: { line: innerNo, col: runCol } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), ...(captureName ? { captureName } : {}), }; @@ -714,7 +700,6 @@ export function parseRunCatchStep( type: "run" as const, workflow: { value: call.ref, loc: { line: innerNo, col: runCol } }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), ...(captureName ? { captureName } : {}), }; diff --git a/src/parse/workflow-brace.ts b/src/parse/workflow-brace.ts index f0a52e26..6c125747 100644 --- a/src/parse/workflow-brace.ts +++ b/src/parse/workflow-brace.ts @@ -1,6 +1,7 @@ import type { WorkflowMetadata, WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { + argsToSourceForm, colFromRaw, fail, hasUnescapedClosingQuote, @@ -273,7 +274,6 @@ export function parseBlockStatement( loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), async: true, }, nextIdx: idx + 1, @@ -290,7 +290,6 @@ export function parseBlockStatement( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, }, nextIdx: result.nextLineIdx, @@ -322,7 +321,6 @@ export function parseBlockStatement( loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, nextIdx: idx + 1, }; @@ -383,7 +381,6 @@ export function parseBlockStatement( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), }, }, nextIdx: result.nextLineIdx, @@ -421,7 +418,6 @@ export function parseBlockStatement( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), }, }, nextIdx: result.nextLineIdx, @@ -498,8 +494,7 @@ export function parseBlockStatement( body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - ...(result.bareIdentifierArgs ? { bareIdentifierArgs: result.bareIdentifierArgs } : {}), - }, + }, }, nextIdx: result.nextLineIdx, }; @@ -510,11 +505,10 @@ export function parseBlockStatement( return { step: { type: "return", - value: `run ${call.ref}(${call.args ?? ""})`, + value: `run ${call.ref}(${argsToSourceForm(call.args)})`, loc: retLoc, managed: { kind: "run", ref: { value: call.ref, loc: retLoc }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, }, nextIdx: idx + 1, @@ -528,11 +522,10 @@ export function parseBlockStatement( return { step: { type: "return", - value: `ensure ${call.ref}(${call.args ?? ""})`, + value: `ensure ${call.ref}(${argsToSourceForm(call.args)})`, loc: retLoc, managed: { kind: "ensure", ref: { value: call.ref, loc: retLoc }, args: call.args, - ...(call.bareIdentifierArgs ? { bareIdentifierArgs: call.bareIdentifierArgs } : {}), }, }, nextIdx: idx + 1, diff --git a/src/runtime/kernel/node-workflow-runtime.ts b/src/runtime/kernel/node-workflow-runtime.ts index 7ef18adc..a557be73 100644 --- a/src/runtime/kernel/node-workflow-runtime.ts +++ b/src/runtime/kernel/node-workflow-runtime.ts @@ -5,6 +5,7 @@ import { PassThrough } from "node:stream"; import { randomUUID } from "node:crypto"; import { AsyncLocalStorage } from "node:async_hooks"; import { inlineScriptName } from "../../inline-script-name"; +import { argsToRuntimeString } from "../../parse/core"; import type { MatchExprDef, WorkflowStepDef } from "../../types"; import { executePrompt, resolveConfig, resolveModel, resolvePromptStepName } from "./prompt"; import { appendRunSummaryLine } from "./emit"; @@ -522,7 +523,7 @@ export class NodeWorkflowRuntime { let message: string; if (step.managed?.kind === "run_inline_script") { const shebang = step.managed.lang ? `#!/usr/bin/env ${step.managed.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.managed.body, shebang, step.managed.args ?? ""); + const result = await this.executeInlineScript(scope, step.managed.body, shebang, argsToRuntimeString(step.managed.args)); if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); message = result.returnValue ?? result.output.trim(); } else { @@ -570,14 +571,14 @@ export class NodeWorkflowRuntime { } if (step.managed.kind === "run_inline_script") { const shebang = step.managed.lang ? `#!/usr/bin/env ${step.managed.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.managed.body, shebang, step.managed.args ?? ""); + const result = await this.executeInlineScript(scope, step.managed.body, shebang, argsToRuntimeString(step.managed.args)); if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); returnValue = result.returnValue ?? result.output.trim(); return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); } const result = step.managed.kind === "run" - ? await this.executeRunRef(scope, step.managed.ref.value, step.managed.args ?? "") - : await this.executeEnsureRef(scope, step.managed.ref.value, step.managed.args ?? "", undefined); + ? await this.executeRunRef(scope, step.managed.ref.value, argsToRuntimeString(step.managed.args)) + : await this.executeEnsureRef(scope, step.managed.ref.value, argsToRuntimeString(step.managed.args), undefined); if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); returnValue = result.returnValue ?? result.output.trim(); return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); @@ -607,7 +608,7 @@ export class NodeWorkflowRuntime { if (sendHandleErr) return this.mergeStepResult(accOut, accErr, sendHandleErr); payload = interpolate(step.rhs.bash, scope.vars, scope.env); } else if (step.rhs.kind === "run") { - const runValue = await this.executeRunRef(scope, step.rhs.ref.value, step.rhs.args ?? ""); + const runValue = await this.executeRunRef(scope, step.rhs.ref.value, argsToRuntimeString(step.rhs.args)); if (runValue.status !== 0) return this.mergeStepResult(accOut, accErr, runValue); payload = runValue.returnValue ?? runValue.output.trim(); } else { @@ -679,7 +680,7 @@ export class NodeWorkflowRuntime { } if (step.value.kind === "run_capture") { const captureRef = step.value.ref.value; - const captureArgs = step.value.args ?? ""; + const captureArgs = argsToRuntimeString(step.value.args); if (step.value.async) { // Async capture: create handle, store in scope, register for join. asyncCounter += 1; @@ -702,13 +703,13 @@ export class NodeWorkflowRuntime { } if (step.value.kind === "run_inline_script_capture") { const shebang = step.value.lang ? `#!/usr/bin/env ${step.value.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.value.body, shebang, step.value.args ?? ""); + const result = await this.executeInlineScript(scope, step.value.body, shebang, argsToRuntimeString(step.value.args)); if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); scope.vars.set(step.name, result.returnValue ?? result.output.trim()); continue; } if (step.value.kind === "ensure_capture") { - const ensureResult = await this.executeEnsureRef(scope, step.value.ref.value, step.value.args ?? "", undefined); + const ensureResult = await this.executeEnsureRef(scope, step.value.ref.value, argsToRuntimeString(step.value.args), undefined); if (ensureResult.status !== 0) return this.mergeStepResult(accOut, accErr, ensureResult); scope.vars.set(step.name, ensureResult.returnValue ?? ensureResult.output.trim()); continue; @@ -738,7 +739,7 @@ export class NodeWorkflowRuntime { const branchStack = [...this.getFrameStack()]; const branchIndices = [...this.getAsyncIndices(), asyncCounter]; const ref = step.workflow.value; - const argsRaw = step.args ?? ""; + const argsRaw = argsToRuntimeString(step.args); const runInBranch = (fn: () => Promise): Promise => this.asyncFrameStack.run(branchStack, () => this.asyncIndicesStorage.run(branchIndices, fn), @@ -780,12 +781,12 @@ export class NodeWorkflowRuntime { } if (step.recover) { const limit = this.resolveRecoverLimit(scope.filePath); - let lastResult = await this.executeRunRef(scope, step.workflow.value, step.args ?? ""); + let lastResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); let attempt = 1; while (lastResult.status !== 0 && attempt <= limit) { const rr = await this.runRecoverBody(scope, step.recover, `${lastResult.output}${lastResult.error}`); if (rr.status !== 0 || rr.returnValue !== undefined) return this.mergeStepResult(accOut, accErr, rr); - lastResult = await this.executeRunRef(scope, step.workflow.value, step.args ?? ""); + lastResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); attempt += 1; } if (lastResult.status === 0) { @@ -797,7 +798,7 @@ export class NodeWorkflowRuntime { } continue; } - const runResult = await this.executeRunRef(scope, step.workflow.value, step.args ?? ""); + const runResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); if (runResult.status === 0) { if (step.captureName) { scope.vars.set(step.captureName, runResult.returnValue ?? runResult.output.trim()); @@ -812,7 +813,7 @@ export class NodeWorkflowRuntime { } if (step.type === "run_inline_script") { const shebang = step.lang ? `#!/usr/bin/env ${step.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.body, shebang, step.args ?? ""); + const result = await this.executeInlineScript(scope, step.body, shebang, argsToRuntimeString(step.args)); if (step.captureName && result.status === 0) { scope.vars.set(step.captureName, result.returnValue ?? result.output.trim()); } @@ -820,7 +821,7 @@ export class NodeWorkflowRuntime { continue; } if (step.type === "ensure") { - const ensureResult = await this.executeEnsureRef(scope, step.ref.value, step.args ?? "", step.catch); + const ensureResult = await this.executeEnsureRef(scope, step.ref.value, argsToRuntimeString(step.args), step.catch); if (step.captureName && ensureResult.status === 0) { scope.vars.set(step.captureName, ensureResult.returnValue ?? ensureResult.output.trim()); } diff --git a/src/runtime/kernel/runtime-arg-parser.ts b/src/runtime/kernel/runtime-arg-parser.ts index b09db127..925d9df4 100644 --- a/src/runtime/kernel/runtime-arg-parser.ts +++ b/src/runtime/kernel/runtime-arg-parser.ts @@ -5,7 +5,7 @@ * resolve interpolated strings, parse call argument lists (including managed * `run`/`ensure` and inline-script forms), and validate prompt return schemas. */ -import { parseCallRef } from "../../parse/core"; +import { argsToRuntimeString, parseCallRef } from "../../parse/core"; import { formatUtcTimestamp } from "./emit"; export const BARE_IDENT_RE = /^[A-Za-z_][A-Za-z0-9_]*$/; @@ -146,7 +146,7 @@ export function parseManagedArgAt(raw: string, start: number): { token: ParsedAr kind: "managed", managedKind: keyword, ref: call.ref, - argsRaw: call.args ?? "", + argsRaw: argsToRuntimeString(call.args), }, next: start + keyword.length + skipped + consumed, }; diff --git a/src/transpile/compiler-golden.test.ts b/src/transpile/compiler-golden.test.ts index b4c78c74..c263ff70 100644 --- a/src/transpile/compiler-golden.test.ts +++ b/src/transpile/compiler-golden.test.ts @@ -411,13 +411,13 @@ test("parser: const allows run-wrapped script call with args", () => { const step = mod.workflows[0].steps[0] as { type: string; name: string; - value: { kind: string; ref?: { value: string }; args?: string }; + value: { kind: string; ref?: { value: string }; args?: import("../types").Arg[] }; }; assert.equal(step.type, "const"); assert.equal(step.name, "x"); assert.equal(step.value.kind, "run_capture"); assert.equal(step.value.ref?.value, "some_script"); - assert.equal(step.value.args, '${arg1}'); + assert.deepEqual(step.value.args, [{ kind: "var", name: "arg1" }]); }); test("parser: const prompt capture parses", () => { diff --git a/src/transpile/validate-string.test.ts b/src/transpile/validate-string.test.ts index f2e2cc93..251b65e3 100644 --- a/src/transpile/validate-string.test.ts +++ b/src/transpile/validate-string.test.ts @@ -399,11 +399,11 @@ test("rejected: ${run ref} with unknown ref in workflow", () => { }); }); -test("extractInlineCaptures extracts run and ensure with args", () => { +test("extractInlineCaptures extracts run and ensure with typed Arg[]", () => { const { extractInlineCaptures } = require("./validate-string"); const result = extractInlineCaptures('prefix ${run greet(world)} middle ${ensure check()} suffix'); assert.deepEqual(result, [ - { kind: "run", ref: "greet", args: "${world}" }, + { kind: "run", ref: "greet", args: [{ kind: "var", name: "world" }] }, { kind: "ensure", ref: "check", args: undefined }, ]); }); diff --git a/src/transpile/validate-string.ts b/src/transpile/validate-string.ts index 34777e53..4851031c 100644 --- a/src/transpile/validate-string.ts +++ b/src/transpile/validate-string.ts @@ -11,6 +11,7 @@ import { jaiphError } from "../errors"; import { parseCallRef } from "../parse/core"; +import type { Arg } from "../types"; /** * Check for shell fallback/expansion syntax inside ${...} blocks. @@ -98,7 +99,7 @@ const INLINE_CAPTURE_RE = /\$\{(run|ensure)\s+([^}]+)\}/g; export interface InlineCapture { kind: "run" | "ensure"; ref: string; - args?: string; + args?: Arg[]; } /** Extract ${run ref [args]} and ${ensure ref [args]} from string content (unquoted). */ @@ -280,7 +281,7 @@ export function validateJaiphStringContent( ); } - if (call.args && /\$\{(?:run|ensure)\s/.test(call.args)) { + if (call.args?.some((a) => a.kind === "literal" && /\$\{(?:run|ensure)\s/.test(a.raw))) { throw jaiphError( filePath, line, col, "E_PARSE", `${context} cannot contain nested inline captures; extract to a const variable`, diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index 0bd0aff8..ae944e21 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -1,7 +1,7 @@ import { existsSync } from "node:fs"; import { dirname, resolve } from "node:path"; import { jaiphError } from "../errors"; -import type { jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; +import type { Arg, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; import type { ModuleGraph } from "./module-graph"; import type { SubstitutionValidateEnv } from "./validate-substitution"; import { validateManagedWorkflowShell } from "./validate-substitution"; @@ -54,17 +54,22 @@ function hasUnquotedSendArrow(line: string): boolean { return false; } -/** Check if args contain unquoted shell redirection operators (>, >>, |, &). */ -function hasShellRedirection(args: string): boolean { - let inQuote = false; - for (let i = 0; i < args.length; i++) { - const ch = args[i]; - if (ch === '"' && (i === 0 || args[i - 1] !== "\\")) { - inQuote = !inQuote; - continue; - } - if (!inQuote && (ch === ">" || ch === "|" || ch === "&")) { - return true; +/** Check if any literal arg contains unquoted shell redirection operators (>, >>, |, &). */ +function hasShellRedirection(args: Arg[] | undefined): boolean { + if (!args) return false; + for (const a of args) { + if (a.kind !== "literal") continue; + let inQuote = false; + const raw = a.raw; + for (let i = 0; i < raw.length; i++) { + const ch = raw[i]; + if (ch === '"' && (i === 0 || raw[i - 1] !== "\\")) { + inQuote = !inQuote; + continue; + } + if (!inQuote && (ch === ">" || ch === "|" || ch === "&")) { + return true; + } } } return false; @@ -74,9 +79,9 @@ function validateNoShellRedirection( filePath: string, loc: { line: number; col: number }, keyword: string, - args: string | undefined, + args: Arg[] | undefined, ): void { - if (!args || !hasShellRedirection(args)) return; + if (!hasShellRedirection(args)) return; throw jaiphError( filePath, loc.line, @@ -287,30 +292,6 @@ function validateImmutableBindings( walk(steps, bound); } -/** Count the number of call arguments from a space-separated args string (respects quotes). */ -function countCallArgs(argsStr: string | undefined): number { - if (!argsStr || !argsStr.trim()) return 0; - let count = 0; - let inQuote: string | null = null; - let hasContent = false; - for (let i = 0; i < argsStr.length; i++) { - const ch = argsStr[i]; - if (inQuote) { - hasContent = true; - if (ch === inQuote && argsStr[i - 1] !== "\\") inQuote = null; - } else if (ch === '"' || ch === "'") { - hasContent = true; - inQuote = ch; - } else if (ch === " " || ch === "\t") { - if (hasContent) { count++; hasContent = false; } - } else { - hasContent = true; - } - } - if (hasContent) count++; - return count; -} - /** Look up declared params for a workflow or rule target. Returns undefined if target has no declared params. */ function lookupCalleeParams( ref: string, @@ -349,14 +330,14 @@ function validateArity( filePath: string, loc: { line: number; col: number }, ref: string, - args: string | undefined, + args: Arg[] | undefined, targetKind: "workflow" | "rule", ast: jaiphModule, refCtx: RefResolutionContext, ): void { const params = lookupCalleeParams(ref, targetKind, ast, refCtx); if (params === undefined) return; // callee not a workflow/rule in scope — skip - const argCount = countCallArgs(args); + const argCount = args?.length ?? 0; if (argCount !== params.length) { throw jaiphError( filePath, @@ -368,70 +349,58 @@ function validateArity( } } - -/** Validate bare identifier args against known variables. */ -function validateBareIdentifierArgs( +/** Check each var-arg against the in-scope bindings; recover bindings are extra names. */ +function validateArgVarRefs( filePath: string, loc: { line: number; col: number }, - bareIdentifierArgs: string[] | undefined, + args: Arg[] | undefined, knownVars: Set, - /** Extra variable names from `ensure … recover` bindings. */ recoverBindings?: Set, ): void { - if (!bareIdentifierArgs) return; - for (const name of bareIdentifierArgs) { - if (recoverBindings?.has(name)) { - continue; - } - if (!knownVars.has(name)) { - throw jaiphError( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `unknown identifier "${name}" used as bare argument; declare it with "const", use a capture, or add a workflow/rule parameter`, - ); - } + if (!args) return; + for (const a of args) { + if (a.kind !== "var") continue; + if (recoverBindings?.has(a.name)) continue; + if (knownVars.has(a.name)) continue; + throw jaiphError( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `unknown identifier "${a.name}" used as bare argument; declare it with "const", use a capture, or add a workflow/rule parameter`, + ); } } -function stripQuotedArgContent(args: string): string { - let out = ""; - let quote: "'" | '"' | null = null; - for (let i = 0; i < args.length; i += 1) { - const ch = args[i]!; - if (quote) { - if (ch === quote && args[i - 1] !== "\\") { - quote = null; - } - out += " "; - continue; - } - if (ch === "'" || ch === '"') { - quote = ch; - out += " "; - continue; - } - out += ch; +/** + * Reject nested unmanaged calls inside literal args, e.g. `outer(inner())` or `outer(\`body\`())`. + * Each literal arg is one source segment, so a nested `name(` or `` `...`( `` form is only + * valid when explicitly prefixed with `run` or `ensure`. + */ +function validateNestedManagedCallArgs( + filePath: string, + loc: { line: number; col: number }, + args: Arg[] | undefined, +): void { + if (!args) return; + for (const a of args) { + if (a.kind !== "literal") continue; + checkNestedManagedInLiteral(filePath, loc, a.raw); } - return out; } -function validateNestedManagedCallArgs( +function checkNestedManagedInLiteral( filePath: string, loc: { line: number; col: number }, - args: string | undefined, + raw: string, ): void { - if (!args) return; - const stripped = stripQuotedArgContent(args); + const stripped = stripQuotedSegmentContent(raw); const re = /\b([A-Za-z_][A-Za-z0-9_.]*)\s*\(/g; let match: RegExpExecArray | null; while ((match = re.exec(stripped)) !== null) { const before = stripped.slice(0, match.index).trimEnd(); const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); - if (lastToken === "run" || lastToken === "ensure") { - continue; - } + if (lastToken === "run" || lastToken === "ensure") continue; throw jaiphError( filePath, loc.line, @@ -440,15 +409,12 @@ function validateNestedManagedCallArgs( `nested managed calls in argument position must be explicit; use "run ${match[1]}(...)" or "ensure ${match[1]}(...)" inside the argument list`, ); } - // Detect bare inline script calls: `body`() without preceding run/ensure const btRe = /`[^`]*`\s*\(/g; let btMatch: RegExpExecArray | null; while ((btMatch = btRe.exec(stripped)) !== null) { const before = stripped.slice(0, btMatch.index).trimEnd(); const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); - if (lastToken === "run" || lastToken === "ensure") { - continue; - } + if (lastToken === "run" || lastToken === "ensure") continue; throw jaiphError( filePath, loc.line, @@ -459,6 +425,29 @@ function validateNestedManagedCallArgs( } } +/** Replace double/single-quoted content (and surrounding quotes) with spaces for shape scanning. */ +function stripQuotedSegmentContent(segment: string): string { + let out = ""; + let quote: "'" | '"' | null = null; + for (let i = 0; i < segment.length; i += 1) { + const ch = segment[i]!; + if (quote) { + if (ch === quote && segment[i - 1] !== "\\") { + quote = null; + } + out += " "; + continue; + } + if (ch === "'" || ch === '"') { + quote = ch; + out += " "; + continue; + } + out += ch; + } + return out; +} + /** Resolve a route target workflow ref to its declared parameter count. Returns undefined if unresolvable. */ function resolveRouteTargetParams( ref: string, @@ -721,7 +710,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, s.ref.loc, s.ref.value, s.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.ref.loc, s.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, s.ref.loc, s.args, ruleKnownVars); if (s.catch) { const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; const rb = new Set(); @@ -748,7 +737,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.workflow, ast, refCtx, expectRunInRuleRef); validateArity(ast.filePath, s.workflow.loc, s.workflow.value, s.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.workflow.loc, s.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, s.workflow.loc, s.args, ruleKnownVars); if (s.catch) { const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; const rb = new Set(); @@ -827,14 +816,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.managed.ref, ast, refCtx, expectRunInRuleRef); validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.managed.ref.loc, s.managed.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, ruleKnownVars); } else if (s.managed.kind === "ensure") { validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "ensure", s.managed.args); validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); validateRef(s.managed.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.managed.ref.loc, s.managed.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, ruleKnownVars); } else if (s.managed.kind === "match") { validateMatchExpr(ast.filePath, s.managed.match, ruleKnownVars); } @@ -871,14 +860,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(v.ref, ast, refCtx, expectRunInRuleRef); validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, v.ref.loc, v.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, v.ref.loc, v.args, ruleKnownVars); } else if (v.kind === "ensure_capture") { validateNoShellRedirection(ast.filePath, v.ref.loc, "ensure", v.args); validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); validateRef(v.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, v.ref.loc, v.bareIdentifierArgs, ruleKnownVars); + validateArgVarRefs(ast.filePath, v.ref.loc, v.args, ruleKnownVars); } else if (v.kind === "prompt_capture") { throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); } else if (v.kind === "run_inline_script_capture") { @@ -1039,7 +1028,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.rhs.ref, ast, refCtx, expectRunTargetRef); validateArity(ast.filePath, s.rhs.ref.loc, s.rhs.ref.value, s.rhs.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.rhs.ref.loc, s.rhs.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, s.rhs.ref.loc, s.rhs.args, wfKnownVars, recoverBindings); } else if (s.rhs.kind === "literal") { const inner = s.rhs.token.startsWith('"') && s.rhs.token.endsWith('"') ? s.rhs.token.slice(1, -1) : s.rhs.token; @@ -1074,7 +1063,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, s.ref.loc, s.ref.value, s.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.ref.loc, s.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, s.ref.loc, s.args, wfKnownVars, recoverBindings); if (s.catch) { const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; const rb = new Set(); @@ -1092,7 +1081,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.workflow, ast, refCtx, expectRunTargetRef); validateArity(ast.filePath, s.workflow.loc, s.workflow.value, s.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.workflow.loc, s.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, s.workflow.loc, s.args, wfKnownVars, recoverBindings); if (s.catch) { const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; const rb = new Set(); @@ -1179,14 +1168,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(s.managed.ref, ast, refCtx, expectRunTargetRef); validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.managed.ref.loc, s.managed.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, wfKnownVars, recoverBindings); } else if (s.managed.kind === "ensure") { validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "ensure", s.managed.args); validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); validateRef(s.managed.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, s.managed.ref.loc, s.managed.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, wfKnownVars, recoverBindings); } else if (s.managed.kind === "match") { validateMatchExpr(ast.filePath, s.managed.match, wfKnownVars); } @@ -1242,14 +1231,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateRef(v.ref, ast, refCtx, expectRunTargetRef); validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "workflow", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, v.ref.loc, v.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, v.ref.loc, v.args, wfKnownVars, recoverBindings); } else if (v.kind === "ensure_capture") { validateNoShellRedirection(ast.filePath, v.ref.loc, "ensure", v.args); validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); validateRef(v.ref, ast, refCtx, expectRuleRef); validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "rule", ast, refCtx); - validateBareIdentifierArgs(ast.filePath, v.ref.loc, v.bareIdentifierArgs, wfKnownVars, recoverBindings); + validateArgVarRefs(ast.filePath, v.ref.loc, v.args, wfKnownVars, recoverBindings); } else if (v.kind === "prompt_capture") { const promptIdent = promptBareIdentifier(v.raw); if (promptIdent && localScripts.has(promptIdent)) { diff --git a/src/types.ts b/src/types.ts index e093e213..73080680 100644 --- a/src/types.ts +++ b/src/types.ts @@ -47,24 +47,36 @@ export interface MatchExprDef { loc: SourceLoc; } +/** + * Single call argument, classified at parse time. + * + * - `var`: a bare identifier reference (e.g. `foo(task)` → `{ kind: "var", name: "task" }`). + * The validator checks `name` against in-scope bindings; the runtime sees `${name}`. + * - `literal`: any other form (quoted string, `${…}` interpolation, nested `run …` / + * `ensure …` / inline-script call). Stored verbatim as authored, between the surrounding commas. + */ +export type Arg = + | { kind: "literal"; raw: string } + | { kind: "var"; name: string }; + export type ConstRhs = | { kind: "expr"; bashRhs: string } - | { kind: "run_capture"; ref: WorkflowRefDef; args?: string; bareIdentifierArgs?: string[]; async?: boolean } - | { kind: "ensure_capture"; ref: RuleRefDef; args?: string; bareIdentifierArgs?: string[] } + | { kind: "run_capture"; ref: WorkflowRefDef; args?: Arg[]; async?: boolean } + | { kind: "ensure_capture"; ref: RuleRefDef; args?: Arg[] } | { kind: "prompt_capture"; raw: string; loc: SourceLoc; returns?: string; } - | { kind: "run_inline_script_capture"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] } + | { kind: "run_inline_script_capture"; body: string; lang?: string; args?: Arg[] } | { kind: "match_expr"; match: MatchExprDef }; /** RHS of `channel <- …` */ export type SendRhsDef = | { kind: "literal"; token: string } | { kind: "var"; bash: string } - | { kind: "run"; ref: WorkflowRefDef; args?: string; bareIdentifierArgs?: string[] } + | { kind: "run"; ref: WorkflowRefDef; args?: Arg[] } /** Parsed then rejected in validation (use `run ref` to capture a return value). */ | { kind: "bare_ref"; ref: WorkflowRefDef } /** Shell fragment emitted as `"$(...)"` for inbox send. */ @@ -111,8 +123,7 @@ export type WorkflowStepDef = | { type: "ensure"; ref: RuleRefDef; - args?: string; - bareIdentifierArgs?: string[]; + args?: Arg[]; /** When set, capture step stdout into this variable name. */ captureName?: string; /** When set, catch failure and run recovery body once. */ @@ -123,8 +134,7 @@ export type WorkflowStepDef = | { type: "run"; workflow: WorkflowRefDef; - args?: string; - bareIdentifierArgs?: string[]; + args?: Arg[]; /** When set, capture step stdout into this variable name. */ captureName?: string; /** When set, execute asynchronously with implicit join before workflow completes. */ @@ -168,14 +178,14 @@ export type WorkflowStepDef = message: string; loc: SourceLoc; /** When set, log message comes from a managed inline-script call. */ - managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] }; + managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; } | { type: "logerr"; message: string; loc: SourceLoc; /** When set, logerr message comes from a managed inline-script call. */ - managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] }; + managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; } | { type: "send"; @@ -189,18 +199,17 @@ export type WorkflowStepDef = loc: SourceLoc; /** When set, return value comes from a managed run/ensure/match instead of the literal `value`. */ managed?: - | { kind: "run"; ref: WorkflowRefDef; args?: string; bareIdentifierArgs?: string[] } - | { kind: "ensure"; ref: RuleRefDef; args?: string; bareIdentifierArgs?: string[] } + | { kind: "run"; ref: WorkflowRefDef; args?: Arg[] } + | { kind: "ensure"; ref: RuleRefDef; args?: Arg[] } | { kind: "match"; match: MatchExprDef } - | { kind: "run_inline_script"; body: string; lang?: string; args?: string; bareIdentifierArgs?: string[] }; + | { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; } | { type: "run_inline_script"; body: string; /** Fence language tag (e.g. "node", "python3"). Maps to `#!/usr/bin/env `. */ lang?: string; - args?: string; - bareIdentifierArgs?: string[]; + args?: Arg[]; captureName?: string; loc: SourceLoc; } diff --git a/test-fixtures/compiler-txtar/valid.txt b/test-fixtures/compiler-txtar/valid.txt index 06dedd39..7da6c30c 100644 --- a/test-fixtures/compiler-txtar/valid.txt +++ b/test-fixtures/compiler-txtar/valid.txt @@ -288,7 +288,7 @@ workflow other_wf(a, b) { log "ok" } workflow default() { - run async other_wf("hello" "$x") + run async other_wf("hello", "$x") } === run async with qualified ref diff --git a/test-fixtures/golden-ast/expected/brace-if.json b/test-fixtures/golden-ast/expected/brace-if.json index aa70b932..b85639c0 100644 --- a/test-fixtures/golden-ast/expected/brace-if.json +++ b/test-fixtures/golden-ast/expected/brace-if.json @@ -42,9 +42,15 @@ }, "block": [ { - "args": "${err} \"error.log\"", - "bareIdentifierArgs": [ - "err" + "args": [ + { + "kind": "var", + "name": "err" + }, + { + "kind": "literal", + "raw": "\"error.log\"" + } ], "type": "run", "workflow": { From 27ad9280c48a87271a63fbcd13431f4c96665d4f Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 13:56:11 +0200 Subject: [PATCH 08/14] Refactor: collapse AST around a single Expr type MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace the three "managed call that yields a value" encodings — the `run` statement, the `run_capture` / `ensure_capture` / `prompt_capture` / `run_inline_script_capture` / `match_expr` ConstRhs branches, and the `managed:` sidecar on `return` / `log` / `logerr` (with placeholder strings like `"__match__"` and `"run inline_script"`) — with one tagged `Expr` union used everywhere a value can appear. `ConstRhs` and `SendRhsDef` are gone; the placeholder strings are gone; the sidecar field is gone. `WorkflowStepDef` collapses from 14 variants to 8 (`exec`, `const`, `return`, `send`, `say`, `if`, `for_lines`, `trivia`), with `exec` covering the prior `run` / `ensure` / `run_inline_script` / `prompt` / `shell` / standalone `match` statement shapes and `say` covering the prior `log` / `logerr` / `fail`. Parser, validator, formatter, emitter, runtime, and golden AST fixtures are migrated in lockstep; a new `src/types-shape.test.ts` enforces the acceptance criteria (no placeholder strings, exactly 8 step variants, no exported ConstRhs / SendRhsDef). Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 43 - docs/architecture.md | 12 +- docs/contributing.md | 1 + src/cli/run/progress.test.ts | 1539 ++--------------- src/cli/run/progress.ts | 209 +-- src/format/emit.ts | 513 +++--- src/parse/arg-ast-shape.test.ts | 93 +- src/parse/const-rhs.ts | 50 +- src/parse/core.ts | 13 - src/parse/parse-bare-call.test.ts | 33 +- src/parse/parse-const-rhs.test.ts | 54 +- src/parse/parse-definitions.test.ts | 13 +- src/parse/parse-inline-script.test.ts | 41 +- src/parse/parse-metadata.test.ts | 2 +- src/parse/parse-prompt.test.ts | 120 +- src/parse/parse-return.test.ts | 173 +- src/parse/parse-run-async.test.ts | 80 +- src/parse/parse-send-rhs.test.ts | 136 +- src/parse/parse-steps.test.ts | 209 +-- src/parse/prompt.ts | 83 +- src/parse/rules.ts | 25 +- src/parse/send-rhs.ts | 32 +- src/parse/steps.ts | 235 ++- src/parse/trivia-ast-shape.test.ts | 61 +- src/parse/workflow-brace.ts | 255 ++- src/parse/workflows.ts | 10 +- src/runtime/kernel/node-workflow-runtime.ts | 406 +++-- .../compiler-edge.acceptance.test.ts | 18 +- src/transpile/compiler-golden.test.ts | 94 +- src/transpile/emit-script.ts | 59 +- src/transpile/validate-prompt-schema.test.ts | 36 +- src/transpile/validate-prompt-schema.ts | 15 +- src/transpile/validate.ts | 953 ++++------ src/types-shape.test.ts | 160 ++ src/types.ts | 175 +- .../golden-ast/expected/brace-if.json | 51 +- .../golden-ast/expected/imports.json | 20 +- test-fixtures/golden-ast/expected/log.json | 24 +- .../golden-ast/expected/match-multiline.json | 11 +- test-fixtures/golden-ast/expected/match.json | 11 +- test-fixtures/golden-ast/expected/params.json | 19 +- .../golden-ast/expected/prompt-capture.json | 10 +- .../golden-ast/expected/run-ensure.json | 39 +- .../golden-ast/expected/script-defs.json | 22 +- 45 files changed, 2333 insertions(+), 3826 deletions(-) create mode 100644 src/types-shape.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index f1a6209d..4ada53d9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings:** The same concept "a managed call that yields a value" used to be encoded three different ways: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return` / `log` / `logerr` whose `value` / `message` carried a placeholder string (`"__match__"`, `"run inline_script"`, etc.). Inline scripts added a fourth (`run_inline_script_capture`); `prompt`, `match`, and `ensure` captures repeated the same dual representation. The validator, formatter, emitter, and runtime each had to handle both branches at every site. All three encodings are gone. The semantic AST now has a single `Expr` tagged union — `literal | call | ensure_call | inline_script | prompt | match | shell | bare_ref` — used everywhere a value can appear: `const name = `, `return `, `send channel <- `, the message of `log` / `logerr` / `fail`, and the body of an `exec` step (the new statement-form managed call, where the value is consumed for its side effects plus optional capture). `ConstRhs` and `SendRhsDef` are deleted as separate types. The `managed:` sidecar field is deleted from `WorkflowStepDef`. The placeholder strings `"__match__"`, `"run inline_script"`, and `"__JAIPH_MANAGED__"` no longer appear anywhere under `src/`. `WorkflowStepDef` collapses from 14 variants to **8** (`exec`, `const`, `return`, `send`, `say`, `if`, `for_lines`, `trivia`): `exec` is the new managed-statement form covering the prior `run` / `ensure` / `run_inline_script` / `prompt` / `shell` / standalone `match` cases (the discriminator now lives inside `body.kind`, with `captureName` / `catch` / `recover` as step-level attributes); `say` covers the prior `log` / `logerr` / `fail` cases (`level: "fail"` aborts the workflow with the message, otherwise the message is written to the corresponding stream); `comment` / `blank_line` collapse into a single `trivia` variant (formatter-only, skipped by validator and runtime). The parser builds `Expr` nodes directly: `parseConstRhs` returns `{ value: Expr }`; `parseSendRhs` returns `{ value: Expr }`; `parsePromptStep` returns an `exec` step whose `body` is an `Expr.prompt`; `return run …` / `return ensure …` / `return match …` / `return run \`…\`(…)` build `Expr.call` / `Expr.ensure_call` / `Expr.match` / `Expr.inline_script` directly with no sidecar; `log run \`…\`(…)` and `logerr run \`…\`(…)` build `say` steps whose `message` is an `Expr.inline_script`. Downstream consumers compress accordingly: the validator switches on the 8-variant `WorkflowStepDef.type` and the 8-kind `Expr.kind` with no "literal value vs managed sidecar" fork; the formatter renders each `Expr` through one `emitExpr` helper instead of branching on a sidecar; the runtime has one private `evaluateExpr(scope, expr, …)` dispatcher that `const` / `return` / `send` / `say` / `exec` all delegate to (which runs the managed call for `call` / `ensure_call` / `inline_script`, walks `match` arms, schema-checks `prompt`, and interpolates `literal` via `interpolateWithCaptures`); the script-emit walk in `src/transpile/emit-script.ts` finds inline-script bodies by recursing into each step's `Expr` payload rather than enumerating the four legacy carriers. New tests pin the invariants: `src/types-shape.test.ts` is a compile-time exhaustive `switch` plus runtime tuple assertion that `WorkflowStepDef` has exactly **8** variants and `Expr` has exactly **8** kinds (AC2), a `grep` over every non-test `.ts` file under `src/` that fails if any of the placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) reappear (AC1), and an export-surface check that fails if `ConstRhs` or `SendRhsDef` are re-exported from `src/types.ts` (AC3). Updated parser tests in `src/parse/parse-return.test.ts`, `src/parse/parse-const-rhs.test.ts`, `src/parse/parse-prompt.test.ts`, `src/parse/parse-send-rhs.test.ts`, `src/parse/parse-steps.test.ts`, `src/parse/parse-inline-script.test.ts`, and `src/parse/parse-bare-call.test.ts` assert the new `Expr` shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …` (AC4). The golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against the emitted bash output; `src/format/roundtrip.test.ts` round-trips bit-for-bit on every fixture; `npm run build` passes with zero TypeScript strict-mode errors (AC5 / AC6). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to reflect the new step shapes (`exec` wrapping every managed call, `say` replacing `log` / `logerr` / `fail`, `trivia` replacing `comment` / `blank_line`, `Expr` value/message/body payloads). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: surface syntax, the validator's deeper structural rewrite (Refactor 4), and parser internals (Refactors 1 & 2). Docs updated in `docs/architecture.md` (rewrote the **AST / Types** bullet to describe the single `Expr` sum and the 8-variant `WorkflowStepDef`; updated **Validator**, **Formatter**, **Node Workflow Runtime**, and **Trivia / CST layer** bullets to drop the dual-representation language; rewrote the `match_expr` mention in **CLI progress reporting pipeline** to use `Expr.kind === "match"`) and `docs/contributing.md` (new **`Expr` / step-variant shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. - **Refactor — Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site:** Every call-bearing AST node used to carry the call arguments twice — `args: string` (the raw source between the parens) and `bareIdentifierArgs?: string[]` (a re-parse of which of those arguments happened to be bare identifiers). The validator had to remember to check both fields and call a hand-rolled `validateBareIdentifierArgs` helper at every site; the emitter re-parsed `args` from scratch because it didn't trust either field on its own. Both fields are gone. The parser now classifies each argument once, at parse time, into a new typed sum `type Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }` and stores it on every call-bearing node as `args?: Arg[]`. Affected nodes: `run` / `ensure` workflow steps, `run_inline_script` steps, the `managed` sidecar on `return` / `log` / `logerr` (in all four shapes — `run`, `ensure`, `run_inline_script`, `match`), the `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS variants, and the `run` send RHS. Downstream consumers walk the typed list directly: the validator's per-call check sequence is now arity (`args.length`), shell-redirection rejection on `literal` raws, nested-unmanaged-call rejection on `literal` raws, ref resolution, and `var`-arg resolution against in-scope bindings via a new `validateArgVarRefs` (the standalone `validateBareIdentifierArgs` helper is deleted); the formatter renders each `Arg` directly (`var` → bare name, `literal` → raw) instead of re-tokenizing a `${ident}`-rewritten string; the runtime turns `Arg[]` back into the space-separated argv string via `argsToRuntimeString` in `src/parse/core.ts` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. New tests pin the invariants: `src/parse/arg-ast-shape.test.ts` is a compile-time assertion that `bareIdentifierArgs` does not appear on `WorkflowStepDef` (`ensure`, `run`, `run_inline_script`, `log.managed`, `logerr.managed`, `return.managed` in `run` / `ensure` / `run_inline_script` shapes), `ConstRhs` (`run_capture`, `ensure_capture`, `run_inline_script_capture`), or the `run` `SendRhsDef` variant (AC1); `src/parse/arg-grep.test.ts` walks every non-test `.ts` under `src/parse/` and `src/transpile/` and fails if any production file matches `args.split(",")` or the bare token `bareIdentifierArgs` (AC2), and separately fails if any file under `src/transpile/` references `validateBareIdentifierArgs` (AC3). The golden compiler corpus, `validate-*.test.ts` files, and the golden AST corpus pass byte-for-byte (AC4); `npm run build` passes with zero TypeScript strict-mode errors (AC5). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the full `Expr` collapse (next task) and surface syntax. Docs updated in `docs/architecture.md` (extended **AST / Types** bullet documenting the typed `Arg` sum; updated **Validator** and **Formatter** bullets to drop the dual representation), `docs/contributing.md` (new **Call-args AST shape** row in the test-layer table), and `docs/spec-async-handles.md` (replaces the stale `commaArgsToSpaced` reference with `argsToRuntimeString`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. - **Refactor — Split source-fidelity data from the semantic AST into a `Trivia` (CST) layer:** Around ten fields whose only consumer was the formatter — `leadingComments` on imports / script imports / channels / `const` decls / `test` blocks, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence` (both module- and workflow-scoped), `topLevelOrder`, `bareSource` on `return`, the `tripleQuoted` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, and the prompt / script `bodyKind` / `bodyIdentifier` discriminators — are removed from `jaiphModule`, `WorkflowStepDef`, `ConstRhs`, `SendRhsDef`, `WorkflowMetadata`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `ScriptDef`, and `TestBlockDef`, and re-homed in a new parallel `Trivia` store (`src/parse/trivia.ts`) keyed by AST-node identity (per-node `WeakMap`) plus a small `ModuleTrivia` record for module-level data. The parser exposes `parsejaiphWithTrivia(source, filePath) → { ast, trivia }`; the legacy `parsejaiph(source, filePath)` is now a thin wrapper that drops trivia for callers that don't care (validator, transpiler, runtime, `loadModuleGraph`). The formatter (`emitModule(ast, trivia, opts?)`) is the only consumer of `Trivia`; validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts`. New tests pin the invariants: `src/parse/trivia-ast-shape.test.ts` is a compile-time assertion (with runtime echo) that none of the listed fields reappear on any semantic AST type (AC1); `src/parse/trivia-grep.test.ts` greps validator and emitter source files and fails if any of them references `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` or imports from `parse/trivia` (AC2); `src/format/roundtrip.test.ts` walks every `.jh` under `examples/` and `test-fixtures/golden-ast/fixtures/` and asserts `parse → format → parse → format` converges bit-for-bit (AC3). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to drop the moved fields. User-visible contracts (CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming) are unchanged. `npm test` and `npm run build` pass with zero TypeScript strict-mode errors (AC4 / AC5). Out of scope: the `Expr` collapse — this refactor only relocates source-fidelity fields without changing the semantic AST's shape. Docs updated in `docs/architecture.md` (new **Trivia / CST layer** section with anchor `#trivia-cst-layer`, plus updated **Parser**, **AST / Types**, and **Formatter** bullets) and `docs/contributing.md` (new row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. - **Refactor — `ModuleGraph` is the single representation of "all `.jh` modules reachable from an entry point, parsed once":** The previous three traversal strategies for compile-time module discovery (validator re-reading imports through `ValidateContext`, `emitScriptsForModule` re-wrapping the same callbacks with an optional `prep` cache, and `buildScripts` walking the filesystem directly) collapse to one path. `parsejaiph(source, filePath)` is now strictly I/O-pure — it can no longer reach `fs`. The single discovery routine `loadModuleGraph(entry, workspaceRoot?)` (`src/transpile/module-graph.ts`) walks the entry plus its transitive `import` closure and returns `{ entryFile, workspaceRoot?, modules: Map }`; every other compile-time consumer takes the graph and never re-reads `.jh` from disk. `validateReferences(graph)` and `emitScriptsForModuleFromGraph(graph, file, rootDir)` operate entirely in-memory. The `ValidateContext` interface (`resolveImportPath` / `existsSync` / `readFile` / `parse` / `workspaceRoot` callbacks) is deleted from `src/transpile/validate.ts`; the validator consumes the graph and uses `existsSync` only to resolve `import script` paths (non-`.jh` bodies). `CompilePrep` / `prepareCompile` / `writeCompilePrep` / `readCompilePrep` and the optional `prep?` parameter on `emitScriptsForModule` / `buildScripts` are gone; `buildScripts(input, outDir, ws?)` now loads a graph internally and `buildScriptsFromGraph(graph, outDir, rootDir?)` is the entry point for callers that already loaded one. `buildRuntimeGraph` accepts either an entry file path (legacy) or an already-loaded `ModuleGraph` — `RuntimeGraph` is a type alias for `ModuleGraph` (the only "all reachable modules" representation in the codebase). The cross-process cache file moves to `/.jaiph-module-graph.json` (deterministic JSON: entries sorted by absolute path, ASTs included verbatim) via `writeModuleGraph` / `readModuleGraph`, and the internal env var the spawned `node-workflow-runner.js` reads is renamed `JAIPH_MODULE_GRAPH_FILE` (replacing `JAIPH_COMPILE_PREP_FILE`). Scope of the env-var hand-off is unchanged: set only for the default local non-Docker `jaiph run` path; `jaiph run --raw`, `jaiph test`, and Docker launches fall back to `loadModuleGraph` from the source file. User-visible contracts — banner, hooks, run artifacts, `run_summary.jsonl`, `return_value.txt`, exit codes, `__JAIPH_EVENT__` streaming, CLI usage, and the full golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) — are unchanged byte-for-byte. New tests (`src/transpile/module-graph.test.ts`, `src/transpile/pipeline-io-purity.test.ts`) stub `node:fs` to throw on any `.jh` read and run the full pipeline against `test-fixtures/` to pin the I/O-purity invariant; another test instruments `parsejaiph` with a call counter to assert no duplicate parses across `loadModuleGraph` → `validateReferences` → `emit` → `buildRuntimeGraph` for fixtures with transitive imports. `src/transpile/compile-prep.ts` and `compile-prep.test.ts` are removed. Docs updated in `docs/architecture.md`, `docs/cli.md`, and `docs/testing.md`. Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 5. diff --git a/QUEUE.md b/QUEUE.md index 911ae667..51278e3e 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,49 +13,6 @@ Process rules: *** -## Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. - -**Why:** The concept "a managed call that yields a value" is encoded three different ways in `src/types.ts`: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return`/`log`/`logerr` with a placeholder string (e.g. `value: "__match__"`, `value: "run inline_script"`). Inline scripts add a fourth (`run_inline_script_capture`). The same is true for `prompt`, `match`, and `ensure` captures. Validator, formatter, and emitter all have to know about the dual representation. - -**Scope:** - -- Introduce a single `Expr` sum type (or equivalent) used everywhere a value can appear: - - ```ts - type Expr = - | { kind: "literal"; raw: string } - | { kind: "var"; name: string; field?: string } - | { kind: "call"; callee: Ref; args: Arg[] } - | { kind: "ensure_call"; callee: Ref; args: Arg[] } - | { kind: "inline_script"; lang?: string; body: string; args?: Arg[] } - | { kind: "prompt"; body: Expr; returns?: Schema } - | { kind: "match"; subject: Expr; arms: MatchArm[] }; - ``` - -- Replace `ConstRhs` with `Expr`. -- Replace `SendRhsDef` with `Expr` (plus the channel arrow itself). -- `ReturnStep`, `LogStep`, `LogerrStep` become `{ value | message: Expr }`. The placeholder strings `"__match__"`, `"run inline_script"`, etc. are deleted. -- The `managed:` sidecar field is deleted from `WorkflowStepDef`. -- `WorkflowStepDef` ends up with ~7 variants (down from 14). -- All references to the deleted shapes in parser, validator, emitter, and formatter are migrated. - -**Acceptance criteria** (each verified by a test): - -1. The string literals `"__match__"`, `"run inline_script"`, and any other AST placeholder strings are absent from `src/`. Add a meta-test (e.g. a `grep` test) that fails if any reappear. -2. `WorkflowStepDef` has at most 8 variants. Add a type-level test (e.g. an exhaustive `switch` in a compile-time assertion file) that fails if a new variant is silently added. -3. `ConstRhs` and `SendRhsDef` are deleted as separate types; their fields are reachable via `Expr`. A test asserting the export surface of `src/types.ts` fails when those symbols reappear. -4. Every existing parser path that produced a `managed:` sidecar now produces an `Expr` node, and a new parser test asserts the AST shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …`. -5. `npm test` passes. The golden corpus (`compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`) passes byte-for-byte against emitted bash output. The formatter round-trip tests pass byte-for-byte against source. -6. `npm run build` passes; TypeScript strict-mode errors are zero. - -**Out of scope:** surface syntax, the validator's structural rewrite (Refactor 4), parser internals (Refactors 1 & 2). This refactor is purely an AST + producer/consumer migration. - -**Dependency:** The Trivia/CST split and `Arg[]` collapse (two previous tasks) should be complete first so the new `Expr` shape is designed against the semantic core only. - -*** - ## Fold the validator's pre-passes (knownVars / promptSchemas / immutableBindings) into a single workflow walk #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. diff --git a/docs/architecture.md b/docs/architecture.md index 13b8764a..1ccd06d8 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -41,15 +41,18 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **AST / Types (`src/types.ts`)** - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). - - **Call arguments are a typed sum.** Every call-bearing node — `run` / `ensure` steps and the `managed` sidecar on `return` / `log` / `logerr`, `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS, the `run` send RHS, and the `run_inline_script` step — carries `args?: Arg[]` where `Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }`. The parser classifies each argument once (a bare in-scope-style identifier becomes `var`; everything else — quoted strings, `${…}` interpolations, nested `run …` / `ensure …` calls, inline-script bodies — is stored verbatim as `literal`). There is no separate `args: string` text payload or shadow `bareIdentifierArgs: string[]` field, and no downstream consumer re-parses call arguments: the validator walks the typed list to enforce arity, reject nested unmanaged calls inside literals, and resolve `var` refs against in-scope bindings; the emitter renders by mapping each `Arg` to its source form; the runtime turns `Arg[]` back into a runtime string via `argsToRuntimeString` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. + - **One `Expr` for every value position.** Anywhere a value can appear — `const name = …`, `return …`, `send channel <- …`, `log` / `logerr` / `fail` arguments, and the body of an `exec` statement — the AST stores a single tagged union: `Expr = literal | call | ensure_call | inline_script | prompt | match | shell | bare_ref`. There is **no longer** a separate `ConstRhs` union, `SendRhsDef` union, or `managed:` sidecar on `return` / `log` / `logerr` (the placeholder strings `"__match__"` / `"run inline_script"` / `"__JAIPH_MANAGED__"` are gone too — a meta-test in `src/types-shape.test.ts` fails if any reappear under `src/`). The eight `Expr` kinds: `literal` (verbatim source text — quoted string, `$var` / `${var}` form, or post-dedent triple-quoted body), `call` (managed workflow/script call; `async: true` for `run async ref(...)` capture position), `ensure_call` (managed rule call), `inline_script` (`` `body`(args) `` or fenced), `prompt` (carries the JSON-quoted body and optional flat `returns` schema), `match` (a `match { ... }` evaluated for its value), `shell` (raw shell fragment used as a managed substitution on the send RHS), and `bare_ref` (bare symbol on a send RHS — always rejected by the validator, preserved so the error message can name the symbol). + - **Eight `WorkflowStepDef` variants** (down from fourteen): `exec` (side-effecting managed call statement — was `run` / `ensure` / `run_inline_script` / `prompt` / standalone `match` / inline `shell`; the discriminator now lives inside `body.kind`, with `captureName` / `catch` / `recover` as step-level attributes); `const`, `return`, `send` (bind, propagate, or emit an `Expr`); `say` (was `log` / `logerr` / `fail` — `level: "fail"` aborts the workflow with the message, otherwise the message is written to the corresponding stream); `if` / `for_lines` (control flow, unchanged shape); `trivia` (formatter-only `comment` / `blank_line` slots — skipped by the runtime and validator). A type-level exhaustive `switch` in `src/types-shape.test.ts` pins both the step count at **8** and the `Expr` kind count at **8**. + - **Call arguments are a typed sum.** Every call-bearing `Expr` (`call`, `ensure_call`, `inline_script`) carries `args?: Arg[]` where `Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }`. The parser classifies each argument once (a bare in-scope-style identifier becomes `var`; everything else — quoted strings, `${…}` interpolations, nested `run …` / `ensure …` calls, inline-script bodies — is stored verbatim as `literal`). There is no separate `args: string` text payload or shadow `bareIdentifierArgs: string[]` field, and no downstream consumer re-parses call arguments: the validator walks the typed list to enforce arity, reject nested unmanaged calls inside literals, and resolve `var` refs against in-scope bindings; the emitter renders by mapping each `Arg` to its source form; the runtime turns `Arg[]` back into a runtime string via `argsToRuntimeString` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. - **Trivia / CST layer (`src/parse/trivia.ts`)** {: #trivia-cst-layer} - `Trivia` is a parallel store keyed by AST-node identity (per-node via `WeakMap`) and a small `ModuleTrivia` record for module-level data. The parser builds it alongside the AST; **only the formatter reads it**. Validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts` — a grep test (`src/parse/trivia-grep.test.ts`) pins this invariant by rejecting any reference to `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` from validator and emitter source files. - - A separate type-shape test (`src/parse/trivia-ast-shape.test.ts`) asserts at compile time that none of the formatter-only fields reappear on `jaiphModule`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `TestBlockDef`, `WorkflowMetadata`, `ScriptDef`, or any `WorkflowStepDef` / `ConstRhs` / `SendRhsDef` variant. + - A separate type-shape test (`src/parse/trivia-ast-shape.test.ts`) asserts at compile time that none of the formatter-only fields reappear on `jaiphModule`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `TestBlockDef`, `WorkflowMetadata`, `ScriptDef`, or any `WorkflowStepDef` / `Expr` variant. (`ConstRhs` / `SendRhsDef` no longer exist — their fields live inside `Expr` — and `src/types-shape.test.ts` fails if those symbols reappear as exports of `src/types.ts`.) - **Validator (`src/transpile/validate.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. + - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation, walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. - Per call site the validator runs five checks against the typed **`Arg[]`** directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution, arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. There is no longer a separate `validateBareIdentifierArgs` helper, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** @@ -58,6 +61,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Node Workflow Runtime (`src/runtime/kernel/node-workflow-runtime.ts`)** - `NodeWorkflowRuntime` interprets the AST directly: walks workflow steps, manages scope/variables, delegates prompt and script execution to kernel helpers, handles channels/inbox/dispatch, owns the frame stack and heartbeat, and writes run artifacts. + - One private `evaluateExpr(scope, expr, …)` dispatcher handles every value position — `const` / `return` / `send` / `say` step handlers and the body of every `exec` step delegate to it. It switches on `Expr.kind` to run the managed call (`call` / `ensure_call` / `inline_script`) or `prompt`, walks a `match` expression, or interpolates a `literal` value through `interpolateWithCaptures`. There is no fan-out across "managed sidecar vs literal value" because that branch is gone from the AST. - Three sibling modules under `src/runtime/kernel/` carry concerns that used to live inline in the runtime file. Dependency direction is one-way (orchestrator → helpers/emitter/mock); no circular imports back. - **`runtime-arg-parser.ts`** — stateless interpolation and call-argument parsing (`interpolate`, `parseInlineCaptureCall`, `commaArgsToInterpolated`, `parseArgsRaw`, `parseInlineScriptAt`, `parseManagedArgAt`, `parseArgTokens`, `stripOuterQuotes`, `parsePromptSchema`, `sanitizeName`, `nowIso`) plus shared constants and the `ParsedArgToken` / `PromptSchemaField` types. Direct unit tests live in `runtime-arg-parser.test.ts`. - **`runtime-event-emitter.ts`** — `RuntimeEventEmitter` owns **`__JAIPH_EVENT__`** writes on stderr (step/log traffic when not suppressed), **`run_summary.jsonl`** appends for the wider timeline (including workflow/prompt records that are summary-first), plus step/prompt sequence counters. Constructed with `{ runId, runDir, env, getFrameStack, getAsyncIndices, suppressLiveEvents? }`; the runtime delegates structured emission to it. The optional `suppressLiveEvents` flag (forwarded from `NodeWorkflowRuntime`'s `suppressLiveEvents` option) skips the live stderr **`__JAIPH_EVENT__`** lines while **`appendRunSummaryLine`** keeps updating **`run_summary.jsonl`** — used by in-process callers like the test runner that share stderr with `node --test` reporter output. The CLI's spawned `node-workflow-runner` child does not set it, so production runs stream events to stderr as before. @@ -71,7 +75,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Prompt execution (`prompt.ts`), streaming parse (`stream-parser.ts`), schema (`schema.ts`), **`mock.ts`** (sequential prompt responses / mock-arm dispatch from test env JSON), **`runtime-mock.ts`** (mock workflow/rule/script **bodies** for `*.test.jh`), **`emit.ts`** (durable **`run_summary.jsonl`** helpers — `appendRunSummaryLine`, `formatUtcTimestamp` — consumed by `RuntimeEventEmitter`), **`workflow-launch.ts`** (spawn contract). **`RuntimeEventEmitter`** (`runtime-event-emitter.ts`) owns live **`__JAIPH_EVENT__`** lines on stderr and coordinates summary writes plus step/prompt sequence counters. Script subprocesses are launched directly from `NodeWorkflowRuntime`. - **Formatter (`src/format/emit.ts`)** - - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. `emitModule(ast, trivia, opts?)` reads the semantic AST together with the parallel **`Trivia`** store ([Trivia (CST layer)](#trivia-cst-layer)) to round-trip leading comments, top-level order, `config` body sequence, `"""..."""` and `bareSource` forms, and prompt / script body discriminators. Call arguments render straight off the typed `Arg[]` — `var` → bare name, `literal` → raw — so the formatter no longer re-parses any args string or consults a `bareIdentifierArgs` shadow field. Pure data→text emitter; no side-effects beyond file writes. Round-trip is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` — pinned by `src/format/roundtrip.test.ts`, which asserts `parse → format → parse → format` converges in one step on every fixture. + - `jaiph format` rewrites `.jh` / `.test.jh` files into canonical style. `emitModule(ast, trivia, opts?)` reads the semantic AST together with the parallel **`Trivia`** store ([Trivia (CST layer)](#trivia-cst-layer)) to round-trip leading comments, top-level order, `config` body sequence, `"""..."""` and `bareSource` forms, and prompt / script body discriminators. Step emission switches on `WorkflowStepDef.type` (8 variants) and an `emitExpr` helper switches on `Expr.kind` (8 kinds) — there are no dual code paths for "managed sidecar vs literal value" because that branch was removed from the AST. Call arguments render straight off the typed `Arg[]` — `var` → bare name, `literal` → raw — so the formatter no longer re-parses any args string or consults a `bareIdentifierArgs` shadow field. Pure data→text emitter; no side-effects beyond file writes. Round-trip is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` — pinned by `src/format/roundtrip.test.ts`, which asserts `parse → format → parse → format` converges in one step on every fixture. - **Docker runtime helper (`src/runtime/docker.ts`)** - Parses mount specs, resolves Docker config (image, network, timeout), and builds the `docker run` invocation when the CLI enables **Docker sandboxing** for `jaiph run` (environment-driven; there is no `jaiph run --docker` flag — see [Sandboxing](sandboxing.md)). The container runs the same `node-workflow-runner` entry as local execution. The default image is the official `ghcr.io/jaiphlang/jaiph-runtime` GHCR image; every selected image must already contain `jaiph` (no auto-install or derived-image build at runtime). Image preparation (`prepareImage`) runs before the CLI banner: it checks whether the image is local, pulls with `--quiet` if needed (short status lines on stderr instead of Docker’s default pull UI), and verifies that `jaiph` exists in the image. `spawnDockerProcess` does not pull or verify — it receives a pre-resolved image. The spawn call uses `stdio: ["ignore", "pipe", "pipe"]` — stdin is ignored so the Docker CLI does not block on stdin EOF, which would stall event streaming and hang the host CLI after the container exits. @@ -152,7 +156,7 @@ Authoring rules, fixtures, and mock syntax for `*.test.jh` are documented in [Te ## CLI progress reporting pipeline -The progress UI combines a **static** step tree derived from the workflow AST (`src/cli/run/progress.ts`) with **live** updates from the runtime event stream. Event wiring: `src/cli/run/events.ts` and `src/cli/run/stderr-handler.ts` parse `__JAIPH_EVENT__` lines; `src/cli/run/emitter.ts` bridges into the renderer. Line-oriented formatting (`formatStartLine`, `formatHeartbeatLine`, `formatCompletedLine`) lives primarily in `src/cli/run/display.ts`, which shares some display helpers with `progress.ts`. Async branch numbering (subscript ₁₂₃… prefixes) is driven by `async_indices` on step and log events — the runtime propagates a chain of 1-based branch indices through `AsyncLocalStorage`, and the stderr handler renders them at the appropriate indent level. `const` steps whose value is a `match_expr` are walked for nested `run`/`ensure` arms; matched targets appear as child items in the step tree (e.g. `▸ script safe_name` under the `const` row). This pipeline does not apply to **`jaiph run --raw`**. +The progress UI combines a **static** step tree derived from the workflow AST (`src/cli/run/progress.ts`) with **live** updates from the runtime event stream. Event wiring: `src/cli/run/events.ts` and `src/cli/run/stderr-handler.ts` parse `__JAIPH_EVENT__` lines; `src/cli/run/emitter.ts` bridges into the renderer. Line-oriented formatting (`formatStartLine`, `formatHeartbeatLine`, `formatCompletedLine`) lives primarily in `src/cli/run/display.ts`, which shares some display helpers with `progress.ts`. Async branch numbering (subscript ₁₂₃… prefixes) is driven by `async_indices` on step and log events — the runtime propagates a chain of 1-based branch indices through `AsyncLocalStorage`, and the stderr handler renders them at the appropriate indent level. `const` steps whose `Expr` value is `kind: "match"` are walked for nested `run` / `ensure` arms; matched targets appear as child items in the step tree (e.g. `▸ script safe_name` under the `const` row). This pipeline does not apply to **`jaiph run --raw`**. ## Distribution: Node vs Bun standalone diff --git a/docs/contributing.md b/docs/contributing.md index 0bb1a9d8..1b48ab71 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -104,6 +104,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Compiler golden tests** | `src/transpile/compiler-golden.test.ts` (colocated) | Regressions in the parser, validation messages, and scripts-only extraction (`buildScriptFiles` in `emit-script.ts`) — expectations are inline in the test file | You changed the parser, validator, or script extraction and need to lock an exact error string, extracted script shape, or corpus behavior | | **Trivia / formatter round-trip** | `src/parse/trivia-ast-shape.test.ts`, `src/parse/trivia-grep.test.ts`, `src/format/roundtrip.test.ts` | Source-fidelity invariants: no trivia fields on semantic AST types (compile-time), validator/emitter sources do not reference `Trivia`, and `parse → format → parse → format` is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` | You changed the parser, formatter, AST types, or anything that touches source-fidelity round-trip (see [Architecture — Trivia (CST layer)](architecture.md#trivia-cst-layer)) | | **Call-args AST shape** | `src/parse/arg-ast-shape.test.ts`, `src/parse/arg-grep.test.ts` | Pins the typed-`Arg[]` invariant: no `bareIdentifierArgs` field on any call-bearing AST type (compile-time), no `args.split(",")` or `bareIdentifierArgs` text in production `src/parse/` or `src/transpile/` sources, and no `validateBareIdentifierArgs` helper in the validator | You changed how call arguments flow through the parser, validator, or emitter and need to confirm nothing re-introduces the parallel raw-string representation (see [Architecture — AST / Types](architecture.md#core-components)) | +| **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | | **Integration tests** | `integration/*.test.ts`, `integration/sample-build/*.test.ts` | Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses | diff --git a/src/cli/run/progress.test.ts b/src/cli/run/progress.test.ts index 92ab843a..6b29c01c 100644 --- a/src/cli/run/progress.test.ts +++ b/src/cli/run/progress.test.ts @@ -11,19 +11,15 @@ import { styleYellow, styleBold, } from "./progress"; -import type { jaiphModule } from "../../types"; - -function minimalModule(overrides?: Partial): jaiphModule { - return { - filePath: "test.jh", - imports: [], - channels: [], - exports: [], - rules: [], - scripts: [], - workflows: [], - ...overrides, - }; +import { parsejaiph } from "../../parser"; + +/** + * Fixtures are built by parsing real Jaiph source so test data flows through + * the same producer as production — no hand-written AST shapes to keep in + * sync with the type definitions. + */ +function modFor(source: string) { + return parsejaiph(source, "test.jh"); } // --- parseLabel --- @@ -71,22 +67,21 @@ test("formatElapsedDuration: handles sub-second", () => { // --- collectWorkflowChildren --- test("collectWorkflowChildren: returns empty for unknown workflow", () => { - const mod = minimalModule(); + const mod = modFor(`workflow default() { + log "hi" +}`); assert.deepStrictEqual(collectWorkflowChildren(mod, "missing"), []); }); -test("collectWorkflowChildren: collects run steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "deploy", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects run step as workflow row", () => { + const mod = modFor([ + "workflow default() {", + " run deploy()", + "}", + "workflow deploy() {", + " log \"d\"", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); assert.equal(items.length, 1); assert.equal(items[0].label, "workflow deploy"); @@ -94,1407 +89,175 @@ test("collectWorkflowChildren: collects run steps", () => { }); test("collectWorkflowChildren: collects async run with prefix", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "bg_task", loc: { line: 2, col: 3 } }, async: true }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "async workflow bg_task"); -}); - -test("collectWorkflowChildren: collects ensure steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "ensure", ref: { value: "check_passes", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 1); - assert.equal(items[0].label, "rule check_passes"); -}); - -test("collectWorkflowChildren: collects prompt steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: 'prompt "hello world"', loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 1); - assert.match(items[0].label, /^prompt "hello world"/); -}); - -test("collectWorkflowChildren: collects log steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "log", message: "starting", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "ℹ starting"); -}); - -test("collectWorkflowChildren: collects logerr steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "logerr", message: "bad thing", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "! bad thing"); -}); - -test("collectWorkflowChildren: collects send steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "send", channel: "notify", rhs: { kind: "literal", token: "hello" }, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "notify <- send"); -}); - -test("collectWorkflowChildren: collects fail steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "fail", message: "broken", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "fail broken"); -}); - -test("collectWorkflowChildren: collects const steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "const", name: "x", value: { kind: "expr", bashRhs: "1" }, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "const x"); -}); - - -test("collectWorkflowChildren: collects return steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "return", value: '"done"', loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); + const mod = modFor([ + "workflow default() {", + " run async deploy()", + "}", + "workflow deploy() {", + " log \"d\"", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, 'return "done"'); -}); - -test("collectWorkflowChildren: collects shell steps with truncation", () => { - const longCmd = "a".repeat(60); - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "shell", command: longCmd, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.match(items[0].label, /^\$ .{53}\.\.\./); -}); - -test("collectWorkflowChildren: skips comment steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "comment", text: "# note", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 0); -}); - -test("collectWorkflowChildren: collects channel-level route declarations", () => { - const mod = minimalModule({ - channels: [{ - name: "events", - routes: [ - { value: "handler1", loc: { line: 1, col: 20 } }, - { value: "handler2", loc: { line: 1, col: 30 } }, - ], - loc: { line: 1, col: 9 }, - }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [], - loc: { line: 3, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 1); - assert.equal(items[0].label, "events -> handler1, handler2"); -}); - -// --- buildRunTreeRows --- - -test("buildRunTreeRows: root row is first", () => { - const mod = minimalModule({ - workflows: [{ name: "default", comments: [], params: [], steps: [], loc: { line: 1, col: 1 } }], - }); - const rows = buildRunTreeRows(mod); - assert.equal(rows.length, 1); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[0].isRoot, true); -}); - -test("buildRunTreeRows: includes nested steps", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "sub", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }, - { - name: "sub", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello", loc: { line: 5, col: 3 } }, - ], - loc: { line: 4, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod); - assert.equal(rows.length, 3); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "workflow sub"); - assert.equal(rows[2].rawLabel, "ℹ hello"); -}); - -test("buildRunTreeRows: does not re-expand visited workflows", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "shared", loc: { line: 2, col: 3 } } }, - { type: "run", workflow: { value: "other", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }, - { - name: "shared", - comments: [], - params: [], - steps: [ - { type: "log", message: "in shared", loc: { line: 6, col: 3 } }, - ], - loc: { line: 5, col: 1 }, - }, - { - name: "other", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "shared", loc: { line: 9, col: 3 } } }, - ], - loc: { line: 8, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod); - const sharedRows = rows.filter((r) => r.rawLabel === "workflow shared"); - // "shared" appears twice in the tree (once expanded, once not re-expanded) - assert.equal(sharedRows.length, 2); - // But "in shared" log only appears once (not re-expanded from "other") - const logRows = rows.filter((r) => r.rawLabel === "ℹ in shared"); - assert.equal(logRows.length, 1); -}); - -// --- formatElapsedDuration (additional) --- - -test("formatElapsedDuration: zero milliseconds", () => { - assert.equal(formatElapsedDuration(0), "0s"); -}); - -test("formatElapsedDuration: sub-second precision", () => { - assert.equal(formatElapsedDuration(50), "0.1s"); - assert.equal(formatElapsedDuration(999), "1s"); -}); - -// --- formatRunningBottomLine --- - -test("formatRunningBottomLine: contains RUNNING and workflow name", () => { - // In non-TTY test env, style functions return plain text - const result = formatRunningBottomLine("default", 1.5); - assert.ok(result.includes("RUNNING"), "should contain RUNNING"); - assert.ok(result.includes("workflow"), "should contain 'workflow'"); - assert.ok(result.includes("default"), "should contain workflow name"); - assert.ok(result.includes("1.5s"), "should contain elapsed time"); -}); - -test("formatRunningBottomLine: formats elapsed with one decimal", () => { - const result = formatRunningBottomLine("deploy", 10.0); - assert.ok(result.includes("10.0s"), "should show one decimal place"); -}); - -// --- collectWorkflowChildren: catch blocks --- - -test("collectWorkflowChildren: run step with single catch includes recovery items", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "run", - workflow: { value: "deploy", loc: { line: 2, col: 3 } }, - catch: { - single: { type: "log", message: "recovering", loc: { line: 3, col: 5 } }, - bindings: { failure: "err" }, - }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); + assert.equal(items[0].label, "async workflow deploy"); +}); + +test("collectWorkflowChildren: collects ensure step as rule row", () => { + const mod = modFor([ + "rule gate() {", + " return \"ok\"", + "}", + "workflow default() {", + " ensure gate()", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 2); - assert.equal(items[0].label, "workflow deploy"); - assert.equal(items[1].label, "ℹ recovering"); + assert.equal(items[0].label, "rule gate"); }); -test("collectWorkflowChildren: run step with block catch includes all recovery items", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "run", - workflow: { value: "deploy", loc: { line: 2, col: 3 } }, - catch: { - block: [ - { type: "log", message: "retrying", loc: { line: 3, col: 5 } }, - { type: "run", workflow: { value: "fallback", loc: { line: 4, col: 5 } } }, - ], - bindings: { failure: "err" }, - }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects prompt step with preview", () => { + const mod = modFor([ + "workflow default() {", + ' prompt "Pick one"', + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 3); - assert.equal(items[0].label, "workflow deploy"); - assert.equal(items[1].label, "ℹ retrying"); - assert.equal(items[2].label, "workflow fallback"); + assert.equal(items[0].label, 'prompt "Pick one"'); }); -test("collectWorkflowChildren: ensure step with single catch includes recovery items", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "ensure", - ref: { value: "check", loc: { line: 2, col: 3 } }, - catch: { - single: { type: "run", workflow: { value: "fix_it", loc: { line: 3, col: 5 } } }, - bindings: { failure: "err" }, - }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects log / logerr / fail (say) rows", () => { + const mod = modFor([ + "workflow default() {", + ' log "ok"', + ' logerr "err"', + ' fail "boom"', + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 2); - assert.equal(items[0].label, "rule check"); - assert.equal(items[1].label, "workflow fix_it"); -}); - -test("collectWorkflowChildren: ensure step with block catch includes all recovery items", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "ensure", - ref: { value: "check", loc: { line: 2, col: 3 } }, - catch: { - block: [ - { type: "log", message: "check failed", loc: { line: 3, col: 5 } }, - { type: "fail", message: "unrecoverable", loc: { line: 4, col: 5 } }, - ], - bindings: { failure: "err" }, - }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); + assert.ok(items.some((i) => i.label.startsWith("ℹ "))); + assert.ok(items.some((i) => i.label.startsWith("! "))); + assert.ok(items.some((i) => i.label.startsWith("fail "))); +}); + +test("collectWorkflowChildren: collects send step", () => { + const mod = modFor([ + "channel ch", + "workflow default() {", + ' ch <- "hi"', + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 3); - assert.equal(items[0].label, "rule check"); - assert.equal(items[1].label, "ℹ check failed"); - assert.equal(items[2].label, "fail unrecoverable"); -}); - -// --- buildRunTreeRows: self-recursive workflows --- - -test("buildRunTreeRows: self-recursive workflow expands limited depth", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "log", message: "iteration", loc: { line: 2, col: 3 } }, - { type: "run", workflow: { value: "default", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - // Should have root + children, with limited recursion (not infinite) - assert.ok(rows.length >= 3, "should expand self-recursive workflow at least once"); - assert.ok(rows.length < 50, "should not expand infinitely"); - // First row is root - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[0].isRoot, true); - // Should contain "ℹ iteration" at least once - const logRows = rows.filter((r) => r.rawLabel === "ℹ iteration"); - assert.ok(logRows.length >= 1, "should show log from recursive workflow"); -}); - -test("buildRunTreeRows: workflow with two self-recursive sites", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "default", loc: { line: 2, col: 3 } } }, - { type: "log", message: "middle", loc: { line: 3, col: 3 } }, - { type: "run", workflow: { value: "default", loc: { line: 4, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - // Should terminate without infinite expansion - assert.ok(rows.length >= 3, "should produce tree rows"); - assert.ok(rows.length < 100, "should not expand infinitely"); + assert.ok(items.some((i) => i.label === "ch <- send")); }); -// --- collectWorkflowChildren: match_expr with run/ensure arms --- - -test("collectWorkflowChildren: const with match_expr containing run arm", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "const", - name: "result", - value: { - kind: "match_expr", - match: { - subject: "x", - arms: [ - { pattern: { kind: "string_literal", value: "a" }, body: 'run deploy("a")' }, - { pattern: { kind: "wildcard" }, body: '"fallback"' }, - ], - loc: { line: 3, col: 10 }, - }, - }, - loc: { line: 3, col: 3 }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects const and return rows", () => { + const mod = modFor([ + "workflow default() {", + ' const x = "hi"', + " return x", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 2); - assert.equal(items[0].label, "const result"); - assert.equal(items[1].label, "workflow deploy"); - assert.equal(items[1].nested, "deploy"); + assert.ok(items.some((i) => i.label === "const x")); + assert.ok(items.some((i) => i.label.startsWith("return "))); }); -test("collectWorkflowChildren: const with match_expr containing ensure arm", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "const", - name: "status", - value: { - kind: "match_expr", - match: { - subject: "x", - arms: [ - { pattern: { kind: "string_literal", value: "check" }, body: 'ensure gate()' }, - { pattern: { kind: "wildcard" }, body: '"skip"' }, - ], - loc: { line: 3, col: 10 }, - }, - }, - loc: { line: 3, col: 3 }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects inline script as 'script (inline)'", () => { + const mod = modFor([ + "workflow default() {", + " run `echo hi`()", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 2); - assert.equal(items[0].label, "const status"); - assert.equal(items[1].label, "rule gate"); - assert.equal(items[1].nested, "gate"); + assert.ok(items.some((i) => i.label === "script (inline)")); }); -test("collectWorkflowChildren: const with match_expr arm with no run/ensure", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "const", - name: "val", - value: { - kind: "match_expr", - match: { - subject: "x", - arms: [ - { pattern: { kind: "string_literal", value: "a" }, body: '"hello"' }, - { pattern: { kind: "wildcard" }, body: '"default"' }, - ], - loc: { line: 3, col: 10 }, - }, - }, - loc: { line: 3, col: 3 }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: collects shell step with $ prefix", () => { + const mod = modFor([ + "workflow default() {", + " echo hello", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 1); - assert.equal(items[0].label, "const val"); + assert.ok(items.some((i) => i.label.startsWith("$ "))); }); -// --- collectWorkflowChildren: run_inline_script --- - -test("collectWorkflowChildren: collects run_inline_script steps", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run_inline_script", body: "echo hello", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("collectWorkflowChildren: skips trivia (comments / blank lines)", () => { + const mod = modFor([ + "workflow default() {", + " # comment", + "", + ' log "hi"', + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); assert.equal(items.length, 1); - assert.equal(items[0].label, "script (inline)"); -}); - -// --- buildRunTreeRows: prefix/indentation --- - -test("buildRunTreeRows: grandchild rows are more indented than children", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "sub", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }, - { - name: "sub", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello", loc: { line: 5, col: 3 } }, - ], - loc: { line: 4, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod); - // Root and direct children share empty prefix; grandchildren are indented - assert.equal(rows[0].prefix, "", "root should have empty prefix"); - assert.equal(rows[1].prefix, "", "direct child inherits root prefix"); - assert.ok(rows[2].prefix.length > rows[1].prefix.length, "grandchild should be more indented than child"); -}); - -// --- buildRunTreeRows: cross-module imported workflows --- - -test("buildRunTreeRows: cross-module workflows are expanded from importedModules", () => { - const mainMod = minimalModule({ - imports: [{ path: "lib.jh", alias: "lib", loc: { line: 1, col: 1 } }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "lib.greet", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 2, col: 1 }, - }], - }); - const libMod = minimalModule({ - filePath: "lib.jh", - workflows: [{ - name: "greet", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello from lib", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const importedModules = new Map([["lib", libMod]]); - const rows = buildRunTreeRows(mainMod, undefined, importedModules); - // Should contain the imported workflow's children - const libLogRows = rows.filter((r) => r.rawLabel === "ℹ hello from lib"); - assert.equal(libLogRows.length, 1, "should expand imported workflow children"); -}); - -// --- formatElapsedDuration: exact boundary --- - -test("formatElapsedDuration: exactly 60000ms uses minute format", () => { - assert.equal(formatElapsedDuration(60000), "1m 0s"); -}); - -test("formatElapsedDuration: just under 60000ms uses seconds format", () => { - assert.equal(formatElapsedDuration(59999), "60s"); -}); - -// --- collectWorkflowChildren: stepFunc with symbols --- - -test("collectWorkflowChildren: run step with dotted ref populates stepFunc from symbols", () => { - const mod = minimalModule({ - imports: [{ path: "lib.jh", alias: "lib", loc: { line: 1, col: 1 } }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "lib.deploy", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 2, col: 1 }, - }], - }); - const symbols = new Map([["lib", "mylib"]]); - const items = collectWorkflowChildren(mod, "default", symbols); - assert.equal(items.length, 1); - assert.equal(items[0].stepFunc, "mylib::deploy"); -}); - -test("collectWorkflowChildren: run step with dotted ref falls back to alias when symbol missing", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "lib.deploy", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const symbols = new Map(); - const items = collectWorkflowChildren(mod, "default", symbols); - assert.equal(items[0].stepFunc, "lib::deploy"); -}); - -test("collectWorkflowChildren: run step with currentSymbol populates stepFunc", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "helper", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default", undefined, "main_mod"); - assert.equal(items[0].stepFunc, "main_mod::helper"); -}); - -test("collectWorkflowChildren: ensure step with dotted ref populates stepFunc from symbols", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "ensure", ref: { value: "lib.check", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const symbols = new Map([["lib", "mylib"]]); - const items = collectWorkflowChildren(mod, "default", symbols); - assert.equal(items[0].stepFunc, "mylib::check"); -}); - -test("collectWorkflowChildren: ensure step with currentSymbol populates stepFunc", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "ensure", ref: { value: "gate", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default", undefined, "main_mod"); - assert.equal(items[0].stepFunc, "main_mod::gate"); -}); - -test("collectWorkflowChildren: prompt step always has jaiph::prompt stepFunc", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: 'prompt "test"', loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); + assert.ok(items[0].label.startsWith("ℹ ")); +}); + +test("collectWorkflowChildren: const = match expression walks arms for run/ensure targets", () => { + const mod = modFor([ + "rule gate() {", + " return \"ok\"", + "}", + "workflow other() {", + " log \"o\"", + "}", + "workflow default(name) {", + " const result = match name {", + ' "x" => run other()', + ' _ => ensure gate()', + " }", + "}", + ].join("\n")); const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].stepFunc, "jaiph::prompt"); + // const row + workflow other row + rule gate row + assert.ok(items.some((i) => i.label === "const result")); + assert.ok(items.some((i) => i.label.startsWith("workflow other"))); + assert.ok(items.some((i) => i.label.startsWith("rule gate"))); }); -// --- buildRunTreeRows: self-recursion depth gating --- +// --- buildRunTreeRows --- -test("buildRunTreeRows: self-recursive workflow with three sites limits expansion", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "default", loc: { line: 2, col: 3 } } }, - { type: "log", message: "a", loc: { line: 3, col: 3 } }, - { type: "run", workflow: { value: "default", loc: { line: 4, col: 3 } } }, - { type: "log", message: "b", loc: { line: 5, col: 3 } }, - { type: "run", workflow: { value: "default", loc: { line: 6, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); +test("buildRunTreeRows: includes root and children", () => { + const mod = modFor([ + "workflow default() {", + " run deploy()", + "}", + "workflow deploy() {", + " log \"d\"", + "}", + ].join("\n")); const rows = buildRunTreeRows(mod); - // Should terminate without infinite expansion - assert.ok(rows.length >= 3, "should produce tree rows"); - assert.ok(rows.length < 200, "should not expand infinitely"); - // Root is first + assert.ok(rows.length >= 2); assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[0].isRoot, true); -}); - -// --- collectWorkflowChildren: prompt label formatting --- - -test("collectWorkflowChildren: prompt with escaped quotes in raw", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: 'prompt "say \\"hello\\""', loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - // The escaped quotes in raw should be handled: \" → " in content, then re-escaped for display - assert.match(items[0].label, /^prompt "/); }); -test("collectWorkflowChildren: prompt with no quotes in raw", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: "prompt myVar", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - // No quote found, preview is empty → label is just 'prompt ""' - assert.equal(items[0].label, 'prompt ""'); -}); - -// --- styleKeywordLabel / styleDim / styleYellow / styleBold --- -// In test env (non-TTY), these return plain text. We verify the non-TTY path. - -test("styleKeywordLabel: returns plain 'kind name' in non-TTY", () => { - const result = styleKeywordLabel("workflow deploy"); - assert.equal(result, "workflow deploy"); -}); - -test("styleKeywordLabel: handles single-word label", () => { - const result = styleKeywordLabel("wait"); - assert.equal(result, "step wait"); -}); - -test("styleDim: returns plain text in non-TTY", () => { - assert.equal(styleDim("hello"), "hello"); -}); - -test("styleYellow: returns plain text in non-TTY", () => { - assert.equal(styleYellow("warning"), "warning"); -}); - -test("styleBold: returns plain text in non-TTY", () => { - assert.equal(styleBold("title"), "title"); -}); - -test("collectWorkflowChildren: prompt with long text truncated at 24 chars", () => { - const longText = "A".repeat(30); - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: `prompt "${longText}"`, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.ok(items[0].label.includes("A".repeat(24) + "..."), "should truncate at 24 chars"); - assert.ok(!items[0].label.includes("A".repeat(25)), "should not contain more than 24 chars"); -}); - -// --- buildRunTreeRows: rootDir parameter --- - -test("buildRunTreeRows: rootDir populates symbols for imported modules", () => { - const mainMod = minimalModule({ - filePath: "/project/main.jh", - imports: [{ path: "lib.jh", alias: "lib", loc: { line: 1, col: 1 } }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "lib.greet", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 2, col: 1 }, - }], - }); - const libMod = minimalModule({ - filePath: "/project/lib.jh", - workflows: [{ - name: "greet", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const importedModules = new Map([["lib", libMod]]); - const rows = buildRunTreeRows(mainMod, undefined, importedModules, "/project"); - // With rootDir, symbols should be resolved; the run step should have a stepFunc - const runRow = rows.find((r) => r.rawLabel === "workflow lib.greet"); - assert.ok(runRow, "should have the imported workflow row"); - assert.ok(runRow!.stepFunc, "stepFunc should be populated when rootDir is given"); -}); - -// --- buildRunTreeRows: custom rootLabel --- - -test("buildRunTreeRows: custom rootLabel appears in root row", () => { - const mod = minimalModule({ - workflows: [{ name: "deploy", comments: [], params: [], steps: [], loc: { line: 1, col: 1 } }], - }); - const rows = buildRunTreeRows(mod, "workflow deploy"); - assert.equal(rows.length, 1); - assert.equal(rows[0].rawLabel, "workflow deploy"); - assert.equal(rows[0].isRoot, true); -}); - -test("buildRunTreeRows: custom rootLabel with rule kind", () => { - const mod = minimalModule({ - workflows: [{ name: "check", comments: [], params: [], steps: [], loc: { line: 1, col: 1 } }], - }); - const rows = buildRunTreeRows(mod, "rule check"); - assert.equal(rows[0].rawLabel, "rule check"); - assert.equal(rows[0].isRoot, true); -}); - -test("buildRunTreeRows: custom rootLabel preserves tree children", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod, "workflow main_entry"); - assert.equal(rows.length, 2); - assert.equal(rows[0].rawLabel, "workflow main_entry"); - assert.equal(rows[1].rawLabel, "ℹ hello"); -}); - -// --- formatRunningBottomLine: edge cases --- - -test("formatRunningBottomLine: zero elapsed time", () => { - const result = formatRunningBottomLine("test", 0.0); - assert.ok(result.includes("RUNNING"), "should contain RUNNING"); - assert.ok(result.includes("0.0s"), "should show zero time"); -}); - -test("formatRunningBottomLine: large elapsed time", () => { - const result = formatRunningBottomLine("deploy", 999.9); - assert.ok(result.includes("999.9s"), "should show large time"); -}); - -// --- collectWorkflowChildren: shell command truncation boundary --- - -test("collectWorkflowChildren: shell command at exactly 56 chars is not truncated", () => { - const cmd = "a".repeat(56); - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "shell", command: cmd, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, `$ ${cmd}`, "56-char command should not be truncated"); - assert.ok(!items[0].label.includes("..."), "should not have ellipsis"); -}); - -test("collectWorkflowChildren: shell command at 57 chars is truncated", () => { - const cmd = "b".repeat(57); - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "shell", command: cmd, loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.ok(items[0].label.includes("..."), "57-char command should be truncated"); - assert.equal(items[0].label, `$ ${"b".repeat(53)}...`); -}); - -test("collectWorkflowChildren: shell command at 1 char is not truncated", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "shell", command: "x", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items[0].label, "$ x"); -}); - -// --- style functions: TTY and NO_COLOR paths --- - -test("styleKeywordLabel: returns ANSI bold kind when TTY and no NO_COLOR", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - delete process.env.NO_COLOR; - const result = styleKeywordLabel("workflow deploy"); - assert.ok(result.includes("\u001b[1mworkflow\u001b[0m"), "kind should be bold in TTY mode"); - assert.ok(result.includes("deploy"), "name should be present"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); - -test("styleKeywordLabel: returns plain text when NO_COLOR is set", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - process.env.NO_COLOR = "1"; - const result = styleKeywordLabel("workflow deploy"); - assert.equal(result, "workflow deploy", "should return plain text with NO_COLOR"); - assert.ok(!result.includes("\u001b["), "should not contain ANSI codes"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); - -test("styleDim: returns ANSI dim when TTY and no NO_COLOR", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - delete process.env.NO_COLOR; - const result = styleDim("hello"); - assert.equal(result, "\u001b[2mhello\u001b[0m", "should wrap in dim ANSI"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); - -test("styleDim: returns plain text when NO_COLOR is set", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - process.env.NO_COLOR = ""; - const result = styleDim("hello"); - assert.equal(result, "hello", "should return plain text with NO_COLOR"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); - -test("styleYellow: returns ANSI yellow when TTY and no NO_COLOR", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - delete process.env.NO_COLOR; - const result = styleYellow("warning"); - assert.equal(result, "\u001b[33mwarning\u001b[0m", "should wrap in yellow ANSI"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); - -test("styleYellow: returns plain text when NO_COLOR is set", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; - try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - process.env.NO_COLOR = "1"; - const result = styleYellow("warning"); - assert.equal(result, "warning", "should return plain text with NO_COLOR"); - } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; - } -}); +// --- style helpers (no-color paths) --- -test("styleBold: returns ANSI bold when TTY and no NO_COLOR", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; +test("styleKeywordLabel: returns plain text when no TTY", () => { + const prev = process.stdout.isTTY; + Object.defineProperty(process.stdout, "isTTY", { value: false, configurable: true }); try { - Object.defineProperty(process.stdout, "isTTY", { value: true, writable: true, configurable: true }); - delete process.env.NO_COLOR; - const result = styleBold("title"); - assert.equal(result, "\u001b[1mtitle\u001b[0m", "should wrap in bold ANSI"); + assert.equal(styleKeywordLabel("workflow default"), "workflow default"); } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; + Object.defineProperty(process.stdout, "isTTY", { value: prev, configurable: true }); } }); -test("styleBold: returns plain text when not TTY", () => { - const origIsTTY = process.stdout.isTTY; - const origNoColor = process.env.NO_COLOR; +test("styleDim / styleYellow / styleBold: no-color when not TTY", () => { + const prev = process.stdout.isTTY; + Object.defineProperty(process.stdout, "isTTY", { value: false, configurable: true }); try { - Object.defineProperty(process.stdout, "isTTY", { value: false, writable: true, configurable: true }); - delete process.env.NO_COLOR; - const result = styleBold("title"); - assert.equal(result, "title", "should return plain text when not TTY"); + assert.equal(styleDim("x"), "x"); + assert.equal(styleYellow("x"), "x"); + assert.equal(styleBold("x"), "x"); } finally { - Object.defineProperty(process.stdout, "isTTY", { value: origIsTTY, writable: true, configurable: true }); - if (origNoColor !== undefined) process.env.NO_COLOR = origNoColor; - else delete process.env.NO_COLOR; + Object.defineProperty(process.stdout, "isTTY", { value: prev, configurable: true }); } }); -// --- buildRunTreeRows: selfRecursiveRunSiteCount returns 0 for missing workflow --- - -test("buildRunTreeRows: non-existent nested workflow reference is handled gracefully", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "nonexistent", loc: { line: 2, col: 3 } } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - // Should have root + the run step reference, but no children expanded since workflow doesn't exist - assert.equal(rows.length, 2); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "workflow nonexistent"); -}); - -test("collectWorkflowChildren: returns empty for workflow with no matching name", () => { - const mod = minimalModule({ - workflows: [{ - name: "other", - comments: [], - params: [], - steps: [ - { type: "log", message: "hello", loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "nonexistent"); - assert.deepStrictEqual(items, []); -}); - -// --- collectWorkflowChildren: prompt with multiline whitespace raw --- - -test("collectWorkflowChildren: prompt with triple-quoted raw (no double quote) returns empty preview", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "prompt", raw: 'prompt """\nHello\n"""', loc: { line: 2, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const items = collectWorkflowChildren(mod, "default"); - assert.equal(items.length, 1); - // The promptPreviewFromRaw picks up text between the first pair of double quotes - // In triple-quote form, first " starts at index 7, second " is immediately after → empty content - // Then third " triggers break → empty preview - assert.match(items[0].label, /^prompt "/); -}); - -// --- buildRunTreeRows: channels without routes don't produce tree nodes --- - -test("buildRunTreeRows: channel without routes adds no tree rows", () => { - const mod = minimalModule({ - channels: [{ - name: "events", - loc: { line: 1, col: 9 }, - }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [{ type: "log", message: "ok", loc: { line: 3, col: 3 } }], - loc: { line: 2, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - assert.equal(rows.length, 2); // root + log - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "ℹ ok"); -}); - -// --- buildRunTreeRows: imported module not found falls through gracefully --- - -test("buildRunTreeRows: imported module alias not in importedModules map is not expanded", () => { - const mainMod = minimalModule({ - imports: [{ path: "lib.jh", alias: "lib", loc: { line: 1, col: 1 } }], - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "run", workflow: { value: "lib.greet", loc: { line: 3, col: 3 } } }, - ], - loc: { line: 2, col: 1 }, - }], - }); - // Pass empty importedModules — alias "lib" not resolved - const importedModules = new Map(); - const rows = buildRunTreeRows(mainMod, undefined, importedModules); - // Should still have root + the run step reference, but not expanded - assert.equal(rows.length, 2); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "workflow lib.greet"); -}); - -// --- buildRunTreeRows: match_expr arm expansion --- - -test("buildRunTreeRows: match arm with run body expands nested workflow", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { - type: "const", - name: "result", - value: { - kind: "match_expr", - match: { - subject: "x", - arms: [ - { pattern: { kind: "string_literal", value: "a" }, body: 'run deploy("a")' }, - { pattern: { kind: "wildcard" }, body: '"fallback"' }, - ], - loc: { line: 3, col: 3 }, - }, - }, - loc: { line: 2, col: 3 }, - }, - ], - loc: { line: 1, col: 1 }, - }, - { - name: "deploy", - comments: [], - params: [], - steps: [ - { type: "log", message: "deploying", loc: { line: 8, col: 3 } }, - ], - loc: { line: 7, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod); - // root + const result + workflow deploy (from match arm) + log deploying (expanded) - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "const result"); - assert.equal(rows[2].rawLabel, "workflow deploy"); - assert.equal(rows[3].rawLabel, "ℹ deploying"); - assert.equal(rows.length, 4); -}); - -test("buildRunTreeRows: match arm with ensure body shows rule in tree", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { - type: "const", - name: "status", - value: { - kind: "match_expr", - match: { - subject: "mode", - arms: [ - { pattern: { kind: "string_literal", value: "strict" }, body: 'ensure gate()' }, - { pattern: { kind: "wildcard" }, body: '"skip"' }, - ], - loc: { line: 3, col: 3 }, - }, - }, - loc: { line: 2, col: 3 }, - }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "const status"); - assert.equal(rows[2].rawLabel, "rule gate"); - assert.equal(rows.length, 3); -}); - -// --- buildRunTreeRows: mixed step types in tree --- - -test("buildRunTreeRows: workflow with multiple step types produces correct tree", () => { - const mod = minimalModule({ - workflows: [{ - name: "default", - comments: [], - params: [], - steps: [ - { type: "log", message: "starting", loc: { line: 2, col: 3 } }, - { type: "run", workflow: { value: "helper", loc: { line: 3, col: 3 } } }, - { type: "ensure", ref: { value: "check", loc: { line: 4, col: 3 } } }, - { type: "send", channel: "events", rhs: { kind: "literal", token: '"data"' }, loc: { line: 5, col: 3 } }, - { type: "fail", message: '"reason"', loc: { line: 6, col: 3 } }, - ], - loc: { line: 1, col: 1 }, - }], - }); - const rows = buildRunTreeRows(mod); - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "ℹ starting"); - assert.equal(rows[2].rawLabel, "workflow helper"); - assert.equal(rows[3].rawLabel, "rule check"); - assert.equal(rows[4].rawLabel, "events <- send"); - assert.equal(rows[5].rawLabel, 'fail "reason"'); - assert.equal(rows.length, 6); -}); - -// --- buildRunTreeRows: run with catch block in tree --- - -test("buildRunTreeRows: run with catch block shows recovery steps in tree", () => { - const mod = minimalModule({ - workflows: [ - { - name: "default", - comments: [], - params: [], - steps: [ - { - type: "run", - workflow: { value: "risky", loc: { line: 2, col: 3 } }, - catch: { - bindings: { failure: "err" }, - block: [ - { type: "log", message: "recovering", loc: { line: 4, col: 5 } }, - { type: "run", workflow: { value: "fallback", loc: { line: 5, col: 5 } } }, - ], - }, - }, - ], - loc: { line: 1, col: 1 }, - }, - { - name: "risky", - comments: [], - params: [], - steps: [{ type: "log", message: "trying", loc: { line: 8, col: 3 } }], - loc: { line: 7, col: 1 }, - }, - ], - }); - const rows = buildRunTreeRows(mod); - // root + workflow risky + log trying (expanded) + log recovering (catch) + workflow fallback (catch) - assert.equal(rows[0].rawLabel, "workflow default"); - assert.equal(rows[1].rawLabel, "workflow risky"); - // risky is expanded since it has children - assert.equal(rows[2].rawLabel, "ℹ trying"); - // catch block children - assert.equal(rows[3].rawLabel, "ℹ recovering"); - assert.equal(rows[4].rawLabel, "workflow fallback"); - assert.equal(rows.length, 5); +test("formatRunningBottomLine: renders status with elapsed", () => { + const line = formatRunningBottomLine("default", 1.5); + assert.ok(line.includes("default")); + assert.ok(line.includes("1.5s")); }); diff --git a/src/cli/run/progress.ts b/src/cli/run/progress.ts index 86aeaaa3..546c7aac 100644 --- a/src/cli/run/progress.ts +++ b/src/cli/run/progress.ts @@ -1,5 +1,5 @@ import { resolve } from "node:path"; -import { jaiphModule, type WorkflowStepDef } from "../../types"; +import { jaiphModule, type Expr, type WorkflowStepDef } from "../../types"; import { workflowSymbolForFile } from "../../transpiler"; export type TreeRow = { @@ -44,7 +44,7 @@ function selfRecursiveRunSiteCount(mod: jaiphModule, workflowName: string): numb } let count = 0; for (const step of workflow.steps) { - if (step.type === "run" && step.workflow.value === workflowName) { + if (step.type === "exec" && step.body.kind === "call" && step.body.callee.value === workflowName) { count += 1; continue; } @@ -52,6 +52,18 @@ function selfRecursiveRunSiteCount(mod: jaiphModule, workflowName: string): numb return count; } +/** Short surface label for an Expr value (used in `return` / `const` rows). */ +function exprLabel(expr: Expr): string { + if (expr.kind === "literal") return expr.raw; + if (expr.kind === "call") return `run ${expr.callee.value}(...)`; + if (expr.kind === "ensure_call") return `ensure ${expr.callee.value}(...)`; + if (expr.kind === "inline_script") return "run `...`(...)"; + if (expr.kind === "prompt") return `prompt ${expr.raw}`; + if (expr.kind === "match") return `match ${expr.match.subject}`; + if (expr.kind === "shell") return expr.command; + return expr.ref.value; +} + export function collectWorkflowChildren( mod: jaiphModule, workflowName: string, @@ -63,81 +75,77 @@ export function collectWorkflowChildren( return []; } const items: Array<{ label: string; nested?: string; stepFunc?: string }> = []; + const refStepFunc = (ref: string): string | undefined => + symbols && ref.includes(".") + ? (() => { + const dot = ref.indexOf("."); + const alias = ref.slice(0, dot); + const name = ref.slice(dot + 1); + return `${symbols.get(alias) ?? alias}::${name}`; + })() + : currentSymbol + ? `${currentSymbol}::${ref}` + : undefined; const stepToItems = (s: WorkflowStepDef): Array<{ label: string; nested?: string; stepFunc?: string }> => { - if (s.type === "run") { - const wf = s.workflow.value; - const asyncPrefix = s.async ? "async " : ""; - const stepFunc = - symbols && wf.includes(".") - ? (() => { - const dot = wf.indexOf("."); - const alias = wf.slice(0, dot); - const name = wf.slice(dot + 1); - return `${symbols.get(alias) ?? alias}::${name}`; - })() - : currentSymbol - ? `${currentSymbol}::${wf}` - : undefined; - const arr: Array<{ label: string; nested?: string; stepFunc?: string }> = [ - { label: `${asyncPrefix}workflow ${wf}`, nested: wf, stepFunc }, - ]; - if (s.recover) { - const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; - for (const r of steps) { - arr.push(...stepToItems(r)); - } - } else if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - for (const r of steps) { - arr.push(...stepToItems(r)); + if (s.type === "exec") { + const body = s.body; + if (body.kind === "call") { + const wf = body.callee.value; + const asyncPrefix = body.async ? "async " : ""; + const arr: Array<{ label: string; nested?: string; stepFunc?: string }> = [ + { label: `${asyncPrefix}workflow ${wf}`, nested: wf, stepFunc: refStepFunc(wf) }, + ]; + if (s.recover) { + const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; + for (const r of steps) arr.push(...stepToItems(r)); + } else if (s.catch) { + const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; + for (const r of steps) arr.push(...stepToItems(r)); } + return arr; } - return arr; - } - if (s.type === "ensure") { - const ref = s.ref.value; - const stepFunc = - symbols && ref.includes(".") - ? (() => { - const dot = ref.indexOf("."); - const alias = ref.slice(0, dot); - const name = ref.slice(dot + 1); - return `${symbols.get(alias) ?? alias}::${name}`; - })() - : currentSymbol - ? `${currentSymbol}::${ref}` - : undefined; - const arr: Array<{ label: string; nested?: string; stepFunc?: string }> = [ - { label: `rule ${ref}`, stepFunc }, - ]; - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - for (const r of steps) { - arr.push(...stepToItems(r)); + if (body.kind === "ensure_call") { + const ref = body.callee.value; + const arr: Array<{ label: string; nested?: string; stepFunc?: string }> = [ + { label: `rule ${ref}`, stepFunc: refStepFunc(ref) }, + ]; + if (s.catch) { + const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; + for (const r of steps) arr.push(...stepToItems(r)); } + return arr; } - return arr; - } - if (s.type === "prompt") { - return [{ label: formatPromptLabel(s.raw), stepFunc: "jaiph::prompt" }]; - } - if (s.type === "log") { - return [{ label: `ℹ ${s.message}` }]; + if (body.kind === "prompt") { + return [{ label: formatPromptLabel(body.raw), stepFunc: "jaiph::prompt" }]; + } + if (body.kind === "inline_script") { + return [{ label: "script (inline)" }]; + } + if (body.kind === "shell") { + const t = body.command.trim(); + const label = t.length > 56 ? `${t.slice(0, 53)}...` : t; + return [{ label: `$ ${label}` }]; + } + if (body.kind === "match") { + // standalone match — no nested rendering + return []; + } + return []; } - if (s.type === "logerr") { - return [{ label: `! ${s.message}` }]; + if (s.type === "say") { + const msg = exprLabel(s.message); + if (s.level === "log") return [{ label: `ℹ ${msg}` }]; + if (s.level === "logerr") return [{ label: `! ${msg}` }]; + return [{ label: `fail ${msg}` }]; } if (s.type === "send") { return [{ label: `${s.channel} <- send` }]; } - if (s.type === "fail") { - return [{ label: `fail ${s.message}` }]; - } if (s.type === "const") { const constItems: Array<{ label: string; nested?: string; stepFunc?: string }> = [ { label: `const ${s.name}` }, ]; - if (s.value.kind === "match_expr") { + if (s.value.kind === "match") { for (const arm of s.value.match.arms) { const body = arm.body.trimStart(); const runM = body.match(/^run\s+([A-Za-z_][A-Za-z0-9_.]*)\(/); @@ -154,19 +162,11 @@ export function collectWorkflowChildren( return constItems; } if (s.type === "return") { - return [{ label: `return ${s.value}` }]; + return [{ label: `return ${exprLabel(s.value)}` }]; } - if (s.type === "comment") { + if (s.type === "trivia") { return []; } - if (s.type === "run_inline_script") { - return [{ label: "script (inline)" }]; - } - if (s.type === "shell") { - const t = s.command.trim(); - const label = t.length > 56 ? `${t.slice(0, 53)}...` : t; - return [{ label: `$ ${label}` }]; - } return []; }; @@ -179,68 +179,7 @@ export function collectWorkflowChildren( } for (const step of workflow.steps) { - if (step.type === "ensure") { - items.push(...stepToItems(step)); - continue; - } - if (step.type === "run") { - const wf = step.workflow.value; - const asyncPrefix = step.async ? "async " : ""; - const stepFunc = - symbols && wf.includes(".") - ? (() => { - const dot = wf.indexOf("."); - const alias = wf.slice(0, dot); - const name = wf.slice(dot + 1); - return `${symbols.get(alias) ?? alias}::${name}`; - })() - : currentSymbol - ? `${currentSymbol}::${wf}` - : undefined; - items.push(...stepToItems(step)); - continue; - } - if (step.type === "run_inline_script") { - items.push({ label: "script (inline)" }); - continue; - } - if (step.type === "prompt") { - items.push({ label: formatPromptLabel(step.raw), stepFunc: "jaiph::prompt" }); - continue; - } - if (step.type === "log") { - items.push({ label: `ℹ ${step.message}` }); - continue; - } - if (step.type === "logerr") { - items.push({ label: `! ${step.message}` }); - continue; - } - if (step.type === "send") { - items.push({ label: `${step.channel} <- send` }); - continue; - } - if (step.type === "fail") { - items.push({ label: `fail ${step.message}` }); - continue; - } - if (step.type === "const") { - items.push(...stepToItems(step)); - continue; - } - if (step.type === "return") { - items.push({ label: `return ${step.value}` }); - continue; - } - if (step.type === "comment") { - continue; - } - if (step.type === "shell") { - const t = step.command.trim(); - const label = t.length > 56 ? `${t.slice(0, 53)}...` : t; - items.push({ label: `$ ${label}` }); - continue; - } + items.push(...stepToItems(step)); } return items; } diff --git a/src/format/emit.ts b/src/format/emit.ts index 66175e3a..80b73f5b 100644 --- a/src/format/emit.ts +++ b/src/format/emit.ts @@ -1,9 +1,8 @@ import type { Arg, + Expr, jaiphModule, WorkflowStepDef, - ConstRhs, - SendRhsDef, WorkflowDef, RuleDef, ScriptDef, @@ -22,12 +21,10 @@ export interface EmitOptions { const DEFAULT_OPTIONS: EmitOptions = { indent: 2 }; -/** Lookup helper: trivia entry for a node, with safe empty default. */ function tn(trivia: Trivia, node: object): NodeTrivia { return trivia.getNode(node) ?? {}; } -/** When `topLevelOrder` is missing (hand-built AST), match pre–source-order emit behavior. */ function legacyTopLevelOrder(mod: jaiphModule): TopLevelEmitOrder[] { const o: TopLevelEmitOrder[] = []; if (mod.envDecls) { @@ -53,7 +50,6 @@ export function emitModule( triviaOrOpts: Trivia | EmitOptions = createTrivia(), optsArg?: EmitOptions, ): string { - // Backwards-compatible: callers may pass (mod, opts) when they don't care about trivia. let trivia: Trivia; let opts: EmitOptions; if (triviaOrOpts instanceof Object && "indent" in triviaOrOpts && !("getModule" in triviaOrOpts)) { @@ -67,9 +63,6 @@ export function emitModule( const pad = " ".repeat(opts.indent); const modTrivia = trivia.getModule(); - // Shebang — we don't store it in the AST, so the caller must prepend it if needed. - // (handled by the format command reading the first line of the original source) - const importLines: string[] = []; if (mod.scriptImports) { for (const si of mod.scriptImports) { @@ -148,7 +141,6 @@ export function emitModule( return sections.join("\n\n") + "\n"; } -/** Emit lines for one `key = value` inside `config { }` (matches canonical value formatting). */ function emitConfigKeyLines(meta: WorkflowMetadata, key: string, pad: string): string[] { switch (key) { case "agent.default_model": @@ -179,8 +171,6 @@ function emitConfigKeyLines(meta: WorkflowMetadata, key: string, pad: string): s if (meta.run?.recoverLimit === undefined) return []; return [`${pad}run.recover_limit = ${meta.run.recoverLimit}`]; case "runtime.docker_enabled": - // runtime.docker_enabled was removed; skip silently for back-compat with - // any cached AST that still carries the key in configBodySequence. return []; case "runtime.docker_image": if (meta.runtime?.dockerImage === undefined) return []; @@ -248,7 +238,6 @@ function emitConfig(meta: WorkflowMetadata, pad: string, trivia: Trivia): string return lines.join("\n"); } -/** Top-level `const` RHS: bare slugs, JSON string, or triple-quoted when `"` / `\\` would break double-quote round-trip. */ function emitEnvDecl(env: EnvDeclDef): string[] { if (env.value.includes("\n")) { const lines = [`const ${env.name} = """`]; @@ -271,7 +260,6 @@ function emitComments(comments: string[]): string[] { return comments.map((c) => (c.startsWith("#") ? c : `# ${c}`)); } -/** One section string: consecutive `#` lines stay single-spaced (module sections join with blank lines). */ function emitCommentBlock(comments: string[]): string { return emitComments(comments).join("\n"); } @@ -334,9 +322,8 @@ function emitChannel(ch: ChannelDef): string { return `channel ${ch.name}`; } -/** `log` / `logerr` message: bare identifier form vs JSON-string form (matches parse storage). */ -function emitLogMessageRhs(message: string): string { - // Parser stores bare `log name` as the literal string `${name}` (interpolation sentinel). +/** Bare-identifier form for `log ` / `logerr `. */ +function emitLogLiteralRhs(message: string): string { if ( message.length >= 3 && message[0] === "$" && @@ -359,17 +346,11 @@ function emitSteps(steps: WorkflowStepDef[], pad: string, currentIndent: string, return lines; } -/** - * Render `Arg[]` back as comma-separated source form. Each `var` becomes the bare name - * and each `literal` is emitted as authored (already in source form, including nested - * `run …` / `ensure …` calls and inline-script bodies). - */ function formatArgs(args: Arg[] | undefined): string { if (!args || args.length === 0) return ""; return args.map((a) => (a.kind === "var" ? a.name : a.raw)).join(", "); } -/** Emit inline script form: `prefix \`body\`(args)` or fenced block. */ function emitInlineScriptLines( prefix: string, body: string, @@ -391,10 +372,7 @@ function emitInlineScriptLines( } function emitRef(ref: { value: string }, args: Arg[] | undefined): string { - if (args !== undefined) { - return `${ref.value}(${formatArgs(args)})`; - } - return `${ref.value}()`; + return `${ref.value}(${formatArgs(args)})`; } function emitMatchPattern(p: import("../types").MatchPatternDef): string { @@ -405,7 +383,6 @@ function emitMatchPattern(p: import("../types").MatchPatternDef): string { function emitMatchArm(arm: import("../types").MatchArmDef, armIndent: string, bodyIndent: string): string[] { const patStr = emitMatchPattern(arm.pattern); - // Multiline body (triple-quoted): body stored as "line1\nline2" with outer quotes and actual newlines. if (arm.body.startsWith('"') && arm.body.endsWith('"') && arm.body.includes("\n")) { const inner = arm.body.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); const lines: string[] = [`${armIndent}${patStr} => """`]; @@ -418,54 +395,156 @@ function emitMatchArm(arm: import("../types").MatchArmDef, armIndent: string, bo return [`${armIndent}${patStr} => ${arm.body}`]; } +/** + * Emit an `Expr` as it would appear after a `=` / `<-` / `return` / `log` etc. + * Multi-line value forms (inline-script fenced bodies, triple-quoted literals, + * match arm blocks, triple-quoted prompts) return additional lines via the + * `tail` array so the caller can append them at the right indent level. + */ +function emitExprFirstLine( + expr: Expr, + trivia: Trivia, + ci: string, + pad: string, +): { head: string; tail: string[] } { + const valueTrivia = tn(trivia, expr); + if (expr.kind === "literal") { + if (valueTrivia.tripleQuoted) { + const inner = valueTrivia.rawBody ?? expr.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + const tail: string[] = []; + for (const bl of inner.split("\n")) tail.push(bl); + tail.push(`${ci}"""`); + return { head: '"""', tail }; + } + if (valueTrivia.bareSource) { + return { head: valueTrivia.bareSource, tail: [] }; + } + return { head: expr.raw, tail: [] }; + } + if (expr.kind === "call") { + const asyncMod = expr.async ? "async " : ""; + return { head: `run ${asyncMod}${emitRef(expr.callee, expr.args)}`, tail: [] }; + } + if (expr.kind === "ensure_call") { + return { head: `ensure ${emitRef(expr.callee, expr.args)}`, tail: [] }; + } + if (expr.kind === "inline_script") { + if (expr.lang || expr.body.includes("\n")) { + const langTag = expr.lang ?? ""; + const tail: string[] = []; + for (const bl of expr.body.split("\n")) tail.push(bl); + tail.push(`${ci}\`\`\`(${formatArgs(expr.args)})`); + return { head: `run \`\`\`${langTag}`, tail }; + } + return { head: `run \`${expr.body}\`(${formatArgs(expr.args)})`, tail: [] }; + } + if (expr.kind === "prompt") { + const returns = expr.returns ? ` returns "${expr.returns}"` : ""; + if (valueTrivia.bodyKind === "identifier" && valueTrivia.bodyIdentifier) { + return { head: `prompt ${valueTrivia.bodyIdentifier}${returns}`, tail: [] }; + } + if (valueTrivia.bodyKind === "triple_quoted") { + const inner = valueTrivia.rawBody ?? expr.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + const tail: string[] = []; + for (const bl of inner.split("\n")) tail.push(bl); + tail.push(`${ci}"""`); + if (expr.returns) { + tail.push(`${ci}returns "${expr.returns}"`); + } + return { head: 'prompt """', tail }; + } + return { head: `prompt ${expr.raw}${returns}`, tail: [] }; + } + if (expr.kind === "match") { + const tail: string[] = []; + for (const arm of expr.match.arms) { + tail.push(...emitMatchArm(arm, `${ci}${pad}`, ci)); + } + tail.push(`${ci}}`); + return { head: `match ${expr.match.subject} {`, tail }; + } + if (expr.kind === "shell") { + return { head: expr.command, tail: [] }; + } + // bare_ref + return { head: expr.ref.value, tail: [] }; +} + function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, trivia: Trivia): string[] { const lines: string[] = []; const ci = currentIndent; - const stepTrivia = tn(trivia, step); - switch (step.type) { - case "blank_line": + if (step.type === "trivia") { + if (step.kind === "blank_line") { lines.push(""); - break; - - case "comment": - lines.push(`${ci}${step.text}`); - break; + } else { + lines.push(`${ci}${step.text ?? ""}`); + } + return lines; + } - case "shell": { - if (step.captureName) { - lines.push(`${ci}${step.captureName} = ${step.command}`); + if (step.type === "say") { + const message = step.message; + if (step.level === "fail") { + // fail always takes a literal message; preserve triple-quoted form when present. + const msgTrivia = tn(trivia, message); + if (message.kind === "literal" && msgTrivia.tripleQuoted) { + const inner = msgTrivia.rawBody ?? message.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + lines.push(`${ci}fail """`); + for (const bl of inner.split("\n")) lines.push(bl); + lines.push(`${ci}"""`); + } else if (message.kind === "literal") { + lines.push(`${ci}fail ${message.raw}`); } else { - lines.push(`${ci}${step.command}`); + const { head, tail } = emitExprFirstLine(message, trivia, ci, pad); + lines.push(`${ci}fail ${head}`); + lines.push(...tail); } - break; + return lines; } - - case "ensure": { - const ref = emitRef(step.ref, step.args); - const capture = step.captureName ? `${step.captureName} = ` : ""; - if (step.catch) { - const b = step.catch.bindings; - const bindStr = `(${b.failure})`; - if ("single" in step.catch) { - const recoverLines = emitStep(step.catch.single, pad, "", trivia); - const recoverText = recoverLines.map((l) => l.trim()).join("\n"); - lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} ${recoverText}`); - } else { - lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} {`); - lines.push(...emitSteps(step.catch.block, pad, ci + pad, trivia)); - lines.push(`${ci}}`); - } + const verb = step.level; + if (message.kind === "inline_script") { + lines.push(...emitInlineScriptLines(`${ci}${verb} run`, message.body, message.lang, message.args, ci)); + return lines; + } + if (message.kind === "literal") { + const msgTrivia = tn(trivia, message); + if (msgTrivia.tripleQuoted) { + const inner = msgTrivia.rawBody ?? message.raw; + lines.push(`${ci}${verb} """`); + for (const bl of inner.split("\n")) lines.push(bl); + lines.push(`${ci}"""`); } else { - lines.push(`${ci}${capture}ensure ${ref}`); + lines.push(`${ci}${verb} ${emitLogLiteralRhs(message.raw)}`); } - break; + return lines; } + // Fallback for any other Expr kind (shouldn't occur per validator). + const { head, tail } = emitExprFirstLine(message, trivia, ci, pad); + lines.push(`${ci}${verb} ${head}`); + lines.push(...tail); + return lines; + } - case "run": { - const ref = emitRef(step.workflow, step.args); - const capture = step.captureName ? `${step.captureName} = ` : ""; - const asyncPrefix = step.async ? "async " : ""; + if (step.type === "shell" as never) { + // Defensive: should never appear in the new AST (shell is an exec body kind). + return lines; + } + + if (step.type === "exec") { + const body = step.body; + if (body.kind === "shell") { + if (step.captureName) { + lines.push(`${ci}${step.captureName} = ${body.command}`); + } else { + lines.push(`${ci}${body.command}`); + } + return lines; + } + const capture = step.captureName ? `${step.captureName} = ` : ""; + if (body.kind === "call") { + const ref = emitRef(body.callee, body.args); + const asyncPrefix = body.async ? "async " : ""; if (step.recover) { const b = step.recover.bindings; const bindStr = `(${b.failure})`; @@ -493,263 +572,109 @@ function emitStep(step: WorkflowStepDef, pad: string, currentIndent: string, tri } else { lines.push(`${ci}${capture}run ${asyncPrefix}${ref}`); } - break; + return lines; } - - case "run_inline_script": { - const capture = step.captureName ? `${step.captureName} = ` : ""; - const argsStr = formatArgs(step.args); - if (step.lang || step.body.includes("\n")) { - const langTag = step.lang ?? ""; - lines.push(`${ci}${capture}run \`\`\`${langTag}`); - for (const bl of step.body.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}\`\`\`(${argsStr})`); - } else { - lines.push(`${ci}${capture}run \`${step.body}\`(${argsStr})`); - } - break; - } - - case "prompt": { - const capture = step.captureName ? `${step.captureName} = ` : ""; - const returns = step.returns ? ` returns "${step.returns}"` : ""; - const bodyKind = stepTrivia.bodyKind; - const bodyIdentifier = stepTrivia.bodyIdentifier; - if (bodyKind === "identifier" && bodyIdentifier) { - lines.push(`${ci}${capture}prompt ${bodyIdentifier}${returns}`); - } else if (bodyKind === "triple_quoted") { - const inner = stepTrivia.rawBody ?? step.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - lines.push(`${ci}${capture}prompt """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - if (step.returns) { - lines.push(`${ci}returns "${step.returns}"`); + if (body.kind === "ensure_call") { + const ref = emitRef(body.callee, body.args); + if (step.catch) { + const b = step.catch.bindings; + const bindStr = `(${b.failure})`; + if ("single" in step.catch) { + const recoverLines = emitStep(step.catch.single, pad, "", trivia); + const recoverText = recoverLines.map((l) => l.trim()).join("\n"); + lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} ${recoverText}`); + } else { + lines.push(`${ci}${capture}ensure ${ref} catch ${bindStr} {`); + lines.push(...emitSteps(step.catch.block, pad, ci + pad, trivia)); + lines.push(`${ci}}`); } } else { - lines.push(`${ci}${capture}prompt ${step.raw}${returns}`); + lines.push(`${ci}${capture}ensure ${ref}`); } - break; + return lines; } - - case "const": { - const valueTrivia = tn(trivia, step.value); - lines.push(`${ci}${emitConstStep(step.name, step.value, valueTrivia)}`); - // Handle multi-line inline script capture body - if (step.value.kind === "run_inline_script_capture" && - (step.value.lang || step.value.body.includes("\n"))) { - for (const bl of step.value.body.split("\n")) { - lines.push(bl); - } - const argsStr = formatArgs(step.value.args); + if (body.kind === "inline_script") { + const argsStr = formatArgs(body.args); + if (body.lang || body.body.includes("\n")) { + const langTag = body.lang ?? ""; + lines.push(`${ci}${capture}run \`\`\`${langTag}`); + for (const bl of body.body.split("\n")) lines.push(bl); lines.push(`${ci}\`\`\`(${argsStr})`); - } - // Handle multi-line triple-quoted prompt capture body - if (step.value.kind === "prompt_capture" && valueTrivia.bodyKind === "triple_quoted") { - const inner = valueTrivia.rawBody ?? step.value.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - if (step.value.returns) { - lines.push(`${ci}returns "${step.value.returns}"`); - } - } - // Handle match expression arms and closing brace - if (step.value.kind === "match_expr") { - for (const arm of step.value.match.arms) { - lines.push(...emitMatchArm(arm, `${ci}${pad}`, ci)); - } - lines.push(`${ci}}`); - } - // Handle multi-line triple-quoted expr (const name = """...""") - if (step.value.kind === "expr" && valueTrivia.tripleQuoted) { - const inner = valueTrivia.rawBody ?? step.value.bashRhs.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - } - break; - } - - case "fail": { - if (stepTrivia.tripleQuoted) { - const inner = stepTrivia.rawBody ?? step.message.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - lines.push(`${ci}fail """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - } else { - lines.push(`${ci}fail ${step.message}`); - } - break; - } - - case "log": - if (step.managed?.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}log run`, step.managed.body, step.managed.lang, step.managed.args, ci)); - } else if (stepTrivia.tripleQuoted) { - const inner = stepTrivia.rawBody ?? step.message; - lines.push(`${ci}log """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - } else { - lines.push(`${ci}log ${emitLogMessageRhs(step.message)}`); - } - break; - - case "logerr": - if (step.managed?.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}logerr run`, step.managed.body, step.managed.lang, step.managed.args, ci)); - } else if (stepTrivia.tripleQuoted) { - const inner = stepTrivia.rawBody ?? step.message; - lines.push(`${ci}logerr """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); } else { - lines.push(`${ci}logerr ${emitLogMessageRhs(step.message)}`); + lines.push(`${ci}${capture}run \`${body.body}\`(${argsStr})`); } - break; - - case "return": { - if (step.managed) { - if (step.managed.kind === "run") { - lines.push(`${ci}return run ${emitRef(step.managed.ref, step.managed.args)}`); - } else if (step.managed.kind === "ensure") { - lines.push(`${ci}return ensure ${emitRef(step.managed.ref, step.managed.args)}`); - } else if (step.managed.kind === "match") { - lines.push(`${ci}return match ${step.managed.match.subject} {`); - for (const arm of step.managed.match.arms) { - lines.push(...emitMatchArm(arm, `${ci}${pad}`, ci)); - } - lines.push(`${ci}}`); - } else if (step.managed.kind === "run_inline_script") { - lines.push(...emitInlineScriptLines(`${ci}return run`, step.managed.body, step.managed.lang, step.managed.args, ci)); - } - } else if (stepTrivia.bareSource) { - lines.push(`${ci}return ${stepTrivia.bareSource}`); - } else if (stepTrivia.tripleQuoted) { - const inner = stepTrivia.rawBody ?? step.value.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - lines.push(`${ci}return """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } - lines.push(`${ci}"""`); - } else { - lines.push(`${ci}return ${step.value}`); - } - break; + return lines; } - - case "send": { - const rhsTrivia = tn(trivia, step.rhs); - if (step.rhs.kind === "literal" && rhsTrivia.tripleQuoted) { - const inner = rhsTrivia.rawBody ?? step.rhs.token.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); - lines.push(`${ci}${step.channel} <- """`); - for (const bl of inner.split("\n")) { - lines.push(bl); - } + if (body.kind === "prompt") { + const bodyTrivia = tn(trivia, body); + const returns = body.returns ? ` returns "${body.returns}"` : ""; + if (bodyTrivia.bodyKind === "identifier" && bodyTrivia.bodyIdentifier) { + lines.push(`${ci}${capture}prompt ${bodyTrivia.bodyIdentifier}${returns}`); + } else if (bodyTrivia.bodyKind === "triple_quoted") { + const inner = bodyTrivia.rawBody ?? body.raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + lines.push(`${ci}${capture}prompt """`); + for (const bl of inner.split("\n")) lines.push(bl); lines.push(`${ci}"""`); + if (body.returns) lines.push(`${ci}returns "${body.returns}"`); } else { - const rhs = emitSendRhs(step.rhs); - lines.push(`${ci}${step.channel} <- ${rhs}`); + lines.push(`${ci}${capture}prompt ${body.raw}${returns}`); } - break; + return lines; } - - - case "match": { - lines.push(`${ci}match ${step.expr.subject} {`); - for (const arm of step.expr.arms) { + if (body.kind === "match") { + lines.push(`${ci}${capture}match ${body.match.subject} {`); + for (const arm of body.match.arms) { lines.push(...emitMatchArm(arm, `${ci}${pad}`, ci)); } lines.push(`${ci}}`); - break; + return lines; } + // bare_ref / literal — not valid as exec body, but handle defensively. + const { head, tail } = emitExprFirstLine(body, trivia, ci, pad); + lines.push(`${ci}${capture}${head}`); + lines.push(...tail); + return lines; + } - case "if": { - const operandStr = step.operand.kind === "string_literal" - ? `"${step.operand.value}"` - : `/${step.operand.source}/`; - lines.push(`${ci}if ${step.subject} ${step.operator} ${operandStr} {`); - lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); - lines.push(`${ci}}`); - break; - } + if (step.type === "const") { + const { head, tail } = emitExprFirstLine(step.value, trivia, ci, pad); + lines.push(`${ci}const ${step.name} = ${head}`); + lines.push(...tail); + return lines; + } - case "for_lines": { - lines.push(`${ci}for ${step.iterVar} in ${step.sourceVar} {`); - lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); - lines.push(`${ci}}`); - break; - } + if (step.type === "return") { + const { head, tail } = emitExprFirstLine(step.value, trivia, ci, pad); + lines.push(`${ci}return ${head}`); + lines.push(...tail); + return lines; } - return lines; -} + if (step.type === "send") { + const { head, tail } = emitExprFirstLine(step.value, trivia, ci, pad); + lines.push(`${ci}${step.channel} <- ${head}`); + lines.push(...tail); + return lines; + } -function emitConstStep(name: string, value: ConstRhs, valueTrivia: NodeTrivia): string { - switch (value.kind) { - case "expr": - if (valueTrivia.tripleQuoted) { - // Multi-line: caller handles remaining lines - return `const ${name} = """`; - } - return `const ${name} = ${value.bashRhs}`; - case "run_capture": { - const asyncMod = value.async ? "async " : ""; - return `const ${name} = run ${asyncMod}${emitRef(value.ref, value.args)}`; - } - case "ensure_capture": - return `const ${name} = ensure ${emitRef(value.ref, value.args)}`; - case "prompt_capture": { - const returns = value.returns ? ` returns "${value.returns}"` : ""; - if (valueTrivia.bodyKind === "identifier" && valueTrivia.bodyIdentifier) { - return `const ${name} = prompt ${valueTrivia.bodyIdentifier}${returns}`; - } - if (valueTrivia.bodyKind === "triple_quoted") { - // Multi-line: caller handles remaining lines - return `const ${name} = prompt """`; - } - return `const ${name} = prompt ${value.raw}${returns}`; - } - case "match_expr": { - // Multi-line format; return first line (const assignment opens the block) - return `const ${name} = match ${value.match.subject} {`; - } - case "run_inline_script_capture": { - const argsStr = formatArgs(value.args); - if (value.lang || value.body.includes("\n")) { - const langTag = value.lang ?? ""; - return `const ${name} = run \`\`\`${langTag}`; - } - return `const ${name} = run \`${value.body}\`(${argsStr})`; - } + if (step.type === "if") { + const operandStr = step.operand.kind === "string_literal" + ? `"${step.operand.value}"` + : `/${step.operand.source}/`; + lines.push(`${ci}if ${step.subject} ${step.operator} ${operandStr} {`); + lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); + lines.push(`${ci}}`); + return lines; } -} -function emitSendRhs(rhs: SendRhsDef): string { - switch (rhs.kind) { - case "literal": - return rhs.token; - case "var": - return rhs.bash; - case "run": - return `run ${emitRef(rhs.ref, rhs.args)}`; - case "bare_ref": - return rhs.ref.value; - case "shell": - return rhs.command; + if (step.type === "for_lines") { + lines.push(`${ci}for ${step.iterVar} in ${step.sourceVar} {`); + lines.push(...emitSteps(step.body, pad, ci + pad, trivia)); + lines.push(`${ci}}`); + return lines; } + + return lines; } function emitTestBlock(test: TestBlockDef, pad: string, trivia: Trivia): string { diff --git a/src/parse/arg-ast-shape.test.ts b/src/parse/arg-ast-shape.test.ts index 77103ba6..3ce31de2 100644 --- a/src/parse/arg-ast-shape.test.ts +++ b/src/parse/arg-ast-shape.test.ts @@ -1,57 +1,62 @@ import test from "node:test"; import assert from "node:assert/strict"; -import type { ConstRhs, SendRhsDef, WorkflowStepDef } from "../types"; +import type { Expr, WorkflowStepDef } from "../types"; /** - * AC1: `bareIdentifierArgs` must not appear on any call-bearing AST node. + * AC1 (Refactor 3): `bareIdentifierArgs` must not appear on any call-bearing + * AST node, and the three "managed call that yields a value" encodings + * — `managed:` sidecar / `run_capture` const RHS / placeholder strings + * — have been replaced by a single `Expr` shape that carries `args: Arg[]`. * - * Each helper below probes a specific variant where the field used to live; if - * it is re-added, `HasField` widens to `true`, the type-level assertion fails, - * and TypeScript breaks compilation. + * Each helper below probes a specific Expr variant where the field used to + * live; if it is re-added, `HasField` widens to `true`, the type-level + * assertion fails, and TypeScript breaks compilation. */ type HasField = T extends Record ? true : false; -type EnsureStep = Extract; -type RunStep = Extract; -type RunInlineScriptStep = Extract; -type LogStep = Extract; -type LogerrStep = Extract; +type ExecStep = Extract; type ReturnStep = Extract; -type LogManaged = NonNullable; -type LogerrManaged = NonNullable; -type ReturnManaged = NonNullable; -type ReturnManagedRun = Extract; -type ReturnManagedEnsure = Extract; -type ReturnManagedInline = Extract; -type RunCapture = Extract; -type EnsureCapture = Extract; -type InlineScriptCapture = Extract; -type SendRun = Extract; +type SayStep = Extract; +type SendStep = Extract; +type ConstStep = Extract; -const _ensureNoBare: HasField = false; -const _runNoBare: HasField = false; -const _inlineNoBare: HasField = false; -const _logManagedNoBare: HasField = false; -const _logerrManagedNoBare: HasField = false; -const _returnManagedRunNoBare: HasField = false; -const _returnManagedEnsureNoBare: HasField = false; -const _returnManagedInlineNoBare: HasField = false; -const _runCaptureNoBare: HasField = false; -const _ensureCaptureNoBare: HasField = false; -const _inlineCaptureNoBare: HasField = false; -const _sendRunNoBare: HasField = false; +type CallExpr = Extract; +type EnsureCallExpr = Extract; +type InlineScriptExpr = Extract; +type PromptExpr = Extract; +type SendRunExpr = SendStep["value"]; +type ConstValueExpr = ConstStep["value"]; -test("AC1: bareIdentifierArgs does not appear on any call-bearing AST type", () => { - assert.equal(_ensureNoBare, false); - assert.equal(_runNoBare, false); +const _callNoBare: HasField = false; +const _ensureCallNoBare: HasField = false; +const _inlineNoBare: HasField = false; +const _promptNoBare: HasField = false; +const _sendValueNoBare: HasField = false; +const _constValueNoBare: HasField = false; + +// Managed sidecar / placeholder strings on return/log/logerr/etc. are gone: +const _returnNoManaged: HasField = false; +const _sayNoManaged: HasField = false; +const _execNoManaged: HasField = false; + +// return.value is now an Expr (not a placeholder string). +const _returnValueIsExpr: ReturnStep["value"] extends Expr ? true : false = true; +const _sayMessageIsExpr: SayStep["message"] extends Expr ? true : false = true; +const _sendValueIsExpr: SendStep["value"] extends Expr ? true : false = true; +const _constValueIsExpr: ConstStep["value"] extends Expr ? true : false = true; + +test("AC1: managed-call encodings collapsed into Expr; no `bareIdentifierArgs` on Expr", () => { + assert.equal(_callNoBare, false); + assert.equal(_ensureCallNoBare, false); assert.equal(_inlineNoBare, false); - assert.equal(_logManagedNoBare, false); - assert.equal(_logerrManagedNoBare, false); - assert.equal(_returnManagedRunNoBare, false); - assert.equal(_returnManagedEnsureNoBare, false); - assert.equal(_returnManagedInlineNoBare, false); - assert.equal(_runCaptureNoBare, false); - assert.equal(_ensureCaptureNoBare, false); - assert.equal(_inlineCaptureNoBare, false); - assert.equal(_sendRunNoBare, false); + assert.equal(_promptNoBare, false); + assert.equal(_sendValueNoBare, false); + assert.equal(_constValueNoBare, false); + assert.equal(_returnNoManaged, false); + assert.equal(_sayNoManaged, false); + assert.equal(_execNoManaged, false); + assert.equal(_returnValueIsExpr, true); + assert.equal(_sayMessageIsExpr, true); + assert.equal(_sendValueIsExpr, true); + assert.equal(_constValueIsExpr, true); }); diff --git a/src/parse/const-rhs.ts b/src/parse/const-rhs.ts index 14e97d97..19e7300e 100644 --- a/src/parse/const-rhs.ts +++ b/src/parse/const-rhs.ts @@ -1,4 +1,4 @@ -import type { ConstRhs, RuleRefDef, WorkflowRefDef } from "../types"; +import type { Expr, RuleRefDef, WorkflowRefDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { fail, parseCallRef, rejectTrailingContent } from "./core"; import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; @@ -49,6 +49,7 @@ export function validateConstBashExpr(filePath: string, expr: string, lineNo: nu /** * Parse RHS after `const name = ` (trimmed). `forRule` disallows prompt capture. + * Returns an `Expr` node — the typed value-form that replaces the legacy `ConstRhs` union. */ export function parseConstRhs( filePath: string, @@ -60,7 +61,7 @@ export function parseConstRhs( forRule: boolean, constName: string, trivia: Trivia = createTrivia(), -): { value: ConstRhs; nextLineIdx: number } { +): { value: Expr; nextLineIdx: number } { const head = rhs.trimStart(); if (head.startsWith("prompt ")) { if (forRule) { @@ -71,24 +72,22 @@ export function parseConstRhs( const promptArg = rhs.slice(rhs.indexOf("prompt") + "prompt".length).trimStart(); const result = parsePromptStep(filePath, lines, lineIdx, promptArg, promptCol, constName, trivia); const st = result.step; - if (st.type !== "prompt" || st.captureName !== constName) { + if (st.type !== "exec" || st.body.kind !== "prompt" || st.captureName !== constName) { + fail(filePath, "const ... = prompt internal parse error", lineNo, col); + } + const promptBody = st.body; + if (promptBody.kind !== "prompt") { fail(filePath, "const ... = prompt internal parse error", lineNo, col); } const promptTrivia = trivia.getNode(st); - const value: ConstRhs = { - kind: "prompt_capture", - raw: st.raw, - loc: st.loc, - returns: st.returns, - }; if (promptTrivia) { - trivia.setNode(value, { + trivia.setNode(promptBody, { ...(promptTrivia.bodyKind ? { bodyKind: promptTrivia.bodyKind } : {}), ...(promptTrivia.bodyIdentifier ? { bodyIdentifier: promptTrivia.bodyIdentifier } : {}), ...(promptTrivia.rawBody !== undefined ? { rawBody: promptTrivia.rawBody } : {}), }); } - return { value, nextLineIdx: result.nextLineIdx }; + return { value: promptBody, nextLineIdx: result.nextLineIdx }; } if (head.startsWith("run ")) { const rest = head.slice("run ".length).trim(); @@ -103,12 +102,9 @@ export function parseConstRhs( fail(filePath, "const ... = run async must target a valid reference", lineNo, col); } rejectTrailingContent(filePath, lineNo, "run async", call.rest); - const ref: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; + const callee: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; return { - value: { - kind: "run_capture", ref, args: call.args, - async: true, - }, + value: { kind: "call", callee, args: call.args, async: true }, nextLineIdx: lineIdx, }; } @@ -116,7 +112,7 @@ export function parseConstRhs( const result = parseAnonymousInlineScript(filePath, lines, lineIdx, rest, lineNo, col); return { value: { - kind: "run_inline_script_capture", + kind: "inline_script", body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, @@ -132,11 +128,9 @@ export function parseConstRhs( fail(filePath, "const ... = run must target a valid reference", lineNo, col); } rejectTrailingContent(filePath, lineNo, "run", call.rest); - const ref: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; + const callee: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; return { - value: { - kind: "run_capture", ref, args: call.args, - }, + value: { kind: "call", callee, args: call.args }, nextLineIdx: lineIdx, }; } @@ -149,11 +143,9 @@ export function parseConstRhs( if (call.rest.trim()) { fail(filePath, "const ... = ensure cannot use catch", lineNo, col); } - const ref: RuleRefDef = { value: call.ref, loc: { line: lineNo, col } }; + const callee: RuleRefDef = { value: call.ref, loc: { line: lineNo, col } }; return { - value: { - kind: "ensure_capture", ref, args: call.args, - }, + value: { kind: "ensure_call", callee, args: call.args }, nextLineIdx: lineIdx, }; } @@ -162,7 +154,7 @@ export function parseConstRhs( if (constMatchHead) { const subject = constMatchHead[1].trim(); const { expr, nextIndex } = parseMatchExpr(filePath, lines, lineIdx, subject, { line: lineNo, col }); - return { value: { kind: "match_expr", match: expr }, nextLineIdx: nextIndex - 1 }; + return { value: { kind: "match", match: expr }, nextLineIdx: nextIndex - 1 }; } // const name = """...""" if (head.startsWith('"""')) { @@ -170,7 +162,7 @@ export function parseConstRhs( tqLines[lineIdx] = head; const { body, nextIdx, afterClose } = parseTripleQuoteBlock(filePath, tqLines, lineIdx); if (afterClose) fail(filePath, 'unexpected content after closing """', nextIdx); - const value: ConstRhs = { kind: "expr", bashRhs: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + const value: Expr = { kind: "literal", raw: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; trivia.setNode(value, { tripleQuoted: true, rawBody: body }); return { value, nextLineIdx: nextIdx - 1 }; } @@ -186,10 +178,10 @@ export function parseConstRhs( validateConstBashExpr(filePath, head, lineNo, col); const isBareDotted = isBareDottedIdentifierReturn(head); const isBare = !isBareDotted && isBareIdentifierReturn(head); - const bashRhs = isBareDotted + const raw = isBareDotted ? dottedReturnToQuotedString(head) : isBare ? bareIdentifierToQuotedString(head) : head; - return { value: { kind: "expr", bashRhs }, nextLineIdx: lineIdx }; + return { value: { kind: "literal", raw }, nextLineIdx: lineIdx }; } diff --git a/src/parse/core.ts b/src/parse/core.ts index 5f405b6e..54c6ba71 100644 --- a/src/parse/core.ts +++ b/src/parse/core.ts @@ -211,19 +211,6 @@ export function argsToRuntimeString(args: Arg[] | undefined): string { return args.map((a) => (a.kind === "var" ? `\${${a.name}}` : a.raw)).join(" "); } -/** - * Convert `Arg[]` back to comma-separated source form: - * - `var` → name (bare) - * - `literal` → raw as authored - * - * Used to populate the placeholder `value` string on managed - * `return run …` / `return ensure …` steps. Empty / undefined → empty string. - */ -export function argsToSourceForm(args: Arg[] | undefined): string { - if (!args || args.length === 0) return ""; - return args.map((a) => (a.kind === "var" ? a.name : a.raw)).join(", "); -} - /** * Parse a call expression `ref(args)` or `ref()` from a string. * Returns the ref, optional typed `Arg[]`, and the rest of the string after `)`. diff --git a/src/parse/parse-bare-call.test.ts b/src/parse/parse-bare-call.test.ts index 75e89ee6..3209e485 100644 --- a/src/parse/parse-bare-call.test.ts +++ b/src/parse/parse-bare-call.test.ts @@ -24,10 +24,10 @@ test("run with args and parens still works", () => { "test.jh", ); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "deploy"); - assert.deepEqual(step.args, [ + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "deploy"); + assert.deepEqual(step.body.args, [ { kind: "literal", raw: '"prod"' }, { kind: "literal", raw: '"v1"' }, ]); @@ -86,32 +86,38 @@ test("const x = ensure bare identifier is rejected — parentheses required", () // === return run/ensure bare identifier (no parens) now falls through === -test("return run bare identifier does not parse as managed return", () => { +test("return run bare identifier falls through to exec/shell", () => { // Without parens, "return run helper" is not recognized as a managed return - // and falls through to a shell step + // and falls through to a shell exec step const mod = parsejaiph( `workflow default() {\n return run helper\n}`, "test.jh", ); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "shell"); + assert.equal(step.type, "exec"); + if (step.type === "exec") { + assert.equal(step.body.kind, "shell"); + } }); -test("return ensure bare identifier does not parse as managed return", () => { +test("return ensure bare identifier falls through to exec/shell", () => { // Without parens, "return ensure check" is not recognized as a managed return - // and falls through to a shell step + // and falls through to a shell exec step const mod = parsejaiph( `rule check() {\n return "ok"\n}\nworkflow default() {\n return ensure check\n}`, "test.jh", ); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "shell"); + assert.equal(step.type, "exec"); + if (step.type === "exec") { + assert.equal(step.body.kind, "shell"); + } }); // === send RHS with bare identifier (no parens) === -test("channel <- run bare identifier does not parse as send with run RHS", () => { - // Without parens, the send RHS falls through to shell kind +test("channel <- run bare identifier does not parse as send with call value", () => { + // Without parens, the send RHS falls through to Expr.shell const mod = parsejaiph( [ "channel alerts", @@ -125,8 +131,7 @@ test("channel <- run bare identifier does not parse as send with run RHS", () => assert.equal(step.type, "send"); if (step.type === "send") { assert.equal(step.channel, "alerts"); - // Without parens, parseCallRef returns null, so it falls through to shell kind - assert.equal(step.rhs.kind, "shell"); + assert.equal(step.value.kind, "shell"); } }); diff --git a/src/parse/parse-const-rhs.test.ts b/src/parse/parse-const-rhs.test.ts index 411a2269..2e66723b 100644 --- a/src/parse/parse-const-rhs.test.ts +++ b/src/parse/parse-const-rhs.test.ts @@ -91,44 +91,44 @@ test("validateConstBashExpr: rejects ${var:?message} fallback", () => { // === parseConstRhs === -test("parseConstRhs: parses bash expression", () => { +test("parseConstRhs: parses literal expression", () => { const result = parseConstRhs("test.jh", ['const x = "hello"'], 0, '"hello"', 1, 1, false, "x"); - assert.equal(result.value.kind, "expr"); - if (result.value.kind === "expr") { - assert.equal(result.value.bashRhs, '"hello"'); + assert.equal(result.value.kind, "literal"); + if (result.value.kind === "literal") { + assert.equal(result.value.raw, '"hello"'); } assert.equal(result.nextLineIdx, 0); }); -test("parseConstRhs: bare identifier is sugar for interpolated string", () => { +test("parseConstRhs: bare identifier is sugar for interpolated literal", () => { const result = parseConstRhs("test.jh", ["const x = response"], 0, "response", 1, 1, false, "x"); - assert.equal(result.value.kind, "expr"); - if (result.value.kind === "expr") { - assert.equal(result.value.bashRhs, '"${response}"'); + assert.equal(result.value.kind, "literal"); + if (result.value.kind === "literal") { + assert.equal(result.value.raw, '"${response}"'); } }); -test("parseConstRhs: bare dotted identifier is sugar for interpolated string", () => { +test("parseConstRhs: bare dotted identifier is sugar for interpolated literal", () => { const result = parseConstRhs("test.jh", ["const x = response.message"], 0, "response.message", 1, 1, false, "x"); - assert.equal(result.value.kind, "expr"); - if (result.value.kind === "expr") { - assert.equal(result.value.bashRhs, '"${response.message}"'); + assert.equal(result.value.kind, "literal"); + if (result.value.kind === "literal") { + assert.equal(result.value.raw, '"${response.message}"'); } }); -test("parseConstRhs: parses run capture", () => { +test("parseConstRhs: parses run capture as Expr.call", () => { const result = parseConstRhs("test.jh", ["const x = run my_script()"], 0, "run my_script()", 1, 1, false, "x"); - assert.equal(result.value.kind, "run_capture"); - if (result.value.kind === "run_capture") { - assert.equal(result.value.ref.value, "my_script"); + assert.equal(result.value.kind, "call"); + if (result.value.kind === "call") { + assert.equal(result.value.callee.value, "my_script"); } }); -test("parseConstRhs: parses run capture with args", () => { +test("parseConstRhs: parses run capture with args as Expr.call", () => { const result = parseConstRhs("test.jh", ['const x = run my_script("arg")'], 0, 'run my_script("arg")', 1, 1, false, "x"); - assert.equal(result.value.kind, "run_capture"); - if (result.value.kind === "run_capture") { - assert.equal(result.value.ref.value, "my_script"); + assert.equal(result.value.kind, "call"); + if (result.value.kind === "call") { + assert.equal(result.value.callee.value, "my_script"); assert.deepEqual(result.value.args, [{ kind: "literal", raw: '"arg"' }]); } }); @@ -140,11 +140,11 @@ test("parseConstRhs: run without parens rejects (parens required)", () => { ); }); -test("parseConstRhs: parses ensure capture", () => { +test("parseConstRhs: parses ensure capture as Expr.ensure_call", () => { const result = parseConstRhs("test.jh", ["const x = ensure my_rule()"], 0, "ensure my_rule()", 1, 1, false, "x"); - assert.equal(result.value.kind, "ensure_capture"); - if (result.value.kind === "ensure_capture") { - assert.equal(result.value.ref.value, "my_rule"); + assert.equal(result.value.kind, "ensure_call"); + if (result.value.kind === "ensure_call") { + assert.equal(result.value.callee.value, "my_rule"); } }); @@ -176,11 +176,11 @@ test("parseConstRhs: bare call without run suggests fix", () => { ); }); -test("parseConstRhs: parses prompt capture in workflow", () => { +test("parseConstRhs: parses prompt capture as Expr.prompt", () => { const lines = [' const x = prompt "What is your name?"']; const result = parseConstRhs("test.jh", lines, 0, 'prompt "What is your name?"', 1, 1, false, "x"); - assert.equal(result.value.kind, "prompt_capture"); - if (result.value.kind === "prompt_capture") { + assert.equal(result.value.kind, "prompt"); + if (result.value.kind === "prompt") { assert.equal(result.value.raw, '"What is your name?"'); } }); diff --git a/src/parse/parse-definitions.test.ts b/src/parse/parse-definitions.test.ts index bc436efa..ecf0e4dc 100644 --- a/src/parse/parse-definitions.test.ts +++ b/src/parse/parse-definitions.test.ts @@ -205,13 +205,20 @@ test("reserved keyword as parameter name is rejected", () => { ); }); -test("log accepts a bare identifier (stored as interpolation)", () => { +test("log accepts a bare identifier (stored as interpolation Expr.literal)", () => { const mod = parsejaiph( ["workflow w() {", " log msg", "}", ""].join("\n"), "test.jh", ); - assert.equal(mod.workflows[0].steps[0].type, "log"); - assert.equal((mod.workflows[0].steps[0] as { message: string }).message, "${msg}"); + const step = mod.workflows[0].steps[0]; + assert.equal(step.type, "say"); + if (step.type === "say") { + assert.equal(step.level, "log"); + assert.equal(step.message.kind, "literal"); + if (step.message.kind === "literal") { + assert.equal(step.message.raw, "${msg}"); + } + } }); // === import script === diff --git a/src/parse/parse-inline-script.test.ts b/src/parse/parse-inline-script.test.ts index f6308c5b..474eba75 100644 --- a/src/parse/parse-inline-script.test.ts +++ b/src/parse/parse-inline-script.test.ts @@ -11,11 +11,11 @@ workflow default() { const ast = parsejaiph(src, "test.jh"); assert.equal(ast.workflows.length, 1); const step = ast.workflows[0].steps[0]; - assert.equal(step.type, "run_inline_script"); - if (step.type === "run_inline_script") { - assert.equal(step.body, "echo hello"); - assert.equal(step.lang, undefined); - assert.equal(step.args, undefined); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "inline_script") { + assert.equal(step.body.body, "echo hello"); + assert.equal(step.body.lang, undefined); + assert.equal(step.body.args, undefined); assert.equal(step.captureName, undefined); } }); @@ -28,10 +28,10 @@ workflow default() { `; const ast = parsejaiph(src, "test.jh"); const step = ast.workflows[0].steps[0]; - assert.equal(step.type, "run_inline_script"); - if (step.type === "run_inline_script") { - assert.equal(step.body, "echo $1"); - assert.deepEqual(step.args, [ + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "inline_script") { + assert.equal(step.body.body, "echo $1"); + assert.deepEqual(step.body.args, [ { kind: "literal", raw: '"arg1"' }, { kind: "literal", raw: '"arg2"' }, ]); @@ -56,11 +56,8 @@ workflow default() { const ast = parsejaiph(src, "test.jh"); const step = ast.workflows[0].steps[0]; assert.equal(step.type, "const"); - if (step.type === "const") { - assert.equal(step.value.kind, "run_inline_script_capture"); - if (step.value.kind === "run_inline_script_capture") { - assert.equal(step.value.body, "echo hello"); - } + if (step.type === "const" && step.value.kind === "inline_script") { + assert.equal(step.value.body, "echo hello"); } }); @@ -74,10 +71,10 @@ test("parser: run script() with fenced block and lang tag", () => { ].join("\n"); const ast = parsejaiph(src, "test.jh"); const step = ast.workflows[0].steps[0]; - assert.equal(step.type, "run_inline_script"); - if (step.type === "run_inline_script") { - assert.equal(step.lang, "python3"); - assert.equal(step.body, "print('hello')"); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "inline_script") { + assert.equal(step.body.lang, "python3"); + assert.equal(step.body.body, "print('hello')"); } }); @@ -107,10 +104,10 @@ test("parser: rule body supports multiline fenced run ```", () => { const ast = parsejaiph(src, "test.jh"); assert.equal(ast.rules.length, 1); const step = ast.rules[0].steps[0]; - assert.equal(step.type, "run_inline_script"); - if (step.type === "run_inline_script") { - assert.ok(step.body.includes('if [ -z "$1" ]')); - assert.deepEqual(step.args, [{ kind: "var", name: "name" }]); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "inline_script") { + assert.ok(step.body.body.includes('if [ -z "$1" ]')); + assert.deepEqual(step.body.args, [{ kind: "var", name: "name" }]); } }); diff --git a/src/parse/parse-metadata.test.ts b/src/parse/parse-metadata.test.ts index 45a9a438..107114bb 100644 --- a/src/parse/parse-metadata.test.ts +++ b/src/parse/parse-metadata.test.ts @@ -272,7 +272,7 @@ test("workflow config: parses config inside workflow", () => { const mod = parsejaiph(src, "test.jh"); assert.equal(mod.workflows[0].metadata?.agent?.backend, "claude"); assert.equal(mod.workflows[0].steps.length, 1); - assert.equal(mod.workflows[0].steps[0].type, "log"); + assert.equal(mod.workflows[0].steps[0].type, "say"); }); test("workflow config: allows comments before config", () => { diff --git a/src/parse/parse-prompt.test.ts b/src/parse/parse-prompt.test.ts index 3ef93cbd..6b2ce9fd 100644 --- a/src/parse/parse-prompt.test.ts +++ b/src/parse/parse-prompt.test.ts @@ -5,39 +5,50 @@ import { createTrivia } from "./trivia"; const trivia = createTrivia(); +/** + * `parsePromptStep` now returns an `exec` step whose `body` is an `Expr.prompt`. + * The bodyKind / bodyIdentifier / rawBody trivia hangs off that inner Expr. + */ +function unwrapPrompt(step: import("../types").WorkflowStepDef): import("../types").Expr & { kind: "prompt" } { + if (step.type !== "exec" || step.body.kind !== "prompt") { + throw new Error(`expected exec step with prompt body, got ${step.type}`); + } + return step.body; +} + // === parsePromptStep: single-line string literal === test("parsePromptStep: parses simple single-line prompt", () => { const lines = [' prompt "Hello world"']; const result = parsePromptStep("test.jh", lines, 0, '"Hello world"', 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - assert.equal(result.step.raw, '"Hello world"'); - assert.equal(result.step.loc.line, 1); - assert.equal(result.step.loc.col, 3); - assert.equal(result.step.captureName, undefined); - assert.equal(result.step.returns, undefined); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "string"); + const body = unwrapPrompt(result.step); + assert.equal(body.raw, '"Hello world"'); + assert.equal(body.loc.line, 1); + assert.equal(body.loc.col, 3); + if (result.step.type === "exec") { + assert.equal(result.step.captureName, undefined); } + assert.equal(body.returns, undefined); + assert.equal(trivia.getNode(body)?.bodyKind, "string"); }); test("parsePromptStep: parses captured prompt", () => { const lines = [' answer = prompt "What?"']; const result = parsePromptStep("test.jh", lines, 0, '"What?"', 3, "answer", trivia); - assert.equal(result.step.type, "prompt"); - assert.equal(result.step.raw, '"What?"'); - assert.equal(result.step.captureName, "answer"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "string"); + const body = unwrapPrompt(result.step); + assert.equal(body.raw, '"What?"'); + if (result.step.type === "exec") { + assert.equal(result.step.captureName, "answer"); } + assert.equal(trivia.getNode(body)?.bodyKind, "string"); }); test("parsePromptStep: parses prompt with returns schema (double-quoted)", () => { const lines = [' prompt "Classify" returns "{ type: string }"']; const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - assert.equal(result.step.raw, '"Classify"'); - assert.equal(result.step.returns, "{ type: string }"); + const body = unwrapPrompt(result.step); + assert.equal(body.raw, '"Classify"'); + assert.equal(body.returns, "{ type: string }"); }); test("parsePromptStep: rejects single-quoted returns schema", () => { @@ -66,35 +77,31 @@ test("parsePromptStep: multiline quoted prompt throws with clear error", () => { test("parsePromptStep: parses bare identifier prompt", () => { const lines = [' prompt myVar']; const result = parsePromptStep("test.jh", lines, 0, "myVar", 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); - assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "myVar"); - assert.equal(result.step.raw, '"${myVar}"'); - assert.equal(result.step.returns, undefined); - } + const body = unwrapPrompt(result.step); + assert.equal(trivia.getNode(body)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(body)?.bodyIdentifier, "myVar"); + assert.equal(body.raw, '"${myVar}"'); + assert.equal(body.returns, undefined); }); test("parsePromptStep: parses identifier prompt with returns", () => { const lines = [' prompt myVar returns "{ type: string }"']; const result = parsePromptStep("test.jh", lines, 0, 'myVar returns "{ type: string }"', 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); - assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "myVar"); - assert.equal(result.step.returns, "{ type: string }"); - } + const body = unwrapPrompt(result.step); + assert.equal(trivia.getNode(body)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(body)?.bodyIdentifier, "myVar"); + assert.equal(body.returns, "{ type: string }"); }); test("parsePromptStep: parses captured identifier prompt", () => { const lines = [' answer = prompt text']; const result = parsePromptStep("test.jh", lines, 0, "text", 3, "answer", trivia); - assert.equal(result.step.type, "prompt"); - assert.equal(result.step.captureName, "answer"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "identifier"); - assert.equal(trivia.getNode(result.step)?.bodyIdentifier, "text"); + const body = unwrapPrompt(result.step); + if (result.step.type === "exec") { + assert.equal(result.step.captureName, "answer"); } + assert.equal(trivia.getNode(body)?.bodyKind, "identifier"); + assert.equal(trivia.getNode(body)?.bodyIdentifier, "text"); }); // === parsePromptStep: triple-quoted block === @@ -107,13 +114,10 @@ test("parsePromptStep: parses triple-quoted block prompt", () => { '"""', ]; const result = parsePromptStep("test.jh", lines, 0, '"""', 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); - // raw contains the body wrapped in quotes for runtime interpolation - assert.ok(result.step.raw.includes("You are a helpful assistant.")); - assert.ok(result.step.raw.includes("${input}")); - } + const body = unwrapPrompt(result.step); + assert.equal(trivia.getNode(body)?.bodyKind, "triple_quoted"); + assert.ok(body.raw.includes("You are a helpful assistant.")); + assert.ok(body.raw.includes("${input}")); }); test("parsePromptStep: parses captured triple-quoted block prompt", () => { @@ -123,11 +127,11 @@ test("parsePromptStep: parses captured triple-quoted block prompt", () => { '"""', ]; const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); - assert.equal(result.step.type, "prompt"); - assert.equal(result.step.captureName, "answer"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); + const body = unwrapPrompt(result.step); + if (result.step.type === "exec") { + assert.equal(result.step.captureName, "answer"); } + assert.equal(trivia.getNode(body)?.bodyKind, "triple_quoted"); }); test("parsePromptStep: triple-quoted block may be followed by returns on the next line", () => { @@ -138,11 +142,9 @@ test("parsePromptStep: triple-quoted block may be followed by returns on the nex 'returns "{ role: string }"', ]; const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); - assert.equal(result.step.returns, "{ role: string }"); - } + const body = unwrapPrompt(result.step); + assert.equal(trivia.getNode(body)?.bodyKind, "triple_quoted"); + assert.equal(body.returns, "{ role: string }"); assert.equal(result.nextLineIdx, 3); }); @@ -153,11 +155,9 @@ test("parsePromptStep: triple-quoted block may close with returns on same line", '""" returns "{ role: string }"', ]; const result = parsePromptStep("test.jh", lines, 0, '"""', 3, "answer", trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(trivia.getNode(result.step)?.bodyKind, "triple_quoted"); - assert.equal(result.step.returns, "{ role: string }"); - } + const body = unwrapPrompt(result.step); + assert.equal(trivia.getNode(body)?.bodyKind, "triple_quoted"); + assert.equal(body.returns, "{ role: string }"); assert.equal(result.nextLineIdx, 2); }); @@ -173,8 +173,6 @@ test("parsePromptStep: unterminated triple-quoted block throws", () => { ); }); -// === parsePromptStep: triple-backtick fences are rejected for prompts === - test("parsePromptStep: triple-backtick fence is rejected with guidance", () => { const lines = [ ' prompt ```', @@ -187,8 +185,6 @@ test("parsePromptStep: triple-backtick fence is rejected with guidance", () => { ); }); -// === parsePromptStep: errors === - test("parsePromptStep: unterminated single-line string throws", () => { const lines = [' prompt "Hello']; assert.throws( @@ -216,8 +212,6 @@ test("parsePromptStep: unterminated returns schema throws", () => { test("parsePromptStep: returns with double-quoted schema", () => { const lines = [' prompt "Classify" returns "{ type: string }"']; const result = parsePromptStep("test.jh", lines, 0, '"Classify" returns "{ type: string }"', 3, undefined, trivia); - assert.equal(result.step.type, "prompt"); - if (result.step.type === "prompt") { - assert.equal(result.step.returns, "{ type: string }"); - } + const body = unwrapPrompt(result.step); + assert.equal(body.returns, "{ type: string }"); }); diff --git a/src/parse/parse-return.test.ts b/src/parse/parse-return.test.ts index 6344edf5..ea40480f 100644 --- a/src/parse/parse-return.test.ts +++ b/src/parse/parse-return.test.ts @@ -2,7 +2,7 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parsejaiph } from "../parser"; -test("return run parses managed run call", () => { +test("return run parses Expr.call", () => { const mod = parsejaiph( `workflow default() {\n return run helper()\n}`, "test.jh", @@ -10,31 +10,27 @@ test("return run parses managed run call", () => { const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run"); - assert.equal(step.managed!.ref.value, "helper"); - assert.equal(step.managed!.args, undefined); - assert.equal(step.value, "run helper()"); + assert.equal(step.value.kind, "call"); + if (step.value.kind === "call") { + assert.equal(step.value.callee.value, "helper"); + assert.equal(step.value.args, undefined); + } } }); -test("return run parses managed run call with args", () => { +test("return run parses Expr.call with args", () => { const mod = parsejaiph( `workflow default() {\n return run helper("a", "b")\n}`, "test.jh", ); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run"); - if (step.managed!.kind === "run") { - assert.equal(step.managed!.ref.value, "helper"); - assert.deepEqual(step.managed!.args, [ - { kind: "literal", raw: '"a"' }, - { kind: "literal", raw: '"b"' }, - ]); - } + if (step.type === "return" && step.value.kind === "call") { + assert.equal(step.value.callee.value, "helper"); + assert.deepEqual(step.value.args, [ + { kind: "literal", raw: '"a"' }, + { kind: "literal", raw: '"b"' }, + ]); } }); @@ -45,14 +41,12 @@ test("return run parses dotted ref", () => { ); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run"); - assert.equal(step.managed!.ref.value, "lib.helper"); + if (step.type === "return" && step.value.kind === "call") { + assert.equal(step.value.callee.value, "lib.helper"); } }); -test("return ensure parses managed ensure call", () => { +test("return ensure parses Expr.ensure_call", () => { const mod = parsejaiph( `workflow default() {\n return ensure check()\n}`, "test.jh", @@ -60,62 +54,52 @@ test("return ensure parses managed ensure call", () => { const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "ensure"); - assert.equal(step.managed!.ref.value, "check"); - assert.equal(step.managed!.args, undefined); - assert.equal(step.value, "ensure check()"); + assert.equal(step.value.kind, "ensure_call"); + if (step.value.kind === "ensure_call") { + assert.equal(step.value.callee.value, "check"); + assert.equal(step.value.args, undefined); + } } }); -test("return ensure parses managed ensure call with args", () => { +test("return ensure parses Expr.ensure_call with args", () => { const mod = parsejaiph( `workflow default() {\n return ensure check("x")\n}`, "test.jh", ); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "ensure"); - if (step.managed!.kind === "ensure") { - assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); - } + if (step.type === "return" && step.value.kind === "ensure_call") { + assert.deepEqual(step.value.args, [{ kind: "literal", raw: '"x"' }]); } }); -test("return run in rule parses managed run call", () => { +test("return run in rule parses Expr.call", () => { const mod = parsejaiph( `script helper = \`echo "ok"\`\nrule my_rule() {\n return run helper()\n}`, "test.jh", ); const step = mod.rules[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run"); - assert.equal(step.managed!.ref.value, "helper"); + if (step.type === "return" && step.value.kind === "call") { + assert.equal(step.value.callee.value, "helper"); } }); -test("return ensure in rule parses managed ensure call", () => { +test("return ensure in rule parses Expr.ensure_call", () => { const mod = parsejaiph( `rule sub_rule() {\n return "ok"\n}\nrule my_rule() {\n return ensure sub_rule()\n}`, "test.jh", ); - const step = mod.rules[0].steps[1]; - // The rule that contains `return ensure sub_rule()` is my_rule (index 1) const myRule = mod.rules.find(r => r.name === "my_rule")!; const retStep = myRule.steps[0]; assert.equal(retStep.type, "return"); - if (retStep.type === "return") { - assert.ok(retStep.managed); - assert.equal(retStep.managed!.kind, "ensure"); - assert.equal(retStep.managed!.ref.value, "sub_rule"); + if (retStep.type === "return" && retStep.value.kind === "ensure_call") { + assert.equal(retStep.value.callee.value, "sub_rule"); } }); -test("return with string value has no managed field", () => { +test("return with string value is Expr.literal", () => { const mod = parsejaiph( `workflow default() {\n return "hello"\n}`, "test.jh", @@ -123,12 +107,14 @@ test("return with string value has no managed field", () => { const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); if (step.type === "return") { - assert.equal(step.managed, undefined); - assert.equal(step.value, '"hello"'); + assert.equal(step.value.kind, "literal"); + if (step.value.kind === "literal") { + assert.equal(step.value.raw, '"hello"'); + } } }); -test("bare return has no managed field", () => { +test("bare return is Expr.literal with empty string", () => { const mod = parsejaiph( `workflow default() {\n return\n}`, "test.jh", @@ -136,25 +122,25 @@ test("bare return has no managed field", () => { const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); if (step.type === "return") { - assert.equal(step.managed, undefined); - assert.equal(step.value, '""'); + assert.equal(step.value.kind, "literal"); + if (step.value.kind === "literal") { + assert.equal(step.value.raw, '""'); + } } }); -test("return run inline script parses managed inline script", () => { +test("return run inline script parses Expr.inline_script", () => { const mod = parsejaiph( "workflow default() {\n return run `cat report.txt`()\n}", "test.jh", ); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run_inline_script"); - if (step.managed!.kind === "run_inline_script") { - assert.equal(step.managed!.body, "cat report.txt"); - assert.equal(step.managed!.args, undefined); - } + if (step.type === "return" && step.value.kind === "inline_script") { + assert.equal(step.value.body, "cat report.txt"); + assert.equal(step.value.args, undefined); + } else { + assert.fail(`expected return/inline_script, got ${step.type}`); } }); @@ -165,13 +151,9 @@ test("return run inline script with args", () => { ); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run_inline_script"); - if (step.managed!.kind === "run_inline_script") { - assert.equal(step.managed!.body, "echo $1"); - assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); - } + if (step.type === "return" && step.value.kind === "inline_script") { + assert.equal(step.value.body, "echo $1"); + assert.deepEqual(step.value.args, [{ kind: "literal", raw: '"x"' }]); } }); @@ -182,18 +164,20 @@ test("return bare inline script is rejected", () => { ); }); -test("log run inline script parses managed inline script", () => { +test("log run inline script parses say with inline_script message", () => { const mod = parsejaiph( "workflow default() {\n log run `cat report.txt`()\n}", "test.jh", ); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "log"); - if (step.type === "log") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run_inline_script"); - assert.equal(step.managed!.body, "cat report.txt"); - assert.equal(step.managed!.args, undefined); + assert.equal(step.type, "say"); + if (step.type === "say") { + assert.equal(step.level, "log"); + assert.equal(step.message.kind, "inline_script"); + if (step.message.kind === "inline_script") { + assert.equal(step.message.body, "cat report.txt"); + assert.equal(step.message.args, undefined); + } } }); @@ -203,14 +187,10 @@ test("log run inline script with args", () => { "test.jh", ); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "log"); - if (step.type === "log") { - assert.ok(step.managed); - assert.equal(step.managed!.kind, "run_inline_script"); - if (step.managed!.kind === "run_inline_script") { - assert.equal(step.managed!.body, "echo $1"); - assert.deepEqual(step.managed!.args, [{ kind: "literal", raw: '"x"' }]); - } + assert.equal(step.type, "say"); + if (step.type === "say" && step.message.kind === "inline_script") { + assert.equal(step.message.body, "echo $1"); + assert.deepEqual(step.message.args, [{ kind: "literal", raw: '"x"' }]); } }); @@ -228,16 +208,15 @@ test("logerr bare inline script is rejected", () => { ); }); -test("return bare identifier is sugar for interpolated string", () => { +test("return bare identifier is sugar for interpolated literal", () => { const mod = parsejaiph( `workflow default() {\n const response = "hello"\n return response\n}`, "test.jh", ); const step = mod.workflows[0].steps[1]; assert.equal(step.type, "return"); - if (step.type === "return") { - assert.equal(step.managed, undefined); - assert.equal(step.value, '"${response}"'); + if (step.type === "return" && step.value.kind === "literal") { + assert.equal(step.value.raw, '"${response}"'); } }); @@ -258,8 +237,8 @@ test("return bare identifier in brace block (if body)", () => { if (ifStep.type === "if") { const retStep = ifStep.body[0]; assert.equal(retStep.type, "return"); - if (retStep.type === "return") { - assert.equal(retStep.value, '"${msg}"'); + if (retStep.type === "return" && retStep.value.kind === "literal") { + assert.equal(retStep.value.raw, '"${msg}"'); } } }); @@ -279,14 +258,14 @@ test("return bare identifier in catch/recover block", () => { "test.jh", ); const ensureStep = mod.workflows[0].steps[0]; - assert.equal(ensureStep.type, "ensure"); - if (ensureStep.type === "ensure") { + assert.equal(ensureStep.type, "exec"); + if (ensureStep.type === "exec" && ensureStep.body.kind === "ensure_call") { assert.ok(ensureStep.catch); const recoverSteps = "block" in ensureStep.catch! ? ensureStep.catch!.block : [ensureStep.catch!.single]; const retStep = recoverSteps[0]; assert.equal(retStep.type, "return"); - if (retStep.type === "return") { - assert.equal(retStep.value, '"${err}"'); + if (retStep.type === "return" && retStep.value.kind === "literal") { + assert.equal(retStep.value.raw, '"${err}"'); } } }); @@ -307,16 +286,14 @@ test("return run in ensure recover block", () => { "test.jh", ); const ensureStep = mod.workflows[0].steps[0]; - assert.equal(ensureStep.type, "ensure"); - if (ensureStep.type === "ensure") { + assert.equal(ensureStep.type, "exec"); + if (ensureStep.type === "exec" && ensureStep.body.kind === "ensure_call") { assert.ok(ensureStep.catch); const recoverSteps = "block" in ensureStep.catch! ? ensureStep.catch!.block : [ensureStep.catch!.single]; const retStep = recoverSteps[0]; assert.equal(retStep.type, "return"); - if (retStep.type === "return") { - assert.ok(retStep.managed); - assert.equal(retStep.managed!.kind, "run"); - assert.equal(retStep.managed!.ref.value, "helper"); + if (retStep.type === "return" && retStep.value.kind === "call") { + assert.equal(retStep.value.callee.value, "helper"); } } }); diff --git a/src/parse/parse-run-async.test.ts b/src/parse/parse-run-async.test.ts index c6540445..1c750f32 100644 --- a/src/parse/parse-run-async.test.ts +++ b/src/parse/parse-run-async.test.ts @@ -2,7 +2,7 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parsejaiph } from "../parser"; -test("parse: run async produces run step with async flag", () => { +test("parse: run async produces exec/call with async flag on the body", () => { const src = [ "workflow default() {", " run async some_wf()", @@ -10,10 +10,10 @@ test("parse: run async produces run step with async flag", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "some_wf"); - assert.equal(step.async, true); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "some_wf"); + assert.equal(step.body.async, true); } }); @@ -25,14 +25,14 @@ test("parse: run async with args", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "other_wf"); - assert.deepEqual(step.args, [ + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "other_wf"); + assert.deepEqual(step.body.args, [ { kind: "literal", raw: '"hello"' }, { kind: "literal", raw: '"$x"' }, ]); - assert.equal(step.async, true); + assert.equal(step.body.async, true); } }); @@ -44,10 +44,10 @@ test("parse: run async with qualified ref", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "mod.some_wf"); - assert.equal(step.async, true); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "mod.some_wf"); + assert.equal(step.body.async, true); } }); @@ -59,9 +59,9 @@ test("parse: regular run does not have async flag", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.async, undefined); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.async, undefined); } }); @@ -77,7 +77,7 @@ test("parse: capture + run async is rejected without const", () => { ); }); -test("parse: const capture + run async produces run_capture with async flag", () => { +test("parse: const capture + run async produces Expr.call with async flag", () => { const src = [ "workflow default() {", " const h = run async some_wf()", @@ -86,13 +86,10 @@ test("parse: const capture + run async produces run_capture with async flag", () const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; assert.equal(step.type, "const"); - if (step.type === "const") { + if (step.type === "const" && step.value.kind === "call") { assert.equal(step.name, "h"); - assert.equal(step.value.kind, "run_capture"); - if (step.value.kind === "run_capture") { - assert.equal(step.value.ref.value, "some_wf"); - assert.equal(step.value.async, true); - } + assert.equal(step.value.callee.value, "some_wf"); + assert.equal(step.value.async, true); } }); @@ -105,13 +102,10 @@ test("parse: const capture + run async with args", () => { const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; assert.equal(step.type, "const"); - if (step.type === "const") { - assert.equal(step.value.kind, "run_capture"); - if (step.value.kind === "run_capture") { - assert.equal(step.value.ref.value, "other_wf"); - assert.deepEqual(step.value.args, [{ kind: "literal", raw: '"hello"' }]); - assert.equal(step.value.async, true); - } + if (step.type === "const" && step.value.kind === "call") { + assert.equal(step.value.callee.value, "other_wf"); + assert.deepEqual(step.value.args, [{ kind: "literal", raw: '"hello"' }]); + assert.equal(step.value.async, true); } }); @@ -123,15 +117,15 @@ test("parse: run async with recover block", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "foo"); - assert.equal(step.async, true); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "foo"); + assert.equal(step.body.async, true); assert.ok(step.recover); if (step.recover && "block" in step.recover) { assert.equal(step.recover.bindings.failure, "err"); assert.equal(step.recover.block.length, 1); - assert.equal(step.recover.block[0].type, "log"); + assert.equal(step.recover.block[0].type, "say"); } } }); @@ -147,9 +141,9 @@ test("parse: run async with multi-line recover block", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.async, true); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.async, true); assert.ok(step.recover); if (step.recover && "block" in step.recover) { assert.equal(step.recover.block.length, 2); @@ -165,10 +159,10 @@ test("parse: run async with catch block", () => { ].join("\n"); const mod = parsejaiph(src, "test.jh"); const step = mod.workflows[0]!.steps[0]!; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "bar"); - assert.equal(step.async, true); + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { + assert.equal(step.body.callee.value, "bar"); + assert.equal(step.body.async, true); assert.ok(step.catch); if (step.catch && "block" in step.catch) { assert.equal(step.catch.bindings.failure, "e"); diff --git a/src/parse/parse-send-rhs.test.ts b/src/parse/parse-send-rhs.test.ts index f3810a9f..f6b7cb0e 100644 --- a/src/parse/parse-send-rhs.test.ts +++ b/src/parse/parse-send-rhs.test.ts @@ -2,16 +2,16 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parseSendRhs } from "./send-rhs"; -// === parseSendRhs: empty/whitespace RHS is now rejected === +// === parseSendRhs: empty/whitespace RHS is rejected === -test("parseSendRhs: empty RHS returns forward kind", () => { +test("parseSendRhs: empty RHS throws", () => { assert.throws( () => parseSendRhs("test.jh", "", 1, 1), /send requires an explicit payload/, ); }); -test("parseSendRhs: whitespace-only RHS returns forward kind", () => { +test("parseSendRhs: whitespace-only RHS throws", () => { assert.throws( () => parseSendRhs("test.jh", " ", 1, 1), /send requires an explicit payload/, @@ -20,19 +20,19 @@ test("parseSendRhs: whitespace-only RHS returns forward kind", () => { // === parseSendRhs: literal === -test("parseSendRhs: quoted string returns literal kind", () => { - const { rhs } = parseSendRhs("test.jh", '"hello world"', 1, 1); - assert.equal(rhs.kind, "literal"); - if (rhs.kind === "literal") { - assert.equal(rhs.token, '"hello world"'); +test("parseSendRhs: quoted string returns Expr.literal", () => { + const { value } = parseSendRhs("test.jh", '"hello world"', 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, '"hello world"'); } }); test("parseSendRhs: quoted string with escaped quote", () => { - const { rhs } = parseSendRhs("test.jh", '"say \\"hi\\""', 1, 1); - assert.equal(rhs.kind, "literal"); - if (rhs.kind === "literal") { - assert.equal(rhs.token, '"say \\"hi\\""'); + const { value } = parseSendRhs("test.jh", '"say \\"hi\\""', 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, '"say \\"hi\\""'); } }); @@ -50,68 +50,68 @@ test("parseSendRhs: trailing content after quoted string throws", () => { ); }); -// === parseSendRhs: run === +// === parseSendRhs: call === -test("parseSendRhs: run call returns run kind", () => { - const { rhs } = parseSendRhs("test.jh", "run my_script()", 1, 5); - assert.equal(rhs.kind, "run"); - if (rhs.kind === "run") { - assert.equal(rhs.ref.value, "my_script"); - assert.equal(rhs.ref.loc.line, 1); - assert.equal(rhs.ref.loc.col, 5); +test("parseSendRhs: run call returns Expr.call", () => { + const { value } = parseSendRhs("test.jh", "run my_script()", 1, 5); + assert.equal(value.kind, "call"); + if (value.kind === "call") { + assert.equal(value.callee.value, "my_script"); + assert.equal(value.callee.loc.line, 1); + assert.equal(value.callee.loc.col, 5); } }); test("parseSendRhs: run call with args", () => { - const { rhs } = parseSendRhs("test.jh", 'run my_script("arg1")', 1, 1); - assert.equal(rhs.kind, "run"); - if (rhs.kind === "run") { - assert.equal(rhs.ref.value, "my_script"); - assert.deepEqual(rhs.args, [{ kind: "literal", raw: '"arg1"' }]); + const { value } = parseSendRhs("test.jh", 'run my_script("arg1")', 1, 1); + assert.equal(value.kind, "call"); + if (value.kind === "call") { + assert.equal(value.callee.value, "my_script"); + assert.deepEqual(value.args, [{ kind: "literal", raw: '"arg1"' }]); } }); test("parseSendRhs: run call with dotted ref", () => { - const { rhs } = parseSendRhs("test.jh", "run lib.process()", 1, 1); - assert.equal(rhs.kind, "run"); - if (rhs.kind === "run") { - assert.equal(rhs.ref.value, "lib.process"); + const { value } = parseSendRhs("test.jh", "run lib.process()", 1, 1); + assert.equal(value.kind, "call"); + if (value.kind === "call") { + assert.equal(value.callee.value, "lib.process"); } }); -// === parseSendRhs: var === +// === parseSendRhs: bare variable (`$name`) is Expr.literal in the new model === -test("parseSendRhs: simple variable returns var kind", () => { - const { rhs } = parseSendRhs("test.jh", "$myVar", 1, 1); - assert.equal(rhs.kind, "var"); - if (rhs.kind === "var") { - assert.equal(rhs.bash, "$myVar"); +test("parseSendRhs: simple variable returns Expr.literal", () => { + const { value } = parseSendRhs("test.jh", "$myVar", 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, "$myVar"); } }); test("parseSendRhs: underscore variable", () => { - const { rhs } = parseSendRhs("test.jh", "$_name", 1, 1); - assert.equal(rhs.kind, "var"); - if (rhs.kind === "var") { - assert.equal(rhs.bash, "$_name"); + const { value } = parseSendRhs("test.jh", "$_name", 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, "$_name"); } }); // === parseSendRhs: braced variable === -test("parseSendRhs: braced variable returns var kind", () => { - const { rhs } = parseSendRhs("test.jh", "${myVar}", 1, 1); - assert.equal(rhs.kind, "var"); - if (rhs.kind === "var") { - assert.equal(rhs.bash, "${myVar}"); +test("parseSendRhs: braced variable returns Expr.literal", () => { + const { value } = parseSendRhs("test.jh", "${myVar}", 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, "${myVar}"); } }); test("parseSendRhs: nested braced variable", () => { - const { rhs } = parseSendRhs("test.jh", "${outer_${inner}}", 1, 1); - assert.equal(rhs.kind, "var"); - if (rhs.kind === "var") { - assert.equal(rhs.bash, "${outer_${inner}}"); + const { value } = parseSendRhs("test.jh", "${outer_${inner}}", 1, 1); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.equal(value.raw, "${outer_${inner}}"); } }); @@ -138,37 +138,37 @@ test("parseSendRhs: braced variable with command substitution throws", () => { // === parseSendRhs: bare_ref === -test("parseSendRhs: bare dotted ref returns bare_ref kind", () => { - const { rhs } = parseSendRhs("test.jh", "lib.handler", 1, 3); - assert.equal(rhs.kind, "bare_ref"); - if (rhs.kind === "bare_ref") { - assert.equal(rhs.ref.value, "lib.handler"); - assert.equal(rhs.ref.loc.line, 1); - assert.equal(rhs.ref.loc.col, 3); +test("parseSendRhs: bare dotted ref returns Expr.bare_ref", () => { + const { value } = parseSendRhs("test.jh", "lib.handler", 1, 3); + assert.equal(value.kind, "bare_ref"); + if (value.kind === "bare_ref") { + assert.equal(value.ref.value, "lib.handler"); + assert.equal(value.ref.loc.line, 1); + assert.equal(value.ref.loc.col, 3); } }); // === parseSendRhs: shell === -test("parseSendRhs: unrecognized expression returns shell kind", () => { - const { rhs } = parseSendRhs("test.jh", "echo hello | grep h", 1, 1); - assert.equal(rhs.kind, "shell"); - if (rhs.kind === "shell") { - assert.equal(rhs.command, "echo hello | grep h"); - assert.equal(rhs.loc.line, 1); - assert.equal(rhs.loc.col, 1); +test("parseSendRhs: unrecognized expression returns Expr.shell", () => { + const { value } = parseSendRhs("test.jh", "echo hello | grep h", 1, 1); + assert.equal(value.kind, "shell"); + if (value.kind === "shell") { + assert.equal(value.command, "echo hello | grep h"); + assert.equal(value.loc.line, 1); + assert.equal(value.loc.col, 1); } }); // === parseSendRhs: triple-quoted literal === -test("parseSendRhs: triple-quoted string returns literal kind", () => { +test("parseSendRhs: triple-quoted string returns Expr.literal", () => { const lines = ['ch <- """', " hello", " world", '"""']; - const { rhs, nextIdx } = parseSendRhs("test.jh", '"""', 1, 6, lines, 0); - assert.equal(rhs.kind, "literal"); - if (rhs.kind === "literal") { - assert.ok(rhs.token.includes("hello")); - assert.ok(rhs.token.includes("world")); + const { value, nextIdx } = parseSendRhs("test.jh", '"""', 1, 6, lines, 0); + assert.equal(value.kind, "literal"); + if (value.kind === "literal") { + assert.ok(value.raw.includes("hello")); + assert.ok(value.raw.includes("world")); } assert.equal(nextIdx, 4); }); diff --git a/src/parse/parse-steps.test.ts b/src/parse/parse-steps.test.ts index 2fd95612..12c2d7b7 100644 --- a/src/parse/parse-steps.test.ts +++ b/src/parse/parse-steps.test.ts @@ -3,45 +3,65 @@ import assert from "node:assert/strict"; import { parsejaiph } from "../parser"; import { parseEnsureStep, parseRunRecoverStep } from "./steps"; +/** + * Helpers to keep individual asserts terse — `parseEnsureStep` / + * `parseRunCatchStep` / `parseRunRecoverStep` all return an `exec` step whose + * body is an `Expr.call` (run) or `Expr.ensure_call` (ensure). + */ +function asEnsureExec(step: import("../types").WorkflowStepDef) { + if (step.type !== "exec" || step.body.kind !== "ensure_call") { + throw new Error(`expected exec/ensure_call step, got ${step.type}`); + } + return step; +} +function asRunExec(step: import("../types").WorkflowStepDef) { + if (step.type !== "exec" || step.body.kind !== "call") { + throw new Error(`expected exec/call step, got ${step.type}`); + } + return step; +} + // === parseEnsureStep: basic ensure without catch === test("parseEnsureStep: parses basic ensure call", () => { const lines = [" ensure my_rule()"]; const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule()"); - assert.equal(step.type, "ensure"); - if (step.type === "ensure") { - assert.equal(step.ref.value, "my_rule"); - assert.equal(step.catch, undefined); + const e = asEnsureExec(step); + assert.equal(e.body.kind, "ensure_call"); + if (e.body.kind === "ensure_call") { + assert.equal(e.body.callee.value, "my_rule"); } + assert.equal(e.catch, undefined); assert.equal(nextIdx, 0); }); test("parseEnsureStep: parses ensure with args", () => { const lines = [' ensure my_rule("arg1")']; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule("arg1")'); - if (step.type === "ensure") { - assert.equal(step.ref.value, "my_rule"); - assert.deepEqual(step.args, [{ kind: "literal", raw: '"arg1"' }]); + const e = asEnsureExec(step); + if (e.body.kind === "ensure_call") { + assert.equal(e.body.callee.value, "my_rule"); + assert.deepEqual(e.body.args, [{ kind: "literal", raw: '"arg1"' }]); } }); test("parseEnsureStep: parses ensure with dotted ref", () => { const lines = [" ensure lib.check()"]; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "lib.check()"); - if (step.type === "ensure") { - assert.equal(step.ref.value, "lib.check"); + const e = asEnsureExec(step); + if (e.body.kind === "ensure_call") { + assert.equal(e.body.callee.value, "lib.check"); } }); test("parseEnsureStep: parses ensure with captureName", () => { const lines = [" result = ensure my_rule()"]; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule()", "result"); - if (step.type === "ensure") { - assert.equal(step.captureName, "result"); - } + const e = asEnsureExec(step); + assert.equal(e.captureName, "result"); }); -test("parseEnsureStep: ensure without parens parses as zero-arg call", () => { +test("parseEnsureStep: ensure without parens throws", () => { const lines = [" ensure my_rule"]; assert.throws( () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule"), @@ -54,24 +74,22 @@ test("parseEnsureStep: ensure without parens parses as zero-arg call", () => { test("parseEnsureStep: parses ensure with single catch statement", () => { const lines = [' ensure my_rule() catch (failure) log "failed"']; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) log "failed"'); - if (step.type === "ensure") { - assert.ok(step.catch); - assert.equal(step.catch.bindings.failure, "failure"); - if ("single" in step.catch) { - assert.equal(step.catch.single.type, "log"); - } + const e = asEnsureExec(step); + assert.ok(e.catch); + assert.equal(e.catch!.bindings.failure, "failure"); + if (e.catch && "single" in e.catch) { + assert.equal(e.catch.single.type, "say"); } }); test("parseEnsureStep: parses ensure with catch run statement", () => { const lines = [" ensure my_rule() catch (err) run fallback()"]; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (err) run fallback()"); - if (step.type === "ensure") { - assert.ok(step.catch); - assert.equal(step.catch.bindings.failure, "err"); - if ("single" in step.catch) { - assert.equal(step.catch.single.type, "run"); - } + const e = asEnsureExec(step); + assert.ok(e.catch); + assert.equal(e.catch!.bindings.failure, "err"); + if (e.catch && "single" in e.catch) { + assert.equal(e.catch.single.type, "exec"); } }); @@ -86,10 +104,12 @@ test("parseEnsureStep: parses ensure with catch wait statement", () => { test("parseEnsureStep: parses ensure with catch fail statement", () => { const lines = [' ensure my_rule() catch (failure) fail "reason"']; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) fail "reason"'); - if (step.type === "ensure") { - assert.ok(step.catch); - if ("single" in step.catch) { - assert.equal(step.catch.single.type, "fail"); + const e = asEnsureExec(step); + assert.ok(e.catch); + if (e.catch && "single" in e.catch) { + assert.equal(e.catch.single.type, "say"); + if (e.catch.single.type === "say") { + assert.equal(e.catch.single.level, "fail"); } } }); @@ -99,13 +119,11 @@ test("parseEnsureStep: parses ensure with catch fail statement", () => { test("parseEnsureStep: parses ensure with inline catch block", () => { const lines = [' ensure my_rule() catch (failure) { log "a"; log "b" }']; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) { log "a"; log "b" }'); - if (step.type === "ensure") { - assert.ok(step.catch); - if ("block" in step.catch) { - assert.equal(step.catch.block.length, 2); - assert.equal(step.catch.block[0].type, "log"); - assert.equal(step.catch.block[1].type, "log"); - } + const e = asEnsureExec(step); + if (e.catch && "block" in e.catch) { + assert.equal(e.catch.block.length, 2); + assert.equal(e.catch.block[0].type, "say"); + assert.equal(e.catch.block[1].type, "say"); } }); @@ -119,13 +137,11 @@ test("parseEnsureStep: parses ensure with multiline catch block", () => { " }", ]; const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) {"); - if (step.type === "ensure") { - assert.ok(step.catch); - if ("block" in step.catch) { - assert.equal(step.catch.block.length, 2); - assert.equal(step.catch.block[0].type, "log"); - assert.equal(step.catch.block[1].type, "run"); - } + const e = asEnsureExec(step); + if (e.catch && "block" in e.catch) { + assert.equal(e.catch.block.length, 2); + assert.equal(e.catch.block[0].type, "say"); + assert.equal(e.catch.block[1].type, "exec"); } assert.equal(nextIdx, 3); }); @@ -141,21 +157,21 @@ test("parseEnsureStep: multiline catch block with triple-quoted prompt", () => { " }", ]; const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "gate() catch (err) {"); - assert.equal(step.type, "ensure"); - if (step.type === "ensure" && step.catch && "block" in step.catch) { - assert.equal(step.catch.block.length, 3); - assert.equal(step.catch.block[0].type, "run"); - const p = step.catch.block[1]; - assert.equal(p.type, "prompt"); - if (p.type === "prompt") { - assert.ok(p.raw.includes("fix CI")); + const e = asEnsureExec(step); + if (e.catch && "block" in e.catch) { + assert.equal(e.catch.block.length, 3); + assert.equal(e.catch.block[0].type, "exec"); + const p = e.catch.block[1]; + assert.equal(p.type, "exec"); + if (p.type === "exec" && p.body.kind === "prompt") { + assert.ok(p.body.raw.includes("fix CI")); } - assert.equal(step.catch.block[2].type, "run"); + assert.equal(e.catch.block[2].type, "exec"); } assert.equal(nextIdx, 6); }); -test("parseEnsureStep: catch block lines starting with # are comments not shell", () => { +test("parseEnsureStep: catch block lines starting with # are trivia comments", () => { const lines = [ " ensure gate() catch (err) {", " # note", @@ -163,11 +179,11 @@ test("parseEnsureStep: catch block lines starting with # are comments not shell" " }", ]; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "gate() catch (err) {"); - assert.equal(step.type, "ensure"); - if (step.type === "ensure" && step.catch && "block" in step.catch) { - assert.equal(step.catch.block.length, 2); - assert.equal(step.catch.block[0].type, "comment"); - assert.equal(step.catch.block[1].type, "run"); + const e = asEnsureExec(step); + if (e.catch && "block" in e.catch) { + assert.equal(e.catch.block.length, 2); + assert.equal(e.catch.block[0].type, "trivia"); + assert.equal(e.catch.block[1].type, "exec"); } }); @@ -234,10 +250,11 @@ test("parseEnsureStep: empty inline catch block throws", () => { test("parseEnsureStep: catch with shell command", () => { const lines = [" ensure my_rule() catch (failure) echo fallback"]; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) echo fallback"); - if (step.type === "ensure") { - assert.ok(step.catch); - if ("single" in step.catch) { - assert.equal(step.catch.single.type, "shell"); + const e = asEnsureExec(step); + if (e.catch && "single" in e.catch) { + assert.equal(e.catch.single.type, "exec"); + if (e.catch.single.type === "exec") { + assert.equal(e.catch.single.body.kind, "shell"); } } }); @@ -245,10 +262,11 @@ test("parseEnsureStep: catch with shell command", () => { test("parseEnsureStep: catch with logerr statement", () => { const lines = [' ensure my_rule() catch (failure) logerr "error msg"']; const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) logerr "error msg"'); - if (step.type === "ensure") { - assert.ok(step.catch); - if ("single" in step.catch) { - assert.equal(step.catch.single.type, "logerr"); + const e = asEnsureExec(step); + if (e.catch && "single" in e.catch) { + assert.equal(e.catch.single.type, "say"); + if (e.catch.single.type === "say") { + assert.equal(e.catch.single.level, "logerr"); } } }); @@ -272,13 +290,13 @@ test("parsejaiph: workflow with ensure catch and multiline triple-quoted prompt" const w = mod.workflows.find((x) => x.name === "w"); assert.ok(w); const ensureStep = w!.steps[0]; - assert.equal(ensureStep.type, "ensure"); - if (ensureStep.type === "ensure" && ensureStep.catch && "block" in ensureStep.catch) { - assert.equal(ensureStep.catch.block.length, 1); - const p = ensureStep.catch.block[0]; - assert.equal(p.type, "prompt"); - if (p.type === "prompt") { - assert.ok(p.raw.includes("hello")); + const e = asEnsureExec(ensureStep); + if (e.catch && "block" in e.catch) { + assert.equal(e.catch.block.length, 1); + const p = e.catch.block[0]; + assert.equal(p.type, "exec"); + if (p.type === "exec" && p.body.kind === "prompt") { + assert.ok(p.body.raw.includes("hello")); } } }); @@ -295,15 +313,15 @@ test("parseRunRecoverStep: parses run with single recover statement", () => { const lines = [' run my_workflow() recover(err) log "repairing"']; const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], 'my_workflow() recover(err) log "repairing"'); assert.ok(result); - const step = result!.step; - assert.equal(step.type, "run"); - if (step.type === "run") { - assert.equal(step.workflow.value, "my_workflow"); - assert.ok(step.recover); - assert.equal(step.recover!.bindings.failure, "err"); - if ("single" in step.recover!) { - assert.equal(step.recover!.single.type, "log"); - } + const step = asRunExec(result!.step); + assert.equal(step.body.kind, "call"); + if (step.body.kind === "call") { + assert.equal(step.body.callee.value, "my_workflow"); + } + assert.ok(step.recover); + assert.equal(step.recover!.bindings.failure, "err"); + if (step.recover && "single" in step.recover) { + assert.equal(step.recover.single.type, "say"); } }); @@ -311,11 +329,11 @@ test("parseRunRecoverStep: parses run with inline recover block", () => { const lines = [' run fix() recover(e) { log "a"; run patch() }']; const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], 'fix() recover(e) { log "a"; run patch() }'); assert.ok(result); - const step = result!.step; - if (step.type === "run" && step.recover && "block" in step.recover) { + const step = asRunExec(result!.step); + if (step.recover && "block" in step.recover) { assert.equal(step.recover.block.length, 2); - assert.equal(step.recover.block[0].type, "log"); - assert.equal(step.recover.block[1].type, "run"); + assert.equal(step.recover.block[0].type, "say"); + assert.equal(step.recover.block[1].type, "exec"); } }); @@ -328,11 +346,11 @@ test("parseRunRecoverStep: parses run with multiline recover block", () => { ]; const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "deploy() recover(err) {"); assert.ok(result); - const step = result!.step; - if (step.type === "run" && step.recover && "block" in step.recover) { + const step = asRunExec(result!.step); + if (step.recover && "block" in step.recover) { assert.equal(step.recover.block.length, 2); - assert.equal(step.recover.block[0].type, "log"); - assert.equal(step.recover.block[1].type, "run"); + assert.equal(step.recover.block[0].type, "say"); + assert.equal(step.recover.block[1].type, "exec"); } assert.equal(result!.nextIdx, 3); }); @@ -369,8 +387,6 @@ test("parseRunRecoverStep: empty recover block throws", () => { ); }); -// === parsejaiph: full workflow with recover === - test("parsejaiph: workflow with run recover block", () => { const src = [ "workflow deploy() {", @@ -390,10 +406,7 @@ test("parsejaiph: workflow with run recover block", () => { const mod = parsejaiph(src, "recover_test.jh"); const w = mod.workflows.find((x) => x.name === "deploy"); assert.ok(w); - const runStep = w!.steps[0]; - assert.equal(runStep.type, "run"); - if (runStep.type === "run") { - assert.ok(runStep.recover); - assert.equal(runStep.catch, undefined); - } + const runStep = asRunExec(w!.steps[0]); + assert.ok(runStep.recover); + assert.equal(runStep.catch, undefined); }); diff --git a/src/parse/prompt.ts b/src/parse/prompt.ts index 0f51b4d6..03b75243 100644 --- a/src/parse/prompt.ts +++ b/src/parse/prompt.ts @@ -1,10 +1,10 @@ -import type { WorkflowStepDef } from "../types"; +import type { Expr, WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { fail, hasUnescapedClosingQuote, indexOfClosingDoubleQuote } from "./core"; import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; /** - * Prompt body source tag stored in the AST. + * Prompt body source tag stored in trivia. * - "string" → single-line `"..."` * - "identifier" → bare identifier after `prompt` * - "triple_quoted" → triple-quote `"""..."""` block @@ -166,13 +166,14 @@ function parsePromptTripleQuoteBlock( } /** - * Parse a prompt step (captured or uncaptured). + * Parse a prompt step (captured or uncaptured). Returns an `exec` step whose + * `body` is an `Expr` with `kind: "prompt"`. + * * Supports three body forms: * 1. Single-line string literal: prompt "text" * 2. Bare identifier: prompt myVar * 3. Triple-quoted block: prompt """ ... """ * - * Returns the parsed step and the 0-based line index to continue from. * For catch statements where multiline scanning is unnecessary, pass `[]` for lines. */ export function parsePromptStep( @@ -196,10 +197,29 @@ export function parsePromptStep( ); } + const stepLoc = { line: lineNo, col: promptCol }; + + const buildStep = ( + body: Expr, + bodyTrivia: { bodyKind?: PromptBodyKind; bodyIdentifier?: string; rawBody?: string }, + nextLineIdx: number, + ): { step: WorkflowStepDef; nextLineIdx: number } => { + trivia.setNode(body, { + ...(bodyTrivia.bodyKind ? { bodyKind: bodyTrivia.bodyKind } : {}), + ...(bodyTrivia.bodyIdentifier ? { bodyIdentifier: bodyTrivia.bodyIdentifier } : {}), + ...(bodyTrivia.rawBody !== undefined ? { rawBody: bodyTrivia.rawBody } : {}), + }); + const step: WorkflowStepDef = { + type: "exec", + body, + ...(captureName ? { captureName } : {}), + loc: stepLoc, + }; + return { step, nextLineIdx }; + }; + // --- Case 1: Triple-quoted block --- if (promptArg.startsWith('"""')) { - // Recover blocks pass `lines: []` and a single merged `promptArg` (multiline). - // Split into synthetic lines so `parseTripleQuoteBlock` sees an opening line of only `"""`. let tqLines: string[]; let tripleQuoteLineIdx: number; if (lines.length === 0) { @@ -215,11 +235,7 @@ export function parsePromptStep( tqLines, tripleQuoteLineIdx, ); - - // Wrap body in quotes so the runtime's interpolateWithCaptures can process ${} vars. - // Apply the same dedent at parse time so the runtime no longer needs a tripleQuoted flag. const raw = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); - const linesForReturns = lines.length === 0 ? tqLines : lines; let returnsSchema: string | undefined = returnsOnClosingLine; let consumeEndIdx = realNextIdx; @@ -237,26 +253,17 @@ export function parsePromptStep( consumeEndIdx = pr.nextIndex; } } - - const step = { - type: "prompt" as const, + const expr: Expr = { + kind: "prompt", raw, - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), + loc: stepLoc, ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), }; - trivia.setNode(step, { bodyKind: "triple_quoted", rawBody: body }); - return { - step, - nextLineIdx: consumeEndIdx - 1, - }; + return buildStep(expr, { bodyKind: "triple_quoted", rawBody: body }, consumeEndIdx - 1); } // --- Case 2: String literal --- if (promptArg.startsWith('"')) { - // Check for triple-quote opening: "\"\" (three quotes) — handle as triple-quoted block - // This won't match since we check for """ above first. - // Check for multiline quoted string (no closing quote on same line) — reject it if (!hasUnescapedClosingQuote(promptArg, 1)) { fail(filePath, 'multiline prompt strings are no longer supported; use a triple-quoted block instead: prompt """...""""', lineNo, promptCol); } @@ -267,22 +274,16 @@ export function parsePromptStep( lines, lineIdx, ); - const step = { - type: "prompt" as const, + const expr: Expr = { + kind: "prompt", raw: promptRaw, - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), + loc: stepLoc, ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), }; - trivia.setNode(step, { bodyKind: "string" }); - return { - step, - nextLineIdx: nextIndex - 1, - }; + return buildStep(expr, { bodyKind: "string" }, nextIndex - 1); } // --- Case 3: Bare identifier --- - // Greedy: take the first token as the identifier const identMatch = promptArg.match(/^([A-Za-z_][A-Za-z0-9_]*)/); if (!identMatch) { const msg = captureName @@ -293,7 +294,6 @@ export function parsePromptStep( const identifier = identMatch[1]; const afterIdent = promptArg.slice(identifier.length); - // Check for `returns` after the identifier const { returns: returnsSchema, nextIndex } = parseReturnsClause( filePath, lineNo, @@ -302,18 +302,13 @@ export function parsePromptStep( lineIdx, ); - // Store as "${identifier}" so the runtime interpolates the variable + // Store as "${identifier}" so the runtime interpolates the variable. const raw = `"\${${identifier}}"`; - const step = { - type: "prompt" as const, + const expr: Expr = { + kind: "prompt", raw, - loc: { line: lineNo, col: promptCol }, - ...(captureName ? { captureName } : {}), + loc: stepLoc, ...(returnsSchema !== undefined ? { returns: returnsSchema } : {}), }; - trivia.setNode(step, { bodyKind: "identifier", bodyIdentifier: identifier }); - return { - step, - nextLineIdx: nextIndex - 1, - }; + return buildStep(expr, { bodyKind: "identifier", bodyIdentifier: identifier }, nextIndex - 1); } diff --git a/src/parse/rules.ts b/src/parse/rules.ts index 6b681c83..e10b7139 100644 --- a/src/parse/rules.ts +++ b/src/parse/rules.ts @@ -66,10 +66,11 @@ export function parseRuleBlock( const cmd = currentCommandLines.join("\n").trim(); currentCommandLines = []; if (!cmd) return; + const loc = { line: accumShellLine, col: accumShellCol }; rule.steps.push({ - type: "shell", - command: stripQuotes(cmd), - loc: { line: accumShellLine, col: accumShellCol }, + type: "exec", + body: { kind: "shell", command: stripQuotes(cmd), loc }, + loc, }); }; @@ -87,8 +88,8 @@ export function parseRuleBlock( } else { flushCommand(); const lastStep = rule.steps[rule.steps.length - 1]; - if (lastStep && lastStep.type !== "blank_line") { - rule.steps.push({ type: "blank_line" }); + if (lastStep && !(lastStep.type === "trivia" && lastStep.kind === "blank_line")) { + rule.steps.push({ type: "trivia", kind: "blank_line" }); } } continue; @@ -103,7 +104,8 @@ export function parseRuleBlock( } else { flushCommand(); rule.steps.push({ - type: "comment", + type: "trivia", + kind: "comment", text: innerRaw.trim(), loc: { line: innerNo, col: 1 }, }); @@ -136,7 +138,8 @@ export function parseRuleBlock( continue; } const st = parseBlockStatement(filePath, lines, i, trivia, { forRule: true }); - if (st.step.type !== "shell") { + const isShellExec = st.step.type === "exec" && st.step.body.kind === "shell"; + if (!isShellExec) { flushCommand(); rule.steps.push(st.step); i = st.nextIdx - 1; @@ -160,7 +163,13 @@ export function parseRuleBlock( if (i >= lines.length) { fail(filePath, `unterminated rule block: ${rule.name}`, lineNo); } - while (rule.steps.length > 0 && rule.steps[rule.steps.length - 1].type === "blank_line") { + while ( + rule.steps.length > 0 && + (() => { + const last = rule.steps[rule.steps.length - 1]; + return last.type === "trivia" && last.kind === "blank_line"; + })() + ) { rule.steps.pop(); } return { rule, nextIndex: i + 1, exported: isExported }; diff --git a/src/parse/send-rhs.ts b/src/parse/send-rhs.ts index f69dc412..dabae365 100644 --- a/src/parse/send-rhs.ts +++ b/src/parse/send-rhs.ts @@ -1,4 +1,4 @@ -import type { SendRhsDef, WorkflowRefDef } from "../types"; +import type { Expr, WorkflowRefDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { fail, hasUnescapedClosingQuote, indexOfClosingDoubleQuote, isRef, parseCallRef, rejectTrailingContent } from "./core"; import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } from "./triple-quote"; @@ -6,7 +6,10 @@ import { dedentTripleQuotedBody, parseTripleQuoteBlock, tripleQuoteBodyToRaw } f const SEND_RHS_HINT = 'send right-hand side must be a quoted string ("..."), a variable ($name or ${...}), or "run [args]" — not raw shell; use a script or use const'; -/** Parse RHS after `<-` for the send operator. Returns the parsed RHS and next line index. */ +/** + * Parse RHS after `<-` for the send operator. Returns the parsed RHS as an `Expr` + * (replaces the legacy `SendRhsDef` union) plus the next line index. + */ export function parseSendRhs( filePath: string, rhs: string, @@ -15,7 +18,7 @@ export function parseSendRhs( lines?: string[], idx?: number, trivia: Trivia = createTrivia(), -): { rhs: SendRhsDef; nextIdx: number } { +): { value: Expr; nextIdx: number } { const t = rhs.trim(); const defaultNext = (idx ?? lineNo - 1) + 1; if (t === "") { @@ -26,9 +29,9 @@ export function parseSendRhs( tqLines[idx] = t; const { body, nextIdx, afterClose } = parseTripleQuoteBlock(filePath, tqLines, idx); if (afterClose) fail(filePath, 'unexpected content after closing """', nextIdx); - const rhsNode: SendRhsDef = { kind: "literal", token: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; - trivia.setNode(rhsNode, { tripleQuoted: true, rawBody: body }); - return { rhs: rhsNode, nextIdx }; + const value: Expr = { kind: "literal", raw: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + trivia.setNode(value, { tripleQuoted: true, rawBody: body }); + return { value, nextIdx }; } if (t.startsWith('"')) { if (!hasUnescapedClosingQuote(t, 1)) { @@ -41,24 +44,21 @@ export function parseSendRhs( if (t.slice(close + 1).trim() !== "") { fail(filePath, SEND_RHS_HINT, lineNo, col); } - return { rhs: { kind: "literal", token: t.slice(0, close + 1) }, nextIdx: defaultNext }; + return { value: { kind: "literal", raw: t.slice(0, close + 1) }, nextIdx: defaultNext }; } if (t.startsWith("run ")) { const call = parseCallRef(t.slice("run ".length).trim()); if (call) { rejectTrailingContent(filePath, lineNo, "run", call.rest); - const ref: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; + const callee: WorkflowRefDef = { value: call.ref, loc: { line: lineNo, col } }; return { - rhs: { - kind: "run", ref, - ...(call.args ? { args: call.args } : {}), - }, + value: { kind: "call", callee, ...(call.args ? { args: call.args } : {}) }, nextIdx: defaultNext, }; } } if (/^\$[A-Za-z_][A-Za-z0-9_]*$/.test(t)) { - return { rhs: { kind: "var", bash: t }, nextIdx: defaultNext }; + return { value: { kind: "literal", raw: t }, nextIdx: defaultNext }; } if (t.startsWith("${")) { let depth = 1; @@ -87,17 +87,17 @@ export function parseSendRhs( if (braced.includes("$(")) { fail(filePath, SEND_RHS_HINT, lineNo, col); } - return { rhs: { kind: "var", bash: braced }, nextIdx: defaultNext }; + return { value: { kind: "literal", raw: braced }, nextIdx: defaultNext }; } const bareWord = t.match(/^([A-Za-z_][A-Za-z0-9_]*(?:\.[A-Za-z_][A-Za-z0-9_]*)?)$/); if (bareWord && isRef(bareWord[1])) { return { - rhs: { kind: "bare_ref", ref: { value: bareWord[1], loc: { line: lineNo, col } } }, + value: { kind: "bare_ref", ref: { value: bareWord[1], loc: { line: lineNo, col } } }, nextIdx: defaultNext, }; } return { - rhs: { kind: "shell", command: t, loc: { line: lineNo, col } }, + value: { kind: "shell", command: t, loc: { line: lineNo, col } }, nextIdx: defaultNext, }; } diff --git a/src/parse/steps.ts b/src/parse/steps.ts index 62d5ec3b..6150224c 100644 --- a/src/parse/steps.ts +++ b/src/parse/steps.ts @@ -1,7 +1,7 @@ -import type { WorkflowStepDef } from "../types"; +import type { CatchBody, Expr, WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { parseConstRhs } from "./const-rhs"; -import { argsToSourceForm, fail, indexOfClosingDoubleQuote, isRef, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; +import { fail, indexOfClosingDoubleQuote, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; import { parseAnonymousInlineScript } from "./inline-script"; import { isBareIdentifierReturn, bareIdentifierToQuotedString, isBareDottedIdentifierReturn, dottedReturnToQuotedString } from "./workflow-return-dotted"; import { parsePromptStep } from "./prompt"; @@ -86,6 +86,22 @@ function splitCatchStatements(blockContent: string): string[] { return statements; } +/** Build an `exec` step. Inline helper to keep call sites tidy. */ +function execStep( + body: Expr, + loc: { line: number; col: number }, + extras: { captureName?: string; catch?: CatchBody; recover?: CatchBody } = {}, +): WorkflowStepDef { + return { + type: "exec", + body, + ...(extras.captureName ? { captureName: extras.captureName } : {}), + ...(extras.catch ? { catch: extras.catch } : {}), + ...(extras.recover ? { recover: extras.recover } : {}), + loc, + }; +} + /** Parse a single workflow statement string (e.g. "run foo", "ensure bar", "echo x") into a step. */ function parseCatchStatement( filePath: string, @@ -95,68 +111,55 @@ function parseCatchStatement( trivia: Trivia, ): WorkflowStepDef { const t = stmt.trim(); + const loc = { line: lineNo, col }; if (!t) { fail(filePath, "empty catch statement", lineNo, col); } if (t.startsWith("#")) { - return { type: "comment", text: t, loc: { line: lineNo, col } }; + return { type: "trivia", kind: "comment", text: t, loc }; } if (t === "wait") { fail(filePath, '"wait" has been removed from the language', lineNo, col); } if (t === "return") { - return { type: "return", value: '""', loc: { line: lineNo, col } }; + return { type: "return", value: { kind: "literal", raw: '""' }, loc }; } if (t.startsWith("return ")) { const retVal = t.slice("return ".length).trim(); - // return run ref(args) — managed run if (retVal.startsWith("run ")) { const call = parseCallRef(retVal.slice("run ".length).trim()); if (call && !call.rest.trim()) { + const callee = { value: call.ref, loc }; return { type: "return", - value: `run ${call.ref}(${argsToSourceForm(call.args)})`, - loc: { line: lineNo, col }, - managed: { - kind: "run", - ref: { value: call.ref, loc: { line: lineNo, col } }, - args: call.args, - }, + value: { kind: "call", callee, args: call.args }, + loc, }; } } - // return ensure ref(args) — managed ensure if (retVal.startsWith("ensure ")) { const call = parseCallRef(retVal.slice("ensure ".length).trim()); if (call && !call.rest.trim()) { + const callee = { value: call.ref, loc }; return { type: "return", - value: `ensure ${call.ref}(${argsToSourceForm(call.args)})`, - loc: { line: lineNo, col }, - managed: { - kind: "ensure", - ref: { value: call.ref, loc: { line: lineNo, col } }, - args: call.args, - }, + value: { kind: "ensure_call", callee, args: call.args }, + loc, }; } } const isBareDotted = isBareDottedIdentifierReturn(retVal); const isBare = !isBareDotted && isBareIdentifierReturn(retVal); - const value = isBareDotted + const raw = isBareDotted ? dottedReturnToQuotedString(retVal) : isBare ? bareIdentifierToQuotedString(retVal) : retVal; - const step: WorkflowStepDef = { - type: "return", - value, - loc: { line: lineNo, col }, - }; + const value: Expr = { kind: "literal", raw }; if (isBareDotted || isBare) { - trivia.setNode(step, { bareSource: retVal.trim() }); + trivia.setNode(value, { bareSource: retVal.trim() }); } - return step; + return { type: "return", value, loc }; } if (/^fail\s+/.test(t)) { const arg = t.slice("fail".length).trimStart(); @@ -167,8 +170,8 @@ function parseCatchStatement( if (closeIdx === -1) { fail(filePath, "unterminated fail string", lineNo, col); } - const message = arg.slice(0, closeIdx + 1); - return { type: "fail", message, loc: { line: lineNo, col } }; + const raw = arg.slice(0, closeIdx + 1); + return { type: "say", level: "fail", message: { kind: "literal", raw }, loc }; } const constMatch = t.match(/^const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*(.+)$/s); if (constMatch) { @@ -176,12 +179,7 @@ function parseCatchStatement( const rhs = constMatch[2].trim(); const syntheticLines = [t]; const { value } = parseConstRhs(filePath, syntheticLines, 0, rhs, lineNo, col, false, name, trivia); - return { - type: "const", - name, - value, - loc: { line: lineNo, col }, - }; + return { type: "const", name, value, loc }; } const genericAssignMatch = t.match(/^([A-Za-z_][A-Za-z0-9_]*)\s+=\s*(.+)$/s); if ( @@ -206,13 +204,13 @@ function parseCatchStatement( const runBody = t.slice("run ".length).trim(); if (runBody.startsWith("`")) { const result = parseAnonymousInlineScript(filePath, [], lineNo - 1, runBody, lineNo, col); - return { - type: "run_inline_script", + const body: Expr = { + kind: "inline_script", body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args, - loc: { line: lineNo, col }, }; + return execStep(body, loc); } // Check for run ... recover inside catch/recover blocks const recoverLoopMatch = runBody.match(/ recover(?=[\s(])/); @@ -229,25 +227,17 @@ function parseCatchStatement( if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { const bindings = { failure: bParts[0] }; const after = rightPart.slice(closeParen + 1).trim(); + const callee = { value: callPart.ref, loc }; + const body: Expr = { kind: "call", callee, args: callPart.args }; if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return { - type: "run", - workflow: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - recover: { block: blockSteps, bindings }, - }; + return execStep(body, loc, { recover: { block: blockSteps, bindings } }); } if (!after.startsWith("{") && after) { const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return { - type: "run", - workflow: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - recover: { single: singleStep, bindings }, - }; + return execStep(body, loc, { recover: { single: singleStep, bindings } }); } } } @@ -267,25 +257,17 @@ function parseCatchStatement( if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { const bindings = { failure: bParts[0] }; const after = rightPart.slice(closeParen + 1).trim(); + const callee = { value: callPart.ref, loc }; + const body: Expr = { kind: "call", callee, args: callPart.args }; if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return { - type: "run", - workflow: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - catch: { block: blockSteps, bindings }, - }; + return execStep(body, loc, { catch: { block: blockSteps, bindings } }); } if (!after.startsWith("{") && after) { const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return { - type: "run", - workflow: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - catch: { single: singleStep, bindings }, - }; + return execStep(body, loc, { catch: { single: singleStep, bindings } }); } } } @@ -294,11 +276,8 @@ function parseCatchStatement( const call = parseCallRef(runBody); if (call) { rejectTrailingContent(filePath, lineNo, "run", call.rest); - return { - type: "run", - workflow: { value: call.ref, loc: { line: lineNo, col } }, - args: call.args, - }; + const callee = { value: call.ref, loc }; + return execStep({ kind: "call", callee, args: call.args }, loc); } } if (t.startsWith("ensure ")) { @@ -316,25 +295,17 @@ function parseCatchStatement( if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { const bindings = { failure: bParts[0] }; const after = rightPart.slice(closeParen + 1).trim(); + const callee = { value: callPart.ref, loc }; + const body: Expr = { kind: "ensure_call", callee, args: callPart.args }; if (after.startsWith("{") && after.endsWith("}")) { const blockContent = after.slice(1, -1).trim(); const stmts = splitCatchStatements(blockContent); const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return { - type: "ensure", - ref: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - catch: { block: blockSteps, bindings }, - }; + return execStep(body, loc, { catch: { block: blockSteps, bindings } }); } if (!after.startsWith("{") && after) { const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return { - type: "ensure", - ref: { value: callPart.ref, loc: { line: lineNo, col } }, - args: callPart.args, - catch: { single: singleStep, bindings }, - }; + return execStep(body, loc, { catch: { single: singleStep, bindings } }); } } } @@ -343,11 +314,8 @@ function parseCatchStatement( const call = parseCallRef(ensureBody); if (call) { rejectTrailingContent(filePath, lineNo, "ensure", call.rest); - return { - type: "ensure", - ref: { value: call.ref, loc: { line: lineNo, col } }, - args: call.args, - }; + const callee = { value: call.ref, loc }; + return execStep({ kind: "ensure_call", callee, args: call.args }, loc); } } const promptAssignMatch = t.match( @@ -370,21 +338,21 @@ function parseCatchStatement( if (t.startsWith("log ") || t === "log") { const logArg = t.slice("log".length).trimStart(); const logCol = col + Math.max(0, t.indexOf("log")); - const message = parseLogMessageRhs(filePath, lineNo, logCol, logArg, "log"); - return { type: "log", message, loc: { line: lineNo, col: logCol } }; + const raw = parseLogMessageRhs(filePath, lineNo, logCol, logArg, "log"); + return { type: "say", level: "log", message: { kind: "literal", raw }, loc: { line: lineNo, col: logCol } }; } if (t.startsWith("logerr ") || t === "logerr") { const logerrArg = t.slice("logerr".length).trimStart(); const logerrCol = col + Math.max(0, t.indexOf("logerr")); - const message = parseLogMessageRhs(filePath, lineNo, logerrCol, logerrArg, "logerr"); - return { type: "logerr", message, loc: { line: lineNo, col: logerrCol } }; + const raw = parseLogMessageRhs(filePath, lineNo, logerrCol, logerrArg, "logerr"); + return { type: "say", level: "logerr", message: { kind: "literal", raw }, loc: { line: lineNo, col: logerrCol } }; } - return { type: "shell", command: t, loc: { line: lineNo, col } }; + return execStep({ kind: "shell", command: t, loc }, loc); } /** * Parse an `ensure [args] [catch ...]` step, with optional captureName. - * Returns the step and the updated 0-based line index. + * Returns the step (`type: "exec"`, `body: ensure_call`) and the updated 0-based line index. */ export function parseEnsureStep( filePath: string, @@ -398,8 +366,8 @@ export function parseEnsureStep( ): { step: WorkflowStepDef; nextIdx: number } { const catchIdx = ensureBody.indexOf(" catch "); const ensureCol = innerRaw.indexOf("ensure") + 1; + const stepLoc = { line: innerNo, col: ensureCol }; - // `catch` at end of line with no block → error if (/\scatch$/.test(ensureBody)) { const catchCol = innerRaw.indexOf("catch") + 1; fail( @@ -416,13 +384,9 @@ export function parseEnsureStep( fail(filePath, "ensure must target a valid reference: ensure ref() or ensure ref(args) — parentheses are required", innerNo); } rejectTrailingContent(filePath, innerNo, "ensure", call.rest); + const callee = { value: call.ref, loc: stepLoc }; return { - step: { - type: "ensure", - ref: { value: call.ref, loc: { line: innerNo, col: ensureCol } }, - args: call.args, - ...(captureName ? { captureName } : {}), - }, + step: execStep({ kind: "ensure_call", callee, args: call.args }, stepLoc, { captureName }), nextIdx: idx, }; } @@ -433,11 +397,10 @@ export function parseEnsureStep( fail(filePath, "ensure must target a valid reference: ensure ref() or ensure ref(args) — parentheses are required", innerNo); } rejectTrailingContent(filePath, innerNo, "ensure", call.rest); - const ref = call.ref; + const callee = { value: call.ref, loc: stepLoc }; const args = call.args; const catchCol = innerRaw.indexOf("catch") + 1; - // Catch requires explicit bindings: catch () if (!right.startsWith("(")) { fail( filePath, @@ -465,12 +428,7 @@ export function parseEnsureStep( const bindings = { failure: bindingParts[0] }; const afterBindings = right.slice(closeParen + 1).trim(); - - const refLoc = { value: ref, loc: { line: innerNo, col: ensureCol } }; - const base = { - type: "ensure" as const, ref: refLoc, args, - ...(captureName ? { captureName } : {}), - }; + const body: Expr = { kind: "ensure_call", callee, args }; if (afterBindings === "{") { let blockLines: string[] = []; @@ -493,7 +451,10 @@ export function parseEnsureStep( fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), + nextIdx: closeLineIdx, + }; } if (afterBindings.startsWith("{")) { @@ -507,7 +468,10 @@ export function parseEnsureStep( fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); - return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), + nextIdx: idx, + }; } if (!afterBindings) { @@ -515,7 +479,10 @@ export function parseEnsureStep( } const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); - return { step: { ...base, catch: { single: singleStep, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { single: singleStep, bindings } }), + nextIdx: idx, + }; } /** @@ -532,7 +499,6 @@ export function parseRunRecoverStep( captureName?: string, trivia: Trivia = createTrivia(), ): { step: WorkflowStepDef; nextIdx: number } | null { - // Match ` recover(`, ` recover `, or ` recover` at end of line const recoverMatch = runBody.match(/ recover(?=[\s(]|$)/); if (!recoverMatch) return null; const recoverIdx = recoverMatch.index!; @@ -552,6 +518,7 @@ export function parseRunRecoverStep( const call = parseCallRef(left); if (!call || call.rest.trim()) return null; const runCol = innerRaw.indexOf("run") + 1; + const stepLoc = { line: innerNo, col: runCol }; const recoverCol = innerRaw.indexOf("recover") + 1; if (!right.startsWith("(")) { @@ -581,12 +548,8 @@ export function parseRunRecoverStep( const bindings = { failure: bindingParts[0] }; const afterBindings = right.slice(closeParen + 1).trim(); - const base = { - type: "run" as const, - workflow: { value: call.ref, loc: { line: innerNo, col: runCol } }, - args: call.args, - ...(captureName ? { captureName } : {}), - }; + const callee = { value: call.ref, loc: stepLoc }; + const body: Expr = { kind: "call", callee, args: call.args }; if (afterBindings === "{") { let blockLines: string[] = []; @@ -609,7 +572,10 @@ export function parseRunRecoverStep( fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { step: { ...base, recover: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; + return { + step: execStep(body, stepLoc, { captureName, recover: { block: blockSteps, bindings } }), + nextIdx: closeLineIdx, + }; } if (afterBindings.startsWith("{")) { @@ -623,7 +589,10 @@ export function parseRunRecoverStep( fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, recoverCol, s, trivia)); - return { step: { ...base, recover: { block: blockSteps, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, recover: { block: blockSteps, bindings } }), + nextIdx: idx, + }; } if (!afterBindings) { @@ -631,7 +600,10 @@ export function parseRunRecoverStep( } const singleStep = parseCatchStatement(filePath, innerNo, recoverCol, afterBindings, trivia); - return { step: { ...base, recover: { single: singleStep, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, recover: { single: singleStep, bindings } }), + nextIdx: idx, + }; } /** @@ -651,7 +623,6 @@ export function parseRunCatchStep( const catchIdx = runBody.indexOf(" catch "); if (catchIdx === -1) return null; - // `catch` at end of line with no block → error if (/\scatch$/.test(runBody)) { const catchCol = innerRaw.indexOf("catch") + 1; fail( @@ -667,6 +638,7 @@ export function parseRunCatchStep( const call = parseCallRef(left); if (!call || call.rest.trim()) return null; const runCol = innerRaw.indexOf("run") + 1; + const stepLoc = { line: innerNo, col: runCol }; const catchCol = innerRaw.indexOf("catch") + 1; if (!right.startsWith("(")) { @@ -696,12 +668,8 @@ export function parseRunCatchStep( const bindings = { failure: bindingParts[0] }; const afterBindings = right.slice(closeParen + 1).trim(); - const base = { - type: "run" as const, - workflow: { value: call.ref, loc: { line: innerNo, col: runCol } }, - args: call.args, - ...(captureName ? { captureName } : {}), - }; + const callee = { value: call.ref, loc: stepLoc }; + const body: Expr = { kind: "call", callee, args: call.args }; if (afterBindings === "{") { let blockLines: string[] = []; @@ -724,7 +692,10 @@ export function parseRunCatchStep( fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: closeLineIdx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), + nextIdx: closeLineIdx, + }; } if (afterBindings.startsWith("{")) { @@ -738,7 +709,10 @@ export function parseRunCatchStep( fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); } const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); - return { step: { ...base, catch: { block: blockSteps, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), + nextIdx: idx, + }; } if (!afterBindings) { @@ -746,5 +720,8 @@ export function parseRunCatchStep( } const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); - return { step: { ...base, catch: { single: singleStep, bindings } }, nextIdx: idx }; + return { + step: execStep(body, stepLoc, { captureName, catch: { single: singleStep, bindings } }), + nextIdx: idx, + }; } diff --git a/src/parse/trivia-ast-shape.test.ts b/src/parse/trivia-ast-shape.test.ts index 458cd209..0e5cac1c 100644 --- a/src/parse/trivia-ast-shape.test.ts +++ b/src/parse/trivia-ast-shape.test.ts @@ -2,24 +2,21 @@ import test from "node:test"; import assert from "node:assert/strict"; import type { ChannelDef, - ConstRhs, ImportDef, ScriptDef, ScriptImportDef, - SendRhsDef, TestBlockDef, WorkflowMetadata, WorkflowStepDef, jaiphModule, + Expr, } from "../types"; /** - * AC1: trivia / source-fidelity fields must not live on semantic AST types. - * - * Each helper below assigns an object literal with the field that *used* to - * exist; if anyone re-adds the field to the public type, the literal type - * widens, the type assertion below fails, and TypeScript breaks compilation — - * which is what the criterion asks for. + * AC1 (Trivia/CST split): source-fidelity fields must not live on semantic + * AST types. Each helper below assigns an object literal with the field that + * *used* to exist; if anyone re-adds the field to the public type, the literal + * widens, the type assertion below fails, and TypeScript breaks compilation. */ type HasField = T extends Record ? true : false; @@ -41,33 +38,29 @@ const _metaNoConfigSeq: HasField = false // ScriptDef must not carry bodyKind. const _scriptNoBodyKind: HasField = false; -// Pick concrete variants out of WorkflowStepDef and assert no trivia fields. -type LogStep = Extract; -type LogerrStep = Extract; -type FailStep = Extract; +// Step variants must not carry surface-form trivia. +type SayStep = Extract; type ReturnStep = Extract; -type PromptStep = Extract; +type SendStep = Extract; +type ExecStep = Extract; -const _logNoTripleQuoted: HasField = false; -const _logerrNoTripleQuoted: HasField = false; -const _failNoTripleQuoted: HasField = false; +const _sayNoTripleQuoted: HasField = false; const _returnNoTripleQuoted: HasField = false; const _returnNoBareSource: HasField = false; -const _promptNoBodyKind: HasField = false; -const _promptNoBodyIdentifier: HasField = false; +const _execNoBodyKind: HasField = false; +const _execNoBodyIdentifier: HasField = false; -// ConstRhs.expr must not carry tripleQuoted. -type ConstExpr = Extract; -type ConstPromptCapture = Extract; -const _constExprNoTripleQuoted: HasField = false; -const _constPromptNoBodyKind: HasField = false; -const _constPromptNoBodyIdentifier: HasField = false; +// Expr literal must not carry tripleQuoted — that lives in trivia instead. +type LiteralExpr = Extract; +type PromptExpr = Extract; +const _literalNoTripleQuoted: HasField = false; +const _promptNoBodyKind: HasField = false; +const _promptNoBodyIdentifier: HasField = false; -// SendRhsDef literal must not carry tripleQuoted. -type SendLiteral = Extract; -const _sendLiteralNoTripleQuoted: HasField = false; +// send.value carries an Expr; the old SendRhsDef.literal wrapper with +// `tripleQuoted` is gone. +const _sendValueIsExpr: SendStep["value"] extends Expr ? true : false = true; -// Reference the symbols so they are not tree-shaken or marked unused. test("AC1: no trivia fields on semantic AST types", () => { assert.equal(_moduleNoConfigLeading, false); assert.equal(_moduleNoTrailing, false); @@ -78,15 +71,13 @@ test("AC1: no trivia fields on semantic AST types", () => { assert.equal(_testBlockNoLeading, false); assert.equal(_metaNoConfigSeq, false); assert.equal(_scriptNoBodyKind, false); - assert.equal(_logNoTripleQuoted, false); - assert.equal(_logerrNoTripleQuoted, false); - assert.equal(_failNoTripleQuoted, false); + assert.equal(_sayNoTripleQuoted, false); assert.equal(_returnNoTripleQuoted, false); assert.equal(_returnNoBareSource, false); + assert.equal(_execNoBodyKind, false); + assert.equal(_execNoBodyIdentifier, false); + assert.equal(_literalNoTripleQuoted, false); assert.equal(_promptNoBodyKind, false); assert.equal(_promptNoBodyIdentifier, false); - assert.equal(_constExprNoTripleQuoted, false); - assert.equal(_constPromptNoBodyKind, false); - assert.equal(_constPromptNoBodyIdentifier, false); - assert.equal(_sendLiteralNoTripleQuoted, false); + assert.equal(_sendValueIsExpr, true); }); diff --git a/src/parse/workflow-brace.ts b/src/parse/workflow-brace.ts index 6c125747..5bf66feb 100644 --- a/src/parse/workflow-brace.ts +++ b/src/parse/workflow-brace.ts @@ -1,7 +1,6 @@ -import type { WorkflowMetadata, WorkflowStepDef } from "../types"; +import type { CatchBody, Expr, WorkflowMetadata, WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; import { - argsToSourceForm, colFromRaw, fail, hasUnescapedClosingQuote, @@ -23,7 +22,7 @@ import { dottedReturnToQuotedString, isBareDottedIdentifierReturn, isBareIdentif export type BlockParseOpts = { forRule?: boolean; - /** When true, push `blank_line` steps so the formatter can preserve spacing. */ + /** When true, push `blank_line` trivia steps so the formatter can preserve spacing. */ preserveBlankLines?: boolean; /** * When set, allow a `config { … }` block as the first non-comment statement. @@ -52,8 +51,8 @@ export function parseBraceBlockBody( if (inner === "") { if (opts?.preserveBlankLines) { const last = steps[steps.length - 1]; - if (last && last.type !== "blank_line") { - steps.push({ type: "blank_line" }); + if (last && !(last.type === "trivia" && last.kind === "blank_line")) { + steps.push({ type: "trivia", kind: "blank_line" }); } } idx += 1; @@ -61,7 +60,8 @@ export function parseBraceBlockBody( } if (inner.startsWith("#")) { steps.push({ - type: "comment", + type: "trivia", + kind: "comment", text: innerRaw.trim(), loc: { line: innerNo, col: 1 }, }); @@ -99,6 +99,22 @@ export function parseBraceBlockBody( fail(filePath, 'unterminated block, expected "}"', openerLineNo); } +/** Build an `exec` step from a value expression and optional capture/catch/recover. */ +function execStep( + body: Expr, + loc: { line: number; col: number }, + extras: { captureName?: string; catch?: CatchBody; recover?: CatchBody } = {}, +): WorkflowStepDef { + return { + type: "exec", + body, + ...(extras.captureName ? { captureName: extras.captureName } : {}), + ...(extras.catch ? { catch: extras.catch } : {}), + ...(extras.recover ? { recover: extras.recover } : {}), + loc, + }; +} + /** * One workflow statement inside `{ … }` (catch body, etc.). */ @@ -117,7 +133,8 @@ export function parseBlockStatement( if (inner.startsWith("#")) { return { step: { - type: "comment", + type: "trivia", + kind: "comment", text: innerRaw.trim(), loc: { line: innerNo, col: 1 }, }, @@ -205,9 +222,10 @@ export function parseBlockStatement( const failCol = innerRaw.indexOf("fail") + 1; if (arg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, arg); - const message = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); - const step = { type: "fail" as const, message, loc: { line: innerNo, col: failCol } }; - trivia.setNode(step, { tripleQuoted: true, rawBody: body }); + const raw = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); + const message: Expr = { kind: "literal", raw }; + trivia.setNode(message, { tripleQuoted: true, rawBody: body }); + const step: WorkflowStepDef = { type: "say", level: "fail", message, loc: { line: innerNo, col: failCol } }; return { step, nextIdx }; } if (!arg.startsWith('"')) { @@ -220,9 +238,14 @@ export function parseBlockStatement( if (closeIdx === -1) { fail(filePath, "unterminated fail string", innerNo, failCol); } - const message = arg.slice(0, closeIdx + 1); + const raw = arg.slice(0, closeIdx + 1); return { - step: { type: "fail", message, loc: { line: innerNo, col: failCol } }, + step: { + type: "say", + level: "fail", + message: { kind: "literal", raw }, + loc: { line: innerNo, col: failCol }, + }, nextIdx: idx + 1, }; } @@ -242,22 +265,25 @@ export function parseBlockStatement( if (inner.startsWith("run async ")) { const runBody = inner.slice("run async ".length).trim(); + const runCol = innerRaw.indexOf("run") + 1; if (runBody.startsWith("`")) { - fail(filePath, "run async is not supported with inline scripts", innerNo, innerRaw.indexOf("run") + 1); + fail(filePath, "run async is not supported with inline scripts", innerNo, runCol); } // run async ... recover(name) { ... } const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (recoverResult && recoverResult.step.type === "run") { + if (recoverResult && recoverResult.step.type === "exec" && recoverResult.step.body.kind === "call") { + const body: Expr = { ...recoverResult.step.body, async: true }; return { - step: { ...recoverResult.step, async: true }, + step: { ...recoverResult.step, body }, nextIdx: recoverResult.nextIdx + 1, }; } // run async ... catch(name) { ... } const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (catchResult && catchResult.step.type === "run") { + if (catchResult && catchResult.step.type === "exec" && catchResult.step.body.kind === "call") { + const body: Expr = { ...catchResult.step.body, async: true }; return { - step: { ...catchResult.step, async: true }, + step: { ...catchResult.step, body }, nextIdx: catchResult.nextIdx + 1, }; } @@ -266,32 +292,31 @@ export function parseBlockStatement( fail(filePath, "run async must target a valid reference: run async ref() or run async ref(args) — parentheses are required", innerNo); } rejectTrailingContent(filePath, innerNo, "run async", call.rest); + const callee = { value: call.ref, loc: { line: innerNo, col: runCol } }; return { - step: { - type: "run", - workflow: { - value: call.ref, - loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, - }, - args: call.args, - async: true, - }, + step: execStep( + { kind: "call", callee, args: call.args, async: true }, + { line: innerNo, col: runCol }, + ), nextIdx: idx + 1, }; } if (inner.startsWith("run ")) { const runBody = inner.slice("run ".length).trim(); + const runCol = innerRaw.indexOf("run") + 1; if (runBody.startsWith("`")) { - const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, innerRaw.indexOf("run") + 1); + const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, runCol); return { - step: { - type: "run_inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, - }, + step: execStep( + { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, + }, + { line: innerNo, col: runCol }, + ), nextIdx: result.nextLineIdx, }; } @@ -313,15 +338,12 @@ export function parseBlockStatement( fail(filePath, "run must target a valid reference: run ref() or run ref(args) — parentheses are required", innerNo); } rejectTrailingContent(filePath, innerNo, "run", call.rest); + const callee = { value: call.ref, loc: { line: innerNo, col: runCol } }; return { - step: { - type: "run", - workflow: { - value: call.ref, - loc: { line: innerNo, col: innerRaw.indexOf("run") + 1 }, - }, - args: call.args, - }, + step: execStep( + { kind: "call", callee, args: call.args }, + { line: innerNo, col: runCol }, + ), nextIdx: idx + 1, }; } @@ -368,82 +390,78 @@ export function parseBlockStatement( if (inner.startsWith("log ") || inner === "log") { const logArg = inner.slice("log".length).trimStart(); const logCol = innerRaw.indexOf("log") + 1; + const stepLoc = { line: innerNo, col: logCol }; if (logArg.startsWith("run ") && logArg.slice("run ".length).trimStart().startsWith("`")) { const runBody = logArg.slice("run ".length).trim(); const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, logCol); - return { - step: { - type: "log", - message: "", - loc: { line: innerNo, col: logCol }, - managed: { - kind: "run_inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }, - }, - nextIdx: result.nextLineIdx, + const message: Expr = { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, }; + return { step: { type: "say", level: "log", message, loc: stepLoc }, nextIdx: result.nextLineIdx }; } if (logArg.startsWith("`") || logArg.startsWith("```")) { fail(filePath, 'bare inline scripts in log are not allowed; use "log run `...`()" to execute a managed inline script', innerNo, logCol); } if (logArg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logArg); - const step = { type: "log" as const, message: dedentTripleQuotedBody(body), loc: { line: innerNo, col: logCol } }; - trivia.setNode(step, { tripleQuoted: true, rawBody: body }); - return { step, nextIdx }; + const raw = dedentTripleQuotedBody(body); + const message: Expr = { kind: "literal", raw }; + trivia.setNode(message, { tripleQuoted: true, rawBody: body }); + return { step: { type: "say", level: "log", message, loc: stepLoc }, nextIdx }; } if (logArg.startsWith('"') && !hasUnescapedClosingQuote(logArg, 1)) { fail(filePath, 'multiline strings use triple quotes: log """..."""', innerNo, logCol); } - const message = parseLogMessageRhs(filePath, innerNo, logCol, logArg, "log"); - return { step: { type: "log", message, loc: { line: innerNo, col: logCol } }, nextIdx: idx + 1 }; + const messageRaw = parseLogMessageRhs(filePath, innerNo, logCol, logArg, "log"); + return { + step: { type: "say", level: "log", message: { kind: "literal", raw: messageRaw }, loc: stepLoc }, + nextIdx: idx + 1, + }; } if (inner.startsWith("logerr ") || inner === "logerr") { const logerrArg = inner.slice("logerr".length).trimStart(); const logerrCol = innerRaw.indexOf("logerr") + 1; + const stepLoc = { line: innerNo, col: logerrCol }; if (logerrArg.startsWith("run ") && logerrArg.slice("run ".length).trimStart().startsWith("`")) { const runBody = logerrArg.slice("run ".length).trim(); const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, logerrCol); - return { - step: { - type: "logerr", - message: "", - loc: { line: innerNo, col: logerrCol }, - managed: { - kind: "run_inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }, - }, - nextIdx: result.nextLineIdx, + const message: Expr = { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, }; + return { step: { type: "say", level: "logerr", message, loc: stepLoc }, nextIdx: result.nextLineIdx }; } if (logerrArg.startsWith("`") || logerrArg.startsWith("```")) { fail(filePath, 'bare inline scripts in logerr are not allowed; use "logerr run `...`()" to execute a managed inline script', innerNo, logerrCol); } if (logerrArg.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logerrArg); - const step = { type: "logerr" as const, message: dedentTripleQuotedBody(body), loc: { line: innerNo, col: logerrCol } }; - trivia.setNode(step, { tripleQuoted: true, rawBody: body }); - return { step, nextIdx }; + const raw = dedentTripleQuotedBody(body); + const message: Expr = { kind: "literal", raw }; + trivia.setNode(message, { tripleQuoted: true, rawBody: body }); + return { step: { type: "say", level: "logerr", message, loc: stepLoc }, nextIdx }; } if (logerrArg.startsWith('"') && !hasUnescapedClosingQuote(logerrArg, 1)) { fail(filePath, 'multiline strings use triple quotes: logerr """..."""', innerNo, logerrCol); } - const message = parseLogMessageRhs(filePath, innerNo, logerrCol, logerrArg, "logerr"); - return { step: { type: "logerr", message, loc: { line: innerNo, col: logerrCol } }, nextIdx: idx + 1 }; + const messageRaw = parseLogMessageRhs(filePath, innerNo, logerrCol, logerrArg, "logerr"); + return { + step: { type: "say", level: "logerr", message: { kind: "literal", raw: messageRaw }, loc: stepLoc }, + nextIdx: idx + 1, + }; } if (inner.trim() === "return") { return { step: { type: "return", - value: '""', + value: { kind: "literal", raw: '""' }, loc: { line: innerNo, col: innerRaw.indexOf("return") + 1 }, }, nextIdx: idx + 1, @@ -457,13 +475,12 @@ export function parseBlockStatement( // return """...""" if (returnValue.startsWith('"""')) { const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, returnValue); - const step = { - type: "return" as const, - value: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)), - loc: retLoc, + const value: Expr = { kind: "literal", raw: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + trivia.setNode(value, { tripleQuoted: true, rawBody: body }); + return { + step: { type: "return", value, loc: retLoc }, + nextIdx, }; - trivia.setNode(step, { tripleQuoted: true, rawBody: body }); - return { step, nextIdx }; } // return match var { ... } const returnMatchHead = returnValue.match(/^match\s+(.+?)\s*\{\s*$/); @@ -471,12 +488,7 @@ export function parseBlockStatement( const subject = returnMatchHead[1].trim(); const { expr, nextIndex } = parseMatchExpr(filePath, lines, idx, subject, retLoc); return { - step: { - type: "return", - value: `__match__`, - loc: retLoc, - managed: { kind: "match", match: expr }, - }, + step: { type: "return", value: { kind: "match", match: expr }, loc: retLoc }, nextIdx: nextIndex, }; } @@ -484,33 +496,23 @@ export function parseBlockStatement( const runBody = returnValue.slice("run ".length).trim(); if (runBody.startsWith("`")) { const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, innerRaw.indexOf("run") + 1); + const value: Expr = { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, + }; return { - step: { - type: "return", - value: `run inline_script`, - loc: retLoc, - managed: { - kind: "run_inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }, - }, + step: { type: "return", value, loc: retLoc }, nextIdx: result.nextLineIdx, }; } const call = parseCallRef(runBody); if (call) { rejectTrailingContent(filePath, innerNo, "run", call.rest); + const callee = { value: call.ref, loc: retLoc }; return { - step: { - type: "return", - value: `run ${call.ref}(${argsToSourceForm(call.args)})`, - loc: retLoc, - managed: { - kind: "run", ref: { value: call.ref, loc: retLoc }, args: call.args, - }, - }, + step: { type: "return", value: { kind: "call", callee, args: call.args }, loc: retLoc }, nextIdx: idx + 1, }; } @@ -519,15 +521,9 @@ export function parseBlockStatement( const call = parseCallRef(returnValue.slice("ensure ".length).trim()); if (call) { rejectTrailingContent(filePath, innerNo, "ensure", call.rest); + const callee = { value: call.ref, loc: retLoc }; return { - step: { - type: "return", - value: `ensure ${call.ref}(${argsToSourceForm(call.args)})`, - loc: retLoc, - managed: { - kind: "ensure", ref: { value: call.ref, loc: retLoc }, args: call.args, - }, - }, + step: { type: "return", value: { kind: "ensure_call", callee, args: call.args }, loc: retLoc }, nextIdx: idx + 1, }; } @@ -558,17 +554,17 @@ export function parseBlockStatement( } const isBareDotted = isBareDottedIdentifierReturn(returnValue); const isBare = !isBareDotted && isBareIdentifierReturn(returnValue); - const value = isBareDotted + const raw = isBareDotted ? dottedReturnToQuotedString(returnValue) : isBare ? bareIdentifierToQuotedString(returnValue) : returnValue; - const step = { type: "return" as const, value, loc: retLoc }; + const value: Expr = { kind: "literal", raw }; if (isBareDotted || isBare) { - trivia.setNode(step, { bareSource: returnValue.trim() }); + trivia.setNode(value, { bareSource: returnValue.trim() }); } return { - step, + step: { type: "return", value, loc: retLoc }, nextIdx: idx + 1, }; } @@ -581,7 +577,7 @@ export function parseBlockStatement( const matchLoc = { line: innerNo, col: innerRaw.indexOf("match") + 1 }; const { expr, nextIndex } = parseMatchExpr(filePath, lines, idx, subject, matchLoc); return { - step: { type: "match", expr }, + step: execStep({ kind: "match", match: expr }, matchLoc), nextIdx: nextIndex, }; } @@ -593,12 +589,12 @@ export function parseBlockStatement( } const arrowIdx = inner.indexOf("<-"); const rhsCol = arrowIdx >= 0 ? arrowIdx + 3 : 1; - const { rhs, nextIdx: sendNextIdx } = parseSendRhs(filePath, sendMatch.rhsText, innerNo, rhsCol, lines, idx, trivia); + const { value, nextIdx: sendNextIdx } = parseSendRhs(filePath, sendMatch.rhsText, innerNo, rhsCol, lines, idx, trivia); return { step: { type: "send", channel: sendMatch.channel, - rhs, + value, loc: { line: innerNo, col: 1 }, }, nextIdx: sendNextIdx, @@ -606,11 +602,10 @@ export function parseBlockStatement( } return { - step: { - type: "shell", - command: inner, - loc: { line: innerNo, col: colFromRaw(innerRaw) }, - }, + step: execStep( + { kind: "shell", command: inner, loc: { line: innerNo, col: colFromRaw(innerRaw) } }, + { line: innerNo, col: colFromRaw(innerRaw) }, + ), nextIdx: idx + 1, }; } diff --git a/src/parse/workflows.ts b/src/parse/workflows.ts index d972d133..341afbd4 100644 --- a/src/parse/workflows.ts +++ b/src/parse/workflows.ts @@ -79,8 +79,14 @@ export function parseWorkflowBlock( }, ); workflow.steps.push(...bodySteps); - // Strip trailing blank_line (whitespace before closing brace). - while (workflow.steps.length > 0 && workflow.steps[workflow.steps.length - 1].type === "blank_line") { + // Strip trailing blank_line trivia (whitespace before closing brace). + while ( + workflow.steps.length > 0 && + (() => { + const last = workflow.steps[workflow.steps.length - 1]; + return last.type === "trivia" && last.kind === "blank_line"; + })() + ) { workflow.steps.pop(); } return { workflow, nextIndex: afterClose, exported: isExported }; diff --git a/src/runtime/kernel/node-workflow-runtime.ts b/src/runtime/kernel/node-workflow-runtime.ts index a557be73..fa34f366 100644 --- a/src/runtime/kernel/node-workflow-runtime.ts +++ b/src/runtime/kernel/node-workflow-runtime.ts @@ -6,7 +6,7 @@ import { randomUUID } from "node:crypto"; import { AsyncLocalStorage } from "node:async_hooks"; import { inlineScriptName } from "../../inline-script-name"; import { argsToRuntimeString } from "../../parse/core"; -import type { MatchExprDef, WorkflowStepDef } from "../../types"; +import type { CatchBody, Expr, MatchExprDef, WorkflowStepDef } from "../../types"; import { executePrompt, resolveConfig, resolveModel, resolvePromptStepName } from "./prompt"; import { appendRunSummaryLine } from "./emit"; import { buildStepDisplayParamPairs } from "../../cli/commands/format-params.js"; @@ -33,8 +33,6 @@ import { linesOfDelimitedString } from "../string-lines"; export type { MockBodyDef } from "./runtime-mock"; -type EnsureRecover = Extract["catch"]; - const HANDLE_PREFIX = "__JAIPH_HANDLE__"; type AsyncHandle = { @@ -509,6 +507,72 @@ export class NodeWorkflowRuntime { return { ok: false, result: { status: 1, output: "", error: "match: no arm matched" } }; } + /** + * Evaluate an `Expr` to its string value, executing any managed call + * (call/ensure_call/inline_script/match/prompt) and returning its captured + * result. Used by `const` / `return` / `send` / `say` step handlers so they + * don't each duplicate the dispatch table. + * + * `promptCaptureName` lets callers route prompt-side effects (e.g. schema + * field exports) into a scope binding; pass `undefined` for non-capture + * positions. + */ + private async evaluateExpr( + scope: Scope, + expr: Expr, + promptCaptureName: string | undefined, + io: StepIO | undefined, + ): Promise<{ ok: true; value: string; output: string } | { ok: false; result: StepResult; output: string }> { + if (expr.kind === "literal") { + const ir = await this.interpolateWithCaptures(expr.raw, scope); + if (!ir.ok) return { ok: false, result: ir.result, output: "" }; + return { ok: true, value: ir.value, output: "" }; + } + if (expr.kind === "call") { + const r = await this.executeRunRef(scope, expr.callee.value, argsToRuntimeString(expr.args)); + if (r.status !== 0) return { ok: false, result: r, output: "" }; + return { ok: true, value: r.returnValue ?? r.output.trim(), output: "" }; + } + if (expr.kind === "ensure_call") { + const r = await this.executeEnsureRef(scope, expr.callee.value, argsToRuntimeString(expr.args), undefined); + if (r.status !== 0) return { ok: false, result: r, output: "" }; + return { ok: true, value: r.returnValue ?? r.output.trim(), output: "" }; + } + if (expr.kind === "inline_script") { + const shebang = expr.lang ? `#!/usr/bin/env ${expr.lang}` : undefined; + const r = await this.executeInlineScript(scope, expr.body, shebang, argsToRuntimeString(expr.args)); + if (r.status !== 0) return { ok: false, result: r, output: "" }; + return { ok: true, value: r.returnValue ?? r.output.trim(), output: "" }; + } + if (expr.kind === "match") { + const mr = await this.evaluateMatch(scope, expr.match); + if (!mr.ok) return { ok: false, result: mr.result, output: "" }; + return { ok: true, value: mr.value, output: "" }; + } + if (expr.kind === "prompt") { + if (expr.returns !== undefined && !promptCaptureName) { + return { + ok: false, + result: { status: 1, output: "", error: 'prompt with "returns" schema must capture to a variable' }, + output: "", + }; + } + const r = await this.runPromptStep(scope, expr.raw, expr.returns, promptCaptureName, io); + if (!r.ok) return { ok: false, result: r.result, output: r.output }; + // For captured prompts `runPromptStep` writes the value into scope and we + // return that here; non-capture prompts (no binding) yield empty string. + const value = promptCaptureName ? (scope.vars.get(promptCaptureName) ?? "") : ""; + return { ok: true, value, output: r.output }; + } + // shell / bare_ref should never reach the runtime — validator rejects them + // outside their narrow send-RHS lane (and shell-as-send is rejected too). + return { + ok: false, + result: { status: 1, output: "", error: `unsupported expression kind in runtime: ${expr.kind}` }, + output: "", + }; + } + private async executeSteps(scope: Scope, steps: WorkflowStepDef[], io?: StepIO): Promise { let accOut = ""; let accErr = ""; @@ -517,23 +581,34 @@ export class NodeWorkflowRuntime { const localHandleIds: string[] = []; let asyncCounter = 0; for (const step of steps) { - if (step.type === "comment" || step.type === "blank_line") continue; - if (step.type === "log" || step.type === "logerr") { - const level = step.type === "log" ? "LOG" : "LOGERR"; + if (step.type === "trivia") continue; + if (step.type === "say") { let message: string; - if (step.managed?.kind === "run_inline_script") { - const shebang = step.managed.lang ? `#!/usr/bin/env ${step.managed.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.managed.body, shebang, argsToRuntimeString(step.managed.args)); + if (step.message.kind === "inline_script") { + const shebang = step.message.lang ? `#!/usr/bin/env ${step.message.lang}` : undefined; + const result = await this.executeInlineScript(scope, step.message.body, shebang, argsToRuntimeString(step.message.args)); if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); message = result.returnValue ?? result.output.trim(); - } else { - const ir = await this.interpolateWithCaptures(step.message, scope); + } else if (step.message.kind === "literal") { + const ir = await this.interpolateWithCaptures(step.message.raw, scope); if (!ir.ok) return this.mergeStepResult(accOut, accErr, ir.result); - message = ir.value; + message = step.level === "fail" || step.level === "logerr" + ? stripOuterQuotes(ir.value) + : ir.value; + } else { + return this.mergeStepResult(accOut, accErr, { + status: 1, + output: "", + error: `unsupported ${step.level} message kind: ${step.message.kind}`, + }); + } + if (step.level === "fail") { + return this.mergeStepResult(accOut, accErr, { status: 1, output: "", error: message }); } - this.emitter.emitLog(level, message); + const eventLevel = step.level === "log" ? "LOG" : "LOGERR"; + this.emitter.emitLog(eventLevel, message); const chunk = `${message}\n`; - if (level === "LOG") { + if (step.level === "log") { accOut += chunk; io?.appendOut(chunk); } else { @@ -542,51 +617,18 @@ export class NodeWorkflowRuntime { } continue; } - if (step.type === "fail") { - const failIr = await this.interpolateWithCaptures(step.message, scope); - if (!failIr.ok) return this.mergeStepResult(accOut, accErr, failIr.result); - const message = failIr.value; - return this.mergeStepResult(accOut, accErr, { status: 1, output: "", error: message }); - } - if (step.type === "shell") { - const cmdIr = await this.interpolateWithCaptures(step.command, scope); - if (!cmdIr.ok) return this.mergeStepResult(accOut, accErr, cmdIr.result); - const stepName = `sh_line_${step.loc.line}`; - const result = await this.executeManagedStep( - "script", - stepName, - [], - (io) => this.executeShLine(scope, cmdIr.value, io), - ); - if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); - continue; - } if (step.type === "return") { - if (step.managed) { - if (step.managed.kind === "match") { - const matchResult = await this.evaluateMatch(scope, step.managed.match); - if (!matchResult.ok) return this.mergeStepResult(accOut, accErr, matchResult.result); - returnValue = matchResult.value; - return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); - } - if (step.managed.kind === "run_inline_script") { - const shebang = step.managed.lang ? `#!/usr/bin/env ${step.managed.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.managed.body, shebang, argsToRuntimeString(step.managed.args)); - if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); - returnValue = result.returnValue ?? result.output.trim(); - return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); - } - const result = step.managed.kind === "run" - ? await this.executeRunRef(scope, step.managed.ref.value, argsToRuntimeString(step.managed.args)) - : await this.executeEnsureRef(scope, step.managed.ref.value, argsToRuntimeString(step.managed.args), undefined); - if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); - returnValue = result.returnValue ?? result.output.trim(); + const value = step.value; + if (value.kind === "literal") { + const retIr = await this.interpolateWithCaptures(value.raw, scope); + if (!retIr.ok) return this.mergeStepResult(accOut, accErr, retIr.result); + returnValue = stripOuterQuotes(retIr.value); return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); } - // Match Bash semantics: return "$var" should return var value, not literal quotes. - const retIr = await this.interpolateWithCaptures(step.value, scope); - if (!retIr.ok) return this.mergeStepResult(accOut, accErr, retIr.result); - returnValue = stripOuterQuotes(retIr.value); + const r = await this.evaluateExpr(scope, value, undefined, io); + accOut += r.output; + if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); + returnValue = r.value; return this.mergeStepResult(accOut, accErr, { status: 0, output: "", error: "", returnValue }); } if (step.type === "send") { @@ -599,23 +641,20 @@ export class NodeWorkflowRuntime { }); } let payload = ""; - if (step.rhs.kind === "literal") { - const sendIr = await this.interpolateWithCaptures(step.rhs.token, scope); + const sendValue = step.value; + if (sendValue.kind === "literal") { + const sendIr = await this.interpolateWithCaptures(sendValue.raw, scope); if (!sendIr.ok) return this.mergeStepResult(accOut, accErr, sendIr.result); payload = stripOuterQuotes(sendIr.value); - } else if (step.rhs.kind === "var") { - const sendHandleErr = await this.resolveHandlesInInput(scope, step.rhs.bash); - if (sendHandleErr) return this.mergeStepResult(accOut, accErr, sendHandleErr); - payload = interpolate(step.rhs.bash, scope.vars, scope.env); - } else if (step.rhs.kind === "run") { - const runValue = await this.executeRunRef(scope, step.rhs.ref.value, argsToRuntimeString(step.rhs.args)); - if (runValue.status !== 0) return this.mergeStepResult(accOut, accErr, runValue); - payload = runValue.returnValue ?? runValue.output.trim(); + } else if (sendValue.kind === "call") { + const r = await this.executeRunRef(scope, sendValue.callee.value, argsToRuntimeString(sendValue.args)); + if (r.status !== 0) return this.mergeStepResult(accOut, accErr, r); + payload = r.returnValue ?? r.output.trim(); } else { return this.mergeStepResult(accOut, accErr, { status: 1, output: "", - error: "unsupported send rhs in node runtime", + error: `unsupported send value kind: ${sendValue.kind}`, }); } this.inboxSeq += 1; @@ -627,7 +666,6 @@ export class NodeWorkflowRuntime { sender: senderName, seqPadded, }; - // Route to the nearest ancestor context that has a route for this channel. let targetCtx = ctx; let routed = false; for (let i = this.workflowCtxStack.length - 1; i >= 0; i -= 1) { @@ -638,8 +676,6 @@ export class NodeWorkflowRuntime { } } targetCtx.queue.push(msg); - // Persist inbox file only when a route consumes the channel — otherwise - // the file would be dead audit data with no corresponding dispatch. if (routed) { const inboxFileDir = join(this.runDir, "inbox"); mkdirSync(inboxFileDir, { recursive: true }); @@ -658,95 +694,54 @@ export class NodeWorkflowRuntime { ); continue; } - if (step.type === "prompt") { - if (step.returns !== undefined && !step.captureName) { - return this.mergeStepResult(accOut, accErr, { - status: 1, - output: "", - error: 'prompt with "returns" schema must capture to a variable', - }); - } - const r = await this.runPromptStep(scope, step.raw, step.returns, step.captureName, io); - accOut += r.output; - if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); - continue; - } if (step.type === "const") { - if (step.value.kind === "expr") { - const exprIr = await this.interpolateWithCaptures(step.value.bashRhs, scope); + const v = step.value; + if (v.kind === "literal") { + const exprIr = await this.interpolateWithCaptures(v.raw, scope); if (!exprIr.ok) return this.mergeStepResult(accOut, accErr, exprIr.result); scope.vars.set(step.name, stripOuterQuotes(exprIr.value)); continue; } - if (step.value.kind === "run_capture") { - const captureRef = step.value.ref.value; - const captureArgs = argsToRuntimeString(step.value.args); - if (step.value.async) { - // Async capture: create handle, store in scope, register for join. - asyncCounter += 1; - const branchStack = [...this.getFrameStack()]; - const branchIndices = [...this.getAsyncIndices(), asyncCounter]; - const promise = this.asyncFrameStack.run(branchStack, () => - this.asyncIndicesStorage.run(branchIndices, () => - this.executeRunRef(scope, captureRef, captureArgs), - ), - ); - const handleId = this.createHandle(captureRef, promise); - localHandleIds.push(handleId); - scope.vars.set(step.name, handleId); - continue; - } - const runResult = await this.executeRunRef(scope, captureRef, captureArgs); - if (runResult.status !== 0) return this.mergeStepResult(accOut, accErr, runResult); - scope.vars.set(step.name, runResult.returnValue ?? runResult.output.trim()); - continue; - } - if (step.value.kind === "run_inline_script_capture") { - const shebang = step.value.lang ? `#!/usr/bin/env ${step.value.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.value.body, shebang, argsToRuntimeString(step.value.args)); - if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); - scope.vars.set(step.name, result.returnValue ?? result.output.trim()); - continue; - } - if (step.value.kind === "ensure_capture") { - const ensureResult = await this.executeEnsureRef(scope, step.value.ref.value, argsToRuntimeString(step.value.args), undefined); - if (ensureResult.status !== 0) return this.mergeStepResult(accOut, accErr, ensureResult); - scope.vars.set(step.name, ensureResult.returnValue ?? ensureResult.output.trim()); - continue; - } - if (step.value.kind === "match_expr") { - const matchResult = await this.evaluateMatch(scope, step.value.match); - if (!matchResult.ok) return this.mergeStepResult(accOut, accErr, matchResult.result); - scope.vars.set(step.name, matchResult.value); - continue; - } - if (step.value.kind === "prompt_capture") { - const r = await this.runPromptStep( - scope, - step.value.raw, - step.value.returns, - step.name, - io, + if (v.kind === "call" && v.async) { + asyncCounter += 1; + const captureRef = v.callee.value; + const captureArgs = argsToRuntimeString(v.args); + const branchStack = [...this.getFrameStack()]; + const branchIndices = [...this.getAsyncIndices(), asyncCounter]; + const promise = this.asyncFrameStack.run(branchStack, () => + this.asyncIndicesStorage.run(branchIndices, () => + this.executeRunRef(scope, captureRef, captureArgs), + ), ); - accOut += r.output; - if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); + const handleId = this.createHandle(captureRef, promise); + localHandleIds.push(handleId); + scope.vars.set(step.name, handleId); continue; } + const r = await this.evaluateExpr(scope, v, step.name, io); + accOut += r.output; + if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); + // Prompt handlers bind via captureName side effect inside runPromptStep; + // all other Expr kinds bind here. + if (v.kind !== "prompt") { + scope.vars.set(step.name, r.value); + } + continue; } - if (step.type === "run") { - if (step.async) { + if (step.type === "exec") { + const body = step.body; + if (body.kind === "call" && body.async) { asyncCounter += 1; const branchStack = [...this.getFrameStack()]; const branchIndices = [...this.getAsyncIndices(), asyncCounter]; - const ref = step.workflow.value; - const argsRaw = argsToRuntimeString(step.args); + const ref = body.callee.value; + const argsRaw = argsToRuntimeString(body.args); const runInBranch = (fn: () => Promise): Promise => this.asyncFrameStack.run(branchStack, () => this.asyncIndicesStorage.run(branchIndices, fn), ); let promise: Promise; if (step.recover) { - // Async + recover loop: wrap retry logic in a single promise. const recoverLimit = this.resolveRecoverLimit(scope.filePath); const recover = step.recover; promise = runInBranch(async () => { @@ -761,7 +756,6 @@ export class NodeWorkflowRuntime { return lastResult; }); } else if (step.catch) { - // Async + catch: single-shot recovery in the async branch. const recover = step.catch; promise = runInBranch(async () => { const result = await this.executeRunRef(scope, ref, argsRaw); @@ -779,55 +773,99 @@ export class NodeWorkflowRuntime { if (step.captureName) scope.vars.set(step.captureName, handleId); continue; } - if (step.recover) { - const limit = this.resolveRecoverLimit(scope.filePath); - let lastResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); - let attempt = 1; - while (lastResult.status !== 0 && attempt <= limit) { - const rr = await this.runRecoverBody(scope, step.recover, `${lastResult.output}${lastResult.error}`); - if (rr.status !== 0 || rr.returnValue !== undefined) return this.mergeStepResult(accOut, accErr, rr); - lastResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); - attempt += 1; + if (body.kind === "call") { + if (step.recover) { + const limit = this.resolveRecoverLimit(scope.filePath); + const ref = body.callee.value; + const argsRaw = argsToRuntimeString(body.args); + let lastResult = await this.executeRunRef(scope, ref, argsRaw); + let attempt = 1; + while (lastResult.status !== 0 && attempt <= limit) { + const rr = await this.runRecoverBody(scope, step.recover, `${lastResult.output}${lastResult.error}`); + if (rr.status !== 0 || rr.returnValue !== undefined) return this.mergeStepResult(accOut, accErr, rr); + lastResult = await this.executeRunRef(scope, ref, argsRaw); + attempt += 1; + } + if (lastResult.status === 0) { + if (step.captureName) { + scope.vars.set(step.captureName, lastResult.returnValue ?? lastResult.output.trim()); + } + } else { + return this.mergeStepResult(accOut, accErr, lastResult); + } + continue; } - if (lastResult.status === 0) { + const runResult = await this.executeRunRef(scope, body.callee.value, argsToRuntimeString(body.args)); + if (runResult.status === 0) { if (step.captureName) { - scope.vars.set(step.captureName, lastResult.returnValue ?? lastResult.output.trim()); + scope.vars.set(step.captureName, runResult.returnValue ?? runResult.output.trim()); } + } else if (step.catch) { + const rr = await this.runRecoverBody(scope, step.catch, `${runResult.output}${runResult.error}`); + if (rr.status !== 0 || rr.returnValue !== undefined) return this.mergeStepResult(accOut, accErr, rr); } else { - return this.mergeStepResult(accOut, accErr, lastResult); + return this.mergeStepResult(accOut, accErr, runResult); } continue; } - const runResult = await this.executeRunRef(scope, step.workflow.value, argsToRuntimeString(step.args)); - if (runResult.status === 0) { - if (step.captureName) { - scope.vars.set(step.captureName, runResult.returnValue ?? runResult.output.trim()); + if (body.kind === "ensure_call") { + const ensureResult = await this.executeEnsureRef(scope, body.callee.value, argsToRuntimeString(body.args), step.catch); + if (step.captureName && ensureResult.status === 0) { + scope.vars.set(step.captureName, ensureResult.returnValue ?? ensureResult.output.trim()); } - } else if (step.catch) { - const rr = await this.runRecoverBody(scope, step.catch, `${runResult.output}${runResult.error}`); - if (rr.status !== 0 || rr.returnValue !== undefined) return this.mergeStepResult(accOut, accErr, rr); - } else { - return this.mergeStepResult(accOut, accErr, runResult); + if (ensureResult.status !== 0) return this.mergeStepResult(accOut, accErr, ensureResult); + if (ensureResult.recoverReturn) return this.mergeStepResult(accOut, accErr, ensureResult); + continue; } - continue; - } - if (step.type === "run_inline_script") { - const shebang = step.lang ? `#!/usr/bin/env ${step.lang}` : undefined; - const result = await this.executeInlineScript(scope, step.body, shebang, argsToRuntimeString(step.args)); - if (step.captureName && result.status === 0) { - scope.vars.set(step.captureName, result.returnValue ?? result.output.trim()); + if (body.kind === "inline_script") { + const shebang = body.lang ? `#!/usr/bin/env ${body.lang}` : undefined; + const result = await this.executeInlineScript(scope, body.body, shebang, argsToRuntimeString(body.args)); + if (step.captureName && result.status === 0) { + scope.vars.set(step.captureName, result.returnValue ?? result.output.trim()); + } + if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); + continue; } - if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); - continue; - } - if (step.type === "ensure") { - const ensureResult = await this.executeEnsureRef(scope, step.ref.value, argsToRuntimeString(step.args), step.catch); - if (step.captureName && ensureResult.status === 0) { - scope.vars.set(step.captureName, ensureResult.returnValue ?? ensureResult.output.trim()); + if (body.kind === "prompt") { + if (body.returns !== undefined && !step.captureName) { + return this.mergeStepResult(accOut, accErr, { + status: 1, + output: "", + error: 'prompt with "returns" schema must capture to a variable', + }); + } + const r = await this.runPromptStep(scope, body.raw, body.returns, step.captureName, io); + accOut += r.output; + if (!r.ok) return this.mergeStepResult(accOut, accErr, r.result); + continue; } - if (ensureResult.status !== 0) return this.mergeStepResult(accOut, accErr, ensureResult); - if (ensureResult.recoverReturn) return this.mergeStepResult(accOut, accErr, ensureResult); - continue; + if (body.kind === "match") { + const matchResult = await this.evaluateMatch(scope, body.match); + if (!matchResult.ok) return this.mergeStepResult(accOut, accErr, matchResult.result); + if (step.captureName) scope.vars.set(step.captureName, matchResult.value); + continue; + } + if (body.kind === "shell") { + const cmdIr = await this.interpolateWithCaptures(body.command, scope); + if (!cmdIr.ok) return this.mergeStepResult(accOut, accErr, cmdIr.result); + const stepName = `sh_line_${body.loc.line}`; + const result = await this.executeManagedStep( + "script", + stepName, + [], + (io) => this.executeShLine(scope, cmdIr.value, io), + ); + if (step.captureName && result.status === 0) { + scope.vars.set(step.captureName, result.returnValue ?? result.output.trim()); + } + if (result.status !== 0) return this.mergeStepResult(accOut, accErr, result); + continue; + } + return this.mergeStepResult(accOut, accErr, { + status: 1, + output: "", + error: `unsupported exec body kind in runtime: ${body.kind}`, + }); } if (step.type === "if") { // Resolve handle if the subject variable is a handle. @@ -873,12 +911,6 @@ export class NodeWorkflowRuntime { } continue; } - if (step.type === "match") { - const matchResult = await this.evaluateMatch(scope, step.expr); - if (!matchResult.ok) return this.mergeStepResult(accOut, accErr, matchResult.result); - // Standalone match: value is discarded - continue; - } } // Implicit join: await all unresolved handles created in this scope before returning. if (localHandleIds.length > 0) { @@ -1183,7 +1215,7 @@ export class NodeWorkflowRuntime { scope: Scope, ref: string, argsRaw: string, - catchDef: EnsureRecover | undefined, + catchDef: CatchBody | undefined, ): Promise { const resolvedArgs = await this.resolveArgsRaw(scope, argsRaw); if (!Array.isArray(resolvedArgs)) return resolvedArgs; diff --git a/src/transpile/compiler-edge.acceptance.test.ts b/src/transpile/compiler-edge.acceptance.test.ts index ca99a578..e2b7a17c 100644 --- a/src/transpile/compiler-edge.acceptance.test.ts +++ b/src/transpile/compiler-edge.acceptance.test.ts @@ -366,9 +366,11 @@ test("ACCEPTANCE: prompt with returns schema (single-line) parses and emits type const step = mod.workflows[0].steps[0]; assert.equal(step.type, "const"); assert.ok(step.type === "const" && step.name === "result"); - assert.ok(step.type === "const" && step.value.kind === "prompt_capture"); - assert.ok(step.type === "const" && step.value.returns !== undefined); - assert.match(step.value.returns!, /type:\s*string/); + assert.ok(step.type === "const" && step.value.kind === "prompt"); + if (step.type === "const" && step.value.kind === "prompt") { + assert.ok(step.value.returns !== undefined); + assert.match(step.value.returns!, /type:\s*string/); + } withTempDir("jaiph-acc-prompt-returns-", (root) => { writeFileSync( @@ -398,10 +400,12 @@ test("ACCEPTANCE: prompt with returns schema (multiline continuation) parses", ( assert.equal(mod.workflows.length, 1); const step = mod.workflows[0].steps[0]; assert.equal(step.type, "const"); - assert.ok(step.type === "const" && step.value.kind === "prompt_capture"); - assert.ok(step.type === "const" && step.value.returns !== undefined); - assert.match(step.value.returns!, /type:\s*string/); - assert.match(step.value.returns!, /risk:\s*string/); + assert.ok(step.type === "const" && step.value.kind === "prompt"); + if (step.type === "const" && step.value.kind === "prompt") { + assert.ok(step.value.returns !== undefined); + assert.match(step.value.returns!, /type:\s*string/); + assert.match(step.value.returns!, /risk:\s*string/); + } }); test("ACCEPTANCE: unsupported type in returns schema fails with E_SCHEMA", () => { diff --git a/src/transpile/compiler-golden.test.ts b/src/transpile/compiler-golden.test.ts index c263ff70..cc89a45e 100644 --- a/src/transpile/compiler-golden.test.ts +++ b/src/transpile/compiler-golden.test.ts @@ -109,13 +109,17 @@ test("parser: assignment capture parses for ensure, run, and const run capture", const steps = mod.workflows[0].steps; assert.equal(steps.length, 2); assert.equal(steps[0].type, "const"); - const c0 = steps[0] as { type: "const"; name: string; value: { kind: string } }; - assert.equal(c0.name, "response"); - assert.equal(c0.value.kind, "ensure_capture"); + const c0 = steps[0]; + if (c0.type === "const") { + assert.equal(c0.name, "response"); + assert.equal(c0.value.kind, "ensure_call"); + } assert.equal(steps[1].type, "const"); - const c1 = steps[1] as { type: "const"; name: string; value: { kind: string } }; - assert.equal(c1.name, "out"); - assert.equal(c1.value.kind, "run_capture"); + const c1 = steps[1]; + if (c1.type === "const") { + assert.equal(c1.name, "out"); + assert.equal(c1.value.kind, "call"); + } }); test("parser: config block parses and populates mod.metadata", () => { @@ -343,13 +347,13 @@ test("parser: run ... catch parses correctly", () => { ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); const step = mod.workflows[0].steps[0]; - assert.equal(step.type, "run"); - if (step.type === "run") { + assert.equal(step.type, "exec"); + if (step.type === "exec" && step.body.kind === "call") { assert.ok(step.catch); assert.equal(step.catch!.bindings.failure, "err"); const recoverSteps = "block" in step.catch! ? step.catch!.block : [step.catch!.single]; assert.equal(recoverSteps.length, 1); - assert.equal(recoverSteps[0].type, "log"); + assert.equal(recoverSteps[0].type, "say"); } }); @@ -360,9 +364,14 @@ test("parser: fail step parses quoted message", () => { "}", ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); - const step = mod.workflows[0].steps[0] as { type: string; message: string }; - assert.equal(step.type, "fail"); - assert.equal(step.message, '"expected reason"'); + const step = mod.workflows[0].steps[0]; + assert.equal(step.type, "say"); + if (step.type === "say") { + assert.equal(step.level, "fail"); + if (step.message.kind === "literal") { + assert.equal(step.message.raw, '"expected reason"'); + } + } }); test("parser: const string expr and const run capture parse", () => { @@ -376,15 +385,19 @@ test("parser: const string expr and const run capture parse", () => { const mod = parsejaiph(source, "/fake/entry.jh"); const steps = mod.workflows[0].steps; assert.equal(steps.length, 2); - const c0 = steps[0] as { type: string; name: string; value: { kind: string; bashRhs?: string } }; - const c1 = steps[1] as { type: string; name: string; value: { kind: string } }; + const c0 = steps[0]; + const c1 = steps[1]; assert.equal(c0.type, "const"); - assert.equal(c0.name, "msg"); - assert.equal(c0.value.kind, "expr"); - assert.equal(c0.value.bashRhs, '"hi"'); + if (c0.type === "const") { + assert.equal(c0.name, "msg"); + assert.equal(c0.value.kind, "literal"); + if (c0.value.kind === "literal") assert.equal(c0.value.raw, '"hi"'); + } assert.equal(c1.type, "const"); - assert.equal(c1.name, "out"); - assert.equal(c1.value.kind, "run_capture"); + if (c1.type === "const") { + assert.equal(c1.name, "out"); + assert.equal(c1.value.kind, "call"); + } }); test("parser: const rejects bare call-like rhs without run", () => { @@ -408,16 +421,13 @@ test("parser: const allows run-wrapped script call with args", () => { "}", ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); - const step = mod.workflows[0].steps[0] as { - type: string; - name: string; - value: { kind: string; ref?: { value: string }; args?: import("../types").Arg[] }; - }; + const step = mod.workflows[0].steps[0]; assert.equal(step.type, "const"); - assert.equal(step.name, "x"); - assert.equal(step.value.kind, "run_capture"); - assert.equal(step.value.ref?.value, "some_script"); - assert.deepEqual(step.value.args, [{ kind: "var", name: "arg1" }]); + if (step.type === "const" && step.value.kind === "call") { + assert.equal(step.name, "x"); + assert.equal(step.value.callee.value, "some_script"); + assert.deepEqual(step.value.args, [{ kind: "var", name: "arg1" }]); + } }); test("parser: const prompt capture parses", () => { @@ -427,14 +437,12 @@ test("parser: const prompt capture parses", () => { "}", ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); - const step = mod.workflows[0].steps[0] as { - type: string; - name: string; - value: { kind: string }; - }; + const step = mod.workflows[0].steps[0]; assert.equal(step.type, "const"); - assert.equal(step.name, "ans"); - assert.equal(step.value.kind, "prompt_capture"); + if (step.type === "const") { + assert.equal(step.name, "ans"); + assert.equal(step.value.kind, "prompt"); + } }); test("parser: wait parses as workflow step (not shell)", () => { @@ -478,8 +486,10 @@ test("parser: send operator parses channel <- \"literal\"", () => { const step = mod.workflows[0].steps[0]; assert.equal(step.type, "send"); if (step.type !== "send") throw new Error("expected send"); - assert.equal(step.rhs.kind, "literal"); - assert.equal(step.rhs.token, `"hello"`); + assert.equal(step.value.kind, "literal"); + if (step.value.kind === "literal") { + assert.equal(step.value.raw, `"hello"`); + } assert.equal(step.channel, "findings"); }); @@ -597,7 +607,7 @@ test("parser: <- inside quotes is not a send", () => { ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); assert.equal(mod.workflows[0].steps.length, 1); - assert.equal(mod.workflows[0].steps[0].type, "log"); + assert.equal(mod.workflows[0].steps[0].type, "say"); }); test("parser: channel route declaration parses into ChannelDef.routes", () => { @@ -659,8 +669,12 @@ test("parser: capture + send is E_PARSE", () => { "}", ].join("\n"); const mod = parsejaiph(source, "/fake/entry.jh"); - // Parsed as a shell step; validation will reject it later - assert.equal(mod.workflows[0].steps[0].type, "shell"); + // Parsed as an exec step with shell body; validation will reject it later + const step = mod.workflows[0].steps[0]; + assert.equal(step.type, "exec"); + if (step.type === "exec") { + assert.equal(step.body.kind, "shell"); + } }); // === Top-level const (env declaration) tests === diff --git a/src/transpile/emit-script.ts b/src/transpile/emit-script.ts index 5ccf8675..2de81999 100644 --- a/src/transpile/emit-script.ts +++ b/src/transpile/emit-script.ts @@ -1,5 +1,5 @@ import { inlineScriptName } from "../inline-script-name"; -import type { jaiphModule, ScriptImportDef, WorkflowStepDef } from "../types"; +import type { Expr, jaiphModule, ScriptImportDef, WorkflowStepDef } from "../types"; import { scriptShebangIsBash } from "../parse/script-bash"; import { langToShebang } from "../parse/scripts"; @@ -69,31 +69,50 @@ function wrapBashStandaloneScriptBody(body: string, envPreamble: string): string export type ScriptArtifact = { name: string; content: string }; -/** Collect all inline script steps from a step tree (handles if/else/catch nesting). */ +/** Walk all `Expr` nodes carried by a step and yield inline-script bodies. */ +function emitInlineFromExpr(expr: Expr, seen: Set, out: ScriptArtifact[]): void { + if (expr.kind === "inline_script") { + const shebang = expr.lang ? langToShebang(expr.lang) : undefined; + emitInlineScriptArtifact(expr.body, shebang, seen, out); + } +} + +/** Collect all inline script bodies from a step tree (handles if/for/catch/recover nesting). */ function collectInlineScripts( steps: WorkflowStepDef[], seen: Set, out: ScriptArtifact[], ): void { for (const s of steps) { - if (s.type === "run_inline_script") { - const shebang = s.lang ? langToShebang(s.lang) : undefined; - emitInlineScriptArtifact(s.body, shebang, seen, out); - } else if (s.type === "const" && s.value.kind === "run_inline_script_capture") { - const shebang = s.value.lang ? langToShebang(s.value.lang) : undefined; - emitInlineScriptArtifact(s.value.body, shebang, seen, out); - } else if (s.type === "return" && s.managed?.kind === "run_inline_script") { - const shebang = s.managed.lang ? langToShebang(s.managed.lang) : undefined; - emitInlineScriptArtifact(s.managed.body, shebang, seen, out); - } else if ((s.type === "log" || s.type === "logerr") && s.managed?.kind === "run_inline_script") { - const shebang = s.managed.lang ? langToShebang(s.managed.lang) : undefined; - emitInlineScriptArtifact(s.managed.body, shebang, seen, out); - } else if ((s.type === "ensure" || s.type === "run") && s.catch) { - const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; - collectInlineScripts(recoverSteps, seen, out); - } else if (s.type === "if") { - collectInlineScripts(s.body, seen, out); - } else if (s.type === "for_lines") { + if (s.type === "exec") { + emitInlineFromExpr(s.body, seen, out); + if (s.catch) { + const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; + collectInlineScripts(recoverSteps, seen, out); + } + if (s.recover) { + const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; + collectInlineScripts(recoverSteps, seen, out); + } + continue; + } + if (s.type === "const") { + emitInlineFromExpr(s.value, seen, out); + continue; + } + if (s.type === "return") { + emitInlineFromExpr(s.value, seen, out); + continue; + } + if (s.type === "say") { + emitInlineFromExpr(s.message, seen, out); + continue; + } + if (s.type === "send") { + emitInlineFromExpr(s.value, seen, out); + continue; + } + if (s.type === "if" || s.type === "for_lines") { collectInlineScripts(s.body, seen, out); } } diff --git a/src/transpile/validate-prompt-schema.test.ts b/src/transpile/validate-prompt-schema.test.ts index 9f26300c..9fe1f637 100644 --- a/src/transpile/validate-prompt-schema.test.ts +++ b/src/transpile/validate-prompt-schema.test.ts @@ -65,34 +65,28 @@ test("validatePromptReturnsSchema: rejects malformed entry", () => { // --- validatePromptStepReturns --- test("validatePromptStepReturns: no error when no returns", () => { - const step = { - type: "prompt" as const, - raw: 'prompt "hello"', - loc: { line: 1, col: 1 }, - }; - validatePromptStepReturns(step, "test.jh"); + validatePromptStepReturns( + { loc: { line: 1, col: 1 } }, + undefined, + "test.jh", + ); }); test("validatePromptStepReturns: no error when returns with capture", () => { - const step = { - type: "prompt" as const, - raw: '"hello"', - loc: { line: 1, col: 1 }, - captureName: "result", - returns: "{ name: string }", - }; - validatePromptStepReturns(step, "test.jh"); + validatePromptStepReturns( + { returns: "{ name: string }", loc: { line: 1, col: 1 } }, + "result", + "test.jh", + ); }); test("validatePromptStepReturns: rejects returns without capture", () => { - const step = { - type: "prompt" as const, - raw: 'prompt "hello" returns "{ name: string }"', - loc: { line: 1, col: 1 }, - returns: "{ name: string }", - }; assert.throws( - () => validatePromptStepReturns(step, "test.jh"), + () => validatePromptStepReturns( + { returns: "{ name: string }", loc: { line: 1, col: 1 } }, + undefined, + "test.jh", + ), /must capture to a variable/, ); }); diff --git a/src/transpile/validate-prompt-schema.ts b/src/transpile/validate-prompt-schema.ts index bb475e73..aee7d4b2 100644 --- a/src/transpile/validate-prompt-schema.ts +++ b/src/transpile/validate-prompt-schema.ts @@ -1,5 +1,4 @@ import { jaiphError } from "../errors"; -import type { WorkflowStepDef } from "../types"; const SUPPORTED_SCHEMA_TYPES = new Set(["string", "number", "boolean"]); @@ -51,20 +50,22 @@ export function validatePromptReturnsSchema( } } +/** Validate that a prompt's optional returns schema is well-formed and bound to a capture. */ export function validatePromptStepReturns( - step: Extract, + prompt: { returns?: string; loc: { line: number; col: number } }, + captureName: string | undefined, filePath: string, ): void { - if (step.returns !== undefined) { - if (!step.captureName) { + if (prompt.returns !== undefined) { + if (!captureName) { throw jaiphError( filePath, - step.loc.line, - step.loc.col, + prompt.loc.line, + prompt.loc.col, "E_PARSE", 'prompt with "returns" schema must capture to a variable (e.g. const result = prompt "..." returns "{ ... }")', ); } - validatePromptReturnsSchema(step.returns, filePath, step.loc.line, step.loc.col); + validatePromptReturnsSchema(prompt.returns, filePath, prompt.loc.line, prompt.loc.col); } } diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index ae944e21..10e63ca1 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -1,7 +1,7 @@ import { existsSync } from "node:fs"; import { dirname, resolve } from "node:path"; import { jaiphError } from "../errors"; -import type { Arg, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; +import type { Arg, Expr, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; import type { ModuleGraph } from "./module-graph"; import type { SubstitutionValidateEnv } from "./validate-substitution"; import { validateManagedWorkflowShell } from "./validate-substitution"; @@ -113,7 +113,6 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< ); } } - // Reject `return` as the leading token of an arm body. const bodyTrimmed = (arm.tripleQuotedBody ? tripleQuotedRawForRuntime(arm.body) : arm.body).trimStart(); if (/^return(\s|$)/.test(bodyTrimmed)) { throw jaiphError( @@ -124,7 +123,6 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< `match arm body must not start with "return"; the match expression itself produces the value — use the expression directly after =>`, ); } - // Reject inline script forms in arm bodies (backtick `…`() or fenced ```…```()). if (/`[^`]*`\s*\(/.test(bodyTrimmed) || bodyTrimmed.startsWith("```")) { throw jaiphError( filePath, @@ -134,12 +132,6 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< `inline scripts are not allowed in match arm bodies; use a named script with "run script_name(…)" instead`, ); } - // Reject unknown verbs, bare function-call forms, and bare unknown identifiers in arm bodies. - // Allowed bodies: string literal ("..." or """..."""), $var/${var}, - // bare in-scope identifier (param/const/capture), or a verb call: fail "...", run ref(...), ensure ref(...). - // A bare identifier followed by space+content (e.g. `error "msg"`) or by `(` (e.g. `error("msg")`) - // is a programming mistake — most likely a typo for `fail`. A bare identifier not in scope - // (e.g. `true`, `blorp`) is also rejected. Skip the check for triple-quoted bodies since those are literal text. if (!arm.tripleQuotedBody) { const idMatch = bodyTrimmed.match(/^([A-Za-z_][A-Za-z0-9_]*)/); if (idMatch) { @@ -157,9 +149,6 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< `unknown match arm verb "${ident}"; allowed: fail "...", run ref(...), ensure ref(...).${hint}`, ); } - // Reject bare unknown identifiers (e.g. `_ => true`, `_ => blorp`). - // Only bare words with no trailing content reach here — valid ones - // must be in-scope variables (params, consts, captures). if (!startsCall && !startsArgs && after.trim() === "" && !knownVars.has(ident)) { throw jaiphError( filePath, @@ -194,13 +183,17 @@ function collectKnownVars(steps: WorkflowStepDef[], envDecls?: { name: string }[ if (s.type === "const") { vars.add(s.name); } - if ((s.type === "ensure" || s.type === "run" || s.type === "prompt" || s.type === "run_inline_script") && s.captureName) { + if (s.type === "exec" && s.captureName) { vars.add(s.captureName); } - if ((s.type === "ensure" || s.type === "run") && s.catch) { + if (s.type === "exec" && s.catch) { const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; walk(recoverSteps); } + if (s.type === "exec" && s.recover) { + const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; + walk(recoverSteps); + } if (s.type === "if") { walk(s.body); } @@ -223,7 +216,6 @@ function validateImmutableBindings( envDecls?: { name: string; loc: { line: number; col: number } }[], moduleScripts?: Set, ): void { - // Map from name → { kind, line } for the first binding site. const bound = new Map(); for (const p of params) { bound.set(p, { kind: "parameter", line: declLoc.line }); @@ -257,19 +249,18 @@ function validateImmutableBindings( if (s.type === "const") { check(s.name, "const", s.loc, b); } - if (s.type === "ensure" && s.captureName) { - check(s.captureName, "capture", s.ref.loc, b); - } - if (s.type === "run" && s.captureName) { - check(s.captureName, "capture", s.workflow.loc, b); + if (s.type === "exec" && s.captureName) { + const captureLoc = execBodyLoc(s.body) ?? s.loc; + check(s.captureName, "capture", captureLoc, b); } - if ((s.type === "prompt" || s.type === "run_inline_script") && s.captureName) { - check(s.captureName, "capture", s.loc, b); - } - if ((s.type === "ensure" || s.type === "run") && s.catch) { + if (s.type === "exec" && s.catch) { const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; walk(recoverSteps, b); } + if (s.type === "exec" && s.recover) { + const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; + walk(recoverSteps, b); + } if (s.type === "if") { walk(s.body, b); } @@ -292,7 +283,14 @@ function validateImmutableBindings( walk(steps, bound); } -/** Look up declared params for a workflow or rule target. Returns undefined if target has no declared params. */ +/** Best-effort location for an exec body — used to attribute capture-binding errors. */ +function execBodyLoc(body: Expr): { line: number; col: number } | undefined { + if (body.kind === "call" || body.kind === "ensure_call") return body.callee.loc; + if (body.kind === "prompt" || body.kind === "shell") return body.loc; + if (body.kind === "match") return body.match.loc; + return undefined; +} + function lookupCalleeParams( ref: string, targetKind: "workflow" | "rule", @@ -325,7 +323,6 @@ function lookupCalleeParams( return undefined; } -/** Validate arity: if the callee declares named params, the call must supply exactly that many args. */ function validateArity( filePath: string, loc: { line: number; col: number }, @@ -336,7 +333,7 @@ function validateArity( refCtx: RefResolutionContext, ): void { const params = lookupCalleeParams(ref, targetKind, ast, refCtx); - if (params === undefined) return; // callee not a workflow/rule in scope — skip + if (params === undefined) return; const argCount = args?.length ?? 0; if (argCount !== params.length) { throw jaiphError( @@ -349,7 +346,6 @@ function validateArity( } } -/** Check each var-arg against the in-scope bindings; recover bindings are extra names. */ function validateArgVarRefs( filePath: string, loc: { line: number; col: number }, @@ -372,11 +368,6 @@ function validateArgVarRefs( } } -/** - * Reject nested unmanaged calls inside literal args, e.g. `outer(inner())` or `outer(\`body\`())`. - * Each literal arg is one source segment, so a nested `name(` or `` `...`( `` form is only - * valid when explicitly prefixed with `run` or `ensure`. - */ function validateNestedManagedCallArgs( filePath: string, loc: { line: number; col: number }, @@ -425,7 +416,6 @@ function checkNestedManagedInLiteral( } } -/** Replace double/single-quoted content (and surrounding quotes) with spaces for shape scanning. */ function stripQuotedSegmentContent(segment: string): string { let out = ""; let quote: "'" | '"' | null = null; @@ -448,7 +438,6 @@ function stripQuotedSegmentContent(segment: string): string { return out; } -/** Resolve a route target workflow ref to its declared parameter count. Returns undefined if unresolvable. */ function resolveRouteTargetParams( ref: string, ast: jaiphModule, @@ -469,23 +458,16 @@ function resolveRouteTargetParams( return wf?.params.length; } -/** Resolve a script import path relative to the importing file's directory. */ export function resolveScriptImportPath(fromFile: string, importPath: string): string { return resolve(dirname(fromFile), importPath); } -/** Validate every module in the graph. Equivalent to `validateModule` per entry, plus de-dup. */ export function validateReferences(graph: ModuleGraph): void { for (const node of graph.modules.values()) { validateModule(node.ast, graph); } } -/** - * Validate one module's references against the graph. Imported ASTs are read - * from `graph.modules` — no `.jh` filesystem access. `existsSync` is used - * only for `import script` paths, which point at non-`.jh` script bodies. - */ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const localChannels = new Set(ast.channels.map((c) => c.name)); const localRules = new Set(ast.rules.map((r) => r.name)); @@ -494,9 +476,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const importsByAlias = new Map(); const importedAstCache = new Map(); - // Validate script imports: resolve paths and check existence. These point - // at non-`.jh` script bodies (resolved + emitted later), so `existsSync` is - // allowed here under acceptance criterion 2. if (ast.scriptImports) { for (const si of ast.scriptImports) { const resolved = resolveScriptImportPath(ast.filePath, si.path); @@ -587,10 +566,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const stripDQ = (s: string): string => s.length >= 2 && s[0] === '"' && s[s.length - 1] === '"' ? s.slice(1, -1) : s; - /** - * Detect `const x = scriptName` and its parser sugar form `const x = "${scriptName}"`. - * Both should report the same domain error ("scripts are not values"). - */ const extractConstScriptName = (rhs: string): string | undefined => { const trimmed = rhs.trim(); if (/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(trimmed)) return trimmed; @@ -599,16 +574,13 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return m?.[1]; }; - /** Inner string for validation. Triple-quoted bodies are pre-dedented by the parser. */ const semanticQuotedOrchestrationInner = (dqRaw: string): string => stripDQ(dqRaw); - /** Detect `prompt ` form from raw `"${identifier}"` shape. */ const promptBareIdentifier = (raw: string): string | undefined => { const m = raw.match(/^"\$\{([A-Za-z_][A-Za-z0-9_]*)\}"$/); return m?.[1]; }; - /** Parse field names from a returns schema string like '{ name: string, age: number }'. */ const parseSchemaFieldNames = (rawSchema: string): string[] => { const inner = rawSchema.trim().replace(/^\s*\{\s*/, "").replace(/\s*\}\s*$/, "").trim(); if (!inner) return []; @@ -620,21 +592,19 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return names; }; - /** Collect prompt capture schemas from all steps in a workflow (pre-pass). */ const collectPromptSchemas = (steps: WorkflowStepDef[]): Map => { const schemas = new Map(); for (const s of steps) { - if (s.type === "prompt" && s.captureName && s.returns !== undefined) { - schemas.set(s.captureName, parseSchemaFieldNames(s.returns)); + if (s.type === "exec" && s.captureName && s.body.kind === "prompt" && s.body.returns !== undefined) { + schemas.set(s.captureName, parseSchemaFieldNames(s.body.returns)); } - if (s.type === "const" && s.value.kind === "prompt_capture" && s.value.returns !== undefined) { + if (s.type === "const" && s.value.kind === "prompt" && s.value.returns !== undefined) { schemas.set(s.name, parseSchemaFieldNames(s.value.returns)); } } return schemas; }; - /** Validate ${var.field} references against known prompt schemas. */ const validateDotFieldRefs = ( content: string, loc: { line: number; col: number }, @@ -687,216 +657,267 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } }; - for (const rule of ast.rules) { - validateImmutableBindings(ast.filePath, rule.steps, rule.params, rule.loc, ast.envDecls, localScripts); - const ruleKnownVars = collectKnownVars(rule.steps, ast.envDecls, rule.params); - // Named params are validated via knownVars; positional argN access was removed. - const validateRuleStep = (s: WorkflowStepDef): void => { - if (s.type === "prompt" || s.type === "send") { - throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", - `${s.type} is not allowed in rules`, - ); + /** Run the 5 standard checks (redirection, nested-managed, ref, arity, var-ref) on a callable Expr. */ + const validateCallable = ( + body: Expr, + knownVars: Set, + scope: "workflow" | "rule", + recoverBindings?: Set, + ): void => { + if (body.kind === "call") { + const loc = body.callee.loc; + validateNoShellRedirection(ast.filePath, loc, "run", body.args); + validateNestedManagedCallArgs(ast.filePath, loc, body.args); + const isRuleScope = scope === "rule"; + if (!body.callee.value.includes(".") && knownVars.has(body.callee.value) && !localScripts.has(body.callee.value) && !(scope === "workflow" && localWorkflows.has(body.callee.value))) { + throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `strings are not executable; "${body.callee.value}" is a string — use a script instead`); } - if (s.type === "comment" || s.type === "blank_line") { + validateRef(body.callee, ast, refCtx, isRuleScope ? expectRunInRuleRef : expectRunTargetRef); + validateArity(ast.filePath, loc, body.callee.value, body.args, "workflow", ast, refCtx); + validateArgVarRefs(ast.filePath, loc, body.args, knownVars, recoverBindings); + return; + } + if (body.kind === "ensure_call") { + const loc = body.callee.loc; + validateNoShellRedirection(ast.filePath, loc, "ensure", body.args); + validateNestedManagedCallArgs(ast.filePath, loc, body.args); + validateRef(body.callee, ast, refCtx, expectRuleRef); + validateArity(ast.filePath, loc, body.callee.value, body.args, "rule", ast, refCtx); + validateArgVarRefs(ast.filePath, loc, body.args, knownVars, recoverBindings); + return; + } + if (body.kind === "inline_script") { + return; // no ref to validate + } + if (body.kind === "match") { + validateMatchExpr(ast.filePath, body.match, knownVars); + return; + } + }; + + /** Validate the value Expr stored under a `const` / `return` / `send` step in a workflow context. */ + const validateWorkflowValueExpr = ( + expr: Expr, + stepLoc: { line: number; col: number }, + knownVars: Set, + promptSchemas: Map, + recoverBindings: Set | undefined, + label: "const" | "return" | "send", + constName?: string, + ): void => { + if (expr.kind === "literal") { + if (label === "send") { + const inner = expr.raw.startsWith('"') && expr.raw.endsWith('"') ? expr.raw.slice(1, -1) : expr.raw; + validateJaiphStringContent(inner, ast.filePath, stepLoc.line, stepLoc.col, "send"); + validateWorkflowStringCaptures(inner, stepLoc); + validateDotFieldRefs(inner, stepLoc, promptSchemas); + validateSimpleInterpolationIdentifiers( + inner, ast.filePath, stepLoc.line, stepLoc.col, + "send", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, + ); return; } - if (s.type === "ensure") { - validateNoShellRedirection(ast.filePath, s.ref.loc, "ensure", s.args); - validateNestedManagedCallArgs(ast.filePath, s.ref.loc, s.args); - validateRef(s.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, s.ref.loc, s.ref.value, s.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.ref.loc, s.args, ruleKnownVars); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - const rb = new Set(); - rb.add(s.catch.bindings.failure); - for (const r of steps) validateRuleStep(r); + if (label === "return") { + validateReturnString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); + if (expr.raw.startsWith('"')) { + const retInner = semanticQuotedOrchestrationInner(expr.raw); + validateWorkflowStringCaptures(retInner, stepLoc); + validateDotFieldRefs(retInner, stepLoc, promptSchemas); + validateSimpleInterpolationIdentifiers( + retInner, ast.filePath, stepLoc.line, stepLoc.col, + "return", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, + ); } return; } - if (s.type === "run") { - validateNoShellRedirection(ast.filePath, s.workflow.loc, "run", s.args); - validateNestedManagedCallArgs(ast.filePath, s.workflow.loc, s.args); - if (s.async) { - throw jaiphError( - ast.filePath, - s.workflow.loc.line, - s.workflow.loc.col, - "E_VALIDATE", - "run async is not allowed in rules; use it in workflows only", + // const + const scriptName = extractConstScriptName(expr.raw); + if (scriptName && localScripts.has(scriptName)) { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); + } + const inner = semanticQuotedOrchestrationInner(expr.raw); + validateWorkflowStringCaptures(inner, stepLoc); + validateDotFieldRefs(inner, stepLoc, promptSchemas); + validateSimpleInterpolationIdentifiers( + inner, ast.filePath, stepLoc.line, stepLoc.col, + "const", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, + ); + return; + } + if (expr.kind === "call") { + validateCallable(expr, knownVars, "workflow", recoverBindings); + return; + } + if (expr.kind === "ensure_call") { + validateCallable(expr, knownVars, "workflow", recoverBindings); + return; + } + if (expr.kind === "inline_script") { + return; + } + if (expr.kind === "match") { + validateMatchExpr(ast.filePath, expr.match, knownVars); + return; + } + if (expr.kind === "prompt") { + if (label !== "const") { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `prompt is not a valid ${label} value`); + } + const promptIdent = promptBareIdentifier(expr.raw); + if (promptIdent && localScripts.has(promptIdent)) { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); + } + validatePromptString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); + if (expr.returns !== undefined) { + validatePromptReturnsSchema(expr.returns, ast.filePath, stepLoc.line, stepLoc.col); + } + const pcInner = semanticQuotedOrchestrationInner(expr.raw); + validateWorkflowStringCaptures(pcInner, stepLoc); + validateDotFieldRefs(pcInner, stepLoc, promptSchemas); + validateSimpleInterpolationIdentifiers( + pcInner, ast.filePath, stepLoc.line, stepLoc.col, + "prompt", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, + ); + return; + } + if (expr.kind === "bare_ref") { + if (label !== "send") { + throw jaiphError(ast.filePath, expr.ref.loc.line, expr.ref.loc.col, "E_VALIDATE", `bare reference is only valid as a send payload`); + } + validateRef(expr.ref, ast, refCtx, bareSendRefSpec); + return; + } + if (expr.kind === "shell") { + if (label !== "send") { + throw jaiphError(ast.filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", `raw shell fragment is only valid as a send payload`); + } + validateManagedWorkflowShell(expr.command, makeSubEnv({ line: expr.loc.line, col: expr.loc.col })); + return; + } + void constName; + }; + + /** Same as `validateWorkflowValueExpr` but with rule-scope rules (no prompt, restricted run targets). */ + const validateRuleValueExpr = ( + expr: Expr, + stepLoc: { line: number; col: number }, + knownVars: Set, + label: "const" | "return", + ): void => { + if (expr.kind === "literal") { + if (label === "return") { + validateReturnString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); + if (expr.raw.startsWith('"')) { + const retRuleInner = semanticQuotedOrchestrationInner(expr.raw); + validateRuleStringCaptures(retRuleInner, stepLoc); + validateSimpleInterpolationIdentifiers( + retRuleInner, ast.filePath, stepLoc.line, stepLoc.col, + "return", knownVars, "rule", undefined, undefined, localScripts, ); } - if (!s.workflow.value.includes(".") && ruleKnownVars.has(s.workflow.value) && !localScripts.has(s.workflow.value)) { - throw jaiphError(ast.filePath, s.workflow.loc.line, s.workflow.loc.col, "E_VALIDATE", `strings are not executable; "${s.workflow.value}" is a string — use a script instead`); - } - validateRef(s.workflow, ast, refCtx, expectRunInRuleRef); - validateArity(ast.filePath, s.workflow.loc, s.workflow.value, s.args, "workflow", ast, refCtx); + return; + } + const scriptName = extractConstScriptName(expr.raw); + if (scriptName && localScripts.has(scriptName)) { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); + } + validateRuleStringCaptures(stripDQ(expr.raw), stepLoc); + validateSimpleInterpolationIdentifiers( + stripDQ(expr.raw), ast.filePath, stepLoc.line, stepLoc.col, + "const", knownVars, "rule", undefined, undefined, localScripts, + ); + return; + } + if (expr.kind === "call") { + validateCallable(expr, knownVars, "rule"); + return; + } + if (expr.kind === "ensure_call") { + validateCallable(expr, knownVars, "rule"); + return; + } + if (expr.kind === "inline_script") { + return; + } + if (expr.kind === "match") { + validateMatchExpr(ast.filePath, expr.match, knownVars); + return; + } + if (expr.kind === "prompt") { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); + } + if (expr.kind === "bare_ref" || expr.kind === "shell") { + throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `${expr.kind} expression is not allowed in rules`); + } + }; - validateArgVarRefs(ast.filePath, s.workflow.loc, s.args, ruleKnownVars); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - const rb = new Set(); - rb.add(s.catch.bindings.failure); - for (const r of steps) validateRuleStep(r); + for (const rule of ast.rules) { + validateImmutableBindings(ast.filePath, rule.steps, rule.params, rule.loc, ast.envDecls, localScripts); + const ruleKnownVars = collectKnownVars(rule.steps, ast.envDecls, rule.params); + const validateRuleStep = (s: WorkflowStepDef): void => { + if (s.type === "trivia") return; + if (s.type === "say") { + if (s.level === "log" || s.level === "logerr") { + if (s.message.kind === "inline_script") return; + if (s.message.kind === "literal") { + validateLogString(s.message.raw, ast.filePath, s.loc.line, s.loc.col, s.level); + const inner = s.message.raw; + validateRuleStringCaptures(inner, s.loc); + validateSimpleInterpolationIdentifiers( + inner, ast.filePath, s.loc.line, s.loc.col, + s.level, ruleKnownVars, "rule", undefined, undefined, localScripts, + ); + return; + } + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); } - if (s.recover) { - const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; - const rb = new Set(); - rb.add(s.recover.bindings.failure); - for (const r of steps) validateRuleStep(r); + // fail + if (s.message.kind !== "literal") { + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); } - return; - } - if (s.type === "fail") { - validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col); - const failInner = semanticQuotedOrchestrationInner(s.message); + validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); + const failInner = semanticQuotedOrchestrationInner(s.message.raw); validateRuleStringCaptures(failInner, s.loc); validateSimpleInterpolationIdentifiers( - failInner, - ast.filePath, - s.loc.line, - s.loc.col, - "fail", - ruleKnownVars, - "rule", - undefined, - undefined, - localScripts, + failInner, ast.filePath, s.loc.line, s.loc.col, + "fail", ruleKnownVars, "rule", undefined, undefined, localScripts, ); return; } - if (s.type === "log") { - if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log"); - const logRuleInner = s.message; - validateRuleStringCaptures(logRuleInner, s.loc); - validateSimpleInterpolationIdentifiers( - logRuleInner, - ast.filePath, - s.loc.line, - s.loc.col, - "log", - ruleKnownVars, - "rule", - undefined, - undefined, - localScripts, - ); - return; - } - if (s.type === "logerr") { - if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr"); - const logerrRuleInner = s.message; - validateRuleStringCaptures(logerrRuleInner, s.loc); - validateSimpleInterpolationIdentifiers( - logerrRuleInner, - ast.filePath, - s.loc.line, - s.loc.col, - "logerr", - ruleKnownVars, - "rule", - undefined, - undefined, - localScripts, - ); - return; + if (s.type === "send") { + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "send is not allowed in rules"); } if (s.type === "return") { - if (s.managed) { - if (s.managed.kind === "run") { - validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "run", s.managed.args); - validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); - validateRef(s.managed.ref, ast, refCtx, expectRunInRuleRef); - validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, ruleKnownVars); - } else if (s.managed.kind === "ensure") { - validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "ensure", s.managed.args); - validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); - validateRef(s.managed.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, ruleKnownVars); - } else if (s.managed.kind === "match") { - validateMatchExpr(ast.filePath, s.managed.match, ruleKnownVars); - } - // run_inline_script — no ref to validate - } else { - validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col); - if (s.value.startsWith('"')) { - const retRuleInner = semanticQuotedOrchestrationInner(s.value); - validateRuleStringCaptures(retRuleInner, s.loc); - validateSimpleInterpolationIdentifiers( - retRuleInner, - ast.filePath, - s.loc.line, - s.loc.col, - "return", - ruleKnownVars, - "rule", - undefined, - undefined, - localScripts, - ); - } - } + validateRuleValueExpr(s.value, s.loc, ruleKnownVars, "return"); return; } if (s.type === "const") { - const v = s.value; - if (v.kind === "run_capture") { - validateNoShellRedirection(ast.filePath, v.ref.loc, "run", v.args); - validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); - if (!v.ref.value.includes(".") && ruleKnownVars.has(v.ref.value) && !localScripts.has(v.ref.value)) { - throw jaiphError(ast.filePath, v.ref.loc.line, v.ref.loc.col, "E_VALIDATE", `strings are not executable; "${v.ref.value}" is a string — use a script instead`); - } - validateRef(v.ref, ast, refCtx, expectRunInRuleRef); - validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, v.ref.loc, v.args, ruleKnownVars); - } else if (v.kind === "ensure_capture") { - validateNoShellRedirection(ast.filePath, v.ref.loc, "ensure", v.args); - validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); - validateRef(v.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, v.ref.loc, v.args, ruleKnownVars); - } else if (v.kind === "prompt_capture") { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); - } else if (v.kind === "run_inline_script_capture") { - // inline script capture — no ref to validate - } else if (v.kind === "match_expr") { - validateMatchExpr(ast.filePath, v.match, ruleKnownVars); - } else if (v.kind === "expr") { - const scriptName = extractConstScriptName(v.bashRhs); - if (scriptName && localScripts.has(scriptName)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); - } - validateRuleStringCaptures(stripDQ(v.bashRhs), s.loc); - validateSimpleInterpolationIdentifiers( - stripDQ(v.bashRhs), - ast.filePath, - s.loc.line, - s.loc.col, - "const", - ruleKnownVars, - "rule", - undefined, - undefined, - localScripts, - ); - } + validateRuleValueExpr(s.value, s.loc, ruleKnownVars, "const"); return; } - if (s.type === "match") { - validateMatchExpr(ast.filePath, s.expr, ruleKnownVars); + if (s.type === "exec") { + const body = s.body; + if (body.kind === "prompt") { + throw jaiphError(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "prompt is not allowed in rules"); + } + if (body.kind === "shell") { + throw jaiphError(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "inline shell steps are forbidden in rules; use explicit script blocks"); + } + if (body.kind === "call" && (s as Extract).body.kind === "call") { + const callBody = body; + if (callBody.async) { + throw jaiphError(ast.filePath, callBody.callee.loc.line, callBody.callee.loc.col, "E_VALIDATE", "run async is not allowed in rules; use it in workflows only"); + } + } + validateCallable(body, ruleKnownVars, "rule"); + if (s.catch) { + const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; + for (const r of steps) validateRuleStep(r); + } + if (s.recover) { + const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; + for (const r of steps) validateRuleStep(r); + } return; } if (s.type === "if") { @@ -911,28 +932,13 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (s.type === "for_lines") { if (!ruleKnownVars.has(s.sourceVar)) { throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", + ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); } for (const bodyStep of s.body) validateRuleStep(bodyStep); return; } - if (s.type === "run_inline_script") { - return; - } - if (s.type === "shell") { - throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", - "inline shell steps are forbidden in rules; use explicit script blocks", - ); - } const _never: never = s; return _never; }; @@ -941,57 +947,29 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } } - const validateChannelRef = ( - channel: string, - loc: { line: number; col: number }, - ): void => { + const validateChannelRef = (channel: string, loc: { line: number; col: number }): void => { const parts = channel.split("."); if (parts.length === 1) { if (!localChannels.has(channel)) { - throw jaiphError( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `Channel "${channel}" is not defined`, - ); + throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } return; } if (parts.length !== 2) { - throw jaiphError( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `Channel "${channel}" is not defined`, - ); + throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } const [alias, importedChannel] = parts; const importedFile = importsByAlias.get(alias); if (!importedFile) { - throw jaiphError( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `Channel "${channel}" is not defined`, - ); + throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } const importedAst = importedAstCache.get(importedFile)!; const importedChannels = new Set(importedAst.channels.map((c) => c.name)); if (!importedChannels.has(importedChannel)) { - throw jaiphError( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `Channel "${channel}" is not defined`, - ); + throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } }; - // Validate channel-level route declarations. for (const ch of ast.channels) { if (ch.routes) { for (const wfRef of ch.routes) { @@ -999,10 +977,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const targetParams = resolveRouteTargetParams(wfRef.value, ast, refCtx); if (targetParams !== undefined && targetParams !== 3) { throw jaiphError( - ast.filePath, - wfRef.loc.line, - wfRef.loc.col, - "E_VALIDATE", + ast.filePath, wfRef.loc.line, wfRef.loc.col, "E_VALIDATE", `inbox route target "${wfRef.value}" must declare exactly 3 parameters (message, channel, sender), but declares ${targetParams}`, ); } @@ -1014,284 +989,94 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { validateImmutableBindings(ast.filePath, workflow.steps, workflow.params, workflow.loc, ast.envDecls, localScripts); const promptSchemas = collectPromptSchemas(workflow.steps); const wfKnownVars = collectKnownVars(workflow.steps, ast.envDecls, workflow.params); - // Named params are validated via knownVars; positional argN access was removed. const validateStep = (s: WorkflowStepDef, recoverBindings?: Set): void => { - if (s.type === "comment" || s.type === "blank_line") { - return; - } + if (s.type === "trivia") return; if (s.type === "send") { validateChannelRef(s.channel, s.loc); - if (s.rhs.kind === "run") { - validateNoShellRedirection(ast.filePath, s.rhs.ref.loc, "run", s.rhs.args); - validateNestedManagedCallArgs(ast.filePath, s.rhs.ref.loc, s.rhs.args); - validateRef(s.rhs.ref, ast, refCtx, expectRunTargetRef); - validateArity(ast.filePath, s.rhs.ref.loc, s.rhs.ref.value, s.rhs.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.rhs.ref.loc, s.rhs.args, wfKnownVars, recoverBindings); - } else if (s.rhs.kind === "literal") { - const inner = s.rhs.token.startsWith('"') && s.rhs.token.endsWith('"') - ? s.rhs.token.slice(1, -1) : s.rhs.token; - validateJaiphStringContent(inner, ast.filePath, s.loc.line, s.loc.col, "send"); - validateWorkflowStringCaptures(inner, s.loc); - validateDotFieldRefs(inner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - inner, - ast.filePath, - s.loc.line, - s.loc.col, - "send", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); - } else if (s.rhs.kind === "bare_ref") { - validateRef(s.rhs.ref, ast, refCtx, bareSendRefSpec); - } else if (s.rhs.kind === "shell") { - validateManagedWorkflowShell( - s.rhs.command, - makeSubEnv({ line: s.rhs.loc.line, col: s.rhs.loc.col }), - ); - } + validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "send"); return; } - if (s.type === "ensure") { - validateNoShellRedirection(ast.filePath, s.ref.loc, "ensure", s.args); - validateNestedManagedCallArgs(ast.filePath, s.ref.loc, s.args); - validateRef(s.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, s.ref.loc, s.ref.value, s.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.ref.loc, s.args, wfKnownVars, recoverBindings); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - const rb = new Set(); - rb.add(s.catch.bindings.failure); - for (const r of steps) validateStep(r, rb); - } - return; - } - if (s.type === "run") { - validateNoShellRedirection(ast.filePath, s.workflow.loc, "run", s.args); - validateNestedManagedCallArgs(ast.filePath, s.workflow.loc, s.args); - if (!s.workflow.value.includes(".") && wfKnownVars.has(s.workflow.value) && !localScripts.has(s.workflow.value) && !localWorkflows.has(s.workflow.value)) { - throw jaiphError(ast.filePath, s.workflow.loc.line, s.workflow.loc.col, "E_VALIDATE", `strings are not executable; "${s.workflow.value}" is a string — use a script instead`); - } - validateRef(s.workflow, ast, refCtx, expectRunTargetRef); - validateArity(ast.filePath, s.workflow.loc, s.workflow.value, s.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.workflow.loc, s.args, wfKnownVars, recoverBindings); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - const rb = new Set(); - rb.add(s.catch.bindings.failure); - for (const r of steps) validateStep(r, rb); - } - if (s.recover) { - const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; - const rb = new Set(); - rb.add(s.recover.bindings.failure); - for (const r of steps) validateStep(r, rb); + if (s.type === "say") { + if (s.level === "log" || s.level === "logerr") { + if (s.message.kind === "inline_script") return; + if (s.message.kind === "literal") { + validateLogString(s.message.raw, ast.filePath, s.loc.line, s.loc.col, s.level); + const inner = s.message.raw; + validateWorkflowStringCaptures(inner, s.loc); + validateDotFieldRefs(inner, s.loc, promptSchemas); + validateSimpleInterpolationIdentifiers( + inner, ast.filePath, s.loc.line, s.loc.col, + s.level, wfKnownVars, "workflow", promptSchemas, recoverBindings, localScripts, + ); + return; + } + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); } - return; - } - if (s.type === "prompt") { - const promptIdent = promptBareIdentifier(s.raw); - if (promptIdent && localScripts.has(promptIdent)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); + // fail + if (s.message.kind !== "literal") { + throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); } - validatePromptString(s.raw, ast.filePath, s.loc.line, s.loc.col); - validatePromptStepReturns(s, ast.filePath); - const promptInner = semanticQuotedOrchestrationInner(s.raw); - validateWorkflowStringCaptures(promptInner, s.loc); - validateDotFieldRefs(promptInner, s.loc, promptSchemas); + validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); + const failInner = semanticQuotedOrchestrationInner(s.message.raw); + validateWorkflowStringCaptures(failInner, s.loc); + validateDotFieldRefs(failInner, s.loc, promptSchemas); validateSimpleInterpolationIdentifiers( - promptInner, - ast.filePath, - s.loc.line, - s.loc.col, - "prompt", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, + failInner, ast.filePath, s.loc.line, s.loc.col, + "fail", wfKnownVars, "workflow", promptSchemas, recoverBindings, localScripts, ); return; } - if (s.type === "log") { - if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "log"); - const logInner = s.message; - validateWorkflowStringCaptures(logInner, s.loc); - validateDotFieldRefs(logInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - logInner, - ast.filePath, - s.loc.line, - s.loc.col, - "log", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); + if (s.type === "return") { + validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "return"); return; } - if (s.type === "logerr") { - if (s.managed?.kind === "run_inline_script") return; // inline script — no ref to validate - validateLogString(s.message, ast.filePath, s.loc.line, s.loc.col, "logerr"); - const logerrInner = s.message; - validateWorkflowStringCaptures(logerrInner, s.loc); - validateDotFieldRefs(logerrInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - logerrInner, - ast.filePath, - s.loc.line, - s.loc.col, - "logerr", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); + if (s.type === "const") { + validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "const", s.name); return; } - if (s.type === "return") { - if (s.managed) { - if (s.managed.kind === "run") { - validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "run", s.managed.args); - validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); - validateRef(s.managed.ref, ast, refCtx, expectRunTargetRef); - validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, wfKnownVars, recoverBindings); - } else if (s.managed.kind === "ensure") { - validateNoShellRedirection(ast.filePath, s.managed.ref.loc, "ensure", s.managed.args); - validateNestedManagedCallArgs(ast.filePath, s.managed.ref.loc, s.managed.args); - validateRef(s.managed.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, s.managed.ref.loc, s.managed.ref.value, s.managed.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, s.managed.ref.loc, s.managed.args, wfKnownVars, recoverBindings); - } else if (s.managed.kind === "match") { - validateMatchExpr(ast.filePath, s.managed.match, wfKnownVars); - } + if (s.type === "exec") { + const body = s.body; + if (body.kind === "prompt") { + validateWorkflowValueExpr(body, s.loc, wfKnownVars, promptSchemas, recoverBindings, "const"); + validatePromptStepReturns(body, s.captureName, ast.filePath); return; } - validateReturnString(s.value, ast.filePath, s.loc.line, s.loc.col); - if (s.value.startsWith('"')) { - const retInner = semanticQuotedOrchestrationInner(s.value); - validateWorkflowStringCaptures(retInner, s.loc); - validateDotFieldRefs(retInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - retInner, - ast.filePath, - s.loc.line, - s.loc.col, - "return", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); - } - return; - } - if (s.type === "fail") { - validateFailString(s.message, ast.filePath, s.loc.line, s.loc.col); - const failWfInner = semanticQuotedOrchestrationInner(s.message); - validateWorkflowStringCaptures(failWfInner, s.loc); - validateDotFieldRefs(failWfInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - failWfInner, - ast.filePath, - s.loc.line, - s.loc.col, - "fail", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); - return; - } - if (s.type === "const") { - const v = s.value; - if (v.kind === "run_capture") { - validateNoShellRedirection(ast.filePath, v.ref.loc, "run", v.args); - validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); - if (!v.ref.value.includes(".") && wfKnownVars.has(v.ref.value) && !localScripts.has(v.ref.value) && !localWorkflows.has(v.ref.value)) { - throw jaiphError(ast.filePath, v.ref.loc.line, v.ref.loc.col, "E_VALIDATE", `strings are not executable; "${v.ref.value}" is a string — use a script instead`); - } - validateRef(v.ref, ast, refCtx, expectRunTargetRef); - validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "workflow", ast, refCtx); - - validateArgVarRefs(ast.filePath, v.ref.loc, v.args, wfKnownVars, recoverBindings); - } else if (v.kind === "ensure_capture") { - validateNoShellRedirection(ast.filePath, v.ref.loc, "ensure", v.args); - validateNestedManagedCallArgs(ast.filePath, v.ref.loc, v.args); - validateRef(v.ref, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, v.ref.loc, v.ref.value, v.args, "rule", ast, refCtx); - - validateArgVarRefs(ast.filePath, v.ref.loc, v.args, wfKnownVars, recoverBindings); - } else if (v.kind === "prompt_capture") { - const promptIdent = promptBareIdentifier(v.raw); - if (promptIdent && localScripts.has(promptIdent)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); - } - validatePromptString(v.raw, ast.filePath, s.loc.line, s.loc.col); - if (v.returns !== undefined) { - validatePromptReturnsSchema(v.returns, ast.filePath, s.loc.line, s.loc.col); + if (body.kind === "shell") { + if (hasUnquotedSendArrow(body.command) && matchSendOperator(body.command) === null) { + throw jaiphError( + ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", + "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)", + ); } - const pcInner = semanticQuotedOrchestrationInner(v.raw); - validateWorkflowStringCaptures(pcInner, s.loc); - validateDotFieldRefs(pcInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - pcInner, - ast.filePath, - s.loc.line, - s.loc.col, - "prompt", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); - } else if (v.kind === "run_inline_script_capture") { - // inline script capture — no ref to validate - } else if (v.kind === "match_expr") { - validateMatchExpr(ast.filePath, v.match, wfKnownVars); - } else if (v.kind === "expr") { - const scriptName = extractConstScriptName(v.bashRhs); - if (scriptName && localScripts.has(scriptName)) { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); + const t = body.command.trim(); + if (/^(?:[A-Za-z_][A-Za-z0-9_]*)(?:\.[A-Za-z_][A-Za-z0-9_]*)*$/.test(t)) { + if (!t.includes(".")) { + if (localScripts.has(t) || localWorkflows.has(t)) { + throw jaiphError( + ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", + `use run ${t}() — a bare name that refers to a script or workflow must use a managed run step`, + ); + } + } else { + validateRef({ value: t, loc: body.loc }, ast, refCtx, expectRunTargetRef); + throw jaiphError( + ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", + `use run ${t}() — "${t}" is a valid script or workflow reference; use a managed run step`, + ); + } } - const exprInner = semanticQuotedOrchestrationInner(v.bashRhs); - validateWorkflowStringCaptures(exprInner, s.loc); - validateDotFieldRefs(exprInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - exprInner, - ast.filePath, - s.loc.line, - s.loc.col, - "const", - wfKnownVars, - "workflow", - promptSchemas, - recoverBindings, - localScripts, - ); + return; + } + validateCallable(body, wfKnownVars, "workflow", recoverBindings); + if (s.catch) { + const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; + for (const r of steps) validateStep(r, new Set([s.catch.bindings.failure])); + } + if (s.recover) { + const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; + for (const r of steps) validateStep(r, new Set([s.recover.bindings.failure])); } - return; - } - if (s.type === "match") { - validateMatchExpr(ast.filePath, s.expr, wfKnownVars); return; } if (s.type === "if") { @@ -1306,59 +1091,17 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (s.type === "for_lines") { if (!wfKnownVars.has(s.sourceVar)) { throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", + ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); } for (const bodyStep of s.body) validateStep(bodyStep, recoverBindings); return; } - if (s.type === "run_inline_script") { - return; - } - if (s.type === "shell") { - if (hasUnquotedSendArrow(s.command) && matchSendOperator(s.command) === null) { - throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", - "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)", - ); - } - const t = s.command.trim(); - if (/^(?:[A-Za-z_][A-Za-z0-9_]*)(?:\.[A-Za-z_][A-Za-z0-9_]*)*$/.test(t)) { - if (!t.includes(".")) { - if (localScripts.has(t) || localWorkflows.has(t)) { - throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", - `use run ${t}() — a bare name that refers to a script or workflow must use a managed run step`, - ); - } - } else { - validateRef({ value: t, loc: s.loc }, ast, refCtx, expectRunTargetRef); - throw jaiphError( - ast.filePath, - s.loc.line, - s.loc.col, - "E_VALIDATE", - `use run ${t}() — "${t}" is a valid script or workflow reference; use a managed run step`, - ); - } - } - return; - } const _never: never = s; return _never; }; - for (const step of workflow.steps) { validateStep(step); } @@ -1369,17 +1112,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } } -/** - * Validate variable references inside `test` blocks. The only names in scope are - * those introduced by `const NAME = …` (literal or `run … capture`) earlier in - * the same block. There is no implicit `response`: an `expect_*` step that - * references an undeclared name is a compile-time error. - * - * Errors raised: - * - `mock prompt ` where `` was not declared earlier - * - `expect_*` LHS variable not declared earlier - * - `expect_* var ` RHS where `` was not declared earlier - */ function validateTestBlocks(ast: jaiphModule, tests: import("../types").TestBlockDef[]): void { for (const tb of tests) { const inScope = new Set(); @@ -1433,11 +1165,6 @@ function validateTestBlocks(ast: jaiphModule, tests: import("../types").TestBloc } continue; } - // Other step types (mock_workflow/rule/script bodies, blank_line, comment) are - // out of scope for this pass: their bodies are validated as workflow/rule steps - // by the regular path when materialized, and they do not contribute to the - // test-level `vars` map. } } } - diff --git a/src/types-shape.test.ts b/src/types-shape.test.ts new file mode 100644 index 00000000..ad2045e6 --- /dev/null +++ b/src/types-shape.test.ts @@ -0,0 +1,160 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readdirSync, readFileSync, statSync } from "node:fs"; +import { join, resolve } from "node:path"; +import type { Expr, WorkflowStepDef } from "./types"; +import * as TypesModule from "./types"; + +// Tests run from dist/src/, so source files live two levels up under src/. +const repoRoot = resolve(__dirname, "../.."); +const srcRoot = join(repoRoot, "src"); + +/** + * AC1 — Placeholder strings deleted from the AST. + * + * After collapsing the three managed-call encodings into `Expr`, no source + * file under `src/` should ever produce the legacy sentinel values that + * existed only so the formatter could print something while the real + * payload sat in a `managed:` sidecar. + * + * If anyone reintroduces one of these strings as a placeholder, this test + * fails with the offending file:line. + */ +const PLACEHOLDER_STRINGS = ['"__match__"', '"run inline_script"', '"__JAIPH_MANAGED__"']; + +function listSourceFiles(dir: string, acc: string[]): void { + for (const entry of readdirSync(dir)) { + const full = join(dir, entry); + const st = statSync(full); + if (st.isDirectory()) { + // Skip the test file itself so it's allowed to mention the strings. + listSourceFiles(full, acc); + continue; + } + if (!entry.endsWith(".ts")) continue; + if (entry.endsWith(".test.ts")) continue; // tests may reference strings in assertions + if (full.endsWith("types-shape.test.ts")) continue; + acc.push(full); + } +} + +test("AC1: no AST placeholder strings linger in src/", () => { + const files: string[] = []; + listSourceFiles(srcRoot, files); + const offenders: string[] = []; + for (const file of files) { + const text = readFileSync(file, "utf8"); + for (const placeholder of PLACEHOLDER_STRINGS) { + if (text.includes(placeholder)) { + offenders.push(`${file} contains ${placeholder}`); + } + } + } + assert.deepEqual(offenders, [], `Placeholder strings reappeared in src/:\n${offenders.join("\n")}`); +}); + +/** + * AC2 — `WorkflowStepDef` has at most 8 variants. The exhaustive switch + * below fails to compile if a new variant is silently added (the `never` + * fallback widens), and the runtime tuple lookup pins the count to 8. + */ +type StepType = WorkflowStepDef["type"]; +type AllStepTypes = readonly ["exec", "const", "return", "send", "say", "if", "for_lines", "trivia"]; +type _StepTypesCoverAllVariants = StepType extends AllStepTypes[number] + ? AllStepTypes[number] extends StepType + ? true + : never + : never; +const _stepTypesAtMost8: _StepTypesCoverAllVariants = true; + +function _exhaustiveStepSwitch(s: WorkflowStepDef): void { + switch (s.type) { + case "exec": + case "const": + case "return": + case "send": + case "say": + case "if": + case "for_lines": + case "trivia": + return; + default: { + const _never: never = s; + return _never; + } + } +} + +test("AC2: WorkflowStepDef has exactly 8 variants", () => { + const declaredTypes: AllStepTypes = ["exec", "const", "return", "send", "say", "if", "for_lines", "trivia"]; + assert.equal(declaredTypes.length, 8); + assert.equal(_stepTypesAtMost8, true); + // Reference the exhaustive switch so the unused-symbol check is happy and + // the dead-code eliminator can't drop the type-level assertion. + void _exhaustiveStepSwitch; +}); + +/** + * AC2 (companion) — `Expr` is exhaustive too. The Refactor 3 design carries + * 7 base kinds from the task spec; this implementation adds `shell` and + * `bare_ref` for send-RHS shapes that the validator either rejects or + * specializes. If a kind is added or removed without updating both the + * declared list and the exhaustive switch, this fails to compile. + */ +type ExprKind = Expr["kind"]; +type AllExprKinds = readonly ["literal", "call", "ensure_call", "inline_script", "prompt", "match", "shell", "bare_ref"]; +type _ExprKindsExhaustive = ExprKind extends AllExprKinds[number] + ? AllExprKinds[number] extends ExprKind + ? true + : never + : never; +const _exprExhaustive: _ExprKindsExhaustive = true; + +function _exhaustiveExprSwitch(e: Expr): void { + switch (e.kind) { + case "literal": + case "call": + case "ensure_call": + case "inline_script": + case "prompt": + case "match": + case "shell": + case "bare_ref": + return; + default: { + const _never: never = e; + return _never; + } + } +} + +test("AC2: Expr has exactly 8 kinds (literal/call/ensure_call/inline_script/prompt/match/shell/bare_ref)", () => { + const declaredKinds: AllExprKinds = ["literal", "call", "ensure_call", "inline_script", "prompt", "match", "shell", "bare_ref"]; + assert.equal(declaredKinds.length, 8); + assert.equal(_exprExhaustive, true); + void _exhaustiveExprSwitch; +}); + +/** + * AC3 — `ConstRhs` and `SendRhsDef` are deleted as separate exported + * symbols; their fields now live inside `Expr`. + */ +test("AC3: ConstRhs and SendRhsDef are not exported from src/types.ts", () => { + const exported = Object.keys(TypesModule); + // Both symbol names should be absent from the module's export surface. + assert.ok(!exported.includes("ConstRhs"), `ConstRhs should not be exported`); + assert.ok(!exported.includes("SendRhsDef"), `SendRhsDef should not be exported`); + + // Belt-and-suspenders: re-check the source file. (Pure types don't show up + // in runtime exports, so the textual check is what catches them.) + const typesPath = join(srcRoot, "types.ts"); + const typesText = readFileSync(typesPath, "utf8"); + assert.ok( + !/export\s+type\s+ConstRhs\b/.test(typesText), + "src/types.ts must not export ConstRhs", + ); + assert.ok( + !/export\s+type\s+SendRhsDef\b/.test(typesText), + "src/types.ts must not export SendRhsDef", + ); +}); diff --git a/src/types.ts b/src/types.ts index 73080680..b990dba6 100644 --- a/src/types.ts +++ b/src/types.ts @@ -59,28 +59,46 @@ export type Arg = | { kind: "literal"; raw: string } | { kind: "var"; name: string }; -export type ConstRhs = - | { kind: "expr"; bashRhs: string } - | { kind: "run_capture"; ref: WorkflowRefDef; args?: Arg[]; async?: boolean } - | { kind: "ensure_capture"; ref: RuleRefDef; args?: Arg[] } - | { - kind: "prompt_capture"; - raw: string; - loc: SourceLoc; - returns?: string; - } - | { kind: "run_inline_script_capture"; body: string; lang?: string; args?: Arg[] } - | { kind: "match_expr"; match: MatchExprDef }; +/** + * One expression — used wherever a value can appear: + * - `const name = ` + * - `return ` + * - `send channel <- ` + * - `log ` / `logerr ` / `fail ` + * - body of an `exec` step (managed call statement form, where the value is consumed + * for its side effects + optional capture) + * + * Replaces the prior `ConstRhs` / `SendRhsDef` unions and the placeholder-string + * `managed:` sidecar on `return` / `log` / `logerr`. + * + * Kinds: + * - `literal`: a string or `$var` / `${var}` form — the raw text as it appears in source + * (post-dedent for triple-quoted bodies; the formatter consults trivia for surface form). + * - `call`: a managed workflow/script call `ref(args)`. `async` is set when the source said + * `run async ref(...)` in capture position. + * - `ensure_call`: a managed rule call `ref(args)`. + * - `inline_script`: an inline-script call (`` `body`(args) `` or fenced). + * - `prompt`: a prompt body. `raw` carries the JSON-quoted prompt text (or `"${identifier}"` + * sugar). `returns` carries an optional flat returns schema. + * - `match`: a `match { ... }` expression evaluated for its value. + * - `shell`: a raw shell fragment used as a managed substitution on the send RHS. + * - `bare_ref`: a bare symbol on a send RHS (e.g. `channel <- foo`). Always rejected by the + * validator; preserved so the error message can name the symbol. + */ +export type Expr = + | { kind: "literal"; raw: string } + | { kind: "call"; callee: WorkflowRefDef; args?: Arg[]; async?: boolean } + | { kind: "ensure_call"; callee: RuleRefDef; args?: Arg[] } + | { kind: "inline_script"; lang?: string; body: string; args?: Arg[] } + | { kind: "prompt"; raw: string; loc: SourceLoc; returns?: string } + | { kind: "match"; match: MatchExprDef } + | { kind: "shell"; command: string; loc: SourceLoc } + | { kind: "bare_ref"; ref: WorkflowRefDef }; -/** RHS of `channel <- …` */ -export type SendRhsDef = - | { kind: "literal"; token: string } - | { kind: "var"; bash: string } - | { kind: "run"; ref: WorkflowRefDef; args?: Arg[] } - /** Parsed then rejected in validation (use `run ref` to capture a return value). */ - | { kind: "bare_ref"; ref: WorkflowRefDef } - /** Shell fragment emitted as `"$(...)"` for inbox send. */ - | { kind: "shell"; command: string; loc: SourceLoc }; +/** Body attached to a `catch` or `recover` clause on an exec step. */ +export type CatchBody = + | { single: WorkflowStepDef; bindings: { failure: string } } + | { block: WorkflowStepDef[]; bindings: { failure: string } }; export interface RuleDef { name: string; @@ -119,109 +137,55 @@ export interface ScriptDef { loc: SourceLoc; } +/** + * Eight workflow-step variants — all values that flow through a step live in `Expr`. + * + * - `exec`: side-effecting managed call statement (was: `run` / `ensure` / + * `run_inline_script` / `prompt` / `shell` step / standalone `match`). The + * discriminator now lives inside `body.kind`; `captureName` / `async` / + * `catch` / `recover` are step-level attributes. + * - `const` / `return` / `send`: bind, propagate, or emit an `Expr` value. + * - `say`: was `log` / `logerr` / `fail`. `level: "fail"` aborts the workflow + * with the message; otherwise the message is written to the corresponding + * stream. + * - `if` / `for_lines`: control flow (unchanged shape). + * - `trivia`: formatter-only `comment` / `blank_line` slots — they have no + * execution semantics and are skipped by the runtime / validator. + */ export type WorkflowStepDef = | { - type: "ensure"; - ref: RuleRefDef; - args?: Arg[]; - /** When set, capture step stdout into this variable name. */ - captureName?: string; - /** When set, catch failure and run recovery body once. */ - catch?: - | { single: WorkflowStepDef; bindings: { failure: string } } - | { block: WorkflowStepDef[]; bindings: { failure: string } }; - } - | { - type: "run"; - workflow: WorkflowRefDef; - args?: Arg[]; - /** When set, capture step stdout into this variable name. */ + type: "exec"; + body: Expr; + /** When set, capture the result into this variable name. */ captureName?: string; - /** When set, execute asynchronously with implicit join before workflow completes. */ - async?: boolean; /** When set, catch failure and run recovery body once. */ - catch?: - | { single: WorkflowStepDef; bindings: { failure: string } } - | { block: WorkflowStepDef[]; bindings: { failure: string } }; + catch?: CatchBody; /** When set, retry with repair loop semantics (try → fail → recover body → retry). */ - recover?: - | { single: WorkflowStepDef; bindings: { failure: string } } - | { block: WorkflowStepDef[]; bindings: { failure: string } }; - } - | { - type: "prompt"; - raw: string; - loc: SourceLoc; - /** When set, capture prompt stdout into this variable name. */ - captureName?: string; - /** When set, validate response JSON against this flat schema (field: string|number|boolean). */ - returns?: string; - } - | { - type: "comment"; - text: string; - loc: SourceLoc; - } - | { - type: "fail"; - message: string; + recover?: CatchBody; loc: SourceLoc; } | { type: "const"; name: string; - value: ConstRhs; - loc: SourceLoc; - } - | { - type: "log"; - message: string; + value: Expr; loc: SourceLoc; - /** When set, log message comes from a managed inline-script call. */ - managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; } | { - type: "logerr"; - message: string; + type: "return"; + value: Expr; loc: SourceLoc; - /** When set, logerr message comes from a managed inline-script call. */ - managed?: { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; } | { type: "send"; channel: string; - rhs: SendRhsDef; + value: Expr; loc: SourceLoc; } | { - type: "return"; - value: string; + type: "say"; + level: "log" | "logerr" | "fail"; + message: Expr; loc: SourceLoc; - /** When set, return value comes from a managed run/ensure/match instead of the literal `value`. */ - managed?: - | { kind: "run"; ref: WorkflowRefDef; args?: Arg[] } - | { kind: "ensure"; ref: RuleRefDef; args?: Arg[] } - | { kind: "match"; match: MatchExprDef } - | { kind: "run_inline_script"; body: string; lang?: string; args?: Arg[] }; - } - | { - type: "run_inline_script"; - body: string; - /** Fence language tag (e.g. "node", "python3"). Maps to `#!/usr/bin/env `. */ - lang?: string; - args?: Arg[]; - captureName?: string; - loc: SourceLoc; - } - | { - type: "shell"; - command: string; - loc: SourceLoc; - captureName?: string; - } - | { - type: "match"; - expr: MatchExprDef; } | { type: "if"; @@ -240,8 +204,11 @@ export type WorkflowStepDef = loc: SourceLoc; } | { - /** Preserved intentional blank line between steps (formatter only). */ - type: "blank_line"; + /** Formatter-only: `# comment` line or preserved blank line between steps. */ + type: "trivia"; + kind: "comment" | "blank_line"; + text?: string; + loc?: SourceLoc; }; export interface EnvDeclDef { diff --git a/test-fixtures/golden-ast/expected/brace-if.json b/test-fixtures/golden-ast/expected/brace-if.json index b85639c0..7adc1c95 100644 --- a/test-fixtures/golden-ast/expected/brace-if.json +++ b/test-fixtures/golden-ast/expected/brace-if.json @@ -9,10 +9,13 @@ "params": [], "steps": [ { - "type": "run", - "workflow": { - "value": "ok_impl" - } + "body": { + "callee": { + "value": "ok_impl" + }, + "kind": "call" + }, + "type": "exec" } ] } @@ -36,33 +39,39 @@ "params": [], "steps": [ { + "body": { + "callee": { + "value": "ok" + }, + "kind": "ensure_call" + }, "catch": { "bindings": { "failure": "err" }, "block": [ { - "args": [ - { - "kind": "var", - "name": "err" + "body": { + "args": [ + { + "kind": "var", + "name": "err" + }, + { + "kind": "literal", + "raw": "\"error.log\"" + } + ], + "callee": { + "value": "save" }, - { - "kind": "literal", - "raw": "\"error.log\"" - } - ], - "type": "run", - "workflow": { - "value": "save" - } + "kind": "call" + }, + "type": "exec" } ] }, - "ref": { - "value": "ok" - }, - "type": "ensure" + "type": "exec" } ] } diff --git a/test-fixtures/golden-ast/expected/imports.json b/test-fixtures/golden-ast/expected/imports.json index ecd705d5..de8dfae2 100644 --- a/test-fixtures/golden-ast/expected/imports.json +++ b/test-fixtures/golden-ast/expected/imports.json @@ -16,16 +16,22 @@ "params": [], "steps": [ { - "type": "run", - "workflow": { - "value": "lib.setup" - } + "body": { + "callee": { + "value": "lib.setup" + }, + "kind": "call" + }, + "type": "exec" }, { - "ref": { - "value": "lib.check" + "body": { + "callee": { + "value": "lib.check" + }, + "kind": "ensure_call" }, - "type": "ensure" + "type": "exec" } ] } diff --git a/test-fixtures/golden-ast/expected/log.json b/test-fixtures/golden-ast/expected/log.json index a8d99f76..d62d4398 100644 --- a/test-fixtures/golden-ast/expected/log.json +++ b/test-fixtures/golden-ast/expected/log.json @@ -11,16 +11,28 @@ "params": [], "steps": [ { - "message": "hello world", - "type": "log" + "level": "log", + "message": { + "kind": "literal", + "raw": "hello world" + }, + "type": "say" }, { - "message": "${USER} logged in", - "type": "log" + "level": "log", + "message": { + "kind": "literal", + "raw": "${USER} logged in" + }, + "type": "say" }, { - "message": "something went wrong", - "type": "logerr" + "level": "logerr", + "message": { + "kind": "literal", + "raw": "something went wrong" + }, + "type": "say" } ] } diff --git a/test-fixtures/golden-ast/expected/match-multiline.json b/test-fixtures/golden-ast/expected/match-multiline.json index 39863b4c..0fa46581 100644 --- a/test-fixtures/golden-ast/expected/match-multiline.json +++ b/test-fixtures/golden-ast/expected/match-multiline.json @@ -14,12 +14,13 @@ "name": "input", "type": "const", "value": { - "bashRhs": "\"hello\"", - "kind": "expr" + "kind": "literal", + "raw": "\"hello\"" } }, { - "managed": { + "type": "return", + "value": { "kind": "match", "match": { "arms": [ @@ -40,9 +41,7 @@ ], "subject": "input" } - }, - "type": "return", - "value": "__match__" + } } ] } diff --git a/test-fixtures/golden-ast/expected/match.json b/test-fixtures/golden-ast/expected/match.json index c64c2651..24853eab 100644 --- a/test-fixtures/golden-ast/expected/match.json +++ b/test-fixtures/golden-ast/expected/match.json @@ -14,12 +14,13 @@ "name": "input", "type": "const", "value": { - "bashRhs": "\"hello\"", - "kind": "expr" + "kind": "literal", + "raw": "\"hello\"" } }, { - "managed": { + "type": "return", + "value": { "kind": "match", "match": { "arms": [ @@ -46,9 +47,7 @@ ], "subject": "input" } - }, - "type": "return", - "value": "__match__" + } } ] } diff --git a/test-fixtures/golden-ast/expected/params.json b/test-fixtures/golden-ast/expected/params.json index 941179de..fdf0457f 100644 --- a/test-fixtures/golden-ast/expected/params.json +++ b/test-fixtures/golden-ast/expected/params.json @@ -11,10 +11,13 @@ ], "steps": [ { - "type": "run", - "workflow": { - "value": "checker" - } + "body": { + "callee": { + "value": "checker" + }, + "kind": "call" + }, + "type": "exec" } ] } @@ -36,8 +39,12 @@ ], "steps": [ { - "message": "${greeting}, ${name}!", - "type": "log" + "level": "log", + "message": { + "kind": "literal", + "raw": "${greeting}, ${name}!" + }, + "type": "say" } ] } diff --git a/test-fixtures/golden-ast/expected/prompt-capture.json b/test-fixtures/golden-ast/expected/prompt-capture.json index b9a88f9c..56a7c61a 100644 --- a/test-fixtures/golden-ast/expected/prompt-capture.json +++ b/test-fixtures/golden-ast/expected/prompt-capture.json @@ -14,13 +14,17 @@ "name": "answer", "type": "const", "value": { - "kind": "prompt_capture", + "kind": "prompt", "raw": "\"What is your name?\"" } }, { - "message": "${answer}", - "type": "log" + "level": "log", + "message": { + "kind": "literal", + "raw": "${answer}" + }, + "type": "say" } ] } diff --git a/test-fixtures/golden-ast/expected/run-ensure.json b/test-fixtures/golden-ast/expected/run-ensure.json index f641a2db..7bf91647 100644 --- a/test-fixtures/golden-ast/expected/run-ensure.json +++ b/test-fixtures/golden-ast/expected/run-ensure.json @@ -9,10 +9,13 @@ "params": [], "steps": [ { - "type": "run", - "workflow": { - "value": "validator" - } + "body": { + "callee": { + "value": "validator" + }, + "kind": "call" + }, + "type": "exec" } ] } @@ -31,16 +34,22 @@ "params": [], "steps": [ { - "ref": { - "value": "check" + "body": { + "callee": { + "value": "check" + }, + "kind": "ensure_call" }, - "type": "ensure" + "type": "exec" }, { - "type": "run", - "workflow": { - "value": "helper" - } + "body": { + "callee": { + "value": "helper" + }, + "kind": "call" + }, + "type": "exec" } ] }, @@ -50,8 +59,12 @@ "params": [], "steps": [ { - "message": "helping", - "type": "log" + "level": "log", + "message": { + "kind": "literal", + "raw": "helping" + }, + "type": "say" } ] } diff --git a/test-fixtures/golden-ast/expected/script-defs.json b/test-fixtures/golden-ast/expected/script-defs.json index 07eb7c9d..dca2963f 100644 --- a/test-fixtures/golden-ast/expected/script-defs.json +++ b/test-fixtures/golden-ast/expected/script-defs.json @@ -27,16 +27,22 @@ "params": [], "steps": [ { - "type": "run", - "workflow": { - "value": "greet" - } + "body": { + "callee": { + "value": "greet" + }, + "kind": "call" + }, + "type": "exec" }, { - "type": "run", - "workflow": { - "value": "multiline" - } + "body": { + "callee": { + "value": "multiline" + }, + "kind": "call" + }, + "type": "exec" } ] } From 7bbe180ee47de098137eab320984880068ed8f92 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 14:16:03 +0200 Subject: [PATCH 09/14] Refactor: fold validator pre-passes into a single workflow walk src/transpile/validate.ts used to descend each workflow's / rule's step tree four times before the main check loop finished -- collectKnownVars, collectPromptSchemas, validateImmutableBindings, and the per-step validator -- each re-implementing the same recursion over if / for_lines / catch / recover with subtly different rules. The three pre-pass helpers are gone. A new walkStepTree descends the tree once, accumulating knownVars, promptSchemas (gated by withPromptSchemas so rules skip schema collection), and enforcing immutable-binding / script-collision rules inline through a shared bindings map (with a fresh inner map under each for_lines body so loop iterators only shadow inside the body). It emits a flat FlatStepEntry[] of every step in tree order with the enclosing catch / recover failure binding attached; the main per-workflow and per-rule validator loops iterate that flat list non-recursively. walkStepTree's internal descend is now the only recursive helper in the file that takes a WorkflowStepDef[]. All existing E_VALIDATE error messages and locations are preserved bit-for-bit. New tests in validate-single-walk.test.ts pin both invariants (no reappearance of the deleted helpers by name; at most one WorkflowStepDef[] walker). Docs updated. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 25 --- docs/architecture.md | 1 + docs/contributing.md | 1 + src/transpile/validate-single-walk.test.ts | 106 ++++++++++ src/transpile/validate.ts | 232 +++++++++++---------- 6 files changed, 236 insertions(+), 130 deletions(-) create mode 100644 src/transpile/validate-single-walk.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 4ada53d9..e6e02e0b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Fold the validator's three workflow pre-passes into a single step-tree walk:** `src/transpile/validate.ts` used to descend each workflow's / rule's step tree four times before its main check loop finished — `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`, and the per-step validator itself — each re-implementing the same recursion over `if` / `for_lines` / `catch` / `recover` with subtly different rules, so "what counts as a binding here" fixes had to land in two or three walkers. The three pre-pass helpers are deleted. One new helper `walkStepTree(filePath, steps, envDecls, params, declLoc, moduleScripts, parseSchemaFieldNames, { withPromptSchemas })` descends the tree once and returns `{ knownVars, promptSchemas, flat }`: it accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings — workflow walks set `withPromptSchemas: true`, rule walks set it `false`), enforces immutable-binding and `script`-collision rules inline through a shared `bindings` map (with a fresh inner map under each `for_lines` body so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding (`recoverBindings: Set | undefined`) attached. The per-workflow and per-rule validator loops now iterate that flat list non-recursively — the `if` / `for_lines` / `catch` / `recover` recursion that used to live inside `validateStep` / `validateRuleStep` is gone. `walkStepTree`'s internal `descend` is the only recursive helper in the file that takes a `WorkflowStepDef[]`. Failure order matches the prior "binding errors first, then per-step errors" behavior because binding checks fire during the descent, before any flat-list iteration starts. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit: the full `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New tests pin the invariants: `src/transpile/validate-single-walk.test.ts` greps `validate.ts` and fails if any of `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear by name (AC1), and a textual AST scan asserts that at most one recursive helper whose parameter list mentions `WorkflowStepDef[]` exists in the file (AC2). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the visitor-table refactor (Refactor 4) and any change to validation rules. Docs updated in `docs/architecture.md` (new **Single workflow walk** bullet under **Validator**) and `docs/contributing.md` (new **Validator single-walk shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. - **Refactor — Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings:** The same concept "a managed call that yields a value" used to be encoded three different ways: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return` / `log` / `logerr` whose `value` / `message` carried a placeholder string (`"__match__"`, `"run inline_script"`, etc.). Inline scripts added a fourth (`run_inline_script_capture`); `prompt`, `match`, and `ensure` captures repeated the same dual representation. The validator, formatter, emitter, and runtime each had to handle both branches at every site. All three encodings are gone. The semantic AST now has a single `Expr` tagged union — `literal | call | ensure_call | inline_script | prompt | match | shell | bare_ref` — used everywhere a value can appear: `const name = `, `return `, `send channel <- `, the message of `log` / `logerr` / `fail`, and the body of an `exec` step (the new statement-form managed call, where the value is consumed for its side effects plus optional capture). `ConstRhs` and `SendRhsDef` are deleted as separate types. The `managed:` sidecar field is deleted from `WorkflowStepDef`. The placeholder strings `"__match__"`, `"run inline_script"`, and `"__JAIPH_MANAGED__"` no longer appear anywhere under `src/`. `WorkflowStepDef` collapses from 14 variants to **8** (`exec`, `const`, `return`, `send`, `say`, `if`, `for_lines`, `trivia`): `exec` is the new managed-statement form covering the prior `run` / `ensure` / `run_inline_script` / `prompt` / `shell` / standalone `match` cases (the discriminator now lives inside `body.kind`, with `captureName` / `catch` / `recover` as step-level attributes); `say` covers the prior `log` / `logerr` / `fail` cases (`level: "fail"` aborts the workflow with the message, otherwise the message is written to the corresponding stream); `comment` / `blank_line` collapse into a single `trivia` variant (formatter-only, skipped by validator and runtime). The parser builds `Expr` nodes directly: `parseConstRhs` returns `{ value: Expr }`; `parseSendRhs` returns `{ value: Expr }`; `parsePromptStep` returns an `exec` step whose `body` is an `Expr.prompt`; `return run …` / `return ensure …` / `return match …` / `return run \`…\`(…)` build `Expr.call` / `Expr.ensure_call` / `Expr.match` / `Expr.inline_script` directly with no sidecar; `log run \`…\`(…)` and `logerr run \`…\`(…)` build `say` steps whose `message` is an `Expr.inline_script`. Downstream consumers compress accordingly: the validator switches on the 8-variant `WorkflowStepDef.type` and the 8-kind `Expr.kind` with no "literal value vs managed sidecar" fork; the formatter renders each `Expr` through one `emitExpr` helper instead of branching on a sidecar; the runtime has one private `evaluateExpr(scope, expr, …)` dispatcher that `const` / `return` / `send` / `say` / `exec` all delegate to (which runs the managed call for `call` / `ensure_call` / `inline_script`, walks `match` arms, schema-checks `prompt`, and interpolates `literal` via `interpolateWithCaptures`); the script-emit walk in `src/transpile/emit-script.ts` finds inline-script bodies by recursing into each step's `Expr` payload rather than enumerating the four legacy carriers. New tests pin the invariants: `src/types-shape.test.ts` is a compile-time exhaustive `switch` plus runtime tuple assertion that `WorkflowStepDef` has exactly **8** variants and `Expr` has exactly **8** kinds (AC2), a `grep` over every non-test `.ts` file under `src/` that fails if any of the placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) reappear (AC1), and an export-surface check that fails if `ConstRhs` or `SendRhsDef` are re-exported from `src/types.ts` (AC3). Updated parser tests in `src/parse/parse-return.test.ts`, `src/parse/parse-const-rhs.test.ts`, `src/parse/parse-prompt.test.ts`, `src/parse/parse-send-rhs.test.ts`, `src/parse/parse-steps.test.ts`, `src/parse/parse-inline-script.test.ts`, and `src/parse/parse-bare-call.test.ts` assert the new `Expr` shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …` (AC4). The golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against the emitted bash output; `src/format/roundtrip.test.ts` round-trips bit-for-bit on every fixture; `npm run build` passes with zero TypeScript strict-mode errors (AC5 / AC6). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to reflect the new step shapes (`exec` wrapping every managed call, `say` replacing `log` / `logerr` / `fail`, `trivia` replacing `comment` / `blank_line`, `Expr` value/message/body payloads). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: surface syntax, the validator's deeper structural rewrite (Refactor 4), and parser internals (Refactors 1 & 2). Docs updated in `docs/architecture.md` (rewrote the **AST / Types** bullet to describe the single `Expr` sum and the 8-variant `WorkflowStepDef`; updated **Validator**, **Formatter**, **Node Workflow Runtime**, and **Trivia / CST layer** bullets to drop the dual-representation language; rewrote the `match_expr` mention in **CLI progress reporting pipeline** to use `Expr.kind === "match"`) and `docs/contributing.md` (new **`Expr` / step-variant shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. - **Refactor — Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site:** Every call-bearing AST node used to carry the call arguments twice — `args: string` (the raw source between the parens) and `bareIdentifierArgs?: string[]` (a re-parse of which of those arguments happened to be bare identifiers). The validator had to remember to check both fields and call a hand-rolled `validateBareIdentifierArgs` helper at every site; the emitter re-parsed `args` from scratch because it didn't trust either field on its own. Both fields are gone. The parser now classifies each argument once, at parse time, into a new typed sum `type Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }` and stores it on every call-bearing node as `args?: Arg[]`. Affected nodes: `run` / `ensure` workflow steps, `run_inline_script` steps, the `managed` sidecar on `return` / `log` / `logerr` (in all four shapes — `run`, `ensure`, `run_inline_script`, `match`), the `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS variants, and the `run` send RHS. Downstream consumers walk the typed list directly: the validator's per-call check sequence is now arity (`args.length`), shell-redirection rejection on `literal` raws, nested-unmanaged-call rejection on `literal` raws, ref resolution, and `var`-arg resolution against in-scope bindings via a new `validateArgVarRefs` (the standalone `validateBareIdentifierArgs` helper is deleted); the formatter renders each `Arg` directly (`var` → bare name, `literal` → raw) instead of re-tokenizing a `${ident}`-rewritten string; the runtime turns `Arg[]` back into the space-separated argv string via `argsToRuntimeString` in `src/parse/core.ts` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. New tests pin the invariants: `src/parse/arg-ast-shape.test.ts` is a compile-time assertion that `bareIdentifierArgs` does not appear on `WorkflowStepDef` (`ensure`, `run`, `run_inline_script`, `log.managed`, `logerr.managed`, `return.managed` in `run` / `ensure` / `run_inline_script` shapes), `ConstRhs` (`run_capture`, `ensure_capture`, `run_inline_script_capture`), or the `run` `SendRhsDef` variant (AC1); `src/parse/arg-grep.test.ts` walks every non-test `.ts` under `src/parse/` and `src/transpile/` and fails if any production file matches `args.split(",")` or the bare token `bareIdentifierArgs` (AC2), and separately fails if any file under `src/transpile/` references `validateBareIdentifierArgs` (AC3). The golden compiler corpus, `validate-*.test.ts` files, and the golden AST corpus pass byte-for-byte (AC4); `npm run build` passes with zero TypeScript strict-mode errors (AC5). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the full `Expr` collapse (next task) and surface syntax. Docs updated in `docs/architecture.md` (extended **AST / Types** bullet documenting the typed `Arg` sum; updated **Validator** and **Formatter** bullets to drop the dual representation), `docs/contributing.md` (new **Call-args AST shape** row in the test-layer table), and `docs/spec-async-handles.md` (replaces the stale `commaArgsToSpaced` reference with `argsToRuntimeString`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. - **Refactor — Split source-fidelity data from the semantic AST into a `Trivia` (CST) layer:** Around ten fields whose only consumer was the formatter — `leadingComments` on imports / script imports / channels / `const` decls / `test` blocks, `configLeadingComments`, `trailingTopLevelComments`, `configBodySequence` (both module- and workflow-scoped), `topLevelOrder`, `bareSource` on `return`, the `tripleQuoted` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, and the prompt / script `bodyKind` / `bodyIdentifier` discriminators — are removed from `jaiphModule`, `WorkflowStepDef`, `ConstRhs`, `SendRhsDef`, `WorkflowMetadata`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `ScriptDef`, and `TestBlockDef`, and re-homed in a new parallel `Trivia` store (`src/parse/trivia.ts`) keyed by AST-node identity (per-node `WeakMap`) plus a small `ModuleTrivia` record for module-level data. The parser exposes `parsejaiphWithTrivia(source, filePath) → { ast, trivia }`; the legacy `parsejaiph(source, filePath)` is now a thin wrapper that drops trivia for callers that don't care (validator, transpiler, runtime, `loadModuleGraph`). The formatter (`emitModule(ast, trivia, opts?)`) is the only consumer of `Trivia`; validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts`. New tests pin the invariants: `src/parse/trivia-ast-shape.test.ts` is a compile-time assertion (with runtime echo) that none of the listed fields reappear on any semantic AST type (AC1); `src/parse/trivia-grep.test.ts` greps validator and emitter source files and fails if any of them references `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` or imports from `parse/trivia` (AC2); `src/format/roundtrip.test.ts` walks every `.jh` under `examples/` and `test-fixtures/golden-ast/fixtures/` and asserts `parse → format → parse → format` converges bit-for-bit (AC3). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to drop the moved fields. User-visible contracts (CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming) are unchanged. `npm test` and `npm run build` pass with zero TypeScript strict-mode errors (AC4 / AC5). Out of scope: the `Expr` collapse — this refactor only relocates source-fidelity fields without changing the semantic AST's shape. Docs updated in `docs/architecture.md` (new **Trivia / CST layer** section with anchor `#trivia-cst-layer`, plus updated **Parser**, **AST / Types**, and **Formatter** bullets) and `docs/contributing.md` (new row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix A. diff --git a/QUEUE.md b/QUEUE.md index 51278e3e..5d70b70c 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,31 +13,6 @@ Process rules: *** -## Fold the validator's pre-passes (knownVars / promptSchemas / immutableBindings) into a single workflow walk #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. - -**Why:** `src/transpile/validate.ts` walks each workflow's step tree at least three times before its main check loop runs: `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`. Each re-implements the same recursion over if/for_lines/catch/recover with subtly different rules — bug-fixes to "what counts as a binding here" land in 2–3 walkers. - -**Scope:** - -- Replace the three pre-passes with a single visitor that descends the workflow once, accumulating `{ knownVars, promptSchemas, bindings }` as it goes. -- The main per-step validator runs in the same descent (or as a second pass over the accumulated state), but the *structural* recursion over if/for_lines/catch/recover happens exactly once. -- All existing validation rules and error messages are preserved bit-for-bit. - -**Acceptance criteria** (each verified by a test): - -1. `collectKnownVars`, `collectPromptSchemas`, and `validateImmutableBindings` are deleted as separate functions. A grep test fails if they reappear by name. -2. There is exactly one recursion over workflow/rule step trees in `src/transpile/validate.ts`. A test counts recursive helpers that walk `WorkflowStepDef[]` and asserts ≤ 1. -3. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit. Snapshot test across every `validate-*.test.ts` fixture. -4. `npm test` passes, including all `validate-*.test.ts` files and the golden corpus. - -**Out of scope:** the visitor-table refactor (Refactor 4, two tasks ahead). Changes to validation rules. - -**Dependency:** The `Expr` collapse (previous task) should be complete first. - -*** - ## Replace fail-fast errors with a Diagnostics collector that aggregates per compile #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. diff --git a/docs/architecture.md b/docs/architecture.md index 1ccd06d8..698c23c5 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -53,6 +53,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Validator (`src/transpile/validate.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation, walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. + - **Single workflow walk.** Each workflow / rule has its step tree descended exactly once by `walkStepTree`, which simultaneously accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings, gated by `options.withPromptSchemas` so rules skip schema collection), enforces immutable-binding / `script`-collision rules inline (mutating a shared `bindings` map and threading a fresh inner map under each `for_lines` so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding attached. The main per-step validator loop iterates that flat list non-recursively, so `walkStepTree`'s internal `descend` is the **only** recursive helper in the file that takes a `WorkflowStepDef[]`. A pair of grep / AST tests (`src/transpile/validate-single-walk.test.ts`) pins both invariants: the prior helpers (`collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`) cannot reappear by name, and at most one recursive `WorkflowStepDef[]` walker may live in `validate.ts`. - Per call site the validator runs five checks against the typed **`Arg[]`** directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution, arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. There is no longer a separate `validateBareIdentifierArgs` helper, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** diff --git a/docs/contributing.md b/docs/contributing.md index 1b48ab71..60e0d8f3 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -105,6 +105,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Trivia / formatter round-trip** | `src/parse/trivia-ast-shape.test.ts`, `src/parse/trivia-grep.test.ts`, `src/format/roundtrip.test.ts` | Source-fidelity invariants: no trivia fields on semantic AST types (compile-time), validator/emitter sources do not reference `Trivia`, and `parse → format → parse → format` is bit-for-bit on every fixture under `examples/` and `test-fixtures/golden-ast/fixtures/` | You changed the parser, formatter, AST types, or anything that touches source-fidelity round-trip (see [Architecture — Trivia (CST layer)](architecture.md#trivia-cst-layer)) | | **Call-args AST shape** | `src/parse/arg-ast-shape.test.ts`, `src/parse/arg-grep.test.ts` | Pins the typed-`Arg[]` invariant: no `bareIdentifierArgs` field on any call-bearing AST type (compile-time), no `args.split(",")` or `bareIdentifierArgs` text in production `src/parse/` or `src/transpile/` sources, and no `validateBareIdentifierArgs` helper in the validator | You changed how call arguments flow through the parser, validator, or emitter and need to confirm nothing re-introduces the parallel raw-string representation (see [Architecture — AST / Types](architecture.md#core-components)) | | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | +| **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | | **Integration tests** | `integration/*.test.ts`, `integration/sample-build/*.test.ts` | Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses | diff --git a/src/transpile/validate-single-walk.test.ts b/src/transpile/validate-single-walk.test.ts new file mode 100644 index 00000000..e4cc4d73 --- /dev/null +++ b/src/transpile/validate-single-walk.test.ts @@ -0,0 +1,106 @@ +import { readFileSync } from "node:fs"; +import { resolve } from "node:path"; +import test from "node:test"; +import assert from "node:assert/strict"; + +// Compiled test sits at dist/src/transpile/; the source file is three levels +// up under src/transpile/. +const validatePath = resolve(__dirname, "../../../src/transpile/validate.ts"); + +/** + * AC1 — The three pre-pass helpers (`collectKnownVars`, + * `collectPromptSchemas`, `validateImmutableBindings`) have been replaced by a + * single workflow walk. None of those names should reappear in validate.ts — + * if they do, this test fails immediately. The grep is anchored on word + * boundaries so unrelated identifiers (e.g. a `validateImmutableBindingsFoo` + * variant) would still be flagged. + */ +test("AC1: pre-pass helpers are deleted from validate.ts", () => { + const text = readFileSync(validatePath, "utf8"); + const forbidden = [ + "collectKnownVars", + "collectPromptSchemas", + "validateImmutableBindings", + ]; + const offenders: string[] = []; + for (const name of forbidden) { + if (new RegExp(`\\b${name}\\b`).test(text)) { + offenders.push(name); + } + } + assert.deepEqual( + offenders, + [], + `forbidden helper names reappeared in validate.ts: ${offenders.join(", ")}`, + ); +}); + +/** + * AC2 — Exactly one recursive helper in validate.ts walks + * `WorkflowStepDef[]`. A "helper" is any top-level or nested + * function/arrow declaration whose parameter list mentions + * `WorkflowStepDef[]`; it is "recursive" if its body calls its own name. + * + * Before the refactor there were four such walkers (`collectKnownVars`'s + * inner walk, `validateImmutableBindings`'s inner walk, the workflow's + * `validateStep`, and the rule's `validateRuleStep`). After the refactor + * only the single `descend` inside `walkStepTree` should remain. + */ +test("AC2: at most one recursive helper walks WorkflowStepDef[] in validate.ts", () => { + const text = readFileSync(validatePath, "utf8"); + const helpers = findStepArrayHelpers(text); + const recursive = helpers.filter((h) => + new RegExp(`\\b${h.name}\\(`).test(h.body), + ); + assert.ok( + recursive.length <= 1, + `expected at most 1 recursive helper walking WorkflowStepDef[] in validate.ts, ` + + `found ${recursive.length}: ${recursive.map((h) => h.name).join(", ")}`, + ); +}); + +interface Helper { + name: string; + body: string; +} + +/** + * Locate every `function NAME(...)` or `const NAME = (...) => ...` declaration + * whose parameter list textually contains `WorkflowStepDef[]`, and return its + * name + body (text between the body's matching braces). Nested arrows count + * — that's how we catch a helper redeclared inside another function. + */ +function findStepArrayHelpers(text: string): Helper[] { + const out: Helper[] = []; + const declRe = /(?:^|\n)\s*(?:function\s+(\w+)\s*\(|(?:const|let)\s+(\w+)\s*=\s*(?:async\s*)?\()/g; + let match: RegExpExecArray | null; + while ((match = declRe.exec(text)) !== null) { + const name = match[1] ?? match[2]; + if (!name) continue; + const openParen = text.indexOf("(", match.index); + if (openParen < 0) continue; + const closeParen = findMatching(text, openParen, "(", ")"); + if (closeParen < 0) continue; + const params = text.slice(openParen, closeParen + 1); + if (!params.includes("WorkflowStepDef[]")) continue; + const bodyOpen = text.indexOf("{", closeParen); + if (bodyOpen < 0) continue; + const bodyClose = findMatching(text, bodyOpen, "{", "}"); + if (bodyClose < 0) continue; + out.push({ name, body: text.slice(bodyOpen + 1, bodyClose) }); + } + return out; +} + +function findMatching(text: string, openIdx: number, open: string, close: string): number { + let depth = 0; + for (let i = openIdx; i < text.length; i += 1) { + const ch = text[i]; + if (ch === open) depth += 1; + else if (ch === close) { + depth -= 1; + if (depth === 0) return i; + } + } + return -1; +} diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index 10e63ca1..9e6a989a 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -169,59 +169,69 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< } } -/** Collect all variable names defined in a step list (consts, captures, params). Flat walk — includes nested if/else blocks. */ -function collectKnownVars(steps: WorkflowStepDef[], envDecls?: { name: string }[], params?: string[]): Set { - const vars = new Set(); - if (envDecls) { - for (const d of envDecls) vars.add(d.name); - } - for (const p of params ?? []) { - vars.add(p); - } - const walk = (ss: WorkflowStepDef[]): void => { - for (const s of ss) { - if (s.type === "const") { - vars.add(s.name); - } - if (s.type === "exec" && s.captureName) { - vars.add(s.captureName); - } - if (s.type === "exec" && s.catch) { - const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; - walk(recoverSteps); - } - if (s.type === "exec" && s.recover) { - const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; - walk(recoverSteps); - } - if (s.type === "if") { - walk(s.body); - } - if (s.type === "for_lines") { - vars.add(s.iterVar); - walk(s.body); - } - } - }; - walk(steps); - return vars; +/** + * One step entry in the flat list built by the single workflow walk. + * + * `recoverBindings` is the `Set` of failure-binding names contributed by an + * enclosing `catch` / `recover`, threaded down so steps inside a recovery + * body can resolve `` as an in-scope identifier. + */ +interface FlatStepEntry { + step: WorkflowStepDef; + recoverBindings: Set | undefined; } -/** Validate that no immutable binding (param, const, capture) is redefined in the same scope. */ -function validateImmutableBindings( +/** + * Result of the single recursive descent over a workflow's / rule's step + * tree: the global identifier set (envDecls + params + every nested const / + * capture / for-iterator), the top-level prompt schemas, and a flat list of + * every step in tree order. The flat list is what the main validator loop + * iterates over — that loop is non-recursive, so the only recursive helper + * walking `WorkflowStepDef[]` in this file is `walkStepTree` itself. + * + * Replaces three prior pre-passes that each walked the same step tree with + * subtly different recursion rules. Immutable-binding rules are enforced + * inline during the descent so the failure order matches the prior + * "binding errors first, then per-step errors" behavior. + */ +interface StepTreeWalk { + knownVars: Set; + promptSchemas: Map; + flat: FlatStepEntry[]; +} + +function walkStepTree( filePath: string, steps: WorkflowStepDef[], + envDecls: { name: string; loc: { line: number; col: number } }[] | undefined, params: string[], declLoc: { line: number; col: number }, - envDecls?: { name: string; loc: { line: number; col: number } }[], - moduleScripts?: Set, -): void { - const bound = new Map(); + moduleScripts: Set, + parseSchemaFieldNames: (rawSchema: string) => string[], + options: { withPromptSchemas: boolean }, +): StepTreeWalk { + const knownVars = new Set(); + const promptSchemas = new Map(); + const flat: FlatStepEntry[] = []; + + if (envDecls) { + for (const d of envDecls) knownVars.add(d.name); + } + for (const p of params) { + knownVars.add(p); + } + + const seedBindings = new Map(); for (const p of params) { - bound.set(p, { kind: "parameter", line: declLoc.line }); + seedBindings.set(p, { kind: "parameter", line: declLoc.line }); } - const check = (name: string, kind: string, loc: { line: number; col: number }, b: Map): void => { + const checkBinding = ( + name: string, + kind: string, + loc: { line: number; col: number }, + b: Map, + ): void => { const prev = b.get(name); if (prev) { throw jaiphError( @@ -232,7 +242,7 @@ function validateImmutableBindings( `cannot rebind immutable name "${name}"; already bound as ${prev.kind} at ${filePath}:${prev.line}`, ); } - if (moduleScripts?.has(name)) { + if (moduleScripts.has(name)) { throw jaiphError( filePath, loc.line, @@ -244,28 +254,52 @@ function validateImmutableBindings( b.set(name, { kind, line: loc.line }); }; - const walk = (ss: WorkflowStepDef[], b: Map): void => { + const descend = ( + ss: WorkflowStepDef[], + bindings: Map, + recoverBindings: Set | undefined, + topLevel: boolean, + ): void => { for (const s of ss) { + flat.push({ step: s, recoverBindings }); + if (s.type === "const") { - check(s.name, "const", s.loc, b); - } - if (s.type === "exec" && s.captureName) { - const captureLoc = execBodyLoc(s.body) ?? s.loc; - check(s.captureName, "capture", captureLoc, b); - } - if (s.type === "exec" && s.catch) { - const recoverSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; - walk(recoverSteps, b); + knownVars.add(s.name); + checkBinding(s.name, "const", s.loc, bindings); + if (options.withPromptSchemas && topLevel && s.value.kind === "prompt" && s.value.returns !== undefined) { + promptSchemas.set(s.name, parseSchemaFieldNames(s.value.returns)); + } + continue; } - if (s.type === "exec" && s.recover) { - const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; - walk(recoverSteps, b); + + if (s.type === "exec") { + if (s.captureName) { + knownVars.add(s.captureName); + const captureLoc = execBodyLoc(s.body) ?? s.loc; + checkBinding(s.captureName, "capture", captureLoc, bindings); + if (options.withPromptSchemas && topLevel && s.body.kind === "prompt" && s.body.returns !== undefined) { + promptSchemas.set(s.captureName, parseSchemaFieldNames(s.body.returns)); + } + } + if (s.catch) { + const catchSteps = "single" in s.catch ? [s.catch.single] : s.catch.block; + descend(catchSteps, bindings, new Set([s.catch.bindings.failure]), false); + } + if (s.recover) { + const recoverSteps = "single" in s.recover ? [s.recover.single] : s.recover.block; + descend(recoverSteps, bindings, new Set([s.recover.bindings.failure]), false); + } + continue; } + if (s.type === "if") { - walk(s.body, b); + descend(s.body, bindings, recoverBindings, false); + continue; } + if (s.type === "for_lines") { - if (b.has(s.iterVar)) { + knownVars.add(s.iterVar); + if (bindings.has(s.iterVar)) { throw jaiphError( filePath, s.loc.line, @@ -274,13 +308,16 @@ function validateImmutableBindings( `for loop iterator "${s.iterVar}" conflicts with an existing binding`, ); } - const inner = new Map(b); + const inner = new Map(bindings); inner.set(s.iterVar, { kind: "loop_iterator", line: s.loc.line }); - walk(s.body, inner); + descend(s.body, inner, recoverBindings, false); + continue; } } }; - walk(steps, bound); + + descend(steps, seedBindings, undefined, true); + return { knownVars, promptSchemas, flat }; } /** Best-effort location for an exec body — used to attribute capture-binding errors. */ @@ -592,19 +629,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return names; }; - const collectPromptSchemas = (steps: WorkflowStepDef[]): Map => { - const schemas = new Map(); - for (const s of steps) { - if (s.type === "exec" && s.captureName && s.body.kind === "prompt" && s.body.returns !== undefined) { - schemas.set(s.captureName, parseSchemaFieldNames(s.body.returns)); - } - if (s.type === "const" && s.value.kind === "prompt" && s.value.returns !== undefined) { - schemas.set(s.name, parseSchemaFieldNames(s.value.returns)); - } - } - return schemas; - }; - const validateDotFieldRefs = ( content: string, loc: { line: number; col: number }, @@ -852,8 +876,17 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { }; for (const rule of ast.rules) { - validateImmutableBindings(ast.filePath, rule.steps, rule.params, rule.loc, ast.envDecls, localScripts); - const ruleKnownVars = collectKnownVars(rule.steps, ast.envDecls, rule.params); + const ruleWalk = walkStepTree( + ast.filePath, + rule.steps, + ast.envDecls, + rule.params, + rule.loc, + localScripts, + parseSchemaFieldNames, + { withPromptSchemas: false }, + ); + const ruleKnownVars = ruleWalk.knownVars; const validateRuleStep = (s: WorkflowStepDef): void => { if (s.type === "trivia") return; if (s.type === "say") { @@ -910,14 +943,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } } validateCallable(body, ruleKnownVars, "rule"); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - for (const r of steps) validateRuleStep(r); - } - if (s.recover) { - const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; - for (const r of steps) validateRuleStep(r); - } return; } if (s.type === "if") { @@ -926,7 +951,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); } } - for (const bodyStep of s.body) validateRuleStep(bodyStep); return; } if (s.type === "for_lines") { @@ -936,14 +960,13 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); } - for (const bodyStep of s.body) validateRuleStep(bodyStep); return; } const _never: never = s; return _never; }; - for (const st of rule.steps) { - validateRuleStep(st); + for (const entry of ruleWalk.flat) { + validateRuleStep(entry.step); } } @@ -986,9 +1009,18 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } for (const workflow of ast.workflows) { - validateImmutableBindings(ast.filePath, workflow.steps, workflow.params, workflow.loc, ast.envDecls, localScripts); - const promptSchemas = collectPromptSchemas(workflow.steps); - const wfKnownVars = collectKnownVars(workflow.steps, ast.envDecls, workflow.params); + const wfWalk = walkStepTree( + ast.filePath, + workflow.steps, + ast.envDecls, + workflow.params, + workflow.loc, + localScripts, + parseSchemaFieldNames, + { withPromptSchemas: true }, + ); + const wfKnownVars = wfWalk.knownVars; + const promptSchemas = wfWalk.promptSchemas; const validateStep = (s: WorkflowStepDef, recoverBindings?: Set): void => { if (s.type === "trivia") return; @@ -1069,14 +1101,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } validateCallable(body, wfKnownVars, "workflow", recoverBindings); - if (s.catch) { - const steps = "single" in s.catch ? [s.catch.single] : s.catch.block; - for (const r of steps) validateStep(r, new Set([s.catch.bindings.failure])); - } - if (s.recover) { - const steps = "single" in s.recover ? [s.recover.single] : s.recover.block; - for (const r of steps) validateStep(r, new Set([s.recover.bindings.failure])); - } return; } if (s.type === "if") { @@ -1085,7 +1109,6 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); } } - for (const bodyStep of s.body) validateStep(bodyStep, recoverBindings); return; } if (s.type === "for_lines") { @@ -1095,15 +1118,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); } - for (const bodyStep of s.body) validateStep(bodyStep, recoverBindings); return; } const _never: never = s; return _never; }; - for (const step of workflow.steps) { - validateStep(step); + for (const entry of wfWalk.flat) { + validateStep(entry.step, entry.recoverBindings); } } From 791b27d977d56cb76db0d0b6e9de771c898dd8e1 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 15:49:15 +0200 Subject: [PATCH 10/14] Refactor: aggregate compile errors via Diagnostics collector Replace fail-fast `throw jaiphError(...)` in the validator with a new `Diagnostics` collector (`src/diagnostics.ts`) that accumulates every recoverable error per compile. `collectDiagnostics(graph)` walks the import closure and returns the populated collector; the legacy `validateReferences(graph)` is now a thin wrapper that throws the first sorted diagnostic so existing per-error tests and the script-emit path stay intact. Each top-level validation unit (per-import block, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step, per-channel route) is wrapped in `diag.capture(fn)` so a bailout from one error unwinds only that unit; the four leaf helpers (validate-ref-resolution, validate-string, validate-prompt-schema, shell-jaiph-guard) still throw and every caller captures them. `jaiph compile` routes through `collectDiagnostics`, prints the full sorted set (stderr lines or a single JSON array under `--json`), and exits non-zero on any non-empty set. New tests in `src/transpile/diagnostics-collector.test.ts` pin all five acceptance criteria, including a three-error fixture and a source-tree allowlist scan over remaining `throw jaiphError(` sites. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 27 -- docs/architecture.md | 5 +- docs/cli.md | 6 +- docs/contributing.md | 1 + src/cli/commands/compile.ts | 80 ++-- src/diagnostics.ts | 130 ++++++ src/transpile/diagnostics-collector.test.ts | 207 ++++++++++ src/transpile/validate.ts | 430 +++++++++++--------- 9 files changed, 635 insertions(+), 252 deletions(-) create mode 100644 src/diagnostics.ts create mode 100644 src/transpile/diagnostics-collector.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index e6e02e0b..777345f7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Replace fail-fast errors with a `Diagnostics` collector that aggregates every recoverable error per compile:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error, so a user fixed one error, recompiled, hit the next, recompiled, and so on. The validator also pre-ordered some checks defensively because it knew it would only get to surface one error per run. That model is replaced. A new `Diagnostics` class lives in `src/diagnostics.ts` and exposes `add(d)`, `error(file, line, col, code, message)` (records the diagnostic and short-circuits the current unit through a `BailoutError`), `capture(fn)` (runs `fn` and absorbs both `BailoutError` and any thrown legacy `jaiphError` whose message parses as `path:line:col CODE message` — turning the throw into a recoverable entry without re-throwing), `hasErrors()` / `hasFatal()`, `sorted()` (stable order by file, then line, then column), `formatLines()` (one `path:line:col CODE message` per line), and a legacy `throwFirstIfAny()` bridge that throws the first sorted diagnostic via `jaiphError` so existing single-error call sites and per-error tests are unchanged. `src/transpile/validate.ts` exposes a new `collectDiagnostics(graph): Diagnostics` entry that walks the import closure and never throws on user-level errors; the previous `validateReferences(graph)` is now a thin wrapper that calls `collectDiagnostics` and then `throwFirstIfAny()`, preserving the throw-on-first contract for `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph` and for every existing `parse-*.test.ts` / `validate-*.test.ts` fixture that asserts one specific `{ message, line, col, code }`. Inside `validate.ts` every `throw jaiphError(...)` site at user-level (~50 sites across import resolution, channel-route validation, per-rule and per-workflow step walks, prompt schema checks, and `validateTestBlocks`) is migrated to `diag.error(...)`; each top-level unit is wrapped in `diag.capture(...)` (per-import block, per-channel route, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step) so the bailout from one error unwinds only that unit and the next sibling still runs. The four leaf validation helpers (`validate-ref-resolution.ts`, `validate-string.ts`, `validate-prompt-schema.ts`, `shell-jaiph-guard.ts`) still throw via `jaiphError`, but every caller wraps them in `diag.capture(...)`, which converts the thrown error into a recoverable diagnostic and returns. The CLI command `jaiph compile` (`src/cli/commands/compile.ts`) is rewritten to route through `collectDiagnostics`: it accumulates every error from every entry's import closure, sorts them by `(file, line, col)`, and prints the full set — as a single JSON array on stdout under `--json`, or as one `path:line:col CODE message` line per diagnostic on stderr otherwise — exiting **1** on any non-empty diagnostic set. Fatal aborts during graph load or parsing (unterminated triple-quote, unterminated brace block, missing imports during graph build) are reported as a single diagnostic for the affected entry; the command then continues with the next entry. New tests in `src/transpile/diagnostics-collector.test.ts` pin the invariants: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target in one workflow body) asserts `collectDiagnostics(graph)` returns all three in source order (AC1); a source-tree scan asserts `validate.ts` holds **zero** `throw jaiphError(` sites and **≥40** `diag.error(` sites, and that every remaining `throw jaiphError(` under `src/` lives in the documented fatal allowlist — `src/diagnostics.ts` (legacy bridge), `src/parse/core.ts` (parser `fail()`), `src/cli/commands/test.ts` (test-file shape fatal), `src/transpile/module-graph.ts` (loader), `src/transpile/validate-string.ts`, `src/transpile/validate-prompt-schema.ts`, `src/transpile/validate-ref-resolution.ts`, `src/transpile/shell-jaiph-guard.ts` (leaf helpers, each captured) (AC3); and a CLI test runs `jaiph compile --json` against the same fixture and asserts the returned array has all three diagnostics and `status !== 0` (AC4). Existing single-error tests (every `parse-*.test.ts` and `validate-*.test.ts` that pins one specific `{ message, line, col, code }`) still pass because `validateReferences` continues to throw the first sorted diagnostic (AC2); `npm test` and `npm run build` pass (AC5). User-visible contracts on the `jaiph run` / `jaiph test` paths — banner, hooks, run artifacts, exit codes, `__JAIPH_EVENT__` streaming, and golden corpus — are unchanged. Out of scope: changing what counts as an error (this refactor only changes the *how*); LSP integration follows in a separate task. Docs updated in `docs/architecture.md` (new **Diagnostics collector (recoverable errors)** bullet under **Validator**; updated **System overview** to describe the two entry points and the new `jaiph compile` behavior), `docs/cli.md` (new **Multiple-error reporting** paragraph and refined **`--json`** description under **`jaiph compile`**), and `docs/contributing.md` (new **Diagnostics collector shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. - **Refactor — Fold the validator's three workflow pre-passes into a single step-tree walk:** `src/transpile/validate.ts` used to descend each workflow's / rule's step tree four times before its main check loop finished — `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`, and the per-step validator itself — each re-implementing the same recursion over `if` / `for_lines` / `catch` / `recover` with subtly different rules, so "what counts as a binding here" fixes had to land in two or three walkers. The three pre-pass helpers are deleted. One new helper `walkStepTree(filePath, steps, envDecls, params, declLoc, moduleScripts, parseSchemaFieldNames, { withPromptSchemas })` descends the tree once and returns `{ knownVars, promptSchemas, flat }`: it accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings — workflow walks set `withPromptSchemas: true`, rule walks set it `false`), enforces immutable-binding and `script`-collision rules inline through a shared `bindings` map (with a fresh inner map under each `for_lines` body so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding (`recoverBindings: Set | undefined`) attached. The per-workflow and per-rule validator loops now iterate that flat list non-recursively — the `if` / `for_lines` / `catch` / `recover` recursion that used to live inside `validateStep` / `validateRuleStep` is gone. `walkStepTree`'s internal `descend` is the only recursive helper in the file that takes a `WorkflowStepDef[]`. Failure order matches the prior "binding errors first, then per-step errors" behavior because binding checks fire during the descent, before any flat-list iteration starts. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit: the full `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New tests pin the invariants: `src/transpile/validate-single-walk.test.ts` greps `validate.ts` and fails if any of `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear by name (AC1), and a textual AST scan asserts that at most one recursive helper whose parameter list mentions `WorkflowStepDef[]` exists in the file (AC2). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the visitor-table refactor (Refactor 4) and any change to validation rules. Docs updated in `docs/architecture.md` (new **Single workflow walk** bullet under **Validator**) and `docs/contributing.md` (new **Validator single-walk shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. - **Refactor — Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings:** The same concept "a managed call that yields a value" used to be encoded three different ways: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return` / `log` / `logerr` whose `value` / `message` carried a placeholder string (`"__match__"`, `"run inline_script"`, etc.). Inline scripts added a fourth (`run_inline_script_capture`); `prompt`, `match`, and `ensure` captures repeated the same dual representation. The validator, formatter, emitter, and runtime each had to handle both branches at every site. All three encodings are gone. The semantic AST now has a single `Expr` tagged union — `literal | call | ensure_call | inline_script | prompt | match | shell | bare_ref` — used everywhere a value can appear: `const name = `, `return `, `send channel <- `, the message of `log` / `logerr` / `fail`, and the body of an `exec` step (the new statement-form managed call, where the value is consumed for its side effects plus optional capture). `ConstRhs` and `SendRhsDef` are deleted as separate types. The `managed:` sidecar field is deleted from `WorkflowStepDef`. The placeholder strings `"__match__"`, `"run inline_script"`, and `"__JAIPH_MANAGED__"` no longer appear anywhere under `src/`. `WorkflowStepDef` collapses from 14 variants to **8** (`exec`, `const`, `return`, `send`, `say`, `if`, `for_lines`, `trivia`): `exec` is the new managed-statement form covering the prior `run` / `ensure` / `run_inline_script` / `prompt` / `shell` / standalone `match` cases (the discriminator now lives inside `body.kind`, with `captureName` / `catch` / `recover` as step-level attributes); `say` covers the prior `log` / `logerr` / `fail` cases (`level: "fail"` aborts the workflow with the message, otherwise the message is written to the corresponding stream); `comment` / `blank_line` collapse into a single `trivia` variant (formatter-only, skipped by validator and runtime). The parser builds `Expr` nodes directly: `parseConstRhs` returns `{ value: Expr }`; `parseSendRhs` returns `{ value: Expr }`; `parsePromptStep` returns an `exec` step whose `body` is an `Expr.prompt`; `return run …` / `return ensure …` / `return match …` / `return run \`…\`(…)` build `Expr.call` / `Expr.ensure_call` / `Expr.match` / `Expr.inline_script` directly with no sidecar; `log run \`…\`(…)` and `logerr run \`…\`(…)` build `say` steps whose `message` is an `Expr.inline_script`. Downstream consumers compress accordingly: the validator switches on the 8-variant `WorkflowStepDef.type` and the 8-kind `Expr.kind` with no "literal value vs managed sidecar" fork; the formatter renders each `Expr` through one `emitExpr` helper instead of branching on a sidecar; the runtime has one private `evaluateExpr(scope, expr, …)` dispatcher that `const` / `return` / `send` / `say` / `exec` all delegate to (which runs the managed call for `call` / `ensure_call` / `inline_script`, walks `match` arms, schema-checks `prompt`, and interpolates `literal` via `interpolateWithCaptures`); the script-emit walk in `src/transpile/emit-script.ts` finds inline-script bodies by recursing into each step's `Expr` payload rather than enumerating the four legacy carriers. New tests pin the invariants: `src/types-shape.test.ts` is a compile-time exhaustive `switch` plus runtime tuple assertion that `WorkflowStepDef` has exactly **8** variants and `Expr` has exactly **8** kinds (AC2), a `grep` over every non-test `.ts` file under `src/` that fails if any of the placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) reappear (AC1), and an export-surface check that fails if `ConstRhs` or `SendRhsDef` are re-exported from `src/types.ts` (AC3). Updated parser tests in `src/parse/parse-return.test.ts`, `src/parse/parse-const-rhs.test.ts`, `src/parse/parse-prompt.test.ts`, `src/parse/parse-send-rhs.test.ts`, `src/parse/parse-steps.test.ts`, `src/parse/parse-inline-script.test.ts`, and `src/parse/parse-bare-call.test.ts` assert the new `Expr` shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …` (AC4). The golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against the emitted bash output; `src/format/roundtrip.test.ts` round-trips bit-for-bit on every fixture; `npm run build` passes with zero TypeScript strict-mode errors (AC5 / AC6). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to reflect the new step shapes (`exec` wrapping every managed call, `say` replacing `log` / `logerr` / `fail`, `trivia` replacing `comment` / `blank_line`, `Expr` value/message/body payloads). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: surface syntax, the validator's deeper structural rewrite (Refactor 4), and parser internals (Refactors 1 & 2). Docs updated in `docs/architecture.md` (rewrote the **AST / Types** bullet to describe the single `Expr` sum and the 8-variant `WorkflowStepDef`; updated **Validator**, **Formatter**, **Node Workflow Runtime**, and **Trivia / CST layer** bullets to drop the dual-representation language; rewrote the `match_expr` mention in **CLI progress reporting pipeline** to use `Expr.kind === "match"`) and `docs/contributing.md` (new **`Expr` / step-variant shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. - **Refactor — Collapse `bareIdentifierArgs` into a typed `Arg[]` on every call site:** Every call-bearing AST node used to carry the call arguments twice — `args: string` (the raw source between the parens) and `bareIdentifierArgs?: string[]` (a re-parse of which of those arguments happened to be bare identifiers). The validator had to remember to check both fields and call a hand-rolled `validateBareIdentifierArgs` helper at every site; the emitter re-parsed `args` from scratch because it didn't trust either field on its own. Both fields are gone. The parser now classifies each argument once, at parse time, into a new typed sum `type Arg = { kind: "literal"; raw: string } | { kind: "var"; name: string }` and stores it on every call-bearing node as `args?: Arg[]`. Affected nodes: `run` / `ensure` workflow steps, `run_inline_script` steps, the `managed` sidecar on `return` / `log` / `logerr` (in all four shapes — `run`, `ensure`, `run_inline_script`, `match`), the `run_capture` / `ensure_capture` / `run_inline_script_capture` const RHS variants, and the `run` send RHS. Downstream consumers walk the typed list directly: the validator's per-call check sequence is now arity (`args.length`), shell-redirection rejection on `literal` raws, nested-unmanaged-call rejection on `literal` raws, ref resolution, and `var`-arg resolution against in-scope bindings via a new `validateArgVarRefs` (the standalone `validateBareIdentifierArgs` helper is deleted); the formatter renders each `Arg` directly (`var` → bare name, `literal` → raw) instead of re-tokenizing a `${ident}`-rewritten string; the runtime turns `Arg[]` back into the space-separated argv string via `argsToRuntimeString` in `src/parse/core.ts` (`var` → `${name}`, `literal` → raw) so the existing handle-resolution / interpolation path is unchanged. New tests pin the invariants: `src/parse/arg-ast-shape.test.ts` is a compile-time assertion that `bareIdentifierArgs` does not appear on `WorkflowStepDef` (`ensure`, `run`, `run_inline_script`, `log.managed`, `logerr.managed`, `return.managed` in `run` / `ensure` / `run_inline_script` shapes), `ConstRhs` (`run_capture`, `ensure_capture`, `run_inline_script_capture`), or the `run` `SendRhsDef` variant (AC1); `src/parse/arg-grep.test.ts` walks every non-test `.ts` under `src/parse/` and `src/transpile/` and fails if any production file matches `args.split(",")` or the bare token `bareIdentifierArgs` (AC2), and separately fails if any file under `src/transpile/` references `validateBareIdentifierArgs` (AC3). The golden compiler corpus, `validate-*.test.ts` files, and the golden AST corpus pass byte-for-byte (AC4); `npm run build` passes with zero TypeScript strict-mode errors (AC5). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the full `Expr` collapse (next task) and surface syntax. Docs updated in `docs/architecture.md` (extended **AST / Types** bullet documenting the typed `Arg` sum; updated **Validator** and **Formatter** bullets to drop the dual representation), `docs/contributing.md` (new **Call-args AST shape** row in the test-layer table), and `docs/spec-async-handles.md` (replaces the stale `commaArgsToSpaced` reference with `argsToRuntimeString`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix D. diff --git a/QUEUE.md b/QUEUE.md index 5d70b70c..49b066d8 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,33 +13,6 @@ Process rules: *** -## Replace fail-fast errors with a Diagnostics collector that aggregates per compile #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. - -**Why:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error. Users fix one error, recompile, fix the next, recompile. The validator also pre-orders some checks defensively because it knows it will only get to surface one error. A diagnostics collector lets the parser and validator append errors and the run report the full set at the end. - -**Scope:** - -- Introduce `class Diagnostics { errors: JaiphDiagnostic[]; add(...); hasFatal(): boolean; report(): never | void }` (or equivalent). -- Parser and validator append diagnostics instead of throwing for non-fatal errors. A "fatal" tier remains for cases where continuing would produce garbage AST (unterminated triple-quote, unterminated brace block). -- At the end of a compile, `Diagnostics.report()` either prints all collected errors sorted by file/line and exits non-zero, or returns cleanly. The CLI surfaces the full set instead of just the first. -- Existing call sites of `fail()` / `jaiphError()` migrate to `diagnostics.add(...)` where the error is recoverable. - -**Acceptance criteria** (each verified by a test): - -1. A fixture containing **N ≥ 3 independent errors** (e.g. an undefined channel, a duplicate import alias, and an unknown ref in a `run` call) reports all N errors in one compile, not just the first. Add a test that asserts the full set is reported in source order. -2. The existing single-error tests still pass: every `parse-*.test.ts` and `validate-*.test.ts` fixture that asserts a specific `{ message, line, col, code }` still gets exactly that error (now the only one in `Diagnostics`). -3. `fail()` and `jaiphError()` throwing call-sites are reduced to a documented "fatal" subset (count it in the test). Non-fatal call-sites use the collector. -4. CLI exit code on any non-empty `Diagnostics` is non-zero. Add an `e2e` or CLI test. -5. `npm test` and `npm run build` pass. - -**Out of scope:** changing what counts as an error (the *what*) — this refactor only changes the *how*. LSP integration (a follow-up). - -**Dependency:** None hard, but cheapest to do immediately before the visitor-table validator refactor (next task), since the new visitor's per-step entry/exit is the natural place to plug in the collector. - -*** - ## Replace the 1,441-line validator switch with a per-step visitor table indexed by scope #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. diff --git a/docs/architecture.md b/docs/architecture.md index 698c23c5..f9033424 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -20,7 +20,7 @@ For **how to contribute** — branches, test layers, E2E assertion policy, and b Workflow authors write `.jh` / `.test.jh` modules. The toolchain turns those files into **validated** modules plus **extracted script files**, then the **same AST interpreter** runs workflows whether you use local `jaiph run`, Docker, or `jaiph test`. 1. Parse source into AST. Every CLI path walks the entry plus its transitive `.jh` import closure **once** through **`loadModuleGraph`** (`src/transpile/module-graph.ts`) and reuses that **`ModuleGraph`** for the banner (`metadataToConfig`), validation (**`validateReferences(graph)`**), script-body extraction (**`buildScriptsFromGraph`**), and — across the parent → child process boundary on the default local `jaiph run` — for **`buildRuntimeGraph(graph)`** in the spawned runner (see [Local module graph](#local-module-graph) and the sequence diagram below). `parsejaiph(source, filePath)` is I/O-pure; `validate` and `emit` operate entirely on the in-memory graph and never re-read `.jh` files. The only fs entry point that reads `.jh` sources is `loadModuleGraph`. -2. **Compile-time** validation (`validateReferences(graph)`, invoked from **`emitScriptsForModuleFromGraph`** / **`buildScriptsFromGraph()`**) runs before script extraction. The validator consumes the in-memory graph; imported ASTs are looked up by absolute path and never re-read from disk. The **`jaiph compile`** command walks the same import closure but runs **`validateReferences` only**: it builds a graph per entry, validates it, and **does not** emit **`scripts/`**, **does not** invoke **`buildRuntimeGraph()`**, and never spawns the workflow runner (`src/cli/commands/compile.ts`). For a **directory** argument it discovers `*.jh` via `walkjhFiles`, which **skips** `*.test.jh`; to validate a test module, pass that file explicitly. Imported modules in the closure are still validated recursively either way. +2. **Compile-time** validation runs before script extraction. The validator consumes the in-memory graph; imported ASTs are looked up by absolute path and never re-read from disk. Two entry points share the same per-module walk: `validateReferences(graph)` is the legacy throwing form (used by `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph()` so the existing single-error path stays intact), and `collectDiagnostics(graph)` returns a populated `Diagnostics` collector (`src/diagnostics.ts`) with **every** recoverable error from every reachable module. The **`jaiph compile`** command walks the same import closure but routes through `collectDiagnostics`: it builds a graph per entry, collects diagnostics, prints them all (sorted by file/line/col, in `path:line:col CODE message` form on stderr — or as a single JSON array on stdout with `--json`), and exits non-zero if any diagnostic was collected. It **does not** emit **`scripts/`**, **does not** invoke **`buildRuntimeGraph()`**, and never spawns the workflow runner (`src/cli/commands/compile.ts`). For a **directory** argument it discovers `*.jh` via `walkjhFiles`, which **skips** `*.test.jh`; to validate a test module, pass that file explicitly. Imported modules in the closure are still validated recursively either way. 3. **CLI** (`dist/src/cli.js` via npm, or a **Bun-compiled** `dist/jaiph` binary) prepares script executables (scripts-only), then spawns a **detached child** that loads **`node-workflow-runner.js`**. That child calls `buildRuntimeGraph()` and runs **`NodeWorkflowRuntime`**. The child’s interpreter is **`process.execPath`** of the CLI process (Node when you run `node dist/src/cli.js`, the standalone Bun binary when you run `dist/jaiph`). Script steps execute as managed subprocesses; prompt, inbox I/O, and event/summary emission are handled by the kernel under `src/runtime/kernel/`. 4. Stream live events to the CLI and persist durable run artifacts. @@ -52,6 +52,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Validator (`src/transpile/validate.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. + - **Diagnostics collector (recoverable errors).** The validator no longer fails fast on the first user-level error. Every recoverable check appends to a `Diagnostics` collector (`src/diagnostics.ts`) via `diag.error(file, line, col, code, msg)`, which records a `JaiphDiagnostic` and short-circuits the current validation unit through a `BailoutError`. Each top-level unit (per-import block, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step, per-channel route) is wrapped in `diag.capture(fn)`, which absorbs the bailout (and any thrown `jaiphError` from leaf helpers like `validate-ref-resolution.ts` / `validate-string.ts` / `validate-prompt-schema.ts` / `shell-jaiph-guard.ts`) so the next sibling unit still runs. `collectDiagnostics(graph)` walks every module and returns the populated collector; the legacy `validateReferences(graph)` is now a thin wrapper that throws the first sorted diagnostic via `jaiphError` so existing per-error tests and the `emitScriptsForModuleFromGraph` path keep working unchanged. `Diagnostics.sorted()` returns errors ordered by `(file, line, col)`; `formatLines()` renders the standard `path:line:col CODE message` shape. A grep test (`src/transpile/diagnostics-collector.test.ts`) pins the migration: `validate.ts` holds **zero** `throw jaiphError(` sites, and the remaining `throw jaiphError(` call sites under `src/` are confined to a documented allowlist — fatal aborts in the parser (`src/parse/core.ts`), the loader (`src/transpile/module-graph.ts`), and the test-file shape check (`src/cli/commands/test.ts`); the legacy bridge in `src/diagnostics.ts`; and the four leaf validation helpers above, each of which has every caller wrapped in `diag.capture(...)`. - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation, walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. - **Single workflow walk.** Each workflow / rule has its step tree descended exactly once by `walkStepTree`, which simultaneously accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings, gated by `options.withPromptSchemas` so rules skip schema collection), enforces immutable-binding / `script`-collision rules inline (mutating a shared `bindings` map and threading a fresh inner map under each `for_lines` so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding attached. The main per-step validator loop iterates that flat list non-recursively, so `walkStepTree`'s internal `descend` is the **only** recursive helper in the file that takes a `WorkflowStepDef[]`. A pair of grep / AST tests (`src/transpile/validate-single-walk.test.ts`) pins both invariants: the prior helpers (`collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`) cannot reappear by name, and at most one recursive `WorkflowStepDef[]` walker may live in `validate.ts`. - Per call site the validator runs five checks against the typed **`Arg[]`** directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution, arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. There is no longer a separate `validateBareIdentifierArgs` helper, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. @@ -308,7 +309,7 @@ sequenceDiagram ## Summary - `.jh` / `*.test.jh` share parser/AST. The pipeline is **`loadModuleGraph` → `validateReferences(graph)` → `emit(graph, outDir)`**; `parsejaiph` is I/O-pure and `validate` / `emit` operate entirely in-memory. **`buildRuntimeGraph`** consumes the same `ModuleGraph` (loaded in the runner from disk or — on the default local **`jaiph run`** path — deserialized from the parent CLI's graph file via **`JAIPH_MODULE_GRAPH_FILE`**; see [Local module graph](#local-module-graph)). -- **`jaiph compile`** walks import closures with **`validateReferences` only**, and exits — no **`scripts/`** emission (**no **`buildScriptFiles`** / **`buildScripts`**), no **`buildRuntimeGraph()`**, no runner spawn. Directory discovery omits **`*.test.jh`** unless you pass a test file explicitly. +- **`jaiph compile`** walks import closures through **`collectDiagnostics(graph)`** (the multi-error sibling of **`validateReferences`**), prints the full diagnostic set sorted by `(file, line, col)`, and exits non-zero on any non-empty set — no **`scripts/`** emission (**no **`buildScriptFiles`** / **`buildScripts`**), no **`buildRuntimeGraph()`**, no runner spawn. Directory discovery omits **`*.test.jh`** unless you pass a test file explicitly. - **Node-only runtime:** all execution — local `jaiph run`, Docker `jaiph run`, and `jaiph test` — goes through `NodeWorkflowRuntime`. Docker containers run `node-workflow-runner` with the compiled JS tree and scripts mounted, using the same semantics as local execution. - **CLI** owns launch, observation, hooks (except **`jaiph run --raw`**), and runtime preparation (`buildScripts`). **`jaiph run --raw`** still emits **`__JAIPH_EVENT__`** on stderr from the runtime; the CLI does not attach the interactive progress/hooks pipeline. **`jaiph test`** passes **`suppressLiveEvents: true`** into **`NodeWorkflowRuntime`** so **`RuntimeEventEmitter`** skips writing those live stderr lines while **`run_summary.jsonl`** still records workflow traffic where the emitter appends it. - Workflow execution runs in **`NodeWorkflowRuntime`**, with **script steps** as managed subprocesses. diff --git a/docs/cli.md b/docs/cli.md index 658f8d8b..7f48b793 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -276,7 +276,7 @@ jaiph test e2e/say_hello.test.jh ## `jaiph compile` -Parse modules and run **`validateReferences`** (the same compile-time checks as before `jaiph run`) **without** writing `scripts/`, **without** calling **`buildRuntimeGraph`**, and **without** spawning the workflow runner. Use this for CI gates, pre-commit hooks, or editor diagnostics. +Parse modules and run the same compile-time validation as before `jaiph run` **without** writing `scripts/`, **without** calling **`buildRuntimeGraph`**, and **without** spawning the workflow runner. Use this for CI gates, pre-commit hooks, or editor diagnostics. ```bash jaiph compile [--json] [--workspace ] ... @@ -288,9 +288,11 @@ At least one path is required. **`jaiph compile -h`** or **`jaiph compile --help **Directory arguments** — The tree is scanned for `*.jh` files whose basename is **not** `*.test.jh` (same rule as `walkjhFiles` in the transpiler: files like `foo.test.jh` are skipped). Each non-test `*.jh` under the tree is treated as an entrypoint and its closure merged into the same validation set. To validate a test module’s graph explicitly, pass that **`*.test.jh` file** as a path (directories never pick up `*.test.jh` as roots). +**Multiple-error reporting.** `jaiph compile` aggregates **all** recoverable validation errors across the import closure before exiting, rather than stopping at the first failure. Internally it calls **`collectDiagnostics(graph)`** (`src/transpile/validate.ts`), which walks every reachable module and returns a `Diagnostics` collector (`src/diagnostics.ts`) populated with every error the validator accumulated through `diag.error(...)` and `diag.capture(...)`. Output is sorted by `(file, line, col)` so a single compile cycle surfaces independent errors together — for example, a duplicate `import` alias on line 2, an undefined channel in a `send` on line 6, and an unknown `run` target on line 7 all appear in one report. **Fatal** errors (parser failures like an unterminated triple-quote, loader failures, etc.) still abort the closure for the affected entry — `jaiph compile` reports them as a single diagnostic for that entry and continues with the next entry. Any non-empty diagnostic set exits **1**. + **Flags:** -- **`--json`** — On success, print `[]` to stdout. On failure, print one JSON **array** of objects `{ "file", "line", "col", "code", "message" }` to stdout and exit **1** (non-JSON errors use a synthetic `E_COMPILE` object when the message is not in `file:line:col CODE …` form). +- **`--json`** — On success, print `[]` to stdout. On failure, print **one** JSON **array** containing every collected diagnostic — objects `{ "file", "line", "col", "code", "message" }` — to stdout and exit **1** (non-JSON errors use a synthetic `E_COMPILE` object when the message is not in `file:line:col CODE …` form). Without `--json`, the same set is written to **stderr** as one `path:line:col CODE message` line per diagnostic, in the same sorted order. - **`--workspace `** — Override the workspace root used for **library import resolution** (`/.jaiph/libs/`, etc.) for **all** modules reached from the given paths. When omitted, the workspace is **auto-detected** from each path’s location (`detectWorkspaceRoot` — same algorithm as `jaiph run`, starting from the file’s directory or from a directory argument). ## `jaiph format` diff --git a/docs/contributing.md b/docs/contributing.md index 60e0d8f3..8c5e9e6e 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -106,6 +106,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Call-args AST shape** | `src/parse/arg-ast-shape.test.ts`, `src/parse/arg-grep.test.ts` | Pins the typed-`Arg[]` invariant: no `bareIdentifierArgs` field on any call-bearing AST type (compile-time), no `args.split(",")` or `bareIdentifierArgs` text in production `src/parse/` or `src/transpile/` sources, and no `validateBareIdentifierArgs` helper in the validator | You changed how call arguments flow through the parser, validator, or emitter and need to confirm nothing re-introduces the parallel raw-string representation (see [Architecture — AST / Types](architecture.md#core-components)) | | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | +| **Diagnostics collector shape** | `src/transpile/diagnostics-collector.test.ts` | Pins the migration from fail-fast `throw jaiphError(...)` to the `Diagnostics` collector: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target) asserts that `collectDiagnostics(graph)` returns **all three** in source order; a source grep asserts `validate.ts` holds **zero** `throw jaiphError(` sites and many `diag.error(` sites; an allowlist scan over every non-test `*.ts` under `src/` rejects new `throw jaiphError(` sites outside the documented fatal subset (parser `fail()`, loader, test-file shape check, legacy bridge, four leaf helpers wrapped in `diag.capture(...)`); a CLI test asserts `jaiph compile --json` returns the full diagnostic array and exits non-zero | You added a new `throw jaiphError(...)` site, migrated more checks to the collector, changed the fatal/recoverable boundary, or changed `jaiph compile`'s exit-code or output shape (see [Architecture — Validator](architecture.md#core-components) and [CLI](cli.md)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | | **Integration tests** | `integration/*.test.ts`, `integration/sample-build/*.test.ts` | Process-level integration behavior: signal handling, TTY rendering, run summary structure, sample builds | The test spans multiple modules or requires subprocess/PTY harnesses | diff --git a/src/cli/commands/compile.ts b/src/cli/commands/compile.ts index 8f7ed48e..5b21c44e 100644 --- a/src/cli/commands/compile.ts +++ b/src/cli/commands/compile.ts @@ -1,9 +1,13 @@ import { existsSync, statSync } from "node:fs"; import { dirname, resolve } from "node:path"; import { loadModuleGraph } from "../../transpile/module-graph"; -import { validateReferences } from "../../transpile/validate"; +import { collectDiagnostics } from "../../transpile/validate"; import { walkjhFiles } from "../../transpile/build"; import { detectWorkspaceRoot } from "../shared/paths"; +import { + diagnosticFromThrown as parseThrownDiagnostic, + type JaiphDiagnostic, +} from "../../diagnostics"; export interface CompileDiagnostic { file: string; @@ -15,16 +19,12 @@ export interface CompileDiagnostic { /** Parse `path:line:col CODE message` from {@link jaiphError} and similar throws. */ export function diagnosticFromThrown(err: unknown): CompileDiagnostic | null { - if (!(err instanceof Error)) return null; - const m = err.message.match(/^(.+):(\d+):(\d+) (\S+) (.+)$/s); - if (!m) return null; - return { - file: m[1], - line: Number(m[2]), - col: Number(m[3]), - code: m[4], - message: m[5].trimEnd(), - }; + const d = parseThrownDiagnostic(err); + return d ? { file: d.file, line: d.line, col: d.col, code: d.code, message: d.message } : null; +} + +function toCompileDiagnostic(d: JaiphDiagnostic): CompileDiagnostic { + return { file: d.file, line: d.line, col: d.col, code: d.code, message: d.message }; } function printUsage(): void { @@ -39,6 +39,16 @@ function printUsage(): void { ); } +function writeDiagnostics(json: boolean, diags: CompileDiagnostic[]): void { + if (json) { + process.stdout.write(JSON.stringify(diags) + "\n"); + return; + } + for (const d of diags) { + process.stderr.write(`${d.file}:${d.line}:${d.col} ${d.code} ${d.message}\n`); + } +} + export function runCompile(args: string[]): number { let json = false; let workspaceFlag: string | undefined; @@ -97,51 +107,47 @@ export function runCompile(args: string[]): number { } } catch (err) { const d = diagnosticFromThrown(err); - if (json) { - const fallback: CompileDiagnostic = { - file: "", - line: 1, - col: 1, - code: "E_COMPILE", - message: err instanceof Error ? err.message : String(err), - }; - process.stdout.write(JSON.stringify(d ? [d] : [fallback]) + "\n"); - } else { - process.stderr.write((err instanceof Error ? err.message : String(err)) + "\n"); - } + const fallback: CompileDiagnostic = { + file: "", + line: 1, + col: 1, + code: "E_COMPILE", + message: err instanceof Error ? err.message : String(err), + }; + writeDiagnostics(json, [d ?? fallback]); return 1; } + const collected: CompileDiagnostic[] = []; const seen = new Set(); for (const { file, workspaceRoot } of entries) { if (seen.has(file)) continue; seen.add(file); try { const graph = loadModuleGraph(file, workspaceRoot); - validateReferences(graph); - // Mark every reachable module as already validated so a directory walk - // does not double-validate shared imports. + const diag = collectDiagnostics(graph); + for (const d of diag.sorted()) collected.push(toCompileDiagnostic(d)); for (const reachable of graph.modules.keys()) seen.add(reachable); } catch (err) { + // Loader / parser errors are fatal (unrecoverable AST). Surface them + // as a single diagnostic; they do not flow through `Diagnostics`. const d = diagnosticFromThrown(err); - if (json) { - const fallback: CompileDiagnostic = { + collected.push( + d ?? { file, line: 1, col: 1, code: "E_COMPILE", message: err instanceof Error ? err.message : String(err), - }; - process.stdout.write(JSON.stringify(d ? [d] : [fallback]) + "\n"); - } else { - process.stderr.write((err instanceof Error ? err.message : String(err)) + "\n"); - } - return 1; + }, + ); } } - if (json) { - process.stdout.write("[]\n"); + if (collected.length === 0) { + if (json) process.stdout.write("[]\n"); + return 0; } - return 0; + writeDiagnostics(json, collected); + return 1; } diff --git a/src/diagnostics.ts b/src/diagnostics.ts new file mode 100644 index 00000000..2aed034a --- /dev/null +++ b/src/diagnostics.ts @@ -0,0 +1,130 @@ +/** + * Diagnostics collector — replaces fail-fast error reporting for the validator + * (and any future call-site that wants to keep going after the first error). + * + * Two-tier model: + * - **Recoverable** errors append to `Diagnostics.errors` and short-circuit the + * current validation unit via {@link BailoutError}. The unit's outer + * `diag.capture(...)` wrapper absorbs the bailout so the next unit (next + * step / next rule / next channel) still runs. + * - **Fatal** errors continue to throw via `jaiphError` (parser-level cases + * where continuing would produce garbage AST — unterminated triple-quote, + * unterminated brace block, etc.). A fatal bit on the diagnostic record + * lets the CLI render them distinctly if needed. + * + * The collector also accepts errors that helpers still throw via the legacy + * `jaiphError(file, line, col, code, msg)` shape: `capture()` parses such a + * thrown error back into a `JaiphDiagnostic` and appends it. That keeps + * helper signatures stable while still surfacing the full error set. + */ + +import { jaiphError } from "./errors"; + +export interface JaiphDiagnostic { + file: string; + line: number; + col: number; + code: string; + message: string; + fatal: boolean; +} + +/** Sentinel thrown by `diag.error(...)` to unwind to the nearest capture boundary. */ +export class BailoutError extends Error { + readonly __jaiphBailout = true as const; + constructor() { + super("jaiph bailout"); + } +} + +export function isBailout(err: unknown): err is BailoutError { + return err instanceof Error && (err as { __jaiphBailout?: unknown }).__jaiphBailout === true; +} + +/** Parse `path:line:col CODE message` (the shape `jaiphError` produces). */ +export function diagnosticFromThrown(err: unknown, fatal = false): JaiphDiagnostic | null { + if (!(err instanceof Error)) return null; + if (isBailout(err)) return null; + const m = err.message.match(/^(.+):(\d+):(\d+) (\S+) ([\s\S]+)$/); + if (!m) return null; + return { + file: m[1], + line: Number(m[2]), + col: Number(m[3]), + code: m[4], + message: m[5].trimEnd(), + fatal, + }; +} + +export class Diagnostics { + readonly errors: JaiphDiagnostic[] = []; + + add(d: JaiphDiagnostic): void { + this.errors.push(d); + } + + /** + * Append a recoverable diagnostic and short-circuit the current validation + * unit via `BailoutError`. The nearest `capture()` boundary absorbs the + * bailout so the next sibling unit still runs. + */ + error(file: string, line: number, col: number, code: string, message: string): never { + this.errors.push({ file, line, col, code, message, fatal: false }); + throw new BailoutError(); + } + + /** + * Run `fn`. Absorb `BailoutError`. Parse any thrown `jaiphError`-shape error + * into a recoverable diagnostic. Re-throw anything else (likely an internal + * bug we want to surface). + */ + capture(fn: () => void): void { + try { + fn(); + } catch (e) { + if (isBailout(e)) return; + const d = diagnosticFromThrown(e); + if (d) { + this.errors.push(d); + return; + } + throw e; + } + } + + hasErrors(): boolean { + return this.errors.length > 0; + } + + hasFatal(): boolean { + return this.errors.some((d) => d.fatal); + } + + /** Stable order: file, then line, then column. */ + sorted(): JaiphDiagnostic[] { + return [...this.errors].sort((a, b) => { + if (a.file !== b.file) return a.file < b.file ? -1 : 1; + if (a.line !== b.line) return a.line - b.line; + return a.col - b.col; + }); + } + + /** One `file:line:col CODE message` line per diagnostic, in sorted order. */ + formatLines(): string[] { + return this.sorted().map( + (d) => `${d.file}:${d.line}:${d.col} ${d.code} ${d.message}`, + ); + } + + /** + * Legacy bridge: throw the first sorted diagnostic as a regular `jaiphError` + * so existing callers that depend on `validateReferences` throwing continue + * to work. Does nothing when empty. + */ + throwFirstIfAny(): void { + if (this.errors.length === 0) return; + const f = this.sorted()[0]; + throw jaiphError(f.file, f.line, f.col, f.code, f.message); + } +} diff --git a/src/transpile/diagnostics-collector.test.ts b/src/transpile/diagnostics-collector.test.ts new file mode 100644 index 00000000..757a61f5 --- /dev/null +++ b/src/transpile/diagnostics-collector.test.ts @@ -0,0 +1,207 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { join, resolve } from "node:path"; +import { tmpdir } from "node:os"; +import { spawnSync } from "node:child_process"; +import { loadModuleGraph } from "./module-graph"; +import { collectDiagnostics } from "./validate"; + +// Compiled test sits at dist/src/transpile/; the source tree is three levels up. +const repoRoot = resolve(__dirname, "../../.."); +const validatePath = resolve(repoRoot, "src/transpile/validate.ts"); +const cliJsPath = resolve(repoRoot, "dist/src/cli.js"); + +/** + * Acceptance #1: a fixture with N >= 3 independent errors reports the full + * set in one compile (not just the first), in source order. + * + * The three independent errors: + * 1. duplicate import alias `helper` (line 2 — second import line) + * 2. send to undefined channel `notify` (line 6 — inside the workflow body) + * 3. unknown ref `do_thing` in a run call (line 7) + */ +test("Diagnostics: collects 3 independent errors from one compile in source order", () => { + const root = mkdtempSync(join(tmpdir(), "jaiph-diag-multi-")); + try { + writeFileSync( + join(root, "helper.jh"), + ["export rule check(x) {", ' return "ok"', "}", ""].join("\n"), + ); + writeFileSync( + join(root, "m.jh"), + [ + 'import "./helper.jh" as helper', + 'import "./helper.jh" as helper', + "", + "workflow default() {", + ' log "hi"', + ' notify <- "payload"', + " run do_thing()", + "}", + "", + ].join("\n"), + ); + + const graph = loadModuleGraph(join(root, "m.jh")); + const diag = collectDiagnostics(graph); + const sorted = diag.sorted().filter((d) => d.file.endsWith("m.jh")); + + assert.equal( + sorted.length, + 3, + `expected 3 diagnostics, got: ${JSON.stringify(diag.sorted(), null, 2)}`, + ); + assert.equal(sorted[0].line, 2, "duplicate import alias should be on line 2"); + assert.match(sorted[0].message, /duplicate import alias "helper"/); + assert.equal(sorted[1].line, 6, "undefined channel should be on line 6"); + assert.match(sorted[1].message, /Channel "notify" is not defined/); + assert.equal(sorted[2].line, 7, "unknown ref should be on line 7"); + assert.match(sorted[2].message, /unknown local workflow or script reference "do_thing"/); + } finally { + rmSync(root, { recursive: true, force: true }); + } +}); + +/** + * Acceptance #3: throwing call-sites are reduced to a documented "fatal" + * subset. The validator entry point (`validate.ts`) no longer throws on + * user-level errors; it appends to a `Diagnostics` collector instead. + * + * Reference baseline (pre-migration): `validate.ts` alone had ~54 raw + * `throw jaiphError(` call-sites. After migration that file holds zero. + * + * The remaining `throw jaiphError(...)` call-sites in `src/` fall into two + * groups: + * + * - **Fatal aborts** (continuing would produce garbage): the parser's + * `fail()` helper (`src/parse/core.ts`), the loader / graph builder + * (`src/transpile/module-graph.ts`), the test-file shape check + * (`src/cli/commands/test.ts`), plus the legacy bridge inside the + * collector itself (`src/diagnostics.ts`). + * - **Leaf validation helpers** (validate-string, validate-prompt-schema, + * validate-ref-resolution, shell-jaiph-guard): these still throw but + * every caller wraps them in `diag.capture(...)`, which converts the + * thrown `jaiphError` into a recoverable diagnostic and continues with + * the next validation unit. + * + * Test files (`*.test.ts`) are excluded from the count — they intentionally + * exercise the throwing legacy bridge. + */ +test("Diagnostics: throwing call-sites match the documented fatal allowlist", () => { + const src = readFileSync(validatePath, "utf8"); + const throwCount = (src.match(/throw\s+jaiphError\(/g) ?? []).length; + assert.equal( + throwCount, + 0, + `expected validate.ts to use diag.error exclusively, found ${throwCount} throw jaiphError sites`, + ); + + // Sanity: confirm the migration replaced rather than removed. + const diagErrorCount = (src.match(/diag\.error\(/g) ?? []).length; + assert.ok( + diagErrorCount >= 40, + `expected many diag.error sites, found ${diagErrorCount}`, + ); + + // The fatal allowlist: files where a `throw jaiphError(...)` is allowed + // because continuing would produce garbage (parser / loader) or because + // the throw is wrapped by `diag.capture(...)` at every caller. + const allowlist = new Set([ + "src/diagnostics.ts", // legacy bridge + "src/parse/core.ts", // parser fail() + "src/cli/commands/test.ts", // test-file shape fatal + "src/transpile/module-graph.ts", // loader fatal + "src/transpile/validate-string.ts", // leaf helper (captured) + "src/transpile/validate-prompt-schema.ts", // leaf helper (captured) + "src/transpile/validate-ref-resolution.ts", // leaf helper (captured) + "src/transpile/shell-jaiph-guard.ts", // leaf helper (captured) + ]); + + // Walk every .ts file under src/, excluding tests, and confirm any raw + // `throw jaiphError(` lives in the allowlist. Anything outside the + // allowlist is a regression — non-fatal validator/transpiler code must + // route through the collector instead. + const offenders: string[] = []; + walkTsFiles(resolve(repoRoot, "src"), (relPath, contents) => { + if (relPath.endsWith(".test.ts")) return; + if (!/throw\s+jaiphError\(/.test(contents)) return; + if (!allowlist.has(relPath)) offenders.push(relPath); + }); + assert.deepEqual( + offenders, + [], + `unexpected throw jaiphError(...) outside the fatal allowlist: ${offenders.join(", ")}`, + ); +}); + +function walkTsFiles( + dir: string, + cb: (relPath: string, contents: string) => void, +): void { + const { readdirSync, statSync } = require("node:fs") as typeof import("node:fs"); + for (const name of readdirSync(dir)) { + const full = join(dir, name); + const st = statSync(full); + if (st.isDirectory()) { + walkTsFiles(full, cb); + continue; + } + if (!full.endsWith(".ts")) continue; + const rel = full.slice(repoRoot.length + 1); + cb(rel, readFileSync(full, "utf8")); + } +} + +interface CompileDiagnosticJson { + file: string; + line: number; + col: number; + code: string; + message: string; +} + +/** + * Acceptance #4: CLI exit code is non-zero whenever the collector is + * non-empty. `jaiph compile --json` must return the full diagnostic set. + */ +test("CLI: `jaiph compile --json` returns full set + non-zero exit on multiple errors", () => { + const root = mkdtempSync(join(tmpdir(), "jaiph-diag-cli-")); + try { + writeFileSync( + join(root, "helper.jh"), + ["export rule check(x) {", ' return "ok"', "}", ""].join("\n"), + ); + writeFileSync( + join(root, "m.jh"), + [ + 'import "./helper.jh" as helper', + 'import "./helper.jh" as helper', + "", + "workflow default() {", + ' log "hi"', + ' notify <- "payload"', + " run do_thing()", + "}", + "", + ].join("\n"), + ); + + const out = spawnSync( + process.execPath, + [cliJsPath, "compile", "--json", join(root, "m.jh")], + { encoding: "utf8" }, + ); + + assert.notEqual( + out.status, + 0, + `expected non-zero exit; stdout=${out.stdout} stderr=${out.stderr}`, + ); + const parsed = JSON.parse(out.stdout) as CompileDiagnosticJson[]; + const inFile = parsed.filter((d) => d.file.endsWith("m.jh")); + assert.equal(inFile.length, 3, `expected 3 diagnostics; got ${out.stdout}`); + } finally { + rmSync(root, { recursive: true, force: true }); + } +}); diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index 9e6a989a..ef222d1e 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -1,6 +1,6 @@ import { existsSync } from "node:fs"; import { dirname, resolve } from "node:path"; -import { jaiphError } from "../errors"; +import { Diagnostics } from "../diagnostics"; import type { Arg, Expr, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; import type { ModuleGraph } from "./module-graph"; import type { SubstitutionValidateEnv } from "./validate-substitution"; @@ -76,13 +76,14 @@ function hasShellRedirection(args: Arg[] | undefined): boolean { } function validateNoShellRedirection( + diag: Diagnostics, filePath: string, loc: { line: number; col: number }, keyword: string, args: Arg[] | undefined, ): void { if (!hasShellRedirection(args)) return; - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -91,9 +92,14 @@ function validateNoShellRedirection( ); } -function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set): void { +function validateMatchExpr( + diag: Diagnostics, + filePath: string, + expr: MatchExprDef, + knownVars: Set, +): void { if (expr.arms.length === 0) { - throw jaiphError(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have at least one arm"); + diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have at least one arm"); } let wildcardCount = 0; for (const arm of expr.arms) { @@ -104,7 +110,7 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< try { new RegExp(arm.pattern.source); } catch { - throw jaiphError( + diag.error( filePath, expr.loc.line, expr.loc.col, @@ -115,7 +121,7 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< } const bodyTrimmed = (arm.tripleQuotedBody ? tripleQuotedRawForRuntime(arm.body) : arm.body).trimStart(); if (/^return(\s|$)/.test(bodyTrimmed)) { - throw jaiphError( + diag.error( filePath, expr.loc.line, expr.loc.col, @@ -124,7 +130,7 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< ); } if (/`[^`]*`\s*\(/.test(bodyTrimmed) || bodyTrimmed.startsWith("```")) { - throw jaiphError( + diag.error( filePath, expr.loc.line, expr.loc.col, @@ -141,7 +147,7 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< const startsArgs = /^\s+\S/.test(after); if ((startsCall || startsArgs) && ident !== "fail" && ident !== "run" && ident !== "ensure") { const hint = ident === "error" ? ` did you mean "fail"?` : ""; - throw jaiphError( + diag.error( filePath, expr.loc.line, expr.loc.col, @@ -150,7 +156,7 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< ); } if (!startsCall && !startsArgs && after.trim() === "" && !knownVars.has(ident)) { - throw jaiphError( + diag.error( filePath, expr.loc.line, expr.loc.col, @@ -162,10 +168,10 @@ function validateMatchExpr(filePath: string, expr: MatchExprDef, knownVars: Set< } } if (wildcardCount === 0) { - throw jaiphError(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm"); + diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm"); } if (wildcardCount > 1) { - throw jaiphError(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm, found multiple"); + diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm, found multiple"); } } @@ -201,6 +207,7 @@ interface StepTreeWalk { } function walkStepTree( + diag: Diagnostics, filePath: string, steps: WorkflowStepDef[], envDecls: { name: string; loc: { line: number; col: number } }[] | undefined, @@ -234,7 +241,7 @@ function walkStepTree( ): void => { const prev = b.get(name); if (prev) { - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -243,7 +250,7 @@ function walkStepTree( ); } if (moduleScripts.has(name)) { - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -300,7 +307,7 @@ function walkStepTree( if (s.type === "for_lines") { knownVars.add(s.iterVar); if (bindings.has(s.iterVar)) { - throw jaiphError( + diag.error( filePath, s.loc.line, s.loc.col, @@ -361,6 +368,7 @@ function lookupCalleeParams( } function validateArity( + diag: Diagnostics, filePath: string, loc: { line: number; col: number }, ref: string, @@ -373,7 +381,7 @@ function validateArity( if (params === undefined) return; const argCount = args?.length ?? 0; if (argCount !== params.length) { - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -384,6 +392,7 @@ function validateArity( } function validateArgVarRefs( + diag: Diagnostics, filePath: string, loc: { line: number; col: number }, args: Arg[] | undefined, @@ -395,7 +404,7 @@ function validateArgVarRefs( if (a.kind !== "var") continue; if (recoverBindings?.has(a.name)) continue; if (knownVars.has(a.name)) continue; - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -406,6 +415,7 @@ function validateArgVarRefs( } function validateNestedManagedCallArgs( + diag: Diagnostics, filePath: string, loc: { line: number; col: number }, args: Arg[] | undefined, @@ -413,11 +423,12 @@ function validateNestedManagedCallArgs( if (!args) return; for (const a of args) { if (a.kind !== "literal") continue; - checkNestedManagedInLiteral(filePath, loc, a.raw); + checkNestedManagedInLiteral(diag, filePath, loc, a.raw); } } function checkNestedManagedInLiteral( + diag: Diagnostics, filePath: string, loc: { line: number; col: number }, raw: string, @@ -429,7 +440,7 @@ function checkNestedManagedInLiteral( const before = stripped.slice(0, match.index).trimEnd(); const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); if (lastToken === "run" || lastToken === "ensure") continue; - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -443,7 +454,7 @@ function checkNestedManagedInLiteral( const before = stripped.slice(0, btMatch.index).trimEnd(); const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); if (lastToken === "run" || lastToken === "ensure") continue; - throw jaiphError( + diag.error( filePath, loc.line, loc.col, @@ -499,13 +510,43 @@ export function resolveScriptImportPath(fromFile: string, importPath: string): s return resolve(dirname(fromFile), importPath); } +/** + * Legacy throwing entry. Builds a `Diagnostics` collector internally and + * throws the first sorted diagnostic via `jaiphError` so existing callers + * (and per-error tests) continue to see one error per failed compile. + * + * Use {@link collectDiagnostics} when you want the full set. + */ export function validateReferences(graph: ModuleGraph): void { + const diag = collectDiagnostics(graph); + diag.throwFirstIfAny(); +} + +/** + * New entry: walk the graph and append every validation error into a fresh + * `Diagnostics`. Never throws on user-level validation errors — non-validator + * problems (internal bugs) still bubble up. + */ +export function collectDiagnostics(graph: ModuleGraph): Diagnostics { + const diag = new Diagnostics(); for (const node of graph.modules.values()) { - validateModule(node.ast, graph); + validateModuleInto(node.ast, graph, diag); } + return diag; } +/** Legacy throwing per-module wrapper (kept for `emitScriptsForModuleFromGraph`). */ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { + const diag = new Diagnostics(); + validateModuleInto(ast, graph, diag); + diag.throwFirstIfAny(); +} + +export function validateModuleInto( + ast: jaiphModule, + graph: ModuleGraph, + diag: Diagnostics, +): void { const localChannels = new Set(ast.channels.map((c) => c.name)); const localRules = new Set(ast.rules.map((r) => r.name)); const localWorkflows = new Set(ast.workflows.map((w) => w.name)); @@ -515,53 +556,57 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (ast.scriptImports) { for (const si of ast.scriptImports) { - const resolved = resolveScriptImportPath(ast.filePath, si.path); - if (!existsSync(resolved)) { - throw jaiphError( - ast.filePath, - si.loc.line, - si.loc.col, - "E_IMPORT_NOT_FOUND", - `import script "${si.alias}" resolves to missing file "${resolved}"`, - ); - } - localScripts.add(si.alias); + diag.capture(() => { + const resolved = resolveScriptImportPath(ast.filePath, si.path); + if (!existsSync(resolved)) { + diag.error( + ast.filePath, + si.loc.line, + si.loc.col, + "E_IMPORT_NOT_FOUND", + `import script "${si.alias}" resolves to missing file "${resolved}"`, + ); + } + localScripts.add(si.alias); + }); } } const node = graph.modules.get(ast.filePath); for (const imp of ast.imports) { - if (importsByAlias.has(imp.alias)) { - throw jaiphError( - ast.filePath, - imp.loc.line, - imp.loc.col, - "E_VALIDATE", - `duplicate import alias "${imp.alias}"`, - ); - } - const resolved = node?.imports.get(imp.alias); - if (!resolved) { - throw jaiphError( - ast.filePath, - imp.loc.line, - imp.loc.col, - "E_IMPORT_NOT_FOUND", - `import "${imp.alias}" could not be resolved`, - ); - } - importsByAlias.set(imp.alias, resolved); - const importedAst = graph.modules.get(resolved)?.ast; - if (!importedAst) { - throw jaiphError( - ast.filePath, - imp.loc.line, - imp.loc.col, - "E_IMPORT_NOT_FOUND", - `import "${imp.alias}" resolves to missing file "${resolved}"`, - ); - } - importedAstCache.set(resolved, importedAst); + diag.capture(() => { + if (importsByAlias.has(imp.alias)) { + diag.error( + ast.filePath, + imp.loc.line, + imp.loc.col, + "E_VALIDATE", + `duplicate import alias "${imp.alias}"`, + ); + } + const resolved = node?.imports.get(imp.alias); + if (!resolved) { + diag.error( + ast.filePath, + imp.loc.line, + imp.loc.col, + "E_IMPORT_NOT_FOUND", + `import "${imp.alias}" could not be resolved`, + ); + } + importsByAlias.set(imp.alias, resolved); + const importedAst = graph.modules.get(resolved)?.ast; + if (!importedAst) { + diag.error( + ast.filePath, + imp.loc.line, + imp.loc.col, + "E_IMPORT_NOT_FOUND", + `import "${imp.alias}" resolves to missing file "${resolved}"`, + ); + } + importedAstCache.set(resolved, importedAst); + }); } const refCtx: RefResolutionContext = { @@ -637,7 +682,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { for (const ref of extractDotFieldRefs(content)) { const fields = promptSchemas.get(ref.varName); if (!fields) { - throw jaiphError( + diag.error( ast.filePath, loc.line, loc.col, @@ -646,7 +691,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { ); } if (!fields.includes(ref.fieldName)) { - throw jaiphError( + diag.error( ast.filePath, loc.line, loc.col, @@ -660,10 +705,10 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const validateWorkflowStringCaptures = (content: string, loc: { line: number; col: number }): void => { for (const cap of extractInlineCaptures(content)) { if (cap.kind === "run") { - validateNoShellRedirection(ast.filePath, loc, "run", cap.args); + validateNoShellRedirection(diag, ast.filePath, loc, "run", cap.args); validateRef({ value: cap.ref, loc }, ast, refCtx, expectRunTargetRef); } else { - validateNoShellRedirection(ast.filePath, loc, "ensure", cap.args); + validateNoShellRedirection(diag, ast.filePath, loc, "ensure", cap.args); validateRef({ value: cap.ref, loc }, ast, refCtx, expectRuleRef); } } @@ -672,10 +717,10 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const validateRuleStringCaptures = (content: string, loc: { line: number; col: number }): void => { for (const cap of extractInlineCaptures(content)) { if (cap.kind === "run") { - validateNoShellRedirection(ast.filePath, loc, "run", cap.args); + validateNoShellRedirection(diag, ast.filePath, loc, "run", cap.args); validateRef({ value: cap.ref, loc }, ast, refCtx, expectRunInRuleRef); } else { - validateNoShellRedirection(ast.filePath, loc, "ensure", cap.args); + validateNoShellRedirection(diag, ast.filePath, loc, "ensure", cap.args); validateRef({ value: cap.ref, loc }, ast, refCtx, expectRuleRef); } } @@ -690,31 +735,31 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { ): void => { if (body.kind === "call") { const loc = body.callee.loc; - validateNoShellRedirection(ast.filePath, loc, "run", body.args); - validateNestedManagedCallArgs(ast.filePath, loc, body.args); + validateNoShellRedirection(diag, ast.filePath, loc, "run", body.args); + validateNestedManagedCallArgs(diag, ast.filePath, loc, body.args); const isRuleScope = scope === "rule"; if (!body.callee.value.includes(".") && knownVars.has(body.callee.value) && !localScripts.has(body.callee.value) && !(scope === "workflow" && localWorkflows.has(body.callee.value))) { - throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `strings are not executable; "${body.callee.value}" is a string — use a script instead`); + diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `strings are not executable; "${body.callee.value}" is a string — use a script instead`); } validateRef(body.callee, ast, refCtx, isRuleScope ? expectRunInRuleRef : expectRunTargetRef); - validateArity(ast.filePath, loc, body.callee.value, body.args, "workflow", ast, refCtx); - validateArgVarRefs(ast.filePath, loc, body.args, knownVars, recoverBindings); + validateArity(diag, ast.filePath, loc, body.callee.value, body.args, "workflow", ast, refCtx); + validateArgVarRefs(diag, ast.filePath, loc, body.args, knownVars, recoverBindings); return; } if (body.kind === "ensure_call") { const loc = body.callee.loc; - validateNoShellRedirection(ast.filePath, loc, "ensure", body.args); - validateNestedManagedCallArgs(ast.filePath, loc, body.args); + validateNoShellRedirection(diag, ast.filePath, loc, "ensure", body.args); + validateNestedManagedCallArgs(diag, ast.filePath, loc, body.args); validateRef(body.callee, ast, refCtx, expectRuleRef); - validateArity(ast.filePath, loc, body.callee.value, body.args, "rule", ast, refCtx); - validateArgVarRefs(ast.filePath, loc, body.args, knownVars, recoverBindings); + validateArity(diag, ast.filePath, loc, body.callee.value, body.args, "rule", ast, refCtx); + validateArgVarRefs(diag, ast.filePath, loc, body.args, knownVars, recoverBindings); return; } if (body.kind === "inline_script") { return; // no ref to validate } if (body.kind === "match") { - validateMatchExpr(ast.filePath, body.match, knownVars); + validateMatchExpr(diag, ast.filePath, body.match, knownVars); return; } }; @@ -757,7 +802,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { // const const scriptName = extractConstScriptName(expr.raw); if (scriptName && localScripts.has(scriptName)) { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); } const inner = semanticQuotedOrchestrationInner(expr.raw); validateWorkflowStringCaptures(inner, stepLoc); @@ -780,16 +825,16 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (expr.kind === "match") { - validateMatchExpr(ast.filePath, expr.match, knownVars); + validateMatchExpr(diag, ast.filePath, expr.match, knownVars); return; } if (expr.kind === "prompt") { if (label !== "const") { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `prompt is not a valid ${label} value`); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `prompt is not a valid ${label} value`); } const promptIdent = promptBareIdentifier(expr.raw); if (promptIdent && localScripts.has(promptIdent)) { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); } validatePromptString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); if (expr.returns !== undefined) { @@ -806,14 +851,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (expr.kind === "bare_ref") { if (label !== "send") { - throw jaiphError(ast.filePath, expr.ref.loc.line, expr.ref.loc.col, "E_VALIDATE", `bare reference is only valid as a send payload`); + diag.error(ast.filePath, expr.ref.loc.line, expr.ref.loc.col, "E_VALIDATE", `bare reference is only valid as a send payload`); } validateRef(expr.ref, ast, refCtx, bareSendRefSpec); return; } if (expr.kind === "shell") { if (label !== "send") { - throw jaiphError(ast.filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", `raw shell fragment is only valid as a send payload`); + diag.error(ast.filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", `raw shell fragment is only valid as a send payload`); } validateManagedWorkflowShell(expr.command, makeSubEnv({ line: expr.loc.line, col: expr.loc.col })); return; @@ -843,7 +888,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } const scriptName = extractConstScriptName(expr.raw); if (scriptName && localScripts.has(scriptName)) { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); } validateRuleStringCaptures(stripDQ(expr.raw), stepLoc); validateSimpleInterpolationIdentifiers( @@ -864,28 +909,33 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (expr.kind === "match") { - validateMatchExpr(ast.filePath, expr.match, knownVars); + validateMatchExpr(diag, ast.filePath, expr.match, knownVars); return; } if (expr.kind === "prompt") { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); } if (expr.kind === "bare_ref" || expr.kind === "shell") { - throw jaiphError(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `${expr.kind} expression is not allowed in rules`); + diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `${expr.kind} expression is not allowed in rules`); } }; for (const rule of ast.rules) { - const ruleWalk = walkStepTree( - ast.filePath, - rule.steps, - ast.envDecls, - rule.params, - rule.loc, - localScripts, - parseSchemaFieldNames, - { withPromptSchemas: false }, - ); + let ruleWalk: StepTreeWalk | undefined; + diag.capture(() => { + ruleWalk = walkStepTree( + diag, + ast.filePath, + rule.steps, + ast.envDecls, + rule.params, + rule.loc, + localScripts, + parseSchemaFieldNames, + { withPromptSchemas: false }, + ); + }); + if (!ruleWalk) continue; const ruleKnownVars = ruleWalk.knownVars; const validateRuleStep = (s: WorkflowStepDef): void => { if (s.type === "trivia") return; @@ -902,11 +952,11 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { ); return; } - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); } // fail if (s.message.kind !== "literal") { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); } validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); const failInner = semanticQuotedOrchestrationInner(s.message.raw); @@ -918,7 +968,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return; } if (s.type === "send") { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "send is not allowed in rules"); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "send is not allowed in rules"); } if (s.type === "return") { validateRuleValueExpr(s.value, s.loc, ruleKnownVars, "return"); @@ -931,15 +981,15 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (s.type === "exec") { const body = s.body; if (body.kind === "prompt") { - throw jaiphError(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "prompt is not allowed in rules"); + diag.error(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "prompt is not allowed in rules"); } if (body.kind === "shell") { - throw jaiphError(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "inline shell steps are forbidden in rules; use explicit script blocks"); + diag.error(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "inline shell steps are forbidden in rules; use explicit script blocks"); } if (body.kind === "call" && (s as Extract).body.kind === "call") { const callBody = body; if (callBody.async) { - throw jaiphError(ast.filePath, callBody.callee.loc.line, callBody.callee.loc.col, "E_VALIDATE", "run async is not allowed in rules; use it in workflows only"); + diag.error(ast.filePath, callBody.callee.loc.line, callBody.callee.loc.col, "E_VALIDATE", "run async is not allowed in rules; use it in workflows only"); } } validateCallable(body, ruleKnownVars, "rule"); @@ -948,14 +998,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (s.type === "if") { if (s.operand.kind === "regex") { try { new RegExp(s.operand.source); } catch { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); } } return; } if (s.type === "for_lines") { if (!ruleKnownVars.has(s.sourceVar)) { - throw jaiphError( + diag.error( ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); @@ -966,7 +1016,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { return _never; }; for (const entry of ruleWalk.flat) { - validateRuleStep(entry.step); + diag.capture(() => validateRuleStep(entry.step)); } } @@ -974,51 +1024,58 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { const parts = channel.split("."); if (parts.length === 1) { if (!localChannels.has(channel)) { - throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } return; } if (parts.length !== 2) { - throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } const [alias, importedChannel] = parts; const importedFile = importsByAlias.get(alias); if (!importedFile) { - throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } const importedAst = importedAstCache.get(importedFile)!; const importedChannels = new Set(importedAst.channels.map((c) => c.name)); if (!importedChannels.has(importedChannel)) { - throw jaiphError(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); } }; for (const ch of ast.channels) { if (ch.routes) { for (const wfRef of ch.routes) { - validateRef(wfRef, ast, refCtx, expectWorkflowRef); - const targetParams = resolveRouteTargetParams(wfRef.value, ast, refCtx); - if (targetParams !== undefined && targetParams !== 3) { - throw jaiphError( - ast.filePath, wfRef.loc.line, wfRef.loc.col, "E_VALIDATE", - `inbox route target "${wfRef.value}" must declare exactly 3 parameters (message, channel, sender), but declares ${targetParams}`, - ); - } + diag.capture(() => { + validateRef(wfRef, ast, refCtx, expectWorkflowRef); + const targetParams = resolveRouteTargetParams(wfRef.value, ast, refCtx); + if (targetParams !== undefined && targetParams !== 3) { + diag.error( + ast.filePath, wfRef.loc.line, wfRef.loc.col, "E_VALIDATE", + `inbox route target "${wfRef.value}" must declare exactly 3 parameters (message, channel, sender), but declares ${targetParams}`, + ); + } + }); } } } for (const workflow of ast.workflows) { - const wfWalk = walkStepTree( - ast.filePath, - workflow.steps, - ast.envDecls, - workflow.params, - workflow.loc, - localScripts, - parseSchemaFieldNames, - { withPromptSchemas: true }, - ); + let wfWalk: StepTreeWalk | undefined; + diag.capture(() => { + wfWalk = walkStepTree( + diag, + ast.filePath, + workflow.steps, + ast.envDecls, + workflow.params, + workflow.loc, + localScripts, + parseSchemaFieldNames, + { withPromptSchemas: true }, + ); + }); + if (!wfWalk) continue; const wfKnownVars = wfWalk.knownVars; const promptSchemas = wfWalk.promptSchemas; @@ -1043,11 +1100,11 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { ); return; } - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); } // fail if (s.message.kind !== "literal") { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); } validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); const failInner = semanticQuotedOrchestrationInner(s.message.raw); @@ -1076,7 +1133,7 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { } if (body.kind === "shell") { if (hasUnquotedSendArrow(body.command) && matchSendOperator(body.command) === null) { - throw jaiphError( + diag.error( ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)", ); @@ -1085,14 +1142,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (/^(?:[A-Za-z_][A-Za-z0-9_]*)(?:\.[A-Za-z_][A-Za-z0-9_]*)*$/.test(t)) { if (!t.includes(".")) { if (localScripts.has(t) || localWorkflows.has(t)) { - throw jaiphError( + diag.error( ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", `use run ${t}() — a bare name that refers to a script or workflow must use a managed run step`, ); } } else { validateRef({ value: t, loc: body.loc }, ast, refCtx, expectRunTargetRef); - throw jaiphError( + diag.error( ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", `use run ${t}() — "${t}" is a valid script or workflow reference; use a managed run step`, ); @@ -1106,14 +1163,14 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { if (s.type === "if") { if (s.operand.kind === "regex") { try { new RegExp(s.operand.source); } catch { - throw jaiphError(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); + diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); } } return; } if (s.type === "for_lines") { if (!wfKnownVars.has(s.sourceVar)) { - throw jaiphError( + diag.error( ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `for ... in : "${s.sourceVar}" is not a known variable in this scope`, ); @@ -1125,68 +1182,73 @@ export function validateModule(ast: jaiphModule, graph: ModuleGraph): void { }; for (const entry of wfWalk.flat) { - validateStep(entry.step, entry.recoverBindings); + diag.capture(() => validateStep(entry.step, entry.recoverBindings)); } } if (ast.tests && ast.tests.length > 0) { - validateTestBlocks(ast, ast.tests); + validateTestBlocks(diag, ast, ast.tests); } } -function validateTestBlocks(ast: jaiphModule, tests: import("../types").TestBlockDef[]): void { +function validateTestBlocks( + diag: Diagnostics, + ast: jaiphModule, + tests: import("../types").TestBlockDef[], +): void { for (const tb of tests) { const inScope = new Set(); for (const step of tb.steps) { - if (step.type === "test_const") { - inScope.add(step.name); - continue; - } - if (step.type === "test_run_workflow") { - if (step.captureName) inScope.add(step.captureName); - continue; - } - if (step.type === "test_mock_prompt" && step.responseVar) { - if (!inScope.has(step.responseVar)) { - throw jaiphError( - ast.filePath, - step.loc.line, - step.loc.col, - "E_VALIDATE", - `mock prompt: undefined name "${step.responseVar}" (declare it earlier with: const ${step.responseVar} = "…")`, - ); + diag.capture(() => { + if (step.type === "test_const") { + inScope.add(step.name); + return; } - continue; - } - if ( - step.type === "test_expect_contain" || - step.type === "test_expect_not_contain" || - step.type === "test_expect_equal" - ) { - if (!inScope.has(step.variable)) { - throw jaiphError( - ast.filePath, - step.loc.line, - step.loc.col, - "E_VALIDATE", - `${step.type.replace("test_", "")}: undefined name "${step.variable}" (capture it first with: const ${step.variable} = run …)`, - ); + if (step.type === "test_run_workflow") { + if (step.captureName) inScope.add(step.captureName); + return; + } + if (step.type === "test_mock_prompt" && step.responseVar) { + if (!inScope.has(step.responseVar)) { + diag.error( + ast.filePath, + step.loc.line, + step.loc.col, + "E_VALIDATE", + `mock prompt: undefined name "${step.responseVar}" (declare it earlier with: const ${step.responseVar} = "…")`, + ); + } + return; } - const refName = + if ( + step.type === "test_expect_contain" || + step.type === "test_expect_not_contain" || step.type === "test_expect_equal" - ? step.expectedVar - : step.substringVar; - if (refName !== undefined && !inScope.has(refName)) { - throw jaiphError( - ast.filePath, - step.loc.line, - step.loc.col, - "E_VALIDATE", - `${step.type.replace("test_", "")}: undefined name "${refName}" (declare it earlier with: const ${refName} = "…")`, - ); + ) { + if (!inScope.has(step.variable)) { + diag.error( + ast.filePath, + step.loc.line, + step.loc.col, + "E_VALIDATE", + `${step.type.replace("test_", "")}: undefined name "${step.variable}" (capture it first with: const ${step.variable} = run …)`, + ); + } + const refName = + step.type === "test_expect_equal" + ? step.expectedVar + : step.substringVar; + if (refName !== undefined && !inScope.has(refName)) { + diag.error( + ast.filePath, + step.loc.line, + step.loc.col, + "E_VALIDATE", + `${step.type.replace("test_", "")}: undefined name "${refName}" (declare it earlier with: const ${refName} = "…")`, + ); + } } - continue; - } + }); } } } From 246fa7a14983960ac29287d89b339d657dfc06ed Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 16:16:18 +0200 Subject: [PATCH 11/14] Refactor: split validator into per-step visitor table by scope MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace the 1,441-LoC validate.ts monolith — two near-identical inner walkers (validateRuleStep, validateStep) plus the five-check call-shape sequence repeated at 12+ sites — with a two-file split. validate.ts (~430 LoC) keeps the outer layer: import / channel-route / test-block checks and walkStepTree (the single descent that builds knownVars, promptSchemas, and the flat step list). validate-step.ts (~1,025 LoC) holds the per-step visitor: one validateStep(step, ctx) entry, a VALIDATORS: Record table with one row per variant, a validateExpr(expr, ...) dispatcher over the 8 Expr.kind values, and a single validateCallable(expr, ctx) helper that runs the five managed-call-shape checks once for both call (run) and ensure_call (ensure). Rule-vs-workflow differences are captured in a Scope value (WORKFLOW_SCOPE / RULE_SCOPE) with allowSteps (single set-lookup gate at the top of validateStep), runRefExpect, and withPromptSchemas. Every E_VALIDATE message and source location is preserved bit-for-bit. New tests in validate-visitor.test.ts pin the invariants: a ≤700-line cap on validate.ts (AC1), a JSON snapshot over every validate-* txtar fixture asserting each diagnostic's { code, line, col, message } bit-for-bit (AC3), and an "unknown step type" test asserting a synthetic variant produces exactly one internal: no validator for step type "…" diagnostic in both scopes (AC4). The diagnostics-collector fatal-allowlist test now sums throw jaiphError / diag.error counts across both files. Docs updated in docs/architecture.md, docs/contributing.md, and docs/grammar.md. Implements design/2026-05-15-parser-compiler-simplification.md § Refactor 4. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 30 - docs/architecture.md | 12 +- docs/contributing.md | 1 + docs/grammar.md | 4 +- src/transpile/diagnostics-collector.test.ts | 20 +- src/transpile/validate-step.ts | 1025 +++++++++++++++++ src/transpile/validate-visitor.test.ts | 289 +++++ src/transpile/validate.ts | 927 +-------------- .../validate-diagnostics-snapshot.json | 990 ++++++++++++++++ 10 files changed, 2379 insertions(+), 920 deletions(-) create mode 100644 src/transpile/validate-step.ts create mode 100644 src/transpile/validate-visitor.test.ts create mode 100644 test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 777345f7..0d96e98c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Replace the 1,441-line validator switch with a per-step visitor table indexed by scope:** `src/transpile/validate.ts` used to be one ~1,441-LoC function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines): every step type's validation was written twice with subtle differences, and the five-check call-shape sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) was repeated by hand at 6+ sites per side — at least 12 places to keep in sync. Both inner walkers and every duplicated check site are gone. The validator now spans two files. `validate.ts` (~430 LoC) keeps the **outer** layer: import / channel-route / test-block checks and `walkStepTree` (the single descent that builds `{ knownVars, promptSchemas, flat }`). `validate-step.ts` (~1,025 LoC) holds the **per-step** visitor: a single `validateStep(step, ctx)` entry, a `VALIDATORS: Record` table with one row per step variant (`trivia`, `const`, `return`, `send`, `say`, `exec`, `if`, `for_lines`), one `validateExpr(expr, …)` dispatcher over the 8 `Expr.kind` values, and one `validateCallable(expr, ctx)` helper that runs the five managed-call-shape checks once for both `call` (`run`) and `ensure_call` (`ensure`) — parameterized by the scope's `runRefExpect` and the target kind. Rule-vs-workflow differences are captured in a `Scope` value (`WORKFLOW_SCOPE` / `RULE_SCOPE`) with three fields: `allowSteps: Set` (single set-lookup gate at the top of `validateStep` — rules reject `send` outright; rules also reject `prompt` and `run async` from inside `exec` bodies), `runRefExpect: RefExpectMessages` (workflow vs rule semantics for `run ref(…)`), and `withPromptSchemas: boolean` (workflows collect prompt-returning bindings, rules skip schema collection). `ValidatorCtx` threads the scope plus the precomputed `knownVars`, `promptSchemas`, and `recoverBindings` into every visitor — none of which are re-derived per step. Every existing `E_VALIDATE` error message and source location is preserved bit-for-bit: the entire `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New acceptance tests in `src/transpile/validate-visitor.test.ts` pin the invariants: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` (AC1); a JSON snapshot over every `validate-*` txtar fixture (`test-fixtures/compiler-txtar/validate-errors.txt` + `validate-errors-multi-module.txt`) stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json` asserts each diagnostic's `{ code, line, col, message }` against `collectDiagnostics(graph)` bit-for-bit (AC3 — refreshable via `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional); and an "unknown step type" test casts a synthetic step variant into `WorkflowStepDef`, runs `validateStep` in both `WORKFLOW_SCOPE` and `RULE_SCOPE`, and asserts each call produces exactly one diagnostic with the documented `internal: no validator for step type "…"` message (AC4 — proving that adding a new step type costs exactly one row in `VALIDATORS`). The existing `src/transpile/validate-single-walk.test.ts` still passes — `walkStepTree`'s internal `descend` remains the only recursive `WorkflowStepDef[]` walker in `validate.ts` (AC2). The `diagnostics-collector.test.ts` "fatal allowlist" scan now sums `throw jaiphError(` counts across `validate.ts` + `validate-step.ts` (both files are zero) and `diag.error(` counts likewise (≥40). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: changes to validation rules (the *what* — this refactor only changes the *how*), parser changes, AST changes. Docs updated in `docs/architecture.md` (rewrote the **Validator** section to describe the two-file split, `VALIDATORS` table, `Scope` value, and the single `validateCallable` helper), `docs/contributing.md` (new **Validator visitor-table shape** row in the test-layer table), and `docs/grammar.md` (refreshed two stale `validateRuleStep` references to point at the new visitor / `RULE_SCOPE`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. - **Refactor — Replace fail-fast errors with a `Diagnostics` collector that aggregates every recoverable error per compile:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error, so a user fixed one error, recompiled, hit the next, recompiled, and so on. The validator also pre-ordered some checks defensively because it knew it would only get to surface one error per run. That model is replaced. A new `Diagnostics` class lives in `src/diagnostics.ts` and exposes `add(d)`, `error(file, line, col, code, message)` (records the diagnostic and short-circuits the current unit through a `BailoutError`), `capture(fn)` (runs `fn` and absorbs both `BailoutError` and any thrown legacy `jaiphError` whose message parses as `path:line:col CODE message` — turning the throw into a recoverable entry without re-throwing), `hasErrors()` / `hasFatal()`, `sorted()` (stable order by file, then line, then column), `formatLines()` (one `path:line:col CODE message` per line), and a legacy `throwFirstIfAny()` bridge that throws the first sorted diagnostic via `jaiphError` so existing single-error call sites and per-error tests are unchanged. `src/transpile/validate.ts` exposes a new `collectDiagnostics(graph): Diagnostics` entry that walks the import closure and never throws on user-level errors; the previous `validateReferences(graph)` is now a thin wrapper that calls `collectDiagnostics` and then `throwFirstIfAny()`, preserving the throw-on-first contract for `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph` and for every existing `parse-*.test.ts` / `validate-*.test.ts` fixture that asserts one specific `{ message, line, col, code }`. Inside `validate.ts` every `throw jaiphError(...)` site at user-level (~50 sites across import resolution, channel-route validation, per-rule and per-workflow step walks, prompt schema checks, and `validateTestBlocks`) is migrated to `diag.error(...)`; each top-level unit is wrapped in `diag.capture(...)` (per-import block, per-channel route, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step) so the bailout from one error unwinds only that unit and the next sibling still runs. The four leaf validation helpers (`validate-ref-resolution.ts`, `validate-string.ts`, `validate-prompt-schema.ts`, `shell-jaiph-guard.ts`) still throw via `jaiphError`, but every caller wraps them in `diag.capture(...)`, which converts the thrown error into a recoverable diagnostic and returns. The CLI command `jaiph compile` (`src/cli/commands/compile.ts`) is rewritten to route through `collectDiagnostics`: it accumulates every error from every entry's import closure, sorts them by `(file, line, col)`, and prints the full set — as a single JSON array on stdout under `--json`, or as one `path:line:col CODE message` line per diagnostic on stderr otherwise — exiting **1** on any non-empty diagnostic set. Fatal aborts during graph load or parsing (unterminated triple-quote, unterminated brace block, missing imports during graph build) are reported as a single diagnostic for the affected entry; the command then continues with the next entry. New tests in `src/transpile/diagnostics-collector.test.ts` pin the invariants: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target in one workflow body) asserts `collectDiagnostics(graph)` returns all three in source order (AC1); a source-tree scan asserts `validate.ts` holds **zero** `throw jaiphError(` sites and **≥40** `diag.error(` sites, and that every remaining `throw jaiphError(` under `src/` lives in the documented fatal allowlist — `src/diagnostics.ts` (legacy bridge), `src/parse/core.ts` (parser `fail()`), `src/cli/commands/test.ts` (test-file shape fatal), `src/transpile/module-graph.ts` (loader), `src/transpile/validate-string.ts`, `src/transpile/validate-prompt-schema.ts`, `src/transpile/validate-ref-resolution.ts`, `src/transpile/shell-jaiph-guard.ts` (leaf helpers, each captured) (AC3); and a CLI test runs `jaiph compile --json` against the same fixture and asserts the returned array has all three diagnostics and `status !== 0` (AC4). Existing single-error tests (every `parse-*.test.ts` and `validate-*.test.ts` that pins one specific `{ message, line, col, code }`) still pass because `validateReferences` continues to throw the first sorted diagnostic (AC2); `npm test` and `npm run build` pass (AC5). User-visible contracts on the `jaiph run` / `jaiph test` paths — banner, hooks, run artifacts, exit codes, `__JAIPH_EVENT__` streaming, and golden corpus — are unchanged. Out of scope: changing what counts as an error (this refactor only changes the *how*); LSP integration follows in a separate task. Docs updated in `docs/architecture.md` (new **Diagnostics collector (recoverable errors)** bullet under **Validator**; updated **System overview** to describe the two entry points and the new `jaiph compile` behavior), `docs/cli.md` (new **Multiple-error reporting** paragraph and refined **`--json`** description under **`jaiph compile`**), and `docs/contributing.md` (new **Diagnostics collector shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. - **Refactor — Fold the validator's three workflow pre-passes into a single step-tree walk:** `src/transpile/validate.ts` used to descend each workflow's / rule's step tree four times before its main check loop finished — `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`, and the per-step validator itself — each re-implementing the same recursion over `if` / `for_lines` / `catch` / `recover` with subtly different rules, so "what counts as a binding here" fixes had to land in two or three walkers. The three pre-pass helpers are deleted. One new helper `walkStepTree(filePath, steps, envDecls, params, declLoc, moduleScripts, parseSchemaFieldNames, { withPromptSchemas })` descends the tree once and returns `{ knownVars, promptSchemas, flat }`: it accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings — workflow walks set `withPromptSchemas: true`, rule walks set it `false`), enforces immutable-binding and `script`-collision rules inline through a shared `bindings` map (with a fresh inner map under each `for_lines` body so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding (`recoverBindings: Set | undefined`) attached. The per-workflow and per-rule validator loops now iterate that flat list non-recursively — the `if` / `for_lines` / `catch` / `recover` recursion that used to live inside `validateStep` / `validateRuleStep` is gone. `walkStepTree`'s internal `descend` is the only recursive helper in the file that takes a `WorkflowStepDef[]`. Failure order matches the prior "binding errors first, then per-step errors" behavior because binding checks fire during the descent, before any flat-list iteration starts. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit: the full `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New tests pin the invariants: `src/transpile/validate-single-walk.test.ts` greps `validate.ts` and fails if any of `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear by name (AC1), and a textual AST scan asserts that at most one recursive helper whose parameter list mentions `WorkflowStepDef[]` exists in the file (AC2). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the visitor-table refactor (Refactor 4) and any change to validation rules. Docs updated in `docs/architecture.md` (new **Single workflow walk** bullet under **Validator**) and `docs/contributing.md` (new **Validator single-walk shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. - **Refactor — Collapse the AST around a single `Expr` type, eliminating the three "managed call" encodings:** The same concept "a managed call that yields a value" used to be encoded three different ways: as a statement (`{ type: "run", workflow, args }`), as a const RHS (`{ kind: "run_capture", ref, args }`), and as a `managed:` sidecar on `return` / `log` / `logerr` whose `value` / `message` carried a placeholder string (`"__match__"`, `"run inline_script"`, etc.). Inline scripts added a fourth (`run_inline_script_capture`); `prompt`, `match`, and `ensure` captures repeated the same dual representation. The validator, formatter, emitter, and runtime each had to handle both branches at every site. All three encodings are gone. The semantic AST now has a single `Expr` tagged union — `literal | call | ensure_call | inline_script | prompt | match | shell | bare_ref` — used everywhere a value can appear: `const name = `, `return `, `send channel <- `, the message of `log` / `logerr` / `fail`, and the body of an `exec` step (the new statement-form managed call, where the value is consumed for its side effects plus optional capture). `ConstRhs` and `SendRhsDef` are deleted as separate types. The `managed:` sidecar field is deleted from `WorkflowStepDef`. The placeholder strings `"__match__"`, `"run inline_script"`, and `"__JAIPH_MANAGED__"` no longer appear anywhere under `src/`. `WorkflowStepDef` collapses from 14 variants to **8** (`exec`, `const`, `return`, `send`, `say`, `if`, `for_lines`, `trivia`): `exec` is the new managed-statement form covering the prior `run` / `ensure` / `run_inline_script` / `prompt` / `shell` / standalone `match` cases (the discriminator now lives inside `body.kind`, with `captureName` / `catch` / `recover` as step-level attributes); `say` covers the prior `log` / `logerr` / `fail` cases (`level: "fail"` aborts the workflow with the message, otherwise the message is written to the corresponding stream); `comment` / `blank_line` collapse into a single `trivia` variant (formatter-only, skipped by validator and runtime). The parser builds `Expr` nodes directly: `parseConstRhs` returns `{ value: Expr }`; `parseSendRhs` returns `{ value: Expr }`; `parsePromptStep` returns an `exec` step whose `body` is an `Expr.prompt`; `return run …` / `return ensure …` / `return match …` / `return run \`…\`(…)` build `Expr.call` / `Expr.ensure_call` / `Expr.match` / `Expr.inline_script` directly with no sidecar; `log run \`…\`(…)` and `logerr run \`…\`(…)` build `say` steps whose `message` is an `Expr.inline_script`. Downstream consumers compress accordingly: the validator switches on the 8-variant `WorkflowStepDef.type` and the 8-kind `Expr.kind` with no "literal value vs managed sidecar" fork; the formatter renders each `Expr` through one `emitExpr` helper instead of branching on a sidecar; the runtime has one private `evaluateExpr(scope, expr, …)` dispatcher that `const` / `return` / `send` / `say` / `exec` all delegate to (which runs the managed call for `call` / `ensure_call` / `inline_script`, walks `match` arms, schema-checks `prompt`, and interpolates `literal` via `interpolateWithCaptures`); the script-emit walk in `src/transpile/emit-script.ts` finds inline-script bodies by recursing into each step's `Expr` payload rather than enumerating the four legacy carriers. New tests pin the invariants: `src/types-shape.test.ts` is a compile-time exhaustive `switch` plus runtime tuple assertion that `WorkflowStepDef` has exactly **8** variants and `Expr` has exactly **8** kinds (AC2), a `grep` over every non-test `.ts` file under `src/` that fails if any of the placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) reappear (AC1), and an export-surface check that fails if `ConstRhs` or `SendRhsDef` are re-exported from `src/types.ts` (AC3). Updated parser tests in `src/parse/parse-return.test.ts`, `src/parse/parse-const-rhs.test.ts`, `src/parse/parse-prompt.test.ts`, `src/parse/parse-send-rhs.test.ts`, `src/parse/parse-steps.test.ts`, `src/parse/parse-inline-script.test.ts`, and `src/parse/parse-bare-call.test.ts` assert the new `Expr` shape directly for `return run …`, `return ensure …`, `return match … { … }`, `return run \`…\`(…)`, `log run \`…\`(…)`, and `const x = prompt …` (AC4). The golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`) passes byte-for-byte against the emitted bash output; `src/format/roundtrip.test.ts` round-trips bit-for-bit on every fixture; `npm run build` passes with zero TypeScript strict-mode errors (AC5 / AC6). Golden AST fixtures under `test-fixtures/golden-ast/expected/` are regenerated to reflect the new step shapes (`exec` wrapping every managed call, `say` replacing `log` / `logerr` / `fail`, `trivia` replacing `comment` / `blank_line`, `Expr` value/message/body payloads). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: surface syntax, the validator's deeper structural rewrite (Refactor 4), and parser internals (Refactors 1 & 2). Docs updated in `docs/architecture.md` (rewrote the **AST / Types** bullet to describe the single `Expr` sum and the 8-variant `WorkflowStepDef`; updated **Validator**, **Formatter**, **Node Workflow Runtime**, and **Trivia / CST layer** bullets to drop the dual-representation language; rewrote the `match_expr` mention in **CLI progress reporting pipeline** to use `Expr.kind === "match"`) and `docs/contributing.md` (new **`Expr` / step-variant shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 3. diff --git a/QUEUE.md b/QUEUE.md index 49b066d8..f81f6046 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,36 +13,6 @@ Process rules: *** -## Replace the 1,441-line validator switch with a per-step visitor table indexed by scope #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. - -**Why:** `src/transpile/validate.ts` is one function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines). Each step type's validation is written twice with subtle differences, and the 5-check sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) is repeated by hand at 6+ sites per side — at least 12 places to keep in sync. - -**Scope:** - -- Replace the two inner walkers with a single AST visitor parameterized by a `Scope` value: - - `Scope` carries `allow: Set`, `refSpec: RefSpec`, and any other rule-vs-workflow differences. - - A `VALIDATORS: Record` table holds one validator per step type, written once. - - `validateCallStep("run" | "ensure")` is a single helper invoked by both `run` and `ensure` validators with different ref-spec / arity-kind arguments. -- The 5-check sequence is encapsulated in one helper (`validateManagedCallShape` or similar) invoked from each call-bearing validator. -- "Is this step allowed in this scope?" becomes a single set-lookup at the top of the visitor, not three throw sites. -- All existing error messages and error codes (`E_VALIDATE`, etc.) are preserved verbatim — both content and source location (line/col) must match what users see today. - -**Acceptance criteria** (each verified by a test): - -1. `src/transpile/validate.ts` is at most 700 lines (down from 1,441). Add a CI check (or test) that fails if it exceeds the bound. -2. `validateReferences` contains exactly one step-walking function. A grep test fails if a second walker is introduced. -3. Every `E_VALIDATE` error message and error location produced today is produced bit-for-bit by the new code. Add a snapshot-style test over every `validate-*.test.ts` fixture asserting `{ message, line, col, code }` matches the pre-refactor output. -4. Adding a new step type requires adding exactly one row to `VALIDATORS` and (if needed) updating the `Scope.allow` sets. Add a test that introduces a synthetic step type behind a test-only flag and asserts the validator rejects it with a single expected message until the row is added. -5. `npm test` passes (all of `validate-immutable-bindings.test.ts`, `validate-managed-calls.test.ts`, `validate-match.test.ts`, `validate-prompt-schema.test.ts`, `validate-ref-resolution.test.ts`, `validate-run-async.test.ts`, `validate-string.test.ts`, `validate-substitution.test.ts`, `validate-type-crossing.test.ts`, plus the golden corpus). - -**Out of scope:** changes to validation rules (the *what*) — this refactor only changes the *how*. Parser changes. AST changes (Refactor 3 must already be merged). - -**Dependency:** Refactor 3 (Expr collapse) and the single-pass-walk + Diagnostics tasks (previous two) must be complete first; otherwise the new visitor still needs to special-case the `managed:` sidecar and the pre-pass-walker pattern. - -*** - ## Decouple the validator from runtime semantics #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. diff --git a/docs/architecture.md b/docs/architecture.md index f9033424..fae2a109 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -50,12 +50,14 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - `Trivia` is a parallel store keyed by AST-node identity (per-node via `WeakMap`) and a small `ModuleTrivia` record for module-level data. The parser builds it alongside the AST; **only the formatter reads it**. Validator, emitter, transpiler, and runtime never import from `src/parse/trivia.ts` — a grep test (`src/parse/trivia-grep.test.ts`) pins this invariant by rejecting any reference to `Trivia` / `createTrivia` / `NodeTrivia` / `ModuleTrivia` from validator and emitter source files. - A separate type-shape test (`src/parse/trivia-ast-shape.test.ts`) asserts at compile time that none of the formatter-only fields reappear on `jaiphModule`, `ImportDef`, `ScriptImportDef`, `ChannelDef`, `TestBlockDef`, `WorkflowMetadata`, `ScriptDef`, or any `WorkflowStepDef` / `Expr` variant. (`ConstRhs` / `SendRhsDef` no longer exist — their fields live inside `Expr` — and `src/types-shape.test.ts` fails if those symbols reappear as exports of `src/types.ts`.) -- **Validator (`src/transpile/validate.ts`)** +- **Validator (`src/transpile/validate.ts` + `src/transpile/validate-step.ts`)** - Resolves imports and symbol references; emits deterministic compile-time errors. Import resolution (`resolveImportPath` in `transpile/resolve.ts`) checks relative paths first, then falls back to project-scoped libraries under `/.jaiph/libs/` — the workspace root is threaded through all compilation call sites. Export visibility is enforced by `validateRef` in `validate-ref-resolution.ts`: if an imported module declares any `export`, only exported names are reachable through the import alias. - - **Diagnostics collector (recoverable errors).** The validator no longer fails fast on the first user-level error. Every recoverable check appends to a `Diagnostics` collector (`src/diagnostics.ts`) via `diag.error(file, line, col, code, msg)`, which records a `JaiphDiagnostic` and short-circuits the current validation unit through a `BailoutError`. Each top-level unit (per-import block, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step, per-channel route) is wrapped in `diag.capture(fn)`, which absorbs the bailout (and any thrown `jaiphError` from leaf helpers like `validate-ref-resolution.ts` / `validate-string.ts` / `validate-prompt-schema.ts` / `shell-jaiph-guard.ts`) so the next sibling unit still runs. `collectDiagnostics(graph)` walks every module and returns the populated collector; the legacy `validateReferences(graph)` is now a thin wrapper that throws the first sorted diagnostic via `jaiphError` so existing per-error tests and the `emitScriptsForModuleFromGraph` path keep working unchanged. `Diagnostics.sorted()` returns errors ordered by `(file, line, col)`; `formatLines()` renders the standard `path:line:col CODE message` shape. A grep test (`src/transpile/diagnostics-collector.test.ts`) pins the migration: `validate.ts` holds **zero** `throw jaiphError(` sites, and the remaining `throw jaiphError(` call sites under `src/` are confined to a documented allowlist — fatal aborts in the parser (`src/parse/core.ts`), the loader (`src/transpile/module-graph.ts`), and the test-file shape check (`src/cli/commands/test.ts`); the legacy bridge in `src/diagnostics.ts`; and the four leaf validation helpers above, each of which has every caller wrapped in `diag.capture(...)`. - - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation, walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. - - **Single workflow walk.** Each workflow / rule has its step tree descended exactly once by `walkStepTree`, which simultaneously accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings, gated by `options.withPromptSchemas` so rules skip schema collection), enforces immutable-binding / `script`-collision rules inline (mutating a shared `bindings` map and threading a fresh inner map under each `for_lines` so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding attached. The main per-step validator loop iterates that flat list non-recursively, so `walkStepTree`'s internal `descend` is the **only** recursive helper in the file that takes a `WorkflowStepDef[]`. A pair of grep / AST tests (`src/transpile/validate-single-walk.test.ts`) pins both invariants: the prior helpers (`collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`) cannot reappear by name, and at most one recursive `WorkflowStepDef[]` walker may live in `validate.ts`. - - Per call site the validator runs five checks against the typed **`Arg[]`** directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution, arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. There is no longer a separate `validateBareIdentifierArgs` helper, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. + - **Two-file split.** `validate.ts` owns the **outer** layer: import / channel-route / test-block checks plus `walkStepTree` (the single descent that builds `{ knownVars, promptSchemas, flat }` for each workflow / rule). `validate-step.ts` owns the **per-step** visitor: one row per `WorkflowStepDef.type` in a `VALIDATORS: Record` table, a single `validateExpr` dispatcher over the 8 `Expr.kind` values, and the call-shape / channel / string-content helpers. `validate.ts` is bounded at **≤700 lines** (currently ~430) by a CI-style test in `src/transpile/validate-visitor.test.ts`; new validators belong in `validate-step.ts`. + - **Visitor table + scope.** Per-step validation has one entry point — `validateStep(step, ctx)` in `validate-step.ts`. It looks the step's `type` up in `VALIDATORS` (the dispatch table), then consults `ctx.scope.allowSteps` (a `Set`) once to decide whether this step is permitted in the current scope. Two scopes exist: `WORKFLOW_SCOPE` (allows every step variant including `send` and `prompt`) and `RULE_SCOPE` (rejects `send` outright; rejects `prompt` and `run async` from inside `exec` bodies). The scope also carries `runRefExpect` (`RUN_TARGET_REF_EXPECT` for workflows, `RUN_IN_RULE_REF_EXPECT` for rules) and `withPromptSchemas` (workflows collect prompt-returning bindings; rules skip schema collection). Adding a new step type requires exactly one row in `VALIDATORS` and, if the rule/workflow split needs to differ, an entry in `Scope.allowSteps` — an `AC4` test in `validate-visitor.test.ts` injects a synthetic step type and asserts it produces exactly one diagnostic with the documented `internal: no validator for step type "…"` message until the row is added. + - **Single managed-call-shape helper.** Every `call` / `ensure_call` site runs the same five checks against the typed `Arg[]` directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution (with the scope's `runRefExpect` for `call`, `RULE_REF_EXPECT` for `ensure_call`), arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. The sequence lives once in `validateCallable(expr, ctx)`; both `run` and `ensure` validators invoke it with a different ref expectation / target kind. There is no longer a separate `validateBareIdentifierArgs` helper, no per-site repetition of the five-step sequence, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. + - **Diagnostics collector (recoverable errors).** The validator no longer fails fast on the first user-level error. Every recoverable check appends to a `Diagnostics` collector (`src/diagnostics.ts`) via `diag.error(file, line, col, code, msg)`, which records a `JaiphDiagnostic` and short-circuits the current validation unit through a `BailoutError`. Each top-level unit (per-import block, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step, per-channel route) is wrapped in `diag.capture(fn)`, which absorbs the bailout (and any thrown `jaiphError` from leaf helpers like `validate-ref-resolution.ts` / `validate-string.ts` / `validate-prompt-schema.ts` / `shell-jaiph-guard.ts`) so the next sibling unit still runs. `collectDiagnostics(graph)` walks every module and returns the populated collector; the legacy `validateReferences(graph)` is now a thin wrapper that throws the first sorted diagnostic via `jaiphError` so existing per-error tests and the `emitScriptsForModuleFromGraph` path keep working unchanged. `Diagnostics.sorted()` returns errors ordered by `(file, line, col)`; `formatLines()` renders the standard `path:line:col CODE message` shape. A grep test (`src/transpile/diagnostics-collector.test.ts`) pins the migration: `validate.ts` + `validate-step.ts` hold **zero** `throw jaiphError(` sites, and the remaining `throw jaiphError(` call sites under `src/` are confined to a documented allowlist — fatal aborts in the parser (`src/parse/core.ts`), the loader (`src/transpile/module-graph.ts`), and the test-file shape check (`src/cli/commands/test.ts`); the legacy bridge in `src/diagnostics.ts`; and the four leaf validation helpers above, each of which has every caller wrapped in `diag.capture(...)`. + - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation (`validateCallable`), walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. + - **Single workflow walk.** Each workflow / rule has its step tree descended exactly once by `walkStepTree` (in `validate.ts`), which simultaneously accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings, gated by `options.withPromptSchemas` so rules skip schema collection), enforces immutable-binding / `script`-collision rules inline (mutating a shared `bindings` map and threading a fresh inner map under each `for_lines` so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding attached. The main per-step validator loop iterates that flat list non-recursively and calls `validateStep` once per entry, so `walkStepTree`'s internal `descend` is the **only** recursive helper in `validate.ts` that takes a `WorkflowStepDef[]`. A pair of grep / AST tests (`src/transpile/validate-single-walk.test.ts`) pins both invariants: the prior helpers (`collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`) cannot reappear by name, and at most one recursive `WorkflowStepDef[]` walker may live in `validate.ts`. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** - **`emitScriptsForModuleFromGraph`** validates one module against the graph and runs **`buildScriptFiles`** — the only compile path for `jaiph run` / `jaiph test` — **persists only atomic `script` files** under `scripts/`. **`buildScripts(input, outDir, ws?)`** is the path-based wrapper used by tests and the directory walk; it loads a `ModuleGraph` and delegates. **`buildScriptsFromGraph(graph, outDir)`** is the graph-based entry point used by `jaiph run` / `jaiph test`, which already loaded the graph. Inline scripts (`` run `body`(args) ``) are also emitted as `scripts/__inline_` with deterministic hash-based names (`inlineScriptName` in `src/inline-script-name.ts`). There is no workflow-level bash emission. diff --git a/docs/contributing.md b/docs/contributing.md index 8c5e9e6e..0bac96df 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -106,6 +106,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **Call-args AST shape** | `src/parse/arg-ast-shape.test.ts`, `src/parse/arg-grep.test.ts` | Pins the typed-`Arg[]` invariant: no `bareIdentifierArgs` field on any call-bearing AST type (compile-time), no `args.split(",")` or `bareIdentifierArgs` text in production `src/parse/` or `src/transpile/` sources, and no `validateBareIdentifierArgs` helper in the validator | You changed how call arguments flow through the parser, validator, or emitter and need to confirm nothing re-introduces the parallel raw-string representation (see [Architecture — AST / Types](architecture.md#core-components)) | | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | +| **Validator visitor-table shape** | `src/transpile/validate-visitor.test.ts` | Pins the per-step visitor refactor: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` instead of the outer entry; a `JSON` snapshot over every `validate-*` txtar fixture (stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json`) pins each diagnostic's `{ code, line, col, message }` bit-for-bit, so any drift in wording or location across the visitor table fails the test; an "unknown step type" test injects a synthetic `WorkflowStepDef.type` and asserts it produces exactly one `internal: no validator for step type "…"` diagnostic in both `WORKFLOW_SCOPE` and `RULE_SCOPE` — proving adding a new step type costs exactly one row in `VALIDATORS` | You touched the `VALIDATORS` table, changed `validateStep` / `validateExpr` / `validateCallable` / `Scope`, added or renamed a per-step validator in `validate-step.ts`, or changed any `E_VALIDATE` message wording or source location — refresh the snapshot with `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional (see [Architecture — Validator](architecture.md#core-components)) | | **Diagnostics collector shape** | `src/transpile/diagnostics-collector.test.ts` | Pins the migration from fail-fast `throw jaiphError(...)` to the `Diagnostics` collector: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target) asserts that `collectDiagnostics(graph)` returns **all three** in source order; a source grep asserts `validate.ts` holds **zero** `throw jaiphError(` sites and many `diag.error(` sites; an allowlist scan over every non-test `*.ts` under `src/` rejects new `throw jaiphError(` sites outside the documented fatal subset (parser `fail()`, loader, test-file shape check, legacy bridge, four leaf helpers wrapped in `diag.capture(...)`); a CLI test asserts `jaiph compile --json` returns the full diagnostic array and exits non-zero | You added a new `throw jaiphError(...)` site, migrated more checks to the collector, changed the fatal/recoverable boundary, or changed `jaiph compile`'s exit-code or output shape (see [Architecture — Validator](architecture.md#core-components) and [CLI](cli.md)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | diff --git a/docs/grammar.md b/docs/grammar.md index 9de30995..ca4e973a 100644 --- a/docs/grammar.md +++ b/docs/grammar.md @@ -1064,12 +1064,12 @@ single_workflow_stmt = ensure_stmt | run_stmt | run_catch_stmt | run_recover_stm | send_stmt ; (* Actual catch/recover bodies use parseCatchStatement in src/parse/steps.ts: a richer subset than this sketch, including inline shell text for workflow recovery blocks — rule bodies still - reject unstructured shell via validateRuleStep. *) + reject unstructured shell via the visitor's RULE_SCOPE (validate-step.ts). *) ``` ## Validation Rules -After parsing, the compiler validates references and config (`src/transpile/validate.ts`). Error codes: +After parsing, the compiler validates references and config (`src/transpile/validate.ts` for the module-level entry plus the single workflow walk; `src/transpile/validate-step.ts` for the per-step visitor table). Error codes: - **E_PARSE:** Invalid syntax — duplicate config, invalid keys/values, `$(…)` or `${var:-fallback}` in orchestration strings, `${...}` interpolation in **single-line backtick** script bodies, `prompt … returns` without `const` capture, `name = prompt …` / assignment captures without `const` for `run`/`ensure`, bare `ref(args)` in const RHS (use `run`/`ensure`/`prompt`), `local` at top level, unrecognized workflow/rule line, invalid send RHS, arguments after `catch`, bare `catch` with no recovery step, nested inline captures, shell redirection after `run`/`ensure`, invalid parameter names (non-identifier, duplicate, or reserved keyword), or missing `{` on definition line. - **E_SCHEMA:** Invalid `returns` schema — empty, non-flat, unsupported type (only `string`, `number`, `boolean`). diff --git a/src/transpile/diagnostics-collector.test.ts b/src/transpile/diagnostics-collector.test.ts index 757a61f5..59b6a290 100644 --- a/src/transpile/diagnostics-collector.test.ts +++ b/src/transpile/diagnostics-collector.test.ts @@ -10,6 +10,7 @@ import { collectDiagnostics } from "./validate"; // Compiled test sits at dist/src/transpile/; the source tree is three levels up. const repoRoot = resolve(__dirname, "../../.."); const validatePath = resolve(repoRoot, "src/transpile/validate.ts"); +const validateStepPath = resolve(repoRoot, "src/transpile/validate-step.ts"); const cliJsPath = resolve(repoRoot, "dist/src/cli.js"); /** @@ -89,19 +90,26 @@ test("Diagnostics: collects 3 independent errors from one compile in source orde * exercise the throwing legacy bridge. */ test("Diagnostics: throwing call-sites match the documented fatal allowlist", () => { - const src = readFileSync(validatePath, "utf8"); - const throwCount = (src.match(/throw\s+jaiphError\(/g) ?? []).length; + const validateSrc = readFileSync(validatePath, "utf8"); + const validateStepSrc = readFileSync(validateStepPath, "utf8"); + const throwCount = + (validateSrc.match(/throw\s+jaiphError\(/g) ?? []).length + + (validateStepSrc.match(/throw\s+jaiphError\(/g) ?? []).length; assert.equal( throwCount, 0, - `expected validate.ts to use diag.error exclusively, found ${throwCount} throw jaiphError sites`, + `expected validate.ts + validate-step.ts to use diag.error exclusively, found ${throwCount} throw jaiphError sites`, ); - // Sanity: confirm the migration replaced rather than removed. - const diagErrorCount = (src.match(/diag\.error\(/g) ?? []).length; + // Sanity: confirm the migration replaced rather than removed. After Refactor 4 + // (visitor-table validator) the bulk of these sites moved into the sibling + // `validate-step.ts`, so count across both files. + const diagErrorCount = + (validateSrc.match(/diag\.error\(/g) ?? []).length + + (validateStepSrc.match(/diag\.error\(/g) ?? []).length; assert.ok( diagErrorCount >= 40, - `expected many diag.error sites, found ${diagErrorCount}`, + `expected many diag.error sites across validate.ts + validate-step.ts, found ${diagErrorCount}`, ); // The fatal allowlist: files where a `throw jaiphError(...)` is allowed diff --git a/src/transpile/validate-step.ts b/src/transpile/validate-step.ts new file mode 100644 index 00000000..a672e0c7 --- /dev/null +++ b/src/transpile/validate-step.ts @@ -0,0 +1,1025 @@ +/** + * Visitor table for the validator: one row per step type, one expression + * dispatcher, and the small per-call-shape helper that holds the five + * standard checks. `validateStep` is the only entry point — it consults + * `Scope.allowSteps` once and dispatches into `VALIDATORS`; everything below + * is scope-aware via the `ValidatorCtx`. + */ +import { Diagnostics } from "../diagnostics"; +import { matchSendOperator } from "../parse/core"; +import type { Arg, Expr, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; +import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"; +import { + BARE_SEND_REF_MSG, + lookupKind, + RULE_REF_EXPECT, + RUN_IN_RULE_REF_EXPECT, + RUN_TARGET_REF_EXPECT, + validateRef, + WORKFLOW_REF_EXPECT, + type RefExpectMessages, + type RefResolutionContext, + type RefTargetKind, +} from "./validate-ref-resolution"; +import { validatePromptReturnsSchema, validatePromptStepReturns } from "./validate-prompt-schema"; +import { + validateManagedWorkflowShell, + type SubstitutionValidateEnv, +} from "./validate-substitution"; +import { + extractDotFieldRefs, + extractInlineCaptures, + validateFailString, + validateJaiphStringContent, + validateLogString, + validatePromptString, + validateReturnString, + validateSimpleInterpolationIdentifiers, +} from "./validate-string"; + +export interface Scope { + kind: "workflow" | "rule"; + /** Step types allowed in this scope — single set-lookup gate at the visitor entry. */ + allowSteps: Set; + /** Per-step-type message used when a step is rejected by `allowSteps`. */ + disallowStepMessages: Partial>; + /** Ref expectation for `run ref(...)` callees (workflow vs rule semantics differ). */ + runRefExpect: RefExpectMessages; + /** True for workflows — rules skip prompt schema collection and reject prompts. */ + withPromptSchemas: boolean; +} + +export const WORKFLOW_SCOPE: Scope = { + kind: "workflow", + allowSteps: new Set([ + "trivia", + "send", + "say", + "return", + "const", + "exec", + "if", + "for_lines", + ]), + disallowStepMessages: {}, + runRefExpect: RUN_TARGET_REF_EXPECT, + withPromptSchemas: true, +}; + +export const RULE_SCOPE: Scope = { + kind: "rule", + allowSteps: new Set(["trivia", "say", "return", "const", "exec", "if", "for_lines"]), + disallowStepMessages: { + send: "send is not allowed in rules", + }, + runRefExpect: RUN_IN_RULE_REF_EXPECT, + withPromptSchemas: false, +}; + +export interface ValidatorCtx { + diag: Diagnostics; + ast: jaiphModule; + refCtx: RefResolutionContext; + scope: Scope; + knownVars: Set; + promptSchemas: Map; + recoverBindings: Set | undefined; + localChannels: Set; + localScripts: Set; + localWorkflows: Set; + importsByAlias: Map; + importedAstCache: Map; +} + +type StepValidator = (s: WorkflowStepDef, ctx: ValidatorCtx) => void; + +const VALIDATORS: Record = { + trivia: () => {}, + const: validateConstStep, + return: validateReturnStep, + send: validateSendStep, + say: validateSayStep, + exec: validateExecStep, + if: validateIfStep, + for_lines: validateForLinesStep, +}; + +/** Sole entry for per-step validation. Scope gate first, table dispatch second. */ +export function validateStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + const v = (VALIDATORS as Record)[s.type]; + if (!v) { + const loc = (s as { loc?: { line: number; col: number } }).loc ?? { line: 0, col: 0 }; + ctx.diag.error( + ctx.ast.filePath, + loc.line, + loc.col, + "E_VALIDATE", + `internal: no validator for step type "${(s as { type: string }).type}"`, + ); + } + if (!ctx.scope.allowSteps.has(s.type)) { + const msg = ctx.scope.disallowStepMessages[s.type]; + if (msg !== undefined) { + const loc = (s as { loc: { line: number; col: number } }).loc; + ctx.diag.error(ctx.ast.filePath, loc.line, loc.col, "E_VALIDATE", msg); + } + return; + } + v(s, ctx); +} + +// -- Per-step validators ---------------------------------------------------- + +function validateConstStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "const") return; + validateExpr(s.value, s.loc, "const", ctx); +} + +function validateReturnStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "return") return; + validateExpr(s.value, s.loc, "return", ctx); +} + +function validateSendStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "send") return; + validateChannelRef(s.channel, s.loc, ctx); + validateExpr(s.value, s.loc, "send", ctx); +} + +function validateSayStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "say") return; + if (s.level === "log" || s.level === "logerr") { + if (s.message.kind === "inline_script") return; + if (s.message.kind === "literal") { + validateLogString(s.message.raw, ctx.ast.filePath, s.loc.line, s.loc.col, s.level); + const inner = s.message.raw; + validateInlineStringCaptures(inner, s.loc, ctx); + if (ctx.scope.withPromptSchemas) { + validateDotFieldRefs(inner, s.loc, ctx); + } + validateSimpleInterpolationIdentifiers( + inner, + ctx.ast.filePath, + s.loc.line, + s.loc.col, + s.level, + ctx.knownVars, + ctx.scope.kind, + ctx.scope.withPromptSchemas ? ctx.promptSchemas : undefined, + ctx.recoverBindings, + ctx.localScripts, + ); + return; + } + ctx.diag.error( + ctx.ast.filePath, + s.loc.line, + s.loc.col, + "E_VALIDATE", + `unsupported ${s.level} message form`, + ); + } + if (s.message.kind !== "literal") { + ctx.diag.error( + ctx.ast.filePath, + s.loc.line, + s.loc.col, + "E_VALIDATE", + "fail message must be a literal string", + ); + } + validateFailString(s.message.raw, ctx.ast.filePath, s.loc.line, s.loc.col); + const failInner = semanticQuotedOrchestrationInner(s.message.raw); + validateInlineStringCaptures(failInner, s.loc, ctx); + if (ctx.scope.withPromptSchemas) { + validateDotFieldRefs(failInner, s.loc, ctx); + } + validateSimpleInterpolationIdentifiers( + failInner, + ctx.ast.filePath, + s.loc.line, + s.loc.col, + "fail", + ctx.knownVars, + ctx.scope.kind, + ctx.scope.withPromptSchemas ? ctx.promptSchemas : undefined, + ctx.recoverBindings, + ctx.localScripts, + ); +} + +function validateExecStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "exec") return; + const body = s.body; + if (body.kind === "prompt") { + if (ctx.scope.kind === "rule") { + ctx.diag.error( + ctx.ast.filePath, + body.loc.line, + body.loc.col, + "E_VALIDATE", + "prompt is not allowed in rules", + ); + } + validateExpr(body, s.loc, "const", ctx); + validatePromptStepReturns(body, s.captureName, ctx.ast.filePath); + return; + } + if (body.kind === "shell") { + if (ctx.scope.kind === "rule") { + ctx.diag.error( + ctx.ast.filePath, + body.loc.line, + body.loc.col, + "E_VALIDATE", + "inline shell steps are forbidden in rules; use explicit script blocks", + ); + } + validateWorkflowShellExec(body, ctx); + return; + } + if (body.kind === "call" && body.async && ctx.scope.kind === "rule") { + ctx.diag.error( + ctx.ast.filePath, + body.callee.loc.line, + body.callee.loc.col, + "E_VALIDATE", + "run async is not allowed in rules; use it in workflows only", + ); + } + validateExpr(body, s.loc, "exec", ctx); +} + +function validateIfStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "if") return; + if (s.operand.kind === "regex") { + try { + new RegExp(s.operand.source); + } catch { + ctx.diag.error( + ctx.ast.filePath, + s.loc.line, + s.loc.col, + "E_VALIDATE", + `invalid regex in if condition: /${s.operand.source}/`, + ); + } + } +} + +function validateForLinesStep(s: WorkflowStepDef, ctx: ValidatorCtx): void { + if (s.type !== "for_lines") return; + if (!ctx.knownVars.has(s.sourceVar)) { + ctx.diag.error( + ctx.ast.filePath, + s.loc.line, + s.loc.col, + "E_VALIDATE", + `for ... in : "${s.sourceVar}" is not a known variable in this scope`, + ); + } +} + +// -- Expr dispatcher -------------------------------------------------------- + +type ExprLabel = "const" | "return" | "send" | "exec"; + +function validateExpr( + expr: Expr, + stepLoc: { line: number; col: number }, + label: ExprLabel, + ctx: ValidatorCtx, +): void { + if (expr.kind === "literal") { + validateLiteralExpr(expr, stepLoc, label, ctx); + return; + } + if (expr.kind === "call" || expr.kind === "ensure_call") { + validateCallable(expr, ctx); + return; + } + if (expr.kind === "inline_script") { + return; + } + if (expr.kind === "match") { + validateMatchExpr(ctx.diag, ctx.ast.filePath, expr.match, ctx.knownVars); + return; + } + if (expr.kind === "prompt") { + validatePromptExpr(expr, stepLoc, label, ctx); + return; + } + if (expr.kind === "bare_ref") { + if (label !== "send") { + ctx.diag.error( + ctx.ast.filePath, + expr.ref.loc.line, + expr.ref.loc.col, + "E_VALIDATE", + "bare reference is only valid as a send payload", + ); + } + validateRef(expr.ref, ctx.ast, ctx.refCtx, { + mode: "bare_send_rhs", + bareSend: BARE_SEND_REF_MSG, + lookupImportedKind: makeImportedKindLookup(ctx), + }); + return; + } + if (expr.kind === "shell") { + if (label !== "send") { + ctx.diag.error( + ctx.ast.filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + "raw shell fragment is only valid as a send payload", + ); + } + validateManagedWorkflowShell(expr.command, makeSubEnv(ctx, expr.loc)); + return; + } +} + +function validateLiteralExpr( + expr: Extract, + stepLoc: { line: number; col: number }, + label: ExprLabel, + ctx: ValidatorCtx, +): void { + if (label === "send") { + const inner = expr.raw.startsWith('"') && expr.raw.endsWith('"') ? expr.raw.slice(1, -1) : expr.raw; + validateJaiphStringContent(inner, ctx.ast.filePath, stepLoc.line, stepLoc.col, "send"); + validateInlineStringCaptures(inner, stepLoc, ctx); + validateDotFieldRefs(inner, stepLoc, ctx); + validateSimpleInterpolationIdentifiers( + inner, + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "send", + ctx.knownVars, + ctx.scope.kind, + ctx.promptSchemas, + ctx.recoverBindings, + ctx.localScripts, + ); + return; + } + if (label === "return") { + validateReturnString(expr.raw, ctx.ast.filePath, stepLoc.line, stepLoc.col); + if (expr.raw.startsWith('"')) { + const retInner = stripDQ(expr.raw); + validateInlineStringCaptures(retInner, stepLoc, ctx); + if (ctx.scope.withPromptSchemas) { + validateDotFieldRefs(retInner, stepLoc, ctx); + } + validateSimpleInterpolationIdentifiers( + retInner, + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "return", + ctx.knownVars, + ctx.scope.kind, + ctx.scope.withPromptSchemas ? ctx.promptSchemas : undefined, + ctx.recoverBindings, + ctx.localScripts, + ); + } + return; + } + // const / exec — same string-content handling + const scriptName = extractConstScriptName(expr.raw); + if (scriptName && ctx.localScripts.has(scriptName)) { + ctx.diag.error( + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "E_VALIDATE", + `scripts are not values; "${scriptName}" is a script definition`, + ); + } + const inner = stripDQ(expr.raw); + validateInlineStringCaptures(inner, stepLoc, ctx); + if (ctx.scope.withPromptSchemas) { + validateDotFieldRefs(inner, stepLoc, ctx); + } + validateSimpleInterpolationIdentifiers( + inner, + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "const", + ctx.knownVars, + ctx.scope.kind, + ctx.scope.withPromptSchemas ? ctx.promptSchemas : undefined, + ctx.recoverBindings, + ctx.localScripts, + ); +} + +function validatePromptExpr( + expr: Extract, + stepLoc: { line: number; col: number }, + label: ExprLabel, + ctx: ValidatorCtx, +): void { + if (ctx.scope.kind === "rule") { + ctx.diag.error( + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "E_VALIDATE", + "const ... = prompt is not allowed in rules", + ); + } + if (label !== "const" && label !== "exec") { + ctx.diag.error( + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "E_VALIDATE", + `prompt is not a valid ${label} value`, + ); + } + const promptIdent = promptBareIdentifier(expr.raw); + if (promptIdent && ctx.localScripts.has(promptIdent)) { + ctx.diag.error( + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "E_VALIDATE", + `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`, + ); + } + validatePromptString(expr.raw, ctx.ast.filePath, stepLoc.line, stepLoc.col); + if (expr.returns !== undefined) { + validatePromptReturnsSchema(expr.returns, ctx.ast.filePath, stepLoc.line, stepLoc.col); + } + const pcInner = stripDQ(expr.raw); + validateInlineStringCaptures(pcInner, stepLoc, ctx); + validateDotFieldRefs(pcInner, stepLoc, ctx); + validateSimpleInterpolationIdentifiers( + pcInner, + ctx.ast.filePath, + stepLoc.line, + stepLoc.col, + "prompt", + ctx.knownVars, + ctx.scope.kind, + ctx.promptSchemas, + ctx.recoverBindings, + ctx.localScripts, + ); +} + +// -- Managed call shape (the "5-check sequence") ---------------------------- + +/** + * The five checks every call site repeats: shell-redirection, nested-unmanaged + * call inside literals, ref resolution, arity, and var-arg resolution. The + * scope picks the ref expectation for `run` (workflow vs rule semantics). + */ +function validateCallable(expr: Expr, ctx: ValidatorCtx): void { + if (expr.kind === "call") { + const loc = expr.callee.loc; + validateNoShellRedirection(ctx.diag, ctx.ast.filePath, loc, "run", expr.args); + validateNestedManagedCallArgs(ctx.diag, ctx.ast.filePath, loc, expr.args); + const isRuleScope = ctx.scope.kind === "rule"; + if ( + !expr.callee.value.includes(".") && + ctx.knownVars.has(expr.callee.value) && + !ctx.localScripts.has(expr.callee.value) && + !(!isRuleScope && ctx.localWorkflows.has(expr.callee.value)) + ) { + ctx.diag.error( + ctx.ast.filePath, + loc.line, + loc.col, + "E_VALIDATE", + `strings are not executable; "${expr.callee.value}" is a string — use a script instead`, + ); + } + validateRef(expr.callee, ctx.ast, ctx.refCtx, { + mode: "expect", + expect: ctx.scope.runRefExpect, + }); + validateArity(ctx.diag, ctx.ast.filePath, loc, expr.callee.value, expr.args, "workflow", ctx.ast, ctx.refCtx); + validateArgVarRefs(ctx.diag, ctx.ast.filePath, loc, expr.args, ctx.knownVars, ctx.recoverBindings); + return; + } + if (expr.kind === "ensure_call") { + const loc = expr.callee.loc; + validateNoShellRedirection(ctx.diag, ctx.ast.filePath, loc, "ensure", expr.args); + validateNestedManagedCallArgs(ctx.diag, ctx.ast.filePath, loc, expr.args); + validateRef(expr.callee, ctx.ast, ctx.refCtx, { mode: "expect", expect: RULE_REF_EXPECT }); + validateArity(ctx.diag, ctx.ast.filePath, loc, expr.callee.value, expr.args, "rule", ctx.ast, ctx.refCtx); + validateArgVarRefs(ctx.diag, ctx.ast.filePath, loc, expr.args, ctx.knownVars, ctx.recoverBindings); + } +} + +// -- Match expression ------------------------------------------------------- + +export function validateMatchExpr( + diag: Diagnostics, + filePath: string, + expr: MatchExprDef, + knownVars: Set, +): void { + if (expr.arms.length === 0) { + diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have at least one arm"); + } + let wildcardCount = 0; + for (const arm of expr.arms) { + if (arm.pattern.kind === "wildcard") wildcardCount += 1; + if (arm.pattern.kind === "regex") { + try { + new RegExp(arm.pattern.source); + } catch { + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + `invalid regex in match pattern: /${arm.pattern.source}/`, + ); + } + } + const bodyTrimmed = (arm.tripleQuotedBody ? tripleQuotedRawForRuntime(arm.body) : arm.body).trimStart(); + if (/^return(\s|$)/.test(bodyTrimmed)) { + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + `match arm body must not start with "return"; the match expression itself produces the value — use the expression directly after =>`, + ); + } + if (/`[^`]*`\s*\(/.test(bodyTrimmed) || bodyTrimmed.startsWith("```")) { + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + `inline scripts are not allowed in match arm bodies; use a named script with "run script_name(…)" instead`, + ); + } + if (!arm.tripleQuotedBody) { + const idMatch = bodyTrimmed.match(/^([A-Za-z_][A-Za-z0-9_]*)/); + if (idMatch) { + const ident = idMatch[1]!; + const after = bodyTrimmed.slice(ident.length); + const startsCall = after.startsWith("("); + const startsArgs = /^\s+\S/.test(after); + if ((startsCall || startsArgs) && ident !== "fail" && ident !== "run" && ident !== "ensure") { + const hint = ident === "error" ? ` did you mean "fail"?` : ""; + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + `unknown match arm verb "${ident}"; allowed: fail "...", run ref(...), ensure ref(...).${hint}`, + ); + } + if (!startsCall && !startsArgs && after.trim() === "" && !knownVars.has(ident)) { + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + `unknown identifier "${ident}" in match arm body; declare it with "const", use a capture, or add a parameter`, + ); + } + } + } + } + if (wildcardCount === 0) { + diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm"); + } + if (wildcardCount > 1) { + diag.error( + filePath, + expr.loc.line, + expr.loc.col, + "E_VALIDATE", + "match must have exactly one wildcard (_) arm, found multiple", + ); + } +} + +// -- Workflow shell exec (workflow-only body kind) -------------------------- + +function validateWorkflowShellExec( + body: Extract, + ctx: ValidatorCtx, +): void { + if (hasUnquotedSendArrow(body.command) && matchSendOperator(body.command) === null) { + ctx.diag.error( + ctx.ast.filePath, + body.loc.line, + body.loc.col, + "E_VALIDATE", + "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)", + ); + } + const t = body.command.trim(); + if (/^(?:[A-Za-z_][A-Za-z0-9_]*)(?:\.[A-Za-z_][A-Za-z0-9_]*)*$/.test(t)) { + if (!t.includes(".")) { + if (ctx.localScripts.has(t) || ctx.localWorkflows.has(t)) { + ctx.diag.error( + ctx.ast.filePath, + body.loc.line, + body.loc.col, + "E_VALIDATE", + `use run ${t}() — a bare name that refers to a script or workflow must use a managed run step`, + ); + } + } else { + validateRef({ value: t, loc: body.loc }, ctx.ast, ctx.refCtx, { + mode: "expect", + expect: RUN_TARGET_REF_EXPECT, + }); + ctx.diag.error( + ctx.ast.filePath, + body.loc.line, + body.loc.col, + "E_VALIDATE", + `use run ${t}() — "${t}" is a valid script or workflow reference; use a managed run step`, + ); + } + } +} + +// -- Channel/route helpers -------------------------------------------------- + +function validateChannelRef(channel: string, loc: { line: number; col: number }, ctx: ValidatorCtx): void { + const parts = channel.split("."); + if (parts.length === 1) { + if (!ctx.localChannels.has(channel)) { + ctx.diag.error(ctx.ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + } + return; + } + if (parts.length !== 2) { + ctx.diag.error(ctx.ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + } + const [alias, importedChannel] = parts; + const importedFile = ctx.importsByAlias.get(alias); + if (!importedFile) { + ctx.diag.error(ctx.ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + } + const importedAst = ctx.importedAstCache.get(importedFile)!; + const importedChannels = new Set(importedAst.channels.map((c) => c.name)); + if (!importedChannels.has(importedChannel)) { + ctx.diag.error(ctx.ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); + } +} + +export const ROUTE_REF_EXPECT: RefExpectMessages = WORKFLOW_REF_EXPECT; + +export function resolveRouteTargetParams( + ref: string, + ast: jaiphModule, + refCtx: RefResolutionContext, +): number | undefined { + const dotIdx = ref.indexOf("."); + if (dotIdx >= 0) { + const alias = ref.slice(0, dotIdx); + const name = ref.slice(dotIdx + 1); + const importPath = refCtx.importsByAlias.get(alias); + if (!importPath) return undefined; + const importedAst = refCtx.importedAstCache.get(importPath); + if (!importedAst) return undefined; + const wf = importedAst.workflows.find((w) => w.name === name); + return wf?.params.length; + } + const wf = ast.workflows.find((w) => w.name === ref); + return wf?.params.length; +} + +// -- Inline string captures / dot-field refs -------------------------------- + +function validateInlineStringCaptures( + content: string, + loc: { line: number; col: number }, + ctx: ValidatorCtx, +): void { + for (const cap of extractInlineCaptures(content)) { + if (cap.kind === "run") { + validateNoShellRedirection(ctx.diag, ctx.ast.filePath, loc, "run", cap.args); + validateRef({ value: cap.ref, loc }, ctx.ast, ctx.refCtx, { + mode: "expect", + expect: ctx.scope.runRefExpect, + }); + } else { + validateNoShellRedirection(ctx.diag, ctx.ast.filePath, loc, "ensure", cap.args); + validateRef({ value: cap.ref, loc }, ctx.ast, ctx.refCtx, { + mode: "expect", + expect: RULE_REF_EXPECT, + }); + } + } +} + +function validateDotFieldRefs( + content: string, + loc: { line: number; col: number }, + ctx: ValidatorCtx, +): void { + for (const ref of extractDotFieldRefs(content)) { + const fields = ctx.promptSchemas.get(ref.varName); + if (!fields) { + ctx.diag.error( + ctx.ast.filePath, + loc.line, + loc.col, + "E_VALIDATE", + `\${${ref.varName}.${ref.fieldName}}: "${ref.varName}" is not a typed prompt capture; dot notation requires a prompt with "returns" schema`, + ); + } + if (!fields.includes(ref.fieldName)) { + ctx.diag.error( + ctx.ast.filePath, + loc.line, + loc.col, + "E_VALIDATE", + `\${${ref.varName}.${ref.fieldName}}: field "${ref.fieldName}" is not defined in the returns schema for "${ref.varName}"; available fields: ${fields.join(", ")}`, + ); + } + } +} + +// -- Shared call-shape helpers ---------------------------------------------- + +function hasShellRedirection(args: Arg[] | undefined): boolean { + if (!args) return false; + for (const a of args) { + if (a.kind !== "literal") continue; + let inQuote = false; + const raw = a.raw; + for (let i = 0; i < raw.length; i++) { + const ch = raw[i]; + if (ch === '"' && (i === 0 || raw[i - 1] !== "\\")) { + inQuote = !inQuote; + continue; + } + if (!inQuote && (ch === ">" || ch === "|" || ch === "&")) { + return true; + } + } + } + return false; +} + +export function validateNoShellRedirection( + diag: Diagnostics, + filePath: string, + loc: { line: number; col: number }, + keyword: string, + args: Arg[] | undefined, +): void { + if (!hasShellRedirection(args)) return; + diag.error( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `shell redirection (>, >>, |, &) is not supported with ${keyword}; use a script block for shell operations`, + ); +} + +function validateNestedManagedCallArgs( + diag: Diagnostics, + filePath: string, + loc: { line: number; col: number }, + args: Arg[] | undefined, +): void { + if (!args) return; + for (const a of args) { + if (a.kind !== "literal") continue; + checkNestedManagedInLiteral(diag, filePath, loc, a.raw); + } +} + +function checkNestedManagedInLiteral( + diag: Diagnostics, + filePath: string, + loc: { line: number; col: number }, + raw: string, +): void { + const stripped = stripQuotedSegmentContent(raw); + const re = /\b([A-Za-z_][A-Za-z0-9_.]*)\s*\(/g; + let match: RegExpExecArray | null; + while ((match = re.exec(stripped)) !== null) { + const before = stripped.slice(0, match.index).trimEnd(); + const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); + if (lastToken === "run" || lastToken === "ensure") continue; + diag.error( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `nested managed calls in argument position must be explicit; use "run ${match[1]}(...)" or "ensure ${match[1]}(...)" inside the argument list`, + ); + } + const btRe = /`[^`]*`\s*\(/g; + let btMatch: RegExpExecArray | null; + while ((btMatch = btRe.exec(stripped)) !== null) { + const before = stripped.slice(0, btMatch.index).trimEnd(); + const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); + if (lastToken === "run" || lastToken === "ensure") continue; + diag.error( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `nested inline script calls in argument position must be explicit; use "run \`...\`(...)" inside the argument list`, + ); + } +} + +function stripQuotedSegmentContent(segment: string): string { + let out = ""; + let quote: "'" | '"' | null = null; + for (let i = 0; i < segment.length; i += 1) { + const ch = segment[i]!; + if (quote) { + if (ch === quote && segment[i - 1] !== "\\") { + quote = null; + } + out += " "; + continue; + } + if (ch === "'" || ch === '"') { + quote = ch; + out += " "; + continue; + } + out += ch; + } + return out; +} + +function validateArgVarRefs( + diag: Diagnostics, + filePath: string, + loc: { line: number; col: number }, + args: Arg[] | undefined, + knownVars: Set, + recoverBindings?: Set, +): void { + if (!args) return; + for (const a of args) { + if (a.kind !== "var") continue; + if (recoverBindings?.has(a.name)) continue; + if (knownVars.has(a.name)) continue; + diag.error( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `unknown identifier "${a.name}" used as bare argument; declare it with "const", use a capture, or add a workflow/rule parameter`, + ); + } +} + +function validateArity( + diag: Diagnostics, + filePath: string, + loc: { line: number; col: number }, + ref: string, + args: Arg[] | undefined, + targetKind: "workflow" | "rule", + ast: jaiphModule, + refCtx: RefResolutionContext, +): void { + const params = lookupCalleeParams(ref, targetKind, ast, refCtx); + if (params === undefined) return; + const argCount = args?.length ?? 0; + if (argCount !== params.length) { + diag.error( + filePath, + loc.line, + loc.col, + "E_VALIDATE", + `${targetKind} "${ref}" expects ${params.length} argument(s) (${params.join(", ") || "none"}), but got ${argCount}`, + ); + } +} + +function lookupCalleeParams( + ref: string, + targetKind: "workflow" | "rule", + ast: jaiphModule, + refCtx: RefResolutionContext, +): string[] | undefined { + const parts = ref.split("."); + if (parts.length === 1) { + const name = parts[0]; + if (targetKind === "workflow") { + const wf = ast.workflows.find((w) => w.name === name); + return wf?.params; + } + const rl = ast.rules.find((r) => r.name === name); + return rl?.params; + } + if (parts.length === 2) { + const [alias, name] = parts; + const importedFile = refCtx.importsByAlias.get(alias); + if (!importedFile) return undefined; + const importedAst = refCtx.importedAstCache.get(importedFile); + if (!importedAst) return undefined; + if (targetKind === "workflow") { + const wf = importedAst.workflows.find((w) => w.name === name); + return wf?.params; + } + const rl = importedAst.rules.find((r) => r.name === name); + return rl?.params; + } + return undefined; +} + +// -- Misc small helpers ----------------------------------------------------- + +function hasUnquotedSendArrow(line: string): boolean { + let inSingleQuote = false; + let inDoubleQuote = false; + for (let i = 0; i < line.length; i += 1) { + const ch = line[i]; + if (ch === "\\" && (inDoubleQuote || inSingleQuote)) { + i += 1; + continue; + } + if (ch === "'" && !inDoubleQuote) { + inSingleQuote = !inSingleQuote; + continue; + } + if (ch === '"' && !inSingleQuote) { + inDoubleQuote = !inDoubleQuote; + continue; + } + if (!inSingleQuote && !inDoubleQuote && ch === "<" && line[i + 1] === "-") { + return true; + } + } + return false; +} + +function stripDQ(s: string): string { + return s.length >= 2 && s[0] === '"' && s[s.length - 1] === '"' ? s.slice(1, -1) : s; +} + +function semanticQuotedOrchestrationInner(dqRaw: string): string { + return stripDQ(dqRaw); +} + +function extractConstScriptName(rhs: string): string | undefined { + const trimmed = rhs.trim(); + if (/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(trimmed)) return trimmed; + const inner = stripDQ(trimmed); + const m = inner.match(/^\$\{([a-zA-Z_][a-zA-Z0-9_]*)\}$/); + return m?.[1]; +} + +function promptBareIdentifier(raw: string): string | undefined { + const m = raw.match(/^"\$\{([A-Za-z_][A-Za-z0-9_]*)\}"$/); + return m?.[1]; +} + +export function parseSchemaFieldNames(rawSchema: string): string[] { + const inner = rawSchema.trim().replace(/^\s*\{\s*/, "").replace(/\s*\}\s*$/, "").trim(); + if (!inner) return []; + const names: string[] = []; + for (const part of inner.split(",")) { + const m = part.trim().match(/^\s*([A-Za-z_][A-Za-z0-9_]*)\s*:\s*\S+\s*$/); + if (m) names.push(m[1]); + } + return names; +} + +function makeImportedKindLookup( + ctx: ValidatorCtx, +): (alias: string, name: string) => RefTargetKind | undefined { + return (alias, name) => { + const importedFile = ctx.importsByAlias.get(alias); + if (!importedFile) return undefined; + const importedAst = ctx.importedAstCache.get(importedFile)!; + return lookupKind(importedAst, name); + }; +} + +function makeSubEnv( + ctx: ValidatorCtx, + loc: { line: number; col: number }, +): SubstitutionValidateEnv { + return { + filePath: ctx.ast.filePath, + loc, + localRules: new Set(ctx.ast.rules.map((r) => r.name)), + localWorkflows: ctx.localWorkflows, + localScripts: ctx.localScripts, + importsByAlias: ctx.importsByAlias, + lookupImported: makeImportedKindLookup(ctx), + }; +} diff --git a/src/transpile/validate-visitor.test.ts b/src/transpile/validate-visitor.test.ts new file mode 100644 index 00000000..222d6efa --- /dev/null +++ b/src/transpile/validate-visitor.test.ts @@ -0,0 +1,289 @@ +/** + * Acceptance tests for Refactor 4 (visitor-table validator). + * + * AC1 — `src/transpile/validate.ts` is at most 700 lines. + * AC3 — Diagnostic snapshot over every txtar `validate-*` error fixture pins + * `{ code, line, col, message }` bit-for-bit. + * AC4 — Adding a new step type requires exactly one row in `VALIDATORS`: a + * synthetic step type injected via type cast is rejected with the + * documented "internal: no validator" message and produces exactly + * one diagnostic. + */ +import test from "node:test"; +import assert from "node:assert/strict"; +import { + existsSync, + mkdtempSync, + readFileSync, + rmSync, + writeFileSync, +} from "node:fs"; +import { join, resolve } from "node:path"; +import { tmpdir } from "node:os"; +import { Diagnostics } from "../diagnostics"; +import { loadModuleGraph } from "./module-graph"; +import { collectDiagnostics } from "./validate"; +import { + RULE_SCOPE, + WORKFLOW_SCOPE, + validateStep, + type ValidatorCtx, +} from "./validate-step"; +import type { jaiphModule, WorkflowStepDef } from "../types"; + +const repoRoot = resolve(__dirname, "../../.."); +const validatePath = resolve(repoRoot, "src/transpile/validate.ts"); + +// --- AC1: file size bound ------------------------------------------------- + +test("AC1: validate.ts is at most 700 lines", () => { + const text = readFileSync(validatePath, "utf8"); + const lineCount = text.split("\n").length; + assert.ok( + lineCount <= 700, + `validate.ts is ${lineCount} lines (limit 700). The visitor-table refactor (Refactor 4) bounds this file; new validators belong in validate-step.ts.`, + ); +}); + +// --- AC3: diagnostic snapshot -------------------------------------------- + +interface TxtarTestCase { + name: string; + files: Map; +} + +function parseTxtar(content: string): TxtarTestCase[] { + const cases: TxtarTestCase[] = []; + const blocks = content.split(/^=== /m); + for (const block of blocks) { + const trimmed = block.trim(); + if (!trimmed) continue; + const lines = trimmed.split("\n"); + const name = lines[0].trim(); + let fileStartIdx = -1; + for (let i = 1; i < lines.length; i += 1) { + if (lines[i].startsWith("--- ")) { + fileStartIdx = i; + break; + } + } + if (fileStartIdx < 0) continue; + cases.push({ name, files: parseVirtualFiles(lines.slice(fileStartIdx)) }); + } + return cases; +} + +function parseVirtualFiles(lines: string[]): Map { + const files = new Map(); + let cur: string | undefined; + let buf: string[] = []; + for (const line of lines) { + if (line.startsWith("--- ")) { + if (cur !== undefined) files.set(cur, buf.join("\n") + "\n"); + cur = line.slice(4).trim(); + buf = []; + } else { + buf.push(line); + } + } + if (cur !== undefined) files.set(cur, buf.join("\n") + "\n"); + return files; +} + +function entryFile(files: Map): string { + if (files.has("main.jh")) return "main.jh"; + if (files.has("input.jh")) return "input.jh"; + if (files.has("input.test.jh")) return "input.test.jh"; + const first = files.keys().next().value; + if (!first) throw new Error("no virtual files"); + return first; +} + +interface SnapshotEntry { + file: string; + line: number; + col: number; + code: string; + message: string; +} +type Snapshot = Record; + +function captureSnapshot(): Snapshot { + const fixturesDir = resolve(repoRoot, "test-fixtures/compiler-txtar"); + const out: Snapshot = {}; + const files = ["validate-errors.txt", "validate-errors-multi-module.txt"]; + for (const fileName of files) { + const content = readFileSync(join(fixturesDir, fileName), "utf8"); + for (const tc of parseTxtar(content)) { + const key = `${fileName} > ${tc.name}`; + const tmpDir = mkdtempSync(join(tmpdir(), "jaiph-snap-")); + try { + for (const [name, body] of tc.files) { + writeFileSync(join(tmpDir, name), body, "utf8"); + } + const entry = join(tmpDir, entryFile(tc.files)); + let diagnostics: SnapshotEntry[] = []; + try { + const graph = loadModuleGraph(entry); + const diag = collectDiagnostics(graph); + diagnostics = diag.sorted().map((d) => ({ + file: relativizeTmp(d.file, tmpDir), + line: d.line, + col: d.col, + code: d.code, + message: scrubTmp(d.message, tmpDir), + })); + } catch (e) { + // Fatal parser/loader error — capture as a synthetic diagnostic row + // so the snapshot still pins the failure mode. + const msg = (e as Error).message ?? String(e); + const m = msg.match(/^(.+):(\d+):(\d+) (\S+) ([\s\S]+)$/); + diagnostics = [ + m + ? { + file: relativizeTmp(m[1], tmpDir), + line: Number(m[2]), + col: Number(m[3]), + code: m[4], + message: scrubTmp(m[5], tmpDir), + } + : { + file: "", + line: 0, + col: 0, + code: "E_FATAL", + message: scrubTmp(msg, tmpDir), + }, + ]; + } + out[key] = diagnostics; + } finally { + rmSync(tmpDir, { recursive: true, force: true }); + } + } + } + return out; +} + +function relativizeTmp(p: string, tmpDir: string): string { + if (p.startsWith(tmpDir)) { + const rel = p.slice(tmpDir.length); + return rel.replace(/^[\/]+/, ""); + } + return p; +} + +/** Replace `/...` substrings in error messages with `/...` so the snapshot is stable across runs. */ +function scrubTmp(msg: string, tmpDir: string): string { + const escaped = tmpDir.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); + return msg.replace(new RegExp(escaped, "g"), ""); +} + +test("AC3: validate-* fixtures diagnostic snapshot pins {code, line, col, message}", () => { + const snapshotPath = resolve( + repoRoot, + "test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json", + ); + const current = captureSnapshot(); + + if (process.env.UPDATE_SNAPSHOTS === "1" || !existsSync(snapshotPath)) { + writeFileSync(snapshotPath, JSON.stringify(current, null, 2) + "\n", "utf8"); + return; + } + const stored = JSON.parse(readFileSync(snapshotPath, "utf8")) as Snapshot; + assert.deepEqual( + current, + stored, + "diagnostic output drifted from snapshot. Re-run with UPDATE_SNAPSHOTS=1 only after confirming the change is intentional.", + ); +}); + +// --- AC4: unknown step type rejection ------------------------------------- + +test("AC4: unknown step type is rejected with the documented 'no validator' diagnostic (one error)", () => { + const ast: jaiphModule = { + filePath: "/synthetic.jh", + imports: [], + channels: [], + exports: [], + rules: [], + scripts: [], + workflows: [], + }; + const diag = new Diagnostics(); + const ctx: ValidatorCtx = { + diag, + ast, + refCtx: { + importsByAlias: new Map(), + importedAstCache: new Map(), + localRules: new Set(), + localWorkflows: new Set(), + localScripts: new Set(), + }, + scope: WORKFLOW_SCOPE, + knownVars: new Set(), + promptSchemas: new Map(), + recoverBindings: undefined, + localChannels: new Set(), + localScripts: new Set(), + localWorkflows: new Set(), + importsByAlias: new Map(), + importedAstCache: new Map(), + }; + + const syntheticStep = { + type: "ZZZ_synthetic_step_type", + loc: { line: 42, col: 7 }, + } as unknown as WorkflowStepDef; + + diag.capture(() => validateStep(syntheticStep, ctx)); + const errs = diag.sorted(); + assert.equal(errs.length, 1, `expected exactly one diagnostic, got ${JSON.stringify(errs)}`); + assert.equal(errs[0].code, "E_VALIDATE"); + assert.equal(errs[0].line, 42); + assert.equal(errs[0].col, 7); + assert.match(errs[0].message, /^internal: no validator for step type "ZZZ_synthetic_step_type"$/); +}); + +test("AC4: same synthetic step type is rejected in RULE_SCOPE too (scope-independent fallback)", () => { + const ast: jaiphModule = { + filePath: "/synthetic.jh", + imports: [], + channels: [], + exports: [], + rules: [], + scripts: [], + workflows: [], + }; + const diag = new Diagnostics(); + const ctx: ValidatorCtx = { + diag, + ast, + refCtx: { + importsByAlias: new Map(), + importedAstCache: new Map(), + localRules: new Set(), + localWorkflows: new Set(), + localScripts: new Set(), + }, + scope: RULE_SCOPE, + knownVars: new Set(), + promptSchemas: new Map(), + recoverBindings: undefined, + localChannels: new Set(), + localScripts: new Set(), + localWorkflows: new Set(), + importsByAlias: new Map(), + importedAstCache: new Map(), + }; + const syntheticStep = { + type: "ZZZ_synthetic_step_type", + loc: { line: 3, col: 1 }, + } as unknown as WorkflowStepDef; + + diag.capture(() => validateStep(syntheticStep, ctx)); + const errs = diag.sorted(); + assert.equal(errs.length, 1); + assert.match(errs[0].message, /^internal: no validator for step type "ZZZ_synthetic_step_type"$/); +}); diff --git a/src/transpile/validate.ts b/src/transpile/validate.ts index ef222d1e..02a7d26a 100644 --- a/src/transpile/validate.ts +++ b/src/transpile/validate.ts @@ -1,179 +1,18 @@ import { existsSync } from "node:fs"; import { dirname, resolve } from "node:path"; import { Diagnostics } from "../diagnostics"; -import type { Arg, Expr, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; +import type { Expr, jaiphModule, WorkflowStepDef } from "../types"; import type { ModuleGraph } from "./module-graph"; -import type { SubstitutionValidateEnv } from "./validate-substitution"; -import { validateManagedWorkflowShell } from "./validate-substitution"; -import type { RefResolutionContext, RefTargetKind } from "./validate-ref-resolution"; +import { validateRef } from "./validate-ref-resolution"; import { - BARE_SEND_REF_MSG, - lookupKind, - RULE_REF_EXPECT, - RUN_IN_RULE_REF_EXPECT, - RUN_TARGET_REF_EXPECT, - validateRef, - WORKFLOW_REF_EXPECT, -} from "./validate-ref-resolution"; -import { - validatePromptString, - validateLogString, - validateFailString, - validateReturnString, - validateJaiphStringContent, - validateSimpleInterpolationIdentifiers, - extractInlineCaptures, - extractDotFieldRefs, -} from "./validate-string"; -import { validatePromptReturnsSchema, validatePromptStepReturns } from "./validate-prompt-schema"; -import { matchSendOperator } from "../parse/core"; -import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"; - -/** True when `<-` appears outside quotes (same idea as `matchSendOperator`). */ -function hasUnquotedSendArrow(line: string): boolean { - let inSingleQuote = false; - let inDoubleQuote = false; - for (let i = 0; i < line.length; i += 1) { - const ch = line[i]; - if (ch === "\\" && (inDoubleQuote || inSingleQuote)) { - i += 1; - continue; - } - if (ch === "'" && !inDoubleQuote) { - inSingleQuote = !inSingleQuote; - continue; - } - if (ch === '"' && !inSingleQuote) { - inDoubleQuote = !inDoubleQuote; - continue; - } - if (!inSingleQuote && !inDoubleQuote && ch === "<" && line[i + 1] === "-") { - return true; - } - } - return false; -} - -/** Check if any literal arg contains unquoted shell redirection operators (>, >>, |, &). */ -function hasShellRedirection(args: Arg[] | undefined): boolean { - if (!args) return false; - for (const a of args) { - if (a.kind !== "literal") continue; - let inQuote = false; - const raw = a.raw; - for (let i = 0; i < raw.length; i++) { - const ch = raw[i]; - if (ch === '"' && (i === 0 || raw[i - 1] !== "\\")) { - inQuote = !inQuote; - continue; - } - if (!inQuote && (ch === ">" || ch === "|" || ch === "&")) { - return true; - } - } - } - return false; -} - -function validateNoShellRedirection( - diag: Diagnostics, - filePath: string, - loc: { line: number; col: number }, - keyword: string, - args: Arg[] | undefined, -): void { - if (!hasShellRedirection(args)) return; - diag.error( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `shell redirection (>, >>, |, &) is not supported with ${keyword}; use a script block for shell operations`, - ); -} - -function validateMatchExpr( - diag: Diagnostics, - filePath: string, - expr: MatchExprDef, - knownVars: Set, -): void { - if (expr.arms.length === 0) { - diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have at least one arm"); - } - let wildcardCount = 0; - for (const arm of expr.arms) { - if (arm.pattern.kind === "wildcard") { - wildcardCount += 1; - } - if (arm.pattern.kind === "regex") { - try { - new RegExp(arm.pattern.source); - } catch { - diag.error( - filePath, - expr.loc.line, - expr.loc.col, - "E_VALIDATE", - `invalid regex in match pattern: /${arm.pattern.source}/`, - ); - } - } - const bodyTrimmed = (arm.tripleQuotedBody ? tripleQuotedRawForRuntime(arm.body) : arm.body).trimStart(); - if (/^return(\s|$)/.test(bodyTrimmed)) { - diag.error( - filePath, - expr.loc.line, - expr.loc.col, - "E_VALIDATE", - `match arm body must not start with "return"; the match expression itself produces the value — use the expression directly after =>`, - ); - } - if (/`[^`]*`\s*\(/.test(bodyTrimmed) || bodyTrimmed.startsWith("```")) { - diag.error( - filePath, - expr.loc.line, - expr.loc.col, - "E_VALIDATE", - `inline scripts are not allowed in match arm bodies; use a named script with "run script_name(…)" instead`, - ); - } - if (!arm.tripleQuotedBody) { - const idMatch = bodyTrimmed.match(/^([A-Za-z_][A-Za-z0-9_]*)/); - if (idMatch) { - const ident = idMatch[1]!; - const after = bodyTrimmed.slice(ident.length); - const startsCall = after.startsWith("("); - const startsArgs = /^\s+\S/.test(after); - if ((startsCall || startsArgs) && ident !== "fail" && ident !== "run" && ident !== "ensure") { - const hint = ident === "error" ? ` did you mean "fail"?` : ""; - diag.error( - filePath, - expr.loc.line, - expr.loc.col, - "E_VALIDATE", - `unknown match arm verb "${ident}"; allowed: fail "...", run ref(...), ensure ref(...).${hint}`, - ); - } - if (!startsCall && !startsArgs && after.trim() === "" && !knownVars.has(ident)) { - diag.error( - filePath, - expr.loc.line, - expr.loc.col, - "E_VALIDATE", - `unknown identifier "${ident}" in match arm body; declare it with "const", use a capture, or add a parameter`, - ); - } - } - } - } - if (wildcardCount === 0) { - diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm"); - } - if (wildcardCount > 1) { - diag.error(filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", "match must have exactly one wildcard (_) arm, found multiple"); - } -} + parseSchemaFieldNames, + resolveRouteTargetParams, + ROUTE_REF_EXPECT, + RULE_SCOPE, + validateStep, + WORKFLOW_SCOPE, + type ValidatorCtx, +} from "./validate-step"; /** * One step entry in the flat list built by the single workflow walk. @@ -194,11 +33,6 @@ interface FlatStepEntry { * every step in tree order. The flat list is what the main validator loop * iterates over — that loop is non-recursive, so the only recursive helper * walking `WorkflowStepDef[]` in this file is `walkStepTree` itself. - * - * Replaces three prior pre-passes that each walked the same step tree with - * subtly different recursion rules. Immutable-binding rules are enforced - * inline during the descent so the failure order matches the prior - * "binding errors first, then per-step errors" behavior. */ interface StepTreeWalk { knownVars: Set; @@ -214,7 +48,6 @@ function walkStepTree( params: string[], declLoc: { line: number; col: number }, moduleScripts: Set, - parseSchemaFieldNames: (rawSchema: string) => string[], options: { withPromptSchemas: boolean }, ): StepTreeWalk { const knownVars = new Set(); @@ -224,9 +57,7 @@ function walkStepTree( if (envDecls) { for (const d of envDecls) knownVars.add(d.name); } - for (const p of params) { - knownVars.add(p); - } + for (const p of params) knownVars.add(p); const seedBindings = new Map(); for (const p of params) { @@ -335,177 +166,6 @@ function execBodyLoc(body: Expr): { line: number; col: number } | undefined { return undefined; } -function lookupCalleeParams( - ref: string, - targetKind: "workflow" | "rule", - ast: jaiphModule, - refCtx: RefResolutionContext, -): string[] | undefined { - const parts = ref.split("."); - if (parts.length === 1) { - const name = parts[0]; - if (targetKind === "workflow") { - const wf = ast.workflows.find((w) => w.name === name); - return wf?.params; - } - const rl = ast.rules.find((r) => r.name === name); - return rl?.params; - } - if (parts.length === 2) { - const [alias, name] = parts; - const importedFile = refCtx.importsByAlias.get(alias); - if (!importedFile) return undefined; - const importedAst = refCtx.importedAstCache.get(importedFile); - if (!importedAst) return undefined; - if (targetKind === "workflow") { - const wf = importedAst.workflows.find((w) => w.name === name); - return wf?.params; - } - const rl = importedAst.rules.find((r) => r.name === name); - return rl?.params; - } - return undefined; -} - -function validateArity( - diag: Diagnostics, - filePath: string, - loc: { line: number; col: number }, - ref: string, - args: Arg[] | undefined, - targetKind: "workflow" | "rule", - ast: jaiphModule, - refCtx: RefResolutionContext, -): void { - const params = lookupCalleeParams(ref, targetKind, ast, refCtx); - if (params === undefined) return; - const argCount = args?.length ?? 0; - if (argCount !== params.length) { - diag.error( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `${targetKind} "${ref}" expects ${params.length} argument(s) (${params.join(", ") || "none"}), but got ${argCount}`, - ); - } -} - -function validateArgVarRefs( - diag: Diagnostics, - filePath: string, - loc: { line: number; col: number }, - args: Arg[] | undefined, - knownVars: Set, - recoverBindings?: Set, -): void { - if (!args) return; - for (const a of args) { - if (a.kind !== "var") continue; - if (recoverBindings?.has(a.name)) continue; - if (knownVars.has(a.name)) continue; - diag.error( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `unknown identifier "${a.name}" used as bare argument; declare it with "const", use a capture, or add a workflow/rule parameter`, - ); - } -} - -function validateNestedManagedCallArgs( - diag: Diagnostics, - filePath: string, - loc: { line: number; col: number }, - args: Arg[] | undefined, -): void { - if (!args) return; - for (const a of args) { - if (a.kind !== "literal") continue; - checkNestedManagedInLiteral(diag, filePath, loc, a.raw); - } -} - -function checkNestedManagedInLiteral( - diag: Diagnostics, - filePath: string, - loc: { line: number; col: number }, - raw: string, -): void { - const stripped = stripQuotedSegmentContent(raw); - const re = /\b([A-Za-z_][A-Za-z0-9_.]*)\s*\(/g; - let match: RegExpExecArray | null; - while ((match = re.exec(stripped)) !== null) { - const before = stripped.slice(0, match.index).trimEnd(); - const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); - if (lastToken === "run" || lastToken === "ensure") continue; - diag.error( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `nested managed calls in argument position must be explicit; use "run ${match[1]}(...)" or "ensure ${match[1]}(...)" inside the argument list`, - ); - } - const btRe = /`[^`]*`\s*\(/g; - let btMatch: RegExpExecArray | null; - while ((btMatch = btRe.exec(stripped)) !== null) { - const before = stripped.slice(0, btMatch.index).trimEnd(); - const lastToken = before.length === 0 ? "" : before.slice(before.lastIndexOf(" ") + 1); - if (lastToken === "run" || lastToken === "ensure") continue; - diag.error( - filePath, - loc.line, - loc.col, - "E_VALIDATE", - `nested inline script calls in argument position must be explicit; use "run \`...\`(...)" inside the argument list`, - ); - } -} - -function stripQuotedSegmentContent(segment: string): string { - let out = ""; - let quote: "'" | '"' | null = null; - for (let i = 0; i < segment.length; i += 1) { - const ch = segment[i]!; - if (quote) { - if (ch === quote && segment[i - 1] !== "\\") { - quote = null; - } - out += " "; - continue; - } - if (ch === "'" || ch === '"') { - quote = ch; - out += " "; - continue; - } - out += ch; - } - return out; -} - -function resolveRouteTargetParams( - ref: string, - ast: jaiphModule, - refCtx: RefResolutionContext, -): number | undefined { - const dotIdx = ref.indexOf("."); - if (dotIdx >= 0) { - const alias = ref.slice(0, dotIdx); - const name = ref.slice(dotIdx + 1); - const importPath = refCtx.importsByAlias.get(alias); - if (!importPath) return undefined; - const importedAst = refCtx.importedAstCache.get(importPath); - if (!importedAst) return undefined; - const wf = importedAst.workflows.find((w) => w.name === name); - return wf?.params.length; - } - const wf = ast.workflows.find((w) => w.name === ref); - return wf?.params.length; -} - export function resolveScriptImportPath(fromFile: string, importPath: string): string { return resolve(dirname(fromFile), importPath); } @@ -609,7 +269,7 @@ export function validateModuleInto( }); } - const refCtx: RefResolutionContext = { + const refCtx = { importsByAlias, importedAstCache, localRules, @@ -617,308 +277,16 @@ export function validateModuleInto( localScripts, }; - const expectRuleRef = { mode: "expect" as const, expect: RULE_REF_EXPECT }; - const expectWorkflowRef = { mode: "expect" as const, expect: WORKFLOW_REF_EXPECT }; - const expectRunInRuleRef = { mode: "expect" as const, expect: RUN_IN_RULE_REF_EXPECT }; - const expectRunTargetRef = { mode: "expect" as const, expect: RUN_TARGET_REF_EXPECT }; - - const lookupImportedKind = (alias: string, name: string): RefTargetKind | undefined => { - const importedFile = importsByAlias.get(alias); - if (!importedFile) return undefined; - const importedAst = importedAstCache.get(importedFile)!; - return lookupKind(importedAst, name); - }; - - const bareSendRefSpec = { - mode: "bare_send_rhs" as const, - bareSend: BARE_SEND_REF_MSG, - lookupImportedKind, - }; - - const makeSubEnv = (loc: { line: number; col: number }): SubstitutionValidateEnv => ({ - filePath: ast.filePath, - loc, - localRules, - localWorkflows, + const baseCtx = { + diag, + ast, + refCtx, + localChannels, localScripts, + localWorkflows, importsByAlias, - lookupImported: lookupImportedKind, - }); - - const stripDQ = (s: string): string => - s.length >= 2 && s[0] === '"' && s[s.length - 1] === '"' ? s.slice(1, -1) : s; - - const extractConstScriptName = (rhs: string): string | undefined => { - const trimmed = rhs.trim(); - if (/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(trimmed)) return trimmed; - const inner = stripDQ(trimmed); - const m = inner.match(/^\$\{([a-zA-Z_][a-zA-Z0-9_]*)\}$/); - return m?.[1]; - }; - - const semanticQuotedOrchestrationInner = (dqRaw: string): string => stripDQ(dqRaw); - - const promptBareIdentifier = (raw: string): string | undefined => { - const m = raw.match(/^"\$\{([A-Za-z_][A-Za-z0-9_]*)\}"$/); - return m?.[1]; - }; - - const parseSchemaFieldNames = (rawSchema: string): string[] => { - const inner = rawSchema.trim().replace(/^\s*\{\s*/, "").replace(/\s*\}\s*$/, "").trim(); - if (!inner) return []; - const names: string[] = []; - for (const part of inner.split(",")) { - const m = part.trim().match(/^\s*([A-Za-z_][A-Za-z0-9_]*)\s*:\s*\S+\s*$/); - if (m) names.push(m[1]); - } - return names; - }; - - const validateDotFieldRefs = ( - content: string, - loc: { line: number; col: number }, - promptSchemas: Map, - ): void => { - for (const ref of extractDotFieldRefs(content)) { - const fields = promptSchemas.get(ref.varName); - if (!fields) { - diag.error( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `\${${ref.varName}.${ref.fieldName}}: "${ref.varName}" is not a typed prompt capture; dot notation requires a prompt with "returns" schema`, - ); - } - if (!fields.includes(ref.fieldName)) { - diag.error( - ast.filePath, - loc.line, - loc.col, - "E_VALIDATE", - `\${${ref.varName}.${ref.fieldName}}: field "${ref.fieldName}" is not defined in the returns schema for "${ref.varName}"; available fields: ${fields.join(", ")}`, - ); - } - } - }; - - const validateWorkflowStringCaptures = (content: string, loc: { line: number; col: number }): void => { - for (const cap of extractInlineCaptures(content)) { - if (cap.kind === "run") { - validateNoShellRedirection(diag, ast.filePath, loc, "run", cap.args); - validateRef({ value: cap.ref, loc }, ast, refCtx, expectRunTargetRef); - } else { - validateNoShellRedirection(diag, ast.filePath, loc, "ensure", cap.args); - validateRef({ value: cap.ref, loc }, ast, refCtx, expectRuleRef); - } - } - }; - - const validateRuleStringCaptures = (content: string, loc: { line: number; col: number }): void => { - for (const cap of extractInlineCaptures(content)) { - if (cap.kind === "run") { - validateNoShellRedirection(diag, ast.filePath, loc, "run", cap.args); - validateRef({ value: cap.ref, loc }, ast, refCtx, expectRunInRuleRef); - } else { - validateNoShellRedirection(diag, ast.filePath, loc, "ensure", cap.args); - validateRef({ value: cap.ref, loc }, ast, refCtx, expectRuleRef); - } - } - }; - - /** Run the 5 standard checks (redirection, nested-managed, ref, arity, var-ref) on a callable Expr. */ - const validateCallable = ( - body: Expr, - knownVars: Set, - scope: "workflow" | "rule", - recoverBindings?: Set, - ): void => { - if (body.kind === "call") { - const loc = body.callee.loc; - validateNoShellRedirection(diag, ast.filePath, loc, "run", body.args); - validateNestedManagedCallArgs(diag, ast.filePath, loc, body.args); - const isRuleScope = scope === "rule"; - if (!body.callee.value.includes(".") && knownVars.has(body.callee.value) && !localScripts.has(body.callee.value) && !(scope === "workflow" && localWorkflows.has(body.callee.value))) { - diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `strings are not executable; "${body.callee.value}" is a string — use a script instead`); - } - validateRef(body.callee, ast, refCtx, isRuleScope ? expectRunInRuleRef : expectRunTargetRef); - validateArity(diag, ast.filePath, loc, body.callee.value, body.args, "workflow", ast, refCtx); - validateArgVarRefs(diag, ast.filePath, loc, body.args, knownVars, recoverBindings); - return; - } - if (body.kind === "ensure_call") { - const loc = body.callee.loc; - validateNoShellRedirection(diag, ast.filePath, loc, "ensure", body.args); - validateNestedManagedCallArgs(diag, ast.filePath, loc, body.args); - validateRef(body.callee, ast, refCtx, expectRuleRef); - validateArity(diag, ast.filePath, loc, body.callee.value, body.args, "rule", ast, refCtx); - validateArgVarRefs(diag, ast.filePath, loc, body.args, knownVars, recoverBindings); - return; - } - if (body.kind === "inline_script") { - return; // no ref to validate - } - if (body.kind === "match") { - validateMatchExpr(diag, ast.filePath, body.match, knownVars); - return; - } - }; - - /** Validate the value Expr stored under a `const` / `return` / `send` step in a workflow context. */ - const validateWorkflowValueExpr = ( - expr: Expr, - stepLoc: { line: number; col: number }, - knownVars: Set, - promptSchemas: Map, - recoverBindings: Set | undefined, - label: "const" | "return" | "send", - constName?: string, - ): void => { - if (expr.kind === "literal") { - if (label === "send") { - const inner = expr.raw.startsWith('"') && expr.raw.endsWith('"') ? expr.raw.slice(1, -1) : expr.raw; - validateJaiphStringContent(inner, ast.filePath, stepLoc.line, stepLoc.col, "send"); - validateWorkflowStringCaptures(inner, stepLoc); - validateDotFieldRefs(inner, stepLoc, promptSchemas); - validateSimpleInterpolationIdentifiers( - inner, ast.filePath, stepLoc.line, stepLoc.col, - "send", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - return; - } - if (label === "return") { - validateReturnString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); - if (expr.raw.startsWith('"')) { - const retInner = semanticQuotedOrchestrationInner(expr.raw); - validateWorkflowStringCaptures(retInner, stepLoc); - validateDotFieldRefs(retInner, stepLoc, promptSchemas); - validateSimpleInterpolationIdentifiers( - retInner, ast.filePath, stepLoc.line, stepLoc.col, - "return", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - } - return; - } - // const - const scriptName = extractConstScriptName(expr.raw); - if (scriptName && localScripts.has(scriptName)) { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); - } - const inner = semanticQuotedOrchestrationInner(expr.raw); - validateWorkflowStringCaptures(inner, stepLoc); - validateDotFieldRefs(inner, stepLoc, promptSchemas); - validateSimpleInterpolationIdentifiers( - inner, ast.filePath, stepLoc.line, stepLoc.col, - "const", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - return; - } - if (expr.kind === "call") { - validateCallable(expr, knownVars, "workflow", recoverBindings); - return; - } - if (expr.kind === "ensure_call") { - validateCallable(expr, knownVars, "workflow", recoverBindings); - return; - } - if (expr.kind === "inline_script") { - return; - } - if (expr.kind === "match") { - validateMatchExpr(diag, ast.filePath, expr.match, knownVars); - return; - } - if (expr.kind === "prompt") { - if (label !== "const") { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `prompt is not a valid ${label} value`); - } - const promptIdent = promptBareIdentifier(expr.raw); - if (promptIdent && localScripts.has(promptIdent)) { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not promptable; "${promptIdent}" is a script — use a string const instead`); - } - validatePromptString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); - if (expr.returns !== undefined) { - validatePromptReturnsSchema(expr.returns, ast.filePath, stepLoc.line, stepLoc.col); - } - const pcInner = semanticQuotedOrchestrationInner(expr.raw); - validateWorkflowStringCaptures(pcInner, stepLoc); - validateDotFieldRefs(pcInner, stepLoc, promptSchemas); - validateSimpleInterpolationIdentifiers( - pcInner, ast.filePath, stepLoc.line, stepLoc.col, - "prompt", knownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - return; - } - if (expr.kind === "bare_ref") { - if (label !== "send") { - diag.error(ast.filePath, expr.ref.loc.line, expr.ref.loc.col, "E_VALIDATE", `bare reference is only valid as a send payload`); - } - validateRef(expr.ref, ast, refCtx, bareSendRefSpec); - return; - } - if (expr.kind === "shell") { - if (label !== "send") { - diag.error(ast.filePath, expr.loc.line, expr.loc.col, "E_VALIDATE", `raw shell fragment is only valid as a send payload`); - } - validateManagedWorkflowShell(expr.command, makeSubEnv({ line: expr.loc.line, col: expr.loc.col })); - return; - } - void constName; - }; - - /** Same as `validateWorkflowValueExpr` but with rule-scope rules (no prompt, restricted run targets). */ - const validateRuleValueExpr = ( - expr: Expr, - stepLoc: { line: number; col: number }, - knownVars: Set, - label: "const" | "return", - ): void => { - if (expr.kind === "literal") { - if (label === "return") { - validateReturnString(expr.raw, ast.filePath, stepLoc.line, stepLoc.col); - if (expr.raw.startsWith('"')) { - const retRuleInner = semanticQuotedOrchestrationInner(expr.raw); - validateRuleStringCaptures(retRuleInner, stepLoc); - validateSimpleInterpolationIdentifiers( - retRuleInner, ast.filePath, stepLoc.line, stepLoc.col, - "return", knownVars, "rule", undefined, undefined, localScripts, - ); - } - return; - } - const scriptName = extractConstScriptName(expr.raw); - if (scriptName && localScripts.has(scriptName)) { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `scripts are not values; "${scriptName}" is a script definition`); - } - validateRuleStringCaptures(stripDQ(expr.raw), stepLoc); - validateSimpleInterpolationIdentifiers( - stripDQ(expr.raw), ast.filePath, stepLoc.line, stepLoc.col, - "const", knownVars, "rule", undefined, undefined, localScripts, - ); - return; - } - if (expr.kind === "call") { - validateCallable(expr, knownVars, "rule"); - return; - } - if (expr.kind === "ensure_call") { - validateCallable(expr, knownVars, "rule"); - return; - } - if (expr.kind === "inline_script") { - return; - } - if (expr.kind === "match") { - validateMatchExpr(diag, ast.filePath, expr.match, knownVars); - return; - } - if (expr.kind === "prompt") { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", "const ... = prompt is not allowed in rules"); - } - if (expr.kind === "bare_ref" || expr.kind === "shell") { - diag.error(ast.filePath, stepLoc.line, stepLoc.col, "E_VALIDATE", `${expr.kind} expression is not allowed in rules`); - } - }; + importedAstCache, + } as const; for (const rule of ast.rules) { let ruleWalk: StepTreeWalk | undefined; @@ -931,132 +299,38 @@ export function validateModuleInto( rule.params, rule.loc, localScripts, - parseSchemaFieldNames, { withPromptSchemas: false }, ); }); if (!ruleWalk) continue; - const ruleKnownVars = ruleWalk.knownVars; - const validateRuleStep = (s: WorkflowStepDef): void => { - if (s.type === "trivia") return; - if (s.type === "say") { - if (s.level === "log" || s.level === "logerr") { - if (s.message.kind === "inline_script") return; - if (s.message.kind === "literal") { - validateLogString(s.message.raw, ast.filePath, s.loc.line, s.loc.col, s.level); - const inner = s.message.raw; - validateRuleStringCaptures(inner, s.loc); - validateSimpleInterpolationIdentifiers( - inner, ast.filePath, s.loc.line, s.loc.col, - s.level, ruleKnownVars, "rule", undefined, undefined, localScripts, - ); - return; - } - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); - } - // fail - if (s.message.kind !== "literal") { - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); - } - validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); - const failInner = semanticQuotedOrchestrationInner(s.message.raw); - validateRuleStringCaptures(failInner, s.loc); - validateSimpleInterpolationIdentifiers( - failInner, ast.filePath, s.loc.line, s.loc.col, - "fail", ruleKnownVars, "rule", undefined, undefined, localScripts, - ); - return; - } - if (s.type === "send") { - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "send is not allowed in rules"); - } - if (s.type === "return") { - validateRuleValueExpr(s.value, s.loc, ruleKnownVars, "return"); - return; - } - if (s.type === "const") { - validateRuleValueExpr(s.value, s.loc, ruleKnownVars, "const"); - return; - } - if (s.type === "exec") { - const body = s.body; - if (body.kind === "prompt") { - diag.error(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "prompt is not allowed in rules"); - } - if (body.kind === "shell") { - diag.error(ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", "inline shell steps are forbidden in rules; use explicit script blocks"); - } - if (body.kind === "call" && (s as Extract).body.kind === "call") { - const callBody = body; - if (callBody.async) { - diag.error(ast.filePath, callBody.callee.loc.line, callBody.callee.loc.col, "E_VALIDATE", "run async is not allowed in rules; use it in workflows only"); - } - } - validateCallable(body, ruleKnownVars, "rule"); - return; - } - if (s.type === "if") { - if (s.operand.kind === "regex") { - try { new RegExp(s.operand.source); } catch { - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); - } - } - return; - } - if (s.type === "for_lines") { - if (!ruleKnownVars.has(s.sourceVar)) { - diag.error( - ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", - `for ... in : "${s.sourceVar}" is not a known variable in this scope`, - ); - } - return; - } - const _never: never = s; - return _never; + const ctx: ValidatorCtx = { + ...baseCtx, + scope: RULE_SCOPE, + knownVars: ruleWalk.knownVars, + promptSchemas: ruleWalk.promptSchemas, + recoverBindings: undefined, }; for (const entry of ruleWalk.flat) { - diag.capture(() => validateRuleStep(entry.step)); + diag.capture(() => validateStep(entry.step, { ...ctx, recoverBindings: entry.recoverBindings })); } } - const validateChannelRef = (channel: string, loc: { line: number; col: number }): void => { - const parts = channel.split("."); - if (parts.length === 1) { - if (!localChannels.has(channel)) { - diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); - } - return; - } - if (parts.length !== 2) { - diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); - } - const [alias, importedChannel] = parts; - const importedFile = importsByAlias.get(alias); - if (!importedFile) { - diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); - } - const importedAst = importedAstCache.get(importedFile)!; - const importedChannels = new Set(importedAst.channels.map((c) => c.name)); - if (!importedChannels.has(importedChannel)) { - diag.error(ast.filePath, loc.line, loc.col, "E_VALIDATE", `Channel "${channel}" is not defined`); - } - }; - for (const ch of ast.channels) { - if (ch.routes) { - for (const wfRef of ch.routes) { - diag.capture(() => { - validateRef(wfRef, ast, refCtx, expectWorkflowRef); - const targetParams = resolveRouteTargetParams(wfRef.value, ast, refCtx); - if (targetParams !== undefined && targetParams !== 3) { - diag.error( - ast.filePath, wfRef.loc.line, wfRef.loc.col, "E_VALIDATE", - `inbox route target "${wfRef.value}" must declare exactly 3 parameters (message, channel, sender), but declares ${targetParams}`, - ); - } - }); - } + if (!ch.routes) continue; + for (const wfRef of ch.routes) { + diag.capture(() => { + validateRef(wfRef, ast, refCtx, { mode: "expect", expect: ROUTE_REF_EXPECT }); + const targetParams = resolveRouteTargetParams(wfRef.value, ast, refCtx); + if (targetParams !== undefined && targetParams !== 3) { + diag.error( + ast.filePath, + wfRef.loc.line, + wfRef.loc.col, + "E_VALIDATE", + `inbox route target "${wfRef.value}" must declare exactly 3 parameters (message, channel, sender), but declares ${targetParams}`, + ); + } + }); } } @@ -1071,118 +345,19 @@ export function validateModuleInto( workflow.params, workflow.loc, localScripts, - parseSchemaFieldNames, { withPromptSchemas: true }, ); }); if (!wfWalk) continue; - const wfKnownVars = wfWalk.knownVars; - const promptSchemas = wfWalk.promptSchemas; - - const validateStep = (s: WorkflowStepDef, recoverBindings?: Set): void => { - if (s.type === "trivia") return; - if (s.type === "send") { - validateChannelRef(s.channel, s.loc); - validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "send"); - return; - } - if (s.type === "say") { - if (s.level === "log" || s.level === "logerr") { - if (s.message.kind === "inline_script") return; - if (s.message.kind === "literal") { - validateLogString(s.message.raw, ast.filePath, s.loc.line, s.loc.col, s.level); - const inner = s.message.raw; - validateWorkflowStringCaptures(inner, s.loc); - validateDotFieldRefs(inner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - inner, ast.filePath, s.loc.line, s.loc.col, - s.level, wfKnownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - return; - } - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `unsupported ${s.level} message form`); - } - // fail - if (s.message.kind !== "literal") { - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", "fail message must be a literal string"); - } - validateFailString(s.message.raw, ast.filePath, s.loc.line, s.loc.col); - const failInner = semanticQuotedOrchestrationInner(s.message.raw); - validateWorkflowStringCaptures(failInner, s.loc); - validateDotFieldRefs(failInner, s.loc, promptSchemas); - validateSimpleInterpolationIdentifiers( - failInner, ast.filePath, s.loc.line, s.loc.col, - "fail", wfKnownVars, "workflow", promptSchemas, recoverBindings, localScripts, - ); - return; - } - if (s.type === "return") { - validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "return"); - return; - } - if (s.type === "const") { - validateWorkflowValueExpr(s.value, s.loc, wfKnownVars, promptSchemas, recoverBindings, "const", s.name); - return; - } - if (s.type === "exec") { - const body = s.body; - if (body.kind === "prompt") { - validateWorkflowValueExpr(body, s.loc, wfKnownVars, promptSchemas, recoverBindings, "const"); - validatePromptStepReturns(body, s.captureName, ast.filePath); - return; - } - if (body.kind === "shell") { - if (hasUnquotedSendArrow(body.command) && matchSendOperator(body.command) === null) { - diag.error( - ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", - "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)", - ); - } - const t = body.command.trim(); - if (/^(?:[A-Za-z_][A-Za-z0-9_]*)(?:\.[A-Za-z_][A-Za-z0-9_]*)*$/.test(t)) { - if (!t.includes(".")) { - if (localScripts.has(t) || localWorkflows.has(t)) { - diag.error( - ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", - `use run ${t}() — a bare name that refers to a script or workflow must use a managed run step`, - ); - } - } else { - validateRef({ value: t, loc: body.loc }, ast, refCtx, expectRunTargetRef); - diag.error( - ast.filePath, body.loc.line, body.loc.col, "E_VALIDATE", - `use run ${t}() — "${t}" is a valid script or workflow reference; use a managed run step`, - ); - } - } - return; - } - validateCallable(body, wfKnownVars, "workflow", recoverBindings); - return; - } - if (s.type === "if") { - if (s.operand.kind === "regex") { - try { new RegExp(s.operand.source); } catch { - diag.error(ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", `invalid regex in if condition: /${s.operand.source}/`); - } - } - return; - } - if (s.type === "for_lines") { - if (!wfKnownVars.has(s.sourceVar)) { - diag.error( - ast.filePath, s.loc.line, s.loc.col, "E_VALIDATE", - `for ... in : "${s.sourceVar}" is not a known variable in this scope`, - ); - } - return; - } - const _never: never = s; - return _never; + const ctx: ValidatorCtx = { + ...baseCtx, + scope: WORKFLOW_SCOPE, + knownVars: wfWalk.knownVars, + promptSchemas: wfWalk.promptSchemas, + recoverBindings: undefined, }; - for (const entry of wfWalk.flat) { - diag.capture(() => validateStep(entry.step, entry.recoverBindings)); + diag.capture(() => validateStep(entry.step, { ...ctx, recoverBindings: entry.recoverBindings })); } } @@ -1235,9 +410,7 @@ function validateTestBlocks( ); } const refName = - step.type === "test_expect_equal" - ? step.expectedVar - : step.substringVar; + step.type === "test_expect_equal" ? step.expectedVar : step.substringVar; if (refName !== undefined && !inScope.has(refName)) { diag.error( ast.filePath, diff --git a/test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json b/test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json new file mode 100644 index 00000000..3a66374a --- /dev/null +++ b/test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json @@ -0,0 +1,990 @@ +{ + "validate-errors.txt > unknown local rule reference": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local rule reference \"missing_rule\"" + } + ], + "validate-errors.txt > unknown local workflow or script reference in run": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local workflow or script reference \"missing_workflow\"" + } + ], + "validate-errors.txt > unknown local channel in send": [ + { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_VALIDATE", + "message": "Channel \"typo\" is not defined" + } + ], + "validate-errors.txt > rule with inline brace group fails shell-step ban": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "inline shell steps are forbidden in rules; use explicit script blocks" + } + ], + "validate-errors.txt > rule with multi-line brace group fails shell-step ban": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "inline shell steps are forbidden in rules; use explicit script blocks" + } + ], + "validate-errors.txt > unsupported type in returns schema": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "unsupported type in returns schema: \"array\" (only string, number, boolean allowed)" + } + ], + "validate-errors.txt > workflow raw shell that names a script must use run": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "use run f() — a bare name that refers to a script or workflow must use a managed run step" + } + ], + "validate-errors.txt > workflow raw shell that names a workflow must use run": [ + { + "file": "input.jh", + "line": 6, + "col": 3, + "code": "E_VALIDATE", + "message": "use run w() — a bare name that refers to a script or workflow must use a managed run step" + } + ], + "validate-errors.txt > send RHS cannot invoke workflow via shell": [ + { + "file": "input.jh", + "line": 7, + "col": 5, + "code": "E_VALIDATE", + "message": "workflow \"w\" must be called with run" + } + ], + "validate-errors.txt > bare identifier arg unknown name fails": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"unknown_var\" used as bare argument; declare it with \"const\", use a capture, or add a workflow/rule parameter" + } + ], + "validate-errors.txt > run async is rejected in rules": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "run async is not allowed in rules; use it in workflows only" + } + ], + "validate-errors.txt > route with unknown workflow": [ + { + "file": "input.jh", + "line": 1, + "col": 21, + "code": "E_VALIDATE", + "message": "unknown local workflow reference \"missing_wf\"" + } + ], + "validate-errors.txt > route with rule ref": [ + { + "file": "input.jh", + "line": 1, + "col": 21, + "code": "E_VALIDATE", + "message": "rule \"check\" must be called with ensure" + } + ], + "validate-errors.txt > route inside workflow body is parse error": [ + { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "route declarations belong at the top level: channel findings -> analyst" + } + ], + "validate-errors.txt > inline run ref with unknown script": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local workflow or script reference \"nonexistent\"" + } + ], + "validate-errors.txt > dot field ref where var has no schema": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "${x.field}: \"x\" is not a typed prompt capture; dot notation requires a prompt with \"returns\" schema" + } + ], + "validate-errors.txt > dot field ref with nonexistent field in schema": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "${result.bogus}: field \"bogus\" is not defined in the returns schema for \"result\"; available fields: type, risk" + } + ], + "validate-errors.txt > unknown import alias in rule reference": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown import alias \"ghost\" for rule reference \"ghost.guard\"" + } + ], + "validate-errors.txt > match: missing wildcard arm": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "match must have exactly one wildcard (_) arm" + } + ], + "validate-errors.txt > match: multiple wildcard arms": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "match must have exactly one wildcard (_) arm, found multiple" + } + ], + "validate-errors.txt > shell redirection in ensure args rejected": [ + { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after ensure call: '| grep ok'; shell redirection (>, |, &) is not supported — use a script block" + } + ], + "validate-errors.txt > run in workflow targeting a rule is rejected": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"check\" must be called with ensure, not run" + } + ], + "validate-errors.txt > run in rule targeting a rule is rejected": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"other_rule\" must be called with ensure, not run" + } + ], + "validate-errors.txt > ensure rejects local script reference": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "script \"my_script\" cannot be called with ensure" + } + ], + "validate-errors.txt > const prompt in rules (caught at parse time)": [ + { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = prompt is not allowed in rules" + } + ], + "validate-errors.txt > returns schema cannot be empty": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "returns schema cannot be empty" + } + ], + "validate-errors.txt > returns schema rejects array types": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "returns schema must be flat (no arrays or union types); only string, number, boolean allowed" + } + ], + "validate-errors.txt > returns schema rejects union types": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "returns schema must be flat (no arrays or union types); only string, number, boolean allowed" + } + ], + "validate-errors.txt > returns schema rejects malformed entry": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "invalid returns schema entry: expected \"fieldName: type\" (got badentry...)" + } + ], + "validate-errors.txt > returns schema rejects unsupported type": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_SCHEMA", + "message": "unsupported type in returns schema: \"array\" (only string, number, boolean allowed)" + } + ], + "validate-errors.txt > run in workflow targeting a rule via import": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"lib.check\" must be called with ensure, not run" + } + ], + "validate-errors.txt > ensure imported workflow requires run": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"lib.deploy\" must be called with run" + } + ], + "validate-errors.txt > ensure rejects imported script": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "script \"lib.helper\" cannot be called with ensure" + } + ], + "validate-errors.txt > run inside rule must target script not imported workflow": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "run inside a rule must target a script, not workflow \"lib.deploy\"" + } + ], + "validate-errors.txt > route with imported rule ref rejected": [ + { + "file": "main.jh", + "line": 2, + "col": 21, + "code": "E_VALIDATE", + "message": "rule \"lib.check\" must be called with ensure" + } + ], + "validate-errors.txt > arity mismatch: too few args to workflow": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"helper\" expects 2 argument(s) (a, b), but got 1" + } + ], + "validate-errors.txt > arity mismatch: too many args to workflow": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"helper\" expects 1 argument(s) (a), but got 2" + } + ], + "validate-errors.txt > arity mismatch: zero args to workflow expecting two": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"helper\" expects 2 argument(s) (a, b), but got 0" + } + ], + "validate-errors.txt > arity mismatch: too few args to rule": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"check\" expects 1 argument(s) (x), but got 0" + } + ], + "validate-errors.txt > route target with wrong parameter count": [ + { + "file": "input.jh", + "line": 1, + "col": 21, + "code": "E_VALIDATE", + "message": "inbox route target \"handler\" must declare exactly 3 parameters (message, channel, sender), but declares 1" + } + ], + "validate-errors.txt > route target with zero parameters": [ + { + "file": "input.jh", + "line": 1, + "col": 21, + "code": "E_VALIDATE", + "message": "inbox route target \"handler\" must declare exactly 3 parameters (message, channel, sender), but declares 0" + } + ], + "validate-errors.txt > inline run ref with unknown rule": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local workflow or script reference \"nonexistent\"" + } + ], + "validate-errors.txt > match: missing wildcard arm with single pattern": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "match must have exactly one wildcard (_) arm" + } + ], + "validate-errors.txt > send RHS with local rule bare ref": [ + { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_VALIDATE", + "message": "rule \"check\" must be called with ensure" + } + ], + "validate-errors.txt > shell redirection in run args rejected in rule": [ + { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after run call: '| grep ok'; shell redirection (>, |, &) is not supported — use a script block" + } + ], + "validate-errors.txt > shell redirection in run args rejected in workflow": [ + { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after run call: '> /tmp/out'; shell redirection (>, |, &) is not supported — use a script block" + } + ], + "validate-errors.txt > shell redirection in ensure args rejected in workflow": [ + { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after ensure call: '| grep ok'; shell redirection (>, |, &) is not supported — use a script block" + } + ], + "validate-errors.txt > channel ref with unknown import alias": [ + { + "file": "main.jh", + "line": 3, + "col": 1, + "code": "E_VALIDATE", + "message": "Channel \"ghost.mychan\" is not defined" + } + ], + "validate-errors.txt > imported route target with wrong parameter count": [ + { + "file": "main.jh", + "line": 2, + "col": 21, + "code": "E_VALIDATE", + "message": "inbox route target \"lib.handler\" must declare exactly 3 parameters (message, channel, sender), but declares 1" + } + ], + "validate-errors.txt > const run of string variable in rule rejected": [ + { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_VALIDATE", + "message": "strings are not executable; \"name\" is a string — use a script instead" + } + ], + "validate-errors.txt > channel reference with three dot parts is rejected": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "invalid send: channel must be a single name or `alias.name` (at most one dot in the channel part)" + } + ], + "validate-errors.txt > command substitution invokes workflow in send shell RHS": [ + { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_VALIDATE", + "message": "command substitution cannot invoke workflow \"helper\"; use run helper ... in a workflow step" + } + ], + "validate-errors.txt > command substitution invokes script in send shell RHS": [ + { + "file": "input.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "command substitution cannot invoke script \"helper\"; use run helper ... for managed calls (or use pure shell inside $(...))" + } + ], + "validate-errors.txt > scripts are not values in rule const": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts are not values; \"helper\" is a script definition" + } + ], + "validate-errors.txt > scripts are not values in workflow const": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts are not values; \"helper\" is a script definition" + } + ], + "validate-errors.txt > scripts are not promptable in workflow": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts are not promptable; \"helper\" is a script — use a string const instead" + } + ], + "validate-errors.txt > scripts cannot be interpolated in log": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts cannot be interpolated; \"helper\" is a script definition" + } + ], + "validate-errors.txt > match arm body cannot start with return": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "match arm body must not start with \"return\"; the match expression itself produces the value — use the expression directly after =>" + } + ], + "validate-errors.txt > inline script in match arm body rejected": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "inline scripts are not allowed in match arm bodies; use a named script with \"run script_name(…)\" instead" + } + ], + "validate-errors.txt > strings are not executable in workflow run": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "strings are not executable; \"name\" is a string — use a script instead" + } + ], + "validate-errors.txt > match arm body cannot start with return in const context": [ + { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_VALIDATE", + "message": "match arm body must not start with \"return\"; the match expression itself produces the value — use the expression directly after =>" + } + ], + "validate-errors.txt > unknown identifier in fail string": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in fail; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > unknown identifier in rule log string": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in log; declare it with `const`, use a capture, or add a rule parameter" + } + ], + "validate-errors.txt > unknown identifier in send literal": [ + { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in send; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > send RHS with local script bare ref requires run": [ + { + "file": "input.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "script \"helper\" must be called with run" + } + ], + "validate-errors.txt > unknown identifier in workflow logerr string": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in logerr; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > unknown identifier in workflow return string": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in return; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > scripts cannot be interpolated in logerr": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts cannot be interpolated; \"helper\" is a script definition" + } + ], + "validate-errors.txt > scripts cannot be interpolated in fail": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts cannot be interpolated; \"helper\" is a script definition" + } + ], + "validate-errors.txt > scripts cannot be interpolated in return": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "scripts cannot be interpolated; \"helper\" is a script definition" + } + ], + "validate-errors.txt > strings are not executable in workflow const run": [ + { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_VALIDATE", + "message": "strings are not executable; \"name\" is a string — use a script instead" + } + ], + "validate-errors.txt > ensure rejects local workflow reference": [ + { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"helper\" must be called with run" + } + ], + "validate-errors.txt > run async is rejected in rules with imported workflow": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "run async is not allowed in rules; use it in workflows only" + } + ], + "validate-errors.txt > scripts cannot be interpolated in send literal": [ + { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_VALIDATE", + "message": "scripts cannot be interpolated; \"helper\" is a script definition" + } + ], + "validate-errors.txt > unknown identifier in inline run capture in log": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local workflow or script reference \"nonexistent_script\"" + } + ], + "validate-errors.txt > strings are not executable in rule run": [ + { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_VALIDATE", + "message": "strings are not executable; \"greeting\" is a string — use a script instead" + } + ], + "validate-errors.txt > unknown identifier in workflow log string": [ + { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"ghost\" in log; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > match arm body with ensure arm in const context": [], + "validate-errors.txt > return bare unknown identifier": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown identifier \"missing_name\" in return; declare it with `const`, use a capture, or add a workflow parameter" + } + ], + "validate-errors.txt > test block: expect_equal LHS variable not captured (no implicit `response`)": [ + { + "file": "input.test.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "expect_equal: undefined name \"response\" (capture it first with: const response = run …)" + } + ], + "validate-errors.txt > test block: expect_equal RHS const reference not declared": [ + { + "file": "input.test.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "expect_equal: undefined name \"expected\" (declare it earlier with: const expected = \"…\")" + } + ], + "validate-errors.txt > test block: mock prompt references undeclared const": [ + { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_VALIDATE", + "message": "mock prompt: undefined name \"reply\" (declare it earlier with: const reply = \"…\")" + } + ], + "validate-errors.txt > test block: explicit capture + const reference is valid": [], + "validate-errors.txt > invalid regex in if condition in workflow": [ + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "invalid regex in if condition: /[bad(/" + } + ], + "validate-errors.txt > invalid regex in if condition in rule": [ + { + "file": "input.jh", + "line": 4, + "col": 3, + "code": "E_VALIDATE", + "message": "invalid regex in if condition: /[bad(/" + } + ], + "validate-errors.txt > import script resolves to missing file": [ + { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_IMPORT_NOT_FOUND", + "message": "import script \"queue\" resolves to missing file \"/missing.py\"" + }, + { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown local workflow or script reference \"queue\"" + } + ], + "validate-errors.txt > bare imported workflow as shell line must use run": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "use run lib.deploy() — \"lib.deploy\" is a valid script or workflow reference; use a managed run step" + } + ], + "validate-errors.txt > bare imported script as shell line must use run": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "use run lib.helper() — \"lib.helper\" is a valid script or workflow reference; use a managed run step" + } + ], + "validate-errors.txt > command substitution invokes rule in send shell RHS": [ + { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_VALIDATE", + "message": "command substitution cannot invoke rule \"check\"; use ensure check ... in a workflow step" + } + ], + "validate-errors.txt > command substitution contains channel send": [ + { + "file": "input.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "command substitution cannot contain channel send (<-); use a workflow send step instead" + } + ], + "validate-errors-multi-module.txt > duplicate import alias": [ + { + "file": "main.jh", + "line": 2, + "col": 1, + "code": "E_VALIDATE", + "message": "duplicate import alias \"mod\"" + }, + { + "file": "main.jh", + "line": 5, + "col": 3, + "code": "E_VALIDATE", + "message": "imported rule \"mod.one\" does not exist" + } + ], + "validate-errors-multi-module.txt > imported workflow missing": [ + { + "file": "main.jh", + "line": 4, + "col": 3, + "code": "E_VALIDATE", + "message": "imported workflow or script \"lib.missing\" does not exist" + } + ], + "validate-errors-multi-module.txt > send RHS with unknown imported symbol": [ + { + "file": "main.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "unknown symbol \"lib.nonexistent\" in send right-hand side" + } + ], + "validate-errors-multi-module.txt > missing channel import fails": [ + { + "file": "main.jh", + "line": 3, + "col": 1, + "code": "E_VALIDATE", + "message": "Channel \"shared.typo\" is not defined" + } + ], + "validate-errors-multi-module.txt > unknown import alias in run reference": [ + { + "file": "main.jh", + "line": 2, + "col": 3, + "code": "E_VALIDATE", + "message": "unknown import alias \"ghost\" for run target \"ghost.deploy\"" + } + ], + "validate-errors-multi-module.txt > imported script does not exist": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "imported workflow or script \"lib.nonexistent\" does not exist" + } + ], + "validate-errors-multi-module.txt > run in rule targeting imported rule rejected": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"lib.other_rule\" must be called with ensure, not run" + } + ], + "validate-errors-multi-module.txt > import resolves to missing file": [ + { + "file": "main.jh", + "line": 1, + "col": 1, + "code": "E_IMPORT_NOT_FOUND", + "message": "import \"lib\" resolves to missing file \"/nonexistent.jh\"" + } + ], + "validate-errors-multi-module.txt > arity mismatch on imported workflow": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "workflow \"lib.helper\" expects 2 argument(s) (a, b), but got 1" + } + ], + "validate-errors-multi-module.txt > send RHS with imported script bare ref": [ + { + "file": "main.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "script \"lib.helper\" must be called with run" + } + ], + "validate-errors-multi-module.txt > send RHS with imported rule bare ref": [ + { + "file": "main.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "rule \"lib.check\" must be called with ensure" + } + ], + "validate-errors-multi-module.txt > imported channel name not found": [ + { + "file": "main.jh", + "line": 3, + "col": 1, + "code": "E_VALIDATE", + "message": "Channel \"lib.nonexistent_chan\" is not defined" + } + ], + "validate-errors-multi-module.txt > ensure non-exported rule from module with exports": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "\"private_check\" is not exported from module \"lib\"" + } + ], + "validate-errors-multi-module.txt > run non-exported workflow from module with exports": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "\"private_wf\" is not exported from module \"lib\"" + } + ], + "validate-errors-multi-module.txt > shell line with unknown imported symbol in workflow": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "imported workflow or script \"lib.nonexistent_thing\" does not exist" + } + ], + "validate-errors-multi-module.txt > run non-exported script from module with exports": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "\"private_script\" is not exported from module \"lib\"" + } + ], + "validate-errors-multi-module.txt > send RHS with non-exported workflow bare ref": [ + { + "file": "main.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "\"private_wf\" is not exported from module \"lib\"" + } + ], + "validate-errors-multi-module.txt > arity mismatch on imported rule": [ + { + "file": "main.jh", + "line": 3, + "col": 3, + "code": "E_VALIDATE", + "message": "rule \"lib.check\" expects 2 argument(s) (a, b), but got 1" + } + ], + "validate-errors-multi-module.txt > send RHS with non-exported script bare ref": [ + { + "file": "main.jh", + "line": 4, + "col": 5, + "code": "E_VALIDATE", + "message": "\"private_script\" is not exported from module \"lib\"" + } + ], + "validate-errors-multi-module.txt > const ensure on imported workflow rejected": [ + { + "file": "main.jh", + "line": 3, + "col": 13, + "code": "E_VALIDATE", + "message": "workflow \"lib.deploy\" must be called with run" + } + ] +} From c7162f21c97d8d61532237a423227a40719790cd Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 16:32:49 +0200 Subject: [PATCH 12/14] Refactor: decouple validator from runtime semantics MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `validate-step.ts` used to import `tripleQuotedRawForRuntime` from `src/runtime/orchestration-text.ts` to compute "what the runtime will see" for triple-quoted match-arm bodies — a one-way compile-time dependency on runtime semantics. Move the helper into the parser layer as `canonicalizeTripleQuotedString` in `src/parse/triple-quote.ts`; both the validator and the runtime now import it from `src/parse/`, and the `src/runtime/orchestration-text.ts` wrapper is deleted. New tests pin the invariants: `no-runtime-imports.test.ts` greps every non-test `*.ts` under `src/transpile/` and fails on any `from "…/runtime/…"` import, and `canonicalize-triple-quoted.test.ts` walks every triple-quoted match-arm body in `test-fixtures/` and `examples/` and asserts bit-for-bit parity against an inlined legacy baseline. --- CHANGELOG.md | 1 + QUEUE.md | 26 ----- docs/architecture.md | 3 +- docs/contributing.md | 1 + src/parse/canonicalize-triple-quoted.test.ts | 105 +++++++++++++++++++ src/parse/triple-quote.ts | 17 +++ src/runtime/kernel/node-workflow-runtime.ts | 4 +- src/runtime/orchestration-text.ts | 18 ---- src/transpile/no-runtime-imports.test.ts | 38 +++++++ src/transpile/validate-step.ts | 4 +- 10 files changed, 168 insertions(+), 49 deletions(-) create mode 100644 src/parse/canonicalize-triple-quoted.test.ts delete mode 100644 src/runtime/orchestration-text.ts create mode 100644 src/transpile/no-runtime-imports.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 0d96e98c..ae98f452 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Decouple the validator from runtime semantics:** `src/transpile/validate.ts` (now `validate-step.ts`) used to `import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"` so it could compute "what the runtime will see" when checking the content of a triple-quoted `match`-arm body. That was a one-way dependency from compile-time on runtime semantics — a layering inversion that would have kept biting as the runtime grew more such helpers. The canonicalization helper moves into the parser layer as `canonicalizeTripleQuotedString` in `src/parse/triple-quote.ts` (same algorithm: validate the outer `"…"` shape, unescape DSL-quoted inner with `\"` → `"` and `\\` → `\`, then re-wrap via `tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner))`). Both the validator (`validate-step.ts`'s `validateMatchExpr`) and the runtime (`src/runtime/kernel/node-workflow-runtime.ts`'s match-arm dispatch in `runMatchExpr`) now import that helper from `src/parse/`; the wrapper file `src/runtime/orchestration-text.ts` is deleted. New tests pin the invariants: `src/transpile/no-runtime-imports.test.ts` (AC1) greps every non-test `*.ts` under `src/transpile/` and fails if any `from "…/runtime/…"` import reappears, so compile-time code can no longer reach into runtime semantics; `src/parse/canonicalize-triple-quoted.test.ts` (AC2) parses every `.jh` under `test-fixtures/` and `examples/`, collects every triple-quoted `match`-arm body across workflow / rule step trees, and asserts `canonicalizeTripleQuotedString(body) === legacyTripleQuotedRawForRuntime(body)` bit-for-bit (the legacy implementation is inlined in the test as the parity baseline). Existing `validate-string.test.ts` cases and the golden corpus pass unchanged (AC3); `npm run build` passes with zero TypeScript strict-mode errors (AC4). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: rethinking what the canonical form *is* — this refactor only relocates the helper. Docs updated in `docs/architecture.md` (new **No compile-time → runtime imports** bullet under **Validator**; extended **Parser** bullet to document `canonicalizeTripleQuotedString` alongside `parseTripleQuoteBlock`) and `docs/contributing.md` (new **Compile-time / runtime layering** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. - **Refactor — Replace the 1,441-line validator switch with a per-step visitor table indexed by scope:** `src/transpile/validate.ts` used to be one ~1,441-LoC function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines): every step type's validation was written twice with subtle differences, and the five-check call-shape sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) was repeated by hand at 6+ sites per side — at least 12 places to keep in sync. Both inner walkers and every duplicated check site are gone. The validator now spans two files. `validate.ts` (~430 LoC) keeps the **outer** layer: import / channel-route / test-block checks and `walkStepTree` (the single descent that builds `{ knownVars, promptSchemas, flat }`). `validate-step.ts` (~1,025 LoC) holds the **per-step** visitor: a single `validateStep(step, ctx)` entry, a `VALIDATORS: Record` table with one row per step variant (`trivia`, `const`, `return`, `send`, `say`, `exec`, `if`, `for_lines`), one `validateExpr(expr, …)` dispatcher over the 8 `Expr.kind` values, and one `validateCallable(expr, ctx)` helper that runs the five managed-call-shape checks once for both `call` (`run`) and `ensure_call` (`ensure`) — parameterized by the scope's `runRefExpect` and the target kind. Rule-vs-workflow differences are captured in a `Scope` value (`WORKFLOW_SCOPE` / `RULE_SCOPE`) with three fields: `allowSteps: Set` (single set-lookup gate at the top of `validateStep` — rules reject `send` outright; rules also reject `prompt` and `run async` from inside `exec` bodies), `runRefExpect: RefExpectMessages` (workflow vs rule semantics for `run ref(…)`), and `withPromptSchemas: boolean` (workflows collect prompt-returning bindings, rules skip schema collection). `ValidatorCtx` threads the scope plus the precomputed `knownVars`, `promptSchemas`, and `recoverBindings` into every visitor — none of which are re-derived per step. Every existing `E_VALIDATE` error message and source location is preserved bit-for-bit: the entire `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New acceptance tests in `src/transpile/validate-visitor.test.ts` pin the invariants: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` (AC1); a JSON snapshot over every `validate-*` txtar fixture (`test-fixtures/compiler-txtar/validate-errors.txt` + `validate-errors-multi-module.txt`) stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json` asserts each diagnostic's `{ code, line, col, message }` against `collectDiagnostics(graph)` bit-for-bit (AC3 — refreshable via `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional); and an "unknown step type" test casts a synthetic step variant into `WorkflowStepDef`, runs `validateStep` in both `WORKFLOW_SCOPE` and `RULE_SCOPE`, and asserts each call produces exactly one diagnostic with the documented `internal: no validator for step type "…"` message (AC4 — proving that adding a new step type costs exactly one row in `VALIDATORS`). The existing `src/transpile/validate-single-walk.test.ts` still passes — `walkStepTree`'s internal `descend` remains the only recursive `WorkflowStepDef[]` walker in `validate.ts` (AC2). The `diagnostics-collector.test.ts` "fatal allowlist" scan now sums `throw jaiphError(` counts across `validate.ts` + `validate-step.ts` (both files are zero) and `diag.error(` counts likewise (≥40). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: changes to validation rules (the *what* — this refactor only changes the *how*), parser changes, AST changes. Docs updated in `docs/architecture.md` (rewrote the **Validator** section to describe the two-file split, `VALIDATORS` table, `Scope` value, and the single `validateCallable` helper), `docs/contributing.md` (new **Validator visitor-table shape** row in the test-layer table), and `docs/grammar.md` (refreshed two stale `validateRuleStep` references to point at the new visitor / `RULE_SCOPE`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. - **Refactor — Replace fail-fast errors with a `Diagnostics` collector that aggregates every recoverable error per compile:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error, so a user fixed one error, recompiled, hit the next, recompiled, and so on. The validator also pre-ordered some checks defensively because it knew it would only get to surface one error per run. That model is replaced. A new `Diagnostics` class lives in `src/diagnostics.ts` and exposes `add(d)`, `error(file, line, col, code, message)` (records the diagnostic and short-circuits the current unit through a `BailoutError`), `capture(fn)` (runs `fn` and absorbs both `BailoutError` and any thrown legacy `jaiphError` whose message parses as `path:line:col CODE message` — turning the throw into a recoverable entry without re-throwing), `hasErrors()` / `hasFatal()`, `sorted()` (stable order by file, then line, then column), `formatLines()` (one `path:line:col CODE message` per line), and a legacy `throwFirstIfAny()` bridge that throws the first sorted diagnostic via `jaiphError` so existing single-error call sites and per-error tests are unchanged. `src/transpile/validate.ts` exposes a new `collectDiagnostics(graph): Diagnostics` entry that walks the import closure and never throws on user-level errors; the previous `validateReferences(graph)` is now a thin wrapper that calls `collectDiagnostics` and then `throwFirstIfAny()`, preserving the throw-on-first contract for `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph` and for every existing `parse-*.test.ts` / `validate-*.test.ts` fixture that asserts one specific `{ message, line, col, code }`. Inside `validate.ts` every `throw jaiphError(...)` site at user-level (~50 sites across import resolution, channel-route validation, per-rule and per-workflow step walks, prompt schema checks, and `validateTestBlocks`) is migrated to `diag.error(...)`; each top-level unit is wrapped in `diag.capture(...)` (per-import block, per-channel route, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step) so the bailout from one error unwinds only that unit and the next sibling still runs. The four leaf validation helpers (`validate-ref-resolution.ts`, `validate-string.ts`, `validate-prompt-schema.ts`, `shell-jaiph-guard.ts`) still throw via `jaiphError`, but every caller wraps them in `diag.capture(...)`, which converts the thrown error into a recoverable diagnostic and returns. The CLI command `jaiph compile` (`src/cli/commands/compile.ts`) is rewritten to route through `collectDiagnostics`: it accumulates every error from every entry's import closure, sorts them by `(file, line, col)`, and prints the full set — as a single JSON array on stdout under `--json`, or as one `path:line:col CODE message` line per diagnostic on stderr otherwise — exiting **1** on any non-empty diagnostic set. Fatal aborts during graph load or parsing (unterminated triple-quote, unterminated brace block, missing imports during graph build) are reported as a single diagnostic for the affected entry; the command then continues with the next entry. New tests in `src/transpile/diagnostics-collector.test.ts` pin the invariants: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target in one workflow body) asserts `collectDiagnostics(graph)` returns all three in source order (AC1); a source-tree scan asserts `validate.ts` holds **zero** `throw jaiphError(` sites and **≥40** `diag.error(` sites, and that every remaining `throw jaiphError(` under `src/` lives in the documented fatal allowlist — `src/diagnostics.ts` (legacy bridge), `src/parse/core.ts` (parser `fail()`), `src/cli/commands/test.ts` (test-file shape fatal), `src/transpile/module-graph.ts` (loader), `src/transpile/validate-string.ts`, `src/transpile/validate-prompt-schema.ts`, `src/transpile/validate-ref-resolution.ts`, `src/transpile/shell-jaiph-guard.ts` (leaf helpers, each captured) (AC3); and a CLI test runs `jaiph compile --json` against the same fixture and asserts the returned array has all three diagnostics and `status !== 0` (AC4). Existing single-error tests (every `parse-*.test.ts` and `validate-*.test.ts` that pins one specific `{ message, line, col, code }`) still pass because `validateReferences` continues to throw the first sorted diagnostic (AC2); `npm test` and `npm run build` pass (AC5). User-visible contracts on the `jaiph run` / `jaiph test` paths — banner, hooks, run artifacts, exit codes, `__JAIPH_EVENT__` streaming, and golden corpus — are unchanged. Out of scope: changing what counts as an error (this refactor only changes the *how*); LSP integration follows in a separate task. Docs updated in `docs/architecture.md` (new **Diagnostics collector (recoverable errors)** bullet under **Validator**; updated **System overview** to describe the two entry points and the new `jaiph compile` behavior), `docs/cli.md` (new **Multiple-error reporting** paragraph and refined **`--json`** description under **`jaiph compile`**), and `docs/contributing.md` (new **Diagnostics collector shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. - **Refactor — Fold the validator's three workflow pre-passes into a single step-tree walk:** `src/transpile/validate.ts` used to descend each workflow's / rule's step tree four times before its main check loop finished — `collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`, and the per-step validator itself — each re-implementing the same recursion over `if` / `for_lines` / `catch` / `recover` with subtly different rules, so "what counts as a binding here" fixes had to land in two or three walkers. The three pre-pass helpers are deleted. One new helper `walkStepTree(filePath, steps, envDecls, params, declLoc, moduleScripts, parseSchemaFieldNames, { withPromptSchemas })` descends the tree once and returns `{ knownVars, promptSchemas, flat }`: it accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings — workflow walks set `withPromptSchemas: true`, rule walks set it `false`), enforces immutable-binding and `script`-collision rules inline through a shared `bindings` map (with a fresh inner map under each `for_lines` body so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding (`recoverBindings: Set | undefined`) attached. The per-workflow and per-rule validator loops now iterate that flat list non-recursively — the `if` / `for_lines` / `catch` / `recover` recursion that used to live inside `validateStep` / `validateRuleStep` is gone. `walkStepTree`'s internal `descend` is the only recursive helper in the file that takes a `WorkflowStepDef[]`. Failure order matches the prior "binding errors first, then per-step errors" behavior because binding checks fire during the descent, before any flat-list iteration starts. Every existing `E_VALIDATE` error message and location is preserved bit-for-bit: the full `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New tests pin the invariants: `src/transpile/validate-single-walk.test.ts` greps `validate.ts` and fails if any of `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear by name (AC1), and a textual AST scan asserts that at most one recursive helper whose parameter list mentions `WorkflowStepDef[]` exists in the file (AC2). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged. Out of scope: the visitor-table refactor (Refactor 4) and any change to validation rules. Docs updated in `docs/architecture.md` (new **Single workflow walk** bullet under **Validator**) and `docs/contributing.md` (new **Validator single-walk shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix C. diff --git a/QUEUE.md b/QUEUE.md index f81f6046..e6a0a4c8 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,32 +13,6 @@ Process rules: *** -## Decouple the validator from runtime semantics #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. - -**Why:** `src/transpile/validate.ts` imports `tripleQuotedRawForRuntime` from `src/runtime/orchestration-text.ts` so it can compute "what the runtime will see" when validating string content. That is a one-way dependency from compile-time on runtime semantics — a layering inversion that will keep biting if the runtime grows more such helpers. - -**Scope:** - -- Move the canonicalization of triple-quoted strings (currently `tripleQuotedRawForRuntime`) into a parser-side helper (e.g. `src/parse/triple-quote.ts:canonicalizeTripleQuotedString`). -- The validator imports from `src/parse/`, not `src/runtime/`. -- The runtime, if it still needs the same canonical form at runtime, imports from `src/parse/` as well (or the canonical form is baked in at compile time by the emitter). -- Any other `validate*.ts → runtime/*` imports get the same treatment. - -**Acceptance criteria** (each verified by a test): - -1. No file under `src/transpile/` imports from `src/runtime/`. A grep test fails if any such import appears. -2. The canonical string for every triple-quoted form in `test-fixtures/` and `examples/` is bit-for-bit unchanged before and after the move. A test compares pre/post output for every fixture. -3. `npm test` passes, including the golden corpus and all `validate-string.test.ts` cases. -4. `npm run build` passes; TypeScript strict-mode errors are zero. - -**Out of scope:** rethinking what the canonical form *is*. This refactor only relocates the helper. - -**Dependency:** None. - -*** - ## Unify `catch` and `recover` parsing into a single attached-block routine #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 2. diff --git a/docs/architecture.md b/docs/architecture.md index fae2a109..99ceb6bd 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -37,7 +37,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Parser (`src/parser.ts`, `src/parse/*`)** - Converts `.jh`/`.test.jh` into a **semantic AST** (`jaiphModule`) plus a parallel **`Trivia`** store of source-fidelity data. `parsejaiphWithTrivia(source, filePath)` returns `{ ast, trivia }`; the legacy `parsejaiph(source, filePath)` is a thin wrapper that returns only the `ast` for consumers that don't need round-trip data. Both entry points are I/O-pure. - - Reusable primitives: `parseFencedBlock()` (`src/parse/fence.ts`) handles triple-backtick fenced bodies with optional lang tokens for scripts and inline scripts. `parseTripleQuoteBlock()` (`src/parse/triple-quote.ts`) handles `"""..."""` blocks for prompts, `const`, `log`, `logerr`, `fail`, `return`, and `send` — all positions where multiline strings appear. + - Reusable primitives: `parseFencedBlock()` (`src/parse/fence.ts`) handles triple-backtick fenced bodies with optional lang tokens for scripts and inline scripts. `parseTripleQuoteBlock()` (`src/parse/triple-quote.ts`) handles `"""..."""` blocks for prompts, `const`, `log`, `logerr`, `fail`, `return`, and `send` — all positions where multiline strings appear. `canonicalizeTripleQuotedString()` (same file) reproduces the dedent + escape decoding that match-arm bodies still need (they carry an unprocessed `tripleQuoteBodyToRaw`-shaped string plus a `tripleQuotedBody` flag rather than being dedented at parse time); both the validator and the runtime call it, so "what the validator inspects" and "what the runtime executes" are bit-for-bit identical. - **AST / Types (`src/types.ts`)** - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). @@ -57,6 +57,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Single managed-call-shape helper.** Every `call` / `ensure_call` site runs the same five checks against the typed `Arg[]` directly — shell-redirection rejection (only `literal` args are scanned), nested-unmanaged-call rejection inside `literal` raws, ref resolution (with the scope's `runRefExpect` for `call`, `RULE_REF_EXPECT` for `ensure_call`), arity (`args.length` vs declared params), and `var`-arg resolution against in-scope bindings via `validateArgVarRefs`. The sequence lives once in `validateCallable(expr, ctx)`; both `run` and `ensure` validators invoke it with a different ref expectation / target kind. There is no longer a separate `validateBareIdentifierArgs` helper, no per-site repetition of the five-step sequence, and no place re-parses an `args: string` payload by splitting on commas or rescanning quotes. - **Diagnostics collector (recoverable errors).** The validator no longer fails fast on the first user-level error. Every recoverable check appends to a `Diagnostics` collector (`src/diagnostics.ts`) via `diag.error(file, line, col, code, msg)`, which records a `JaiphDiagnostic` and short-circuits the current validation unit through a `BailoutError`. Each top-level unit (per-import block, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step, per-channel route) is wrapped in `diag.capture(fn)`, which absorbs the bailout (and any thrown `jaiphError` from leaf helpers like `validate-ref-resolution.ts` / `validate-string.ts` / `validate-prompt-schema.ts` / `shell-jaiph-guard.ts`) so the next sibling unit still runs. `collectDiagnostics(graph)` walks every module and returns the populated collector; the legacy `validateReferences(graph)` is now a thin wrapper that throws the first sorted diagnostic via `jaiphError` so existing per-error tests and the `emitScriptsForModuleFromGraph` path keep working unchanged. `Diagnostics.sorted()` returns errors ordered by `(file, line, col)`; `formatLines()` renders the standard `path:line:col CODE message` shape. A grep test (`src/transpile/diagnostics-collector.test.ts`) pins the migration: `validate.ts` + `validate-step.ts` hold **zero** `throw jaiphError(` sites, and the remaining `throw jaiphError(` call sites under `src/` are confined to a documented allowlist — fatal aborts in the parser (`src/parse/core.ts`), the loader (`src/transpile/module-graph.ts`), and the test-file shape check (`src/cli/commands/test.ts`); the legacy bridge in `src/diagnostics.ts`; and the four leaf validation helpers above, each of which has every caller wrapped in `diag.capture(...)`. - The validator drives off `WorkflowStepDef.type` (8 variants) and `Expr.kind` (8 variants). For every value-bearing step (`const` / `return` / `send` / `say`) and for the body of every `exec` step, a single `validateExpr(expr, ...)` dispatcher handles the value: it routes `call` / `ensure_call` / `inline_script` to call-site validation (`validateCallable`), walks `match` arms, schema-checks `prompt`, and runs the substitution scanner on `literal` raws. There is no dual code path for "managed sidecar vs literal value" — that branch is gone. + - **No compile-time → runtime imports.** Nothing under `src/transpile/` may `import … from "…/runtime/…"`. Compile-time code must not depend on runtime semantics: when the validator needs the same canonical form the runtime will see (the dedented, escape-decoded view of a triple-quoted match-arm body), both sides import a parser-side helper (`canonicalizeTripleQuotedString` in `src/parse/triple-quote.ts`) rather than reaching across the layer. A grep test (`src/transpile/no-runtime-imports.test.ts`) scans every non-test `*.ts` under `src/transpile/` and fails if any `from "…/runtime/…"` import appears; a separate corpus test (`src/parse/canonicalize-triple-quoted.test.ts`) parses every `.jh` under `test-fixtures/` and `examples/`, collects every triple-quoted match-arm body, and asserts `canonicalizeTripleQuotedString` matches the pre-move `tripleQuotedRawForRuntime` output bit-for-bit. - **Single workflow walk.** Each workflow / rule has its step tree descended exactly once by `walkStepTree` (in `validate.ts`), which simultaneously accumulates `knownVars` (env decls + params + every nested `const` / capture / `for_lines` iterator), `promptSchemas` (top-level prompt-returning bindings, gated by `options.withPromptSchemas` so rules skip schema collection), enforces immutable-binding / `script`-collision rules inline (mutating a shared `bindings` map and threading a fresh inner map under each `for_lines` so loop iterators only shadow inside the body), and emits a flat `FlatStepEntry[]` of every step in tree order with the enclosing `catch` / `recover` failure binding attached. The main per-step validator loop iterates that flat list non-recursively and calls `validateStep` once per entry, so `walkStepTree`'s internal `descend` is the **only** recursive helper in `validate.ts` that takes a `WorkflowStepDef[]`. A pair of grep / AST tests (`src/transpile/validate-single-walk.test.ts`) pins both invariants: the prior helpers (`collectKnownVars`, `collectPromptSchemas`, `validateImmutableBindings`) cannot reappear by name, and at most one recursive `WorkflowStepDef[]` walker may live in `validate.ts`. - **Transpiler (`src/transpiler.ts`, `src/transpile/*`)** diff --git a/docs/contributing.md b/docs/contributing.md index 0bac96df..cb3ee5d1 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -107,6 +107,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | | **Validator visitor-table shape** | `src/transpile/validate-visitor.test.ts` | Pins the per-step visitor refactor: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` instead of the outer entry; a `JSON` snapshot over every `validate-*` txtar fixture (stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json`) pins each diagnostic's `{ code, line, col, message }` bit-for-bit, so any drift in wording or location across the visitor table fails the test; an "unknown step type" test injects a synthetic `WorkflowStepDef.type` and asserts it produces exactly one `internal: no validator for step type "…"` diagnostic in both `WORKFLOW_SCOPE` and `RULE_SCOPE` — proving adding a new step type costs exactly one row in `VALIDATORS` | You touched the `VALIDATORS` table, changed `validateStep` / `validateExpr` / `validateCallable` / `Scope`, added or renamed a per-step validator in `validate-step.ts`, or changed any `E_VALIDATE` message wording or source location — refresh the snapshot with `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional (see [Architecture — Validator](architecture.md#core-components)) | +| **Compile-time / runtime layering** | `src/transpile/no-runtime-imports.test.ts`, `src/parse/canonicalize-triple-quoted.test.ts` | Pins the one-way dependency between compile-time and runtime: a grep over every non-test `*.ts` under `src/transpile/` fails if any `from "…/runtime/…"` import appears, so the validator cannot reach into runtime semantics; a corpus parity test parses every `.jh` under `test-fixtures/` and `examples/`, collects each triple-quoted match-arm body, and asserts `canonicalizeTripleQuotedString` matches the pre-move `tripleQuotedRawForRuntime` output bit-for-bit | You added a new helper used by both the validator and the runtime (it belongs in `src/parse/`, not `src/runtime/`), or you changed how triple-quoted match-arm bodies are canonicalized — rerun this test to confirm the validator stays decoupled from runtime code and the canonical form is unchanged (see [Architecture — Validator](architecture.md#core-components)) | | **Diagnostics collector shape** | `src/transpile/diagnostics-collector.test.ts` | Pins the migration from fail-fast `throw jaiphError(...)` to the `Diagnostics` collector: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target) asserts that `collectDiagnostics(graph)` returns **all three** in source order; a source grep asserts `validate.ts` holds **zero** `throw jaiphError(` sites and many `diag.error(` sites; an allowlist scan over every non-test `*.ts` under `src/` rejects new `throw jaiphError(` sites outside the documented fatal subset (parser `fail()`, loader, test-file shape check, legacy bridge, four leaf helpers wrapped in `diag.capture(...)`); a CLI test asserts `jaiph compile --json` returns the full diagnostic array and exits non-zero | You added a new `throw jaiphError(...)` site, migrated more checks to the collector, changed the fatal/recoverable boundary, or changed `jaiph compile`'s exit-code or output shape (see [Architecture — Validator](architecture.md#core-components) and [CLI](cli.md)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | | **Golden AST tests** | `test-fixtures/golden-ast/fixtures/*.jh` + `test-fixtures/golden-ast/expected/*.json` | Parse tree shape for successful parses — serialized to deterministic JSON with locations stripped (9 fixtures: e.g. imports, brace-if, log, match and match-multiline, params, prompt-capture, run-ensure, script-defs) | You changed the parser and need to verify the AST structure hasn't drifted; txtar tests only check pass/fail, goldens lock in the actual tree shape | diff --git a/src/parse/canonicalize-triple-quoted.test.ts b/src/parse/canonicalize-triple-quoted.test.ts new file mode 100644 index 00000000..55f9020a --- /dev/null +++ b/src/parse/canonicalize-triple-quoted.test.ts @@ -0,0 +1,105 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync, readdirSync, statSync } from "node:fs"; +import { resolve, join } from "node:path"; +import { parsejaiph } from "../parser"; +import { + canonicalizeTripleQuotedString, + tripleQuoteBodyToRaw, +} from "./triple-quote"; +import { dedentCommonLeadingWhitespace } from "./dedent"; +import type { Expr, WorkflowStepDef } from "../types"; + +// Tests run from dist/src/parse/, so repo root is three levels up. +const repoRoot = resolve(__dirname, "../../.."); + +/** + * Verbatim copy of the pre-move `tripleQuotedRawForRuntime` (the helper that + * lived in `src/runtime/orchestration-text.ts`). Used as the parity baseline: + * the new parser-side `canonicalizeTripleQuotedString` must produce bit-for-bit + * identical output for every triple-quoted match-arm body in the corpus. + */ +function legacyTripleQuotedRawForRuntime(raw: string): string { + if (raw.length < 2 || raw[0] !== '"' || raw[raw.length - 1] !== '"') return raw; + const inner = raw.slice(1, -1).replace(/\\"/g, '"').replace(/\\\\/g, "\\"); + return tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner)); +} + +function listJhFiles(dir: string): string[] { + const out: string[] = []; + for (const entry of readdirSync(dir)) { + const abs = join(dir, entry); + if (statSync(abs).isDirectory()) { + out.push(...listJhFiles(abs)); + continue; + } + if (entry.endsWith(".jh") || entry.endsWith(".test.jh")) out.push(abs); + } + return out; +} + +function collectTripleQuotedArmBodies(expr: Expr, bodies: string[]): void { + if (expr.kind === "match") { + for (const arm of expr.match.arms) { + if (arm.tripleQuotedBody) bodies.push(arm.body); + } + } +} + +function walkSteps(steps: WorkflowStepDef[], bodies: string[]): void { + for (const s of steps) { + if (s.type === "const" || s.type === "return") { + collectTripleQuotedArmBodies(s.value, bodies); + } else if (s.type === "send") { + collectTripleQuotedArmBodies(s.value, bodies); + } else if (s.type === "exec") { + collectTripleQuotedArmBodies(s.body, bodies); + if (s.catch) walkSteps("single" in s.catch ? [s.catch.single] : s.catch.block, bodies); + if (s.recover) walkSteps("single" in s.recover ? [s.recover.single] : s.recover.block, bodies); + } else if (s.type === "if") { + walkSteps(s.body, bodies); + } else if (s.type === "for_lines") { + walkSteps(s.body, bodies); + } + } +} + +test("AC2: canonicalizeTripleQuotedString matches pre-move tripleQuotedRawForRuntime bit-for-bit on every fixture", () => { + const roots = [join(repoRoot, "test-fixtures"), join(repoRoot, "examples")]; + const files: string[] = []; + for (const r of roots) { + try { + files.push(...listJhFiles(r)); + } catch { + // root missing in this checkout — skip. + } + } + assert.ok(files.length > 0, "expected to discover .jh fixtures under test-fixtures/ and examples/"); + + let armCount = 0; + for (const file of files) { + const source = readFileSync(file, "utf8"); + let ast; + try { + ast = parsejaiph(source, file); + } catch { + // Fixtures that intentionally fail to parse (e.g. parse-error corpus) are out of scope. + continue; + } + const bodies: string[] = []; + for (const w of ast.workflows) walkSteps(w.steps, bodies); + for (const r of ast.rules) walkSteps(r.steps, bodies); + for (const body of bodies) { + armCount += 1; + assert.equal( + canonicalizeTripleQuotedString(body), + legacyTripleQuotedRawForRuntime(body), + `${file}: canonical form drifted from pre-move tripleQuotedRawForRuntime`, + ); + } + } + assert.ok( + armCount > 0, + "expected at least one triple-quoted match-arm body across the fixture corpus", + ); +}); diff --git a/src/parse/triple-quote.ts b/src/parse/triple-quote.ts index e1a13b8d..e68fa4a6 100644 --- a/src/parse/triple-quote.ts +++ b/src/parse/triple-quote.ts @@ -68,6 +68,23 @@ export function dedentTripleQuotedBody(body: string): string { return dedentCommonLeadingWhitespace(body); } +function unescapeDslDoubleQuotedInner(inner: string): string { + return inner.replace(/\\"/g, '"').replace(/\\\\/g, "\\"); +} + +/** + * Canonicalize a triple-quoted body that was stored in `tripleQuoteBodyToRaw` + * (`"…escaped…"`) form. Used by match-arm bodies, which still carry their own + * `tripleQuotedBody` flag instead of being dedented at parse time. The runtime + * and the validator share this helper so that "what the runtime executes" and + * "what the validator inspects" are bit-for-bit identical. + */ +export function canonicalizeTripleQuotedString(raw: string): string { + if (raw.length < 2 || raw[0] !== '"' || raw[raw.length - 1] !== '"') return raw; + const inner = unescapeDslDoubleQuotedInner(raw.slice(1, -1)); + return tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner)); +} + /** * Helper for step parsers: when a step argument starts with `"""`, splice it back * onto the source line and parse the triple-quoted block. Errors if any content diff --git a/src/runtime/kernel/node-workflow-runtime.ts b/src/runtime/kernel/node-workflow-runtime.ts index fa34f366..b7bdcb1a 100644 --- a/src/runtime/kernel/node-workflow-runtime.ts +++ b/src/runtime/kernel/node-workflow-runtime.ts @@ -13,7 +13,7 @@ import { buildStepDisplayParamPairs } from "../../cli/commands/format-params.js" import { resolveRuleRef, resolveScriptRef, resolveWorkflowRef, type RuntimeGraph } from "./graph"; import type { WorkflowMetadata } from "../../types"; import { extractJson, validateFields } from "./schema"; -import { tripleQuotedRawForRuntime } from "../orchestration-text"; +import { canonicalizeTripleQuotedString } from "../../parse/triple-quote"; import { commaArgsToInterpolated, interpolate, @@ -463,7 +463,7 @@ export class NodeWorkflowRuntime { if (matched) { let body = arm.body.trimStart(); if (arm.tripleQuotedBody) { - body = tripleQuotedRawForRuntime(arm.body).trimStart(); + body = canonicalizeTripleQuotedString(arm.body).trimStart(); } // fail "message" — abort with failure diff --git a/src/runtime/orchestration-text.ts b/src/runtime/orchestration-text.ts deleted file mode 100644 index f31d9af1..00000000 --- a/src/runtime/orchestration-text.ts +++ /dev/null @@ -1,18 +0,0 @@ -import { dedentCommonLeadingWhitespace } from "../parse/dedent"; -import { tripleQuoteBodyToRaw } from "../parse/triple-quote"; - -/** Unescape inner text of a `tripleQuoteBodyToRaw`-shaped `"…"` token (same as format/emit decoders). */ -function unescapeDslDoubleQuotedInner(inner: string): string { - return inner.replace(/\\"/g, '"').replace(/\\\\/g, "\\"); -} - -/** - * Apply common-leading-whitespace dedent to a `tripleQuoteBodyToRaw`-encoded - * value. Still used for match-arm bodies (which carry their own - * `tripleQuotedBody` flag and are not part of the trivia split). - */ -export function tripleQuotedRawForRuntime(raw: string): string { - if (raw.length < 2 || raw[0] !== '"' || raw[raw.length - 1] !== '"') return raw; - const inner = unescapeDslDoubleQuotedInner(raw.slice(1, -1)); - return tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner)); -} diff --git a/src/transpile/no-runtime-imports.test.ts b/src/transpile/no-runtime-imports.test.ts new file mode 100644 index 00000000..6db377ee --- /dev/null +++ b/src/transpile/no-runtime-imports.test.ts @@ -0,0 +1,38 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync, readdirSync, statSync } from "node:fs"; +import { resolve, join } from "node:path"; + +// Tests run from dist/src/transpile/, so repo root is three levels up. +const repoRoot = resolve(__dirname, "../../.."); +const transpileDir = join(repoRoot, "src/transpile"); + +function listTsFiles(dir: string): string[] { + const out: string[] = []; + for (const entry of readdirSync(dir)) { + const abs = join(dir, entry); + if (statSync(abs).isDirectory()) { + out.push(...listTsFiles(abs)); + continue; + } + if (!entry.endsWith(".ts")) continue; + if (entry.endsWith(".test.ts")) continue; + out.push(abs); + } + return out; +} + +test("AC1: no src/transpile/ production source imports from src/runtime/", () => { + const files = listTsFiles(transpileDir); + assert.ok(files.length > 0, "expected to discover transpile source files"); + for (const abs of files) { + const rel = abs.slice(repoRoot.length + 1); + const content = readFileSync(abs, "utf8"); + const re = /from\s+["'][^"']*\/runtime\/[^"']*["']/; + assert.equal( + re.test(content), + false, + `${rel} imports from src/runtime/ — compile-time must not depend on runtime semantics`, + ); + } +}); diff --git a/src/transpile/validate-step.ts b/src/transpile/validate-step.ts index a672e0c7..ab1057ea 100644 --- a/src/transpile/validate-step.ts +++ b/src/transpile/validate-step.ts @@ -8,7 +8,7 @@ import { Diagnostics } from "../diagnostics"; import { matchSendOperator } from "../parse/core"; import type { Arg, Expr, jaiphModule, MatchExprDef, WorkflowStepDef } from "../types"; -import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"; +import { canonicalizeTripleQuotedString } from "../parse/triple-quote"; import { BARE_SEND_REF_MSG, lookupKind, @@ -546,7 +546,7 @@ export function validateMatchExpr( ); } } - const bodyTrimmed = (arm.tripleQuotedBody ? tripleQuotedRawForRuntime(arm.body) : arm.body).trimStart(); + const bodyTrimmed = (arm.tripleQuotedBody ? canonicalizeTripleQuotedString(arm.body) : arm.body).trimStart(); if (/^return(\s|$)/.test(bodyTrimmed)) { diag.error( filePath, From 290668cdecaf47c0f4a40f0db8e2af5c1d23bdd4 Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 17:07:04 +0200 Subject: [PATCH 13/14] Refactor: unify catch/recover parsing into one attached-block routine MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `src/parse/steps.ts` used to contain three near-identical 100+ line functions — `parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep` — that parsed the same ` (binding) { body }` shape and differed only in which host step they decorated and the literal keyword. Their body parser, `parseCatchStatement` (~280 lines), was a stripped-down copy of `parseBlockStatement` recognizing only a fixed subset of statement forms, so the same fix had to land in two places and divergence wasn't always caught by tests. Collapse all four into one `parseAttachedBlock(keyword, host)` entry point in `steps.ts` that parses the bindings and dispatches body statements through the **same** `parseBlockStatement` used at the top level — no mini parser remains. The host side moves to a single `parseRunOrEnsure` helper in `workflow-brace.ts`. `src/parse/steps.ts` drops from 757 → 141 lines; every existing parse error message and location is preserved bit-for-bit (verified by snapshot fixtures), and the full parser/validator/emitter golden corpus passes byte-for-byte. A new `parse-attached-block.test.ts` pins the invariant that any statement form accepted at top level is accepted identically inside `catch (e) { … }` and `recover(e) { … }` without parser changes inside the catch/recover code path. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 27 - docs/architecture.md | 1 + docs/contributing.md | 1 + docs/grammar.md | 8 +- src/parse/parse-attached-block.test.ts | 193 +++++ src/parse/parse-steps.test.ts | 255 +++--- src/parse/steps.ts | 758 ++---------------- src/parse/workflow-brace.ts | 182 +++-- test-fixtures/compiler-txtar/parse-errors.txt | 18 +- 10 files changed, 525 insertions(+), 919 deletions(-) create mode 100644 src/parse/parse-attached-block.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index ae98f452..aaf1bed9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Unify `catch` / `recover` parsing into a single attached-block routine sharing the top-level statement parser:** `src/parse/steps.ts` used to contain three near-identical 100+ line functions — `parseEnsureStep`, `parseRunCatchStep`, and `parseRunRecoverStep` — that parsed the same syntactic shape (` (binding) { body } | single-stmt`) and differed only in which host step they decorated (`ensure` vs `run`) and the literal keyword (`catch` vs `recover`). Their body parser, `parseCatchStatement` (~280 lines), was a stripped-down copy of `parseBlockStatement` that recognized only a fixed subset of statement forms (e.g. a `for … in …` head fell through to a shell command) and diverged in subtle ways — the same fix had to land in two places, and divergence wasn't always caught by tests. All four functions and every helper that existed only to serve them are deleted from `src/parse/steps.ts`. The file drops from **757 → ~140 lines**. The new shape: one entry point `parseAttachedBlock(filePath, lines, idx, innerNo, innerRaw, keyword: "catch" | "recover", textAfterKeyword, trivia)` in `src/parse/steps.ts` parses the bindings (`()` — exactly one identifier, with the same too-many / too-few / non-identifier errors as before) and dispatches on the body shape: a `{` at end of host line walks the existing brace-block scanner and delegates each body statement to `parseBraceBlockBody`, an inline `{ stmt[; stmt]* }` splits on `;` via the shared `splitStatementsOnSemicolons` and dispatches each fragment, and a bare single statement is parsed in-place. In all three cases the body statements run through the **same** `parseBlockStatement` (`src/parse/workflow-brace.ts`) that handles top-level statements — there is no mini parser for catch/recover bodies anymore. The host side moves to one helper `parseRunOrEnsure(filePath, lines, idx, …, host: "run" | "ensure", hostBody, isAsync, captureName, trivia)` in `src/parse/workflow-brace.ts`, called from `parseBlockStatement`'s three call sites (`ensure ref(...)`, `run ref(...)`, `run async ref(...)`). It scans `hostBody` once for a trailing ` recover` (run-only) then ` catch ` segment, parses the host call before the keyword, and delegates the attached clause to `parseAttachedBlock`. "Is this statement allowed inside a catch/recover body?" is now a validator concern — `WORKFLOW_SCOPE` and `RULE_SCOPE` in `validate-step.ts` already gate which step types are accepted in each scope, so rules still reject unstructured shell inside `catch` / `recover` bodies; workflows still accept it. New tests in `src/parse/parse-attached-block.test.ts` pin the invariants: AC1 — an LoC test caps `src/parse/steps.ts` at **≤200 lines** and a grep test fails if any function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears; AC2 — a `for line in items { log "$line" }` statement (a `parseBlockStatement`-only form historically) is parsed as a `for_lines` step at the top level, inside `ensure check() catch (e) { … }`, and inside `run target() recover(e) { … }` — proving `parseBlockStatement` is the single entry point for any statement inside a catch / recover body and there is no separate mini parser; AC3 — a 10-case error-snapshot battery asserts every existing parse error message and column (bindings missing, too many bindings, empty inline / multiline block, unterminated multiline block, missing-paren for both `catch` and `recover` on both `run` and `ensure` hosts) is preserved bit-for-bit. The full parser / validator / emitter golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, `parse-steps.test.ts`, `parse-bare-call.test.ts`, `parse-run-async.test.ts`, and the txtar / golden-AST fixtures) passes byte-for-byte (AC4). User-visible contracts — surface syntax for `catch` / `recover`, CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming — are unchanged. Out of scope: the wider tokenizer rewrite (Refactor 1, deferred); validator changes beyond the per-keyword scope rules that already exist. Docs updated in `docs/architecture.md` (extended **Parser** bullet with a new **Unified `run` / `ensure` host parsing** paragraph), `docs/contributing.md` (new **Attached-block parser shape** row in the test-layer table), and `docs/grammar.md` (replaced the stale `parseCatchStatement` reference in the EBNF aside with a note that `parseAttachedBlock` delegates to `parseBlockStatement`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 2. - **Refactor — Decouple the validator from runtime semantics:** `src/transpile/validate.ts` (now `validate-step.ts`) used to `import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"` so it could compute "what the runtime will see" when checking the content of a triple-quoted `match`-arm body. That was a one-way dependency from compile-time on runtime semantics — a layering inversion that would have kept biting as the runtime grew more such helpers. The canonicalization helper moves into the parser layer as `canonicalizeTripleQuotedString` in `src/parse/triple-quote.ts` (same algorithm: validate the outer `"…"` shape, unescape DSL-quoted inner with `\"` → `"` and `\\` → `\`, then re-wrap via `tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner))`). Both the validator (`validate-step.ts`'s `validateMatchExpr`) and the runtime (`src/runtime/kernel/node-workflow-runtime.ts`'s match-arm dispatch in `runMatchExpr`) now import that helper from `src/parse/`; the wrapper file `src/runtime/orchestration-text.ts` is deleted. New tests pin the invariants: `src/transpile/no-runtime-imports.test.ts` (AC1) greps every non-test `*.ts` under `src/transpile/` and fails if any `from "…/runtime/…"` import reappears, so compile-time code can no longer reach into runtime semantics; `src/parse/canonicalize-triple-quoted.test.ts` (AC2) parses every `.jh` under `test-fixtures/` and `examples/`, collects every triple-quoted `match`-arm body across workflow / rule step trees, and asserts `canonicalizeTripleQuotedString(body) === legacyTripleQuotedRawForRuntime(body)` bit-for-bit (the legacy implementation is inlined in the test as the parity baseline). Existing `validate-string.test.ts` cases and the golden corpus pass unchanged (AC3); `npm run build` passes with zero TypeScript strict-mode errors (AC4). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: rethinking what the canonical form *is* — this refactor only relocates the helper. Docs updated in `docs/architecture.md` (new **No compile-time → runtime imports** bullet under **Validator**; extended **Parser** bullet to document `canonicalizeTripleQuotedString` alongside `parseTripleQuoteBlock`) and `docs/contributing.md` (new **Compile-time / runtime layering** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. - **Refactor — Replace the 1,441-line validator switch with a per-step visitor table indexed by scope:** `src/transpile/validate.ts` used to be one ~1,441-LoC function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines): every step type's validation was written twice with subtle differences, and the five-check call-shape sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) was repeated by hand at 6+ sites per side — at least 12 places to keep in sync. Both inner walkers and every duplicated check site are gone. The validator now spans two files. `validate.ts` (~430 LoC) keeps the **outer** layer: import / channel-route / test-block checks and `walkStepTree` (the single descent that builds `{ knownVars, promptSchemas, flat }`). `validate-step.ts` (~1,025 LoC) holds the **per-step** visitor: a single `validateStep(step, ctx)` entry, a `VALIDATORS: Record` table with one row per step variant (`trivia`, `const`, `return`, `send`, `say`, `exec`, `if`, `for_lines`), one `validateExpr(expr, …)` dispatcher over the 8 `Expr.kind` values, and one `validateCallable(expr, ctx)` helper that runs the five managed-call-shape checks once for both `call` (`run`) and `ensure_call` (`ensure`) — parameterized by the scope's `runRefExpect` and the target kind. Rule-vs-workflow differences are captured in a `Scope` value (`WORKFLOW_SCOPE` / `RULE_SCOPE`) with three fields: `allowSteps: Set` (single set-lookup gate at the top of `validateStep` — rules reject `send` outright; rules also reject `prompt` and `run async` from inside `exec` bodies), `runRefExpect: RefExpectMessages` (workflow vs rule semantics for `run ref(…)`), and `withPromptSchemas: boolean` (workflows collect prompt-returning bindings, rules skip schema collection). `ValidatorCtx` threads the scope plus the precomputed `knownVars`, `promptSchemas`, and `recoverBindings` into every visitor — none of which are re-derived per step. Every existing `E_VALIDATE` error message and source location is preserved bit-for-bit: the entire `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New acceptance tests in `src/transpile/validate-visitor.test.ts` pin the invariants: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` (AC1); a JSON snapshot over every `validate-*` txtar fixture (`test-fixtures/compiler-txtar/validate-errors.txt` + `validate-errors-multi-module.txt`) stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json` asserts each diagnostic's `{ code, line, col, message }` against `collectDiagnostics(graph)` bit-for-bit (AC3 — refreshable via `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional); and an "unknown step type" test casts a synthetic step variant into `WorkflowStepDef`, runs `validateStep` in both `WORKFLOW_SCOPE` and `RULE_SCOPE`, and asserts each call produces exactly one diagnostic with the documented `internal: no validator for step type "…"` message (AC4 — proving that adding a new step type costs exactly one row in `VALIDATORS`). The existing `src/transpile/validate-single-walk.test.ts` still passes — `walkStepTree`'s internal `descend` remains the only recursive `WorkflowStepDef[]` walker in `validate.ts` (AC2). The `diagnostics-collector.test.ts` "fatal allowlist" scan now sums `throw jaiphError(` counts across `validate.ts` + `validate-step.ts` (both files are zero) and `diag.error(` counts likewise (≥40). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: changes to validation rules (the *what* — this refactor only changes the *how*), parser changes, AST changes. Docs updated in `docs/architecture.md` (rewrote the **Validator** section to describe the two-file split, `VALIDATORS` table, `Scope` value, and the single `validateCallable` helper), `docs/contributing.md` (new **Validator visitor-table shape** row in the test-layer table), and `docs/grammar.md` (refreshed two stale `validateRuleStep` references to point at the new visitor / `RULE_SCOPE`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. - **Refactor — Replace fail-fast errors with a `Diagnostics` collector that aggregates every recoverable error per compile:** Today `fail()` (in `src/parse/core.ts`) and `jaiphError()` (in `src/errors.ts`) both throw on the first error, so a user fixed one error, recompiled, hit the next, recompiled, and so on. The validator also pre-ordered some checks defensively because it knew it would only get to surface one error per run. That model is replaced. A new `Diagnostics` class lives in `src/diagnostics.ts` and exposes `add(d)`, `error(file, line, col, code, message)` (records the diagnostic and short-circuits the current unit through a `BailoutError`), `capture(fn)` (runs `fn` and absorbs both `BailoutError` and any thrown legacy `jaiphError` whose message parses as `path:line:col CODE message` — turning the throw into a recoverable entry without re-throwing), `hasErrors()` / `hasFatal()`, `sorted()` (stable order by file, then line, then column), `formatLines()` (one `path:line:col CODE message` per line), and a legacy `throwFirstIfAny()` bridge that throws the first sorted diagnostic via `jaiphError` so existing single-error call sites and per-error tests are unchanged. `src/transpile/validate.ts` exposes a new `collectDiagnostics(graph): Diagnostics` entry that walks the import closure and never throws on user-level errors; the previous `validateReferences(graph)` is now a thin wrapper that calls `collectDiagnostics` and then `throwFirstIfAny()`, preserving the throw-on-first contract for `emitScriptsForModuleFromGraph` / `buildScriptsFromGraph` and for every existing `parse-*.test.ts` / `validate-*.test.ts` fixture that asserts one specific `{ message, line, col, code }`. Inside `validate.ts` every `throw jaiphError(...)` site at user-level (~50 sites across import resolution, channel-route validation, per-rule and per-workflow step walks, prompt schema checks, and `validateTestBlocks`) is migrated to `diag.error(...)`; each top-level unit is wrapped in `diag.capture(...)` (per-import block, per-channel route, per-rule walk, per-rule step, per-workflow walk, per-workflow step, per-test-block step) so the bailout from one error unwinds only that unit and the next sibling still runs. The four leaf validation helpers (`validate-ref-resolution.ts`, `validate-string.ts`, `validate-prompt-schema.ts`, `shell-jaiph-guard.ts`) still throw via `jaiphError`, but every caller wraps them in `diag.capture(...)`, which converts the thrown error into a recoverable diagnostic and returns. The CLI command `jaiph compile` (`src/cli/commands/compile.ts`) is rewritten to route through `collectDiagnostics`: it accumulates every error from every entry's import closure, sorts them by `(file, line, col)`, and prints the full set — as a single JSON array on stdout under `--json`, or as one `path:line:col CODE message` line per diagnostic on stderr otherwise — exiting **1** on any non-empty diagnostic set. Fatal aborts during graph load or parsing (unterminated triple-quote, unterminated brace block, missing imports during graph build) are reported as a single diagnostic for the affected entry; the command then continues with the next entry. New tests in `src/transpile/diagnostics-collector.test.ts` pin the invariants: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target in one workflow body) asserts `collectDiagnostics(graph)` returns all three in source order (AC1); a source-tree scan asserts `validate.ts` holds **zero** `throw jaiphError(` sites and **≥40** `diag.error(` sites, and that every remaining `throw jaiphError(` under `src/` lives in the documented fatal allowlist — `src/diagnostics.ts` (legacy bridge), `src/parse/core.ts` (parser `fail()`), `src/cli/commands/test.ts` (test-file shape fatal), `src/transpile/module-graph.ts` (loader), `src/transpile/validate-string.ts`, `src/transpile/validate-prompt-schema.ts`, `src/transpile/validate-ref-resolution.ts`, `src/transpile/shell-jaiph-guard.ts` (leaf helpers, each captured) (AC3); and a CLI test runs `jaiph compile --json` against the same fixture and asserts the returned array has all three diagnostics and `status !== 0` (AC4). Existing single-error tests (every `parse-*.test.ts` and `validate-*.test.ts` that pins one specific `{ message, line, col, code }`) still pass because `validateReferences` continues to throw the first sorted diagnostic (AC2); `npm test` and `npm run build` pass (AC5). User-visible contracts on the `jaiph run` / `jaiph test` paths — banner, hooks, run artifacts, exit codes, `__JAIPH_EVENT__` streaming, and golden corpus — are unchanged. Out of scope: changing what counts as an error (this refactor only changes the *how*); LSP integration follows in a separate task. Docs updated in `docs/architecture.md` (new **Diagnostics collector (recoverable errors)** bullet under **Validator**; updated **System overview** to describe the two entry points and the new `jaiph compile` behavior), `docs/cli.md` (new **Multiple-error reporting** paragraph and refined **`--json`** description under **`jaiph compile`**), and `docs/contributing.md` (new **Diagnostics collector shape** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix B. diff --git a/QUEUE.md b/QUEUE.md index e6a0a4c8..cf011274 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -13,33 +13,6 @@ Process rules: *** -## Unify `catch` and `recover` parsing into a single attached-block routine #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 2. - -**Why:** `src/parse/steps.ts` contains three near-identical 100+ line functions — `parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep` — that parse the same syntactic shape (` (binding) { body } | single-stmt`) and differ only in which host step they decorate and the literal keyword. Their body parser, `parseCatchStatement` (~280 lines), re-implements a stripped-down version of `parseBlockStatement` with diverging coverage. - -**Scope:** - -- Replace `parseEnsureStep`, `parseRunCatchStep`, `parseRunRecoverStep`, and `parseCatchStatement` with: - - `parseAttachedBlock(keyword: "catch" | "recover", host: WorkflowStepDef)` returning `{ bindings, body: WorkflowStepDef[] }`. - - A body parsed by the **same** `parseBlockStatement` used at the top level — no mini parser. -- All four functions and any helpers that exist only to serve them are deleted from `src/parse/steps.ts`. -- "Is this statement allowed inside a catch/recover body?" is a validator concern after this refactor, not enforced by which mini-parser branches happen to fire. - -**Acceptance criteria** (each verified by a test): - -1. `src/parse/steps.ts` is at most 200 lines (down from 757), and contains no function whose name matches `/parse(Run)?(Catch|Recover|EnsureStep)/`. A grep/size test fails if either bound is violated. -2. `parseBlockStatement` is the single entry point for any statement appearing inside a catch or recover body. Add a test that introduces a new statement form (behind a test-only flag) and asserts it is accepted identically at top level and inside `catch (e) { … }` and `recover(e) { … }` without parser changes inside the catch/recover code path. -3. Every existing parse error message and location related to `catch` / `recover` (bindings missing, too many bindings, unterminated block, etc.) is preserved bit-for-bit. Snapshot test over `parse-*.test.ts` fixtures. -4. The full parser/validator/emitter golden corpus passes byte-for-byte: `npm test`, including `parse-steps.test.ts`, `parse-bare-call.test.ts`, `parse-run-async.test.ts`, `compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`. - -**Out of scope:** the wider tokenizer rewrite (next task) — this task explicitly stays on the line-walking parser, since the goal is incremental simplification. Validator changes beyond minor message preservation. - -**Dependency:** Refactor 3 (AST collapse) should be complete first so the unified parser emits `Expr` nodes directly. If it is not, this task may proceed but must avoid introducing new producers of the deprecated `managed:` sidecar. - -*** - ## Replace the line-by-line ad-hoc parser with a tokenizer + recursive-descent parser #dev-ready **Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 1. diff --git a/docs/architecture.md b/docs/architecture.md index 99ceb6bd..29c5af75 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -38,6 +38,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - **Parser (`src/parser.ts`, `src/parse/*`)** - Converts `.jh`/`.test.jh` into a **semantic AST** (`jaiphModule`) plus a parallel **`Trivia`** store of source-fidelity data. `parsejaiphWithTrivia(source, filePath)` returns `{ ast, trivia }`; the legacy `parsejaiph(source, filePath)` is a thin wrapper that returns only the `ast` for consumers that don't need round-trip data. Both entry points are I/O-pure. - Reusable primitives: `parseFencedBlock()` (`src/parse/fence.ts`) handles triple-backtick fenced bodies with optional lang tokens for scripts and inline scripts. `parseTripleQuoteBlock()` (`src/parse/triple-quote.ts`) handles `"""..."""` blocks for prompts, `const`, `log`, `logerr`, `fail`, `return`, and `send` — all positions where multiline strings appear. `canonicalizeTripleQuotedString()` (same file) reproduces the dedent + escape decoding that match-arm bodies still need (they carry an unprocessed `tripleQuoteBodyToRaw`-shaped string plus a `tripleQuotedBody` flag rather than being dedented at parse time); both the validator and the runtime call it, so "what the validator inspects" and "what the runtime executes" are bit-for-bit identical. + - **Unified `run` / `ensure` host parsing.** `run ref(...)`, `run async ref(...)`, and `ensure ref(...)`, optionally followed by `catch (binding) { ... }` (any host) or `recover(binding) { ... }` (`run` only), are parsed by a single helper `parseRunOrEnsure` in `src/parse/workflow-brace.ts`. The attached `catch` / `recover` clause — bindings, body shape (multi-line `{ … }`, inline `{ stmt[; stmt]* }`, or single-statement) — is parsed by **one** helper `parseAttachedBlock(filePath, lines, idx, …, keyword, textAfterKeyword, trivia)` in `src/parse/steps.ts`. There is no separate mini parser for catch/recover bodies: `parseAttachedBlock` delegates each body statement to the **same** `parseBlockStatement` (`src/parse/workflow-brace.ts`) that handles top-level statements, so every statement form accepted in a workflow / rule body is accepted identically inside a `catch` / `recover` body. "Is this statement allowed inside a catch/recover body?" is a validator concern (the `RULE_SCOPE` / `WORKFLOW_SCOPE` distinction in `validate-step.ts`), not enforced by which mini-parser branches happened to fire. `src/parse/steps.ts` is bounded at **≤200 lines** by `src/parse/parse-attached-block.test.ts`, which also asserts no function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears. - **AST / Types (`src/types.ts`)** - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). diff --git a/docs/contributing.md b/docs/contributing.md index cb3ee5d1..28ebc848 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -107,6 +107,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | | **Validator visitor-table shape** | `src/transpile/validate-visitor.test.ts` | Pins the per-step visitor refactor: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` instead of the outer entry; a `JSON` snapshot over every `validate-*` txtar fixture (stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json`) pins each diagnostic's `{ code, line, col, message }` bit-for-bit, so any drift in wording or location across the visitor table fails the test; an "unknown step type" test injects a synthetic `WorkflowStepDef.type` and asserts it produces exactly one `internal: no validator for step type "…"` diagnostic in both `WORKFLOW_SCOPE` and `RULE_SCOPE` — proving adding a new step type costs exactly one row in `VALIDATORS` | You touched the `VALIDATORS` table, changed `validateStep` / `validateExpr` / `validateCallable` / `Scope`, added or renamed a per-step validator in `validate-step.ts`, or changed any `E_VALIDATE` message wording or source location — refresh the snapshot with `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional (see [Architecture — Validator](architecture.md#core-components)) | +| **Attached-block parser shape** | `src/parse/parse-attached-block.test.ts` | Pins the unified `catch` / `recover` parser refactor: an LoC test caps `src/parse/steps.ts` at **≤200 lines** (down from 757); a grep test fails if any function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears; an "AC2" test introduces a `for … in …` statement (a `parseBlockStatement`-only form historically) at the top level, inside `catch (e) { … }`, and inside `recover(e) { … }`, and asserts it is parsed as a `for_lines` step in **all three** positions — proving `parseBlockStatement` is the single entry point for any statement appearing inside a catch / recover body and there is no separate mini parser; a snapshot battery asserts every existing parse error message and column (bindings missing, too many bindings, empty body, unterminated multiline block) is preserved bit-for-bit | You touched `parseAttachedBlock` / `parseRunOrEnsure` in `src/parse/steps.ts` / `src/parse/workflow-brace.ts`, added a new statement form, or changed any `catch` / `recover` parse-error wording or column — rerun this test to confirm the body parser is still shared with `parseBlockStatement` and the error messages stay byte-for-byte (see [Architecture — Parser](architecture.md#core-components)) | | **Compile-time / runtime layering** | `src/transpile/no-runtime-imports.test.ts`, `src/parse/canonicalize-triple-quoted.test.ts` | Pins the one-way dependency between compile-time and runtime: a grep over every non-test `*.ts` under `src/transpile/` fails if any `from "…/runtime/…"` import appears, so the validator cannot reach into runtime semantics; a corpus parity test parses every `.jh` under `test-fixtures/` and `examples/`, collects each triple-quoted match-arm body, and asserts `canonicalizeTripleQuotedString` matches the pre-move `tripleQuotedRawForRuntime` output bit-for-bit | You added a new helper used by both the validator and the runtime (it belongs in `src/parse/`, not `src/runtime/`), or you changed how triple-quoted match-arm bodies are canonicalized — rerun this test to confirm the validator stays decoupled from runtime code and the canonical form is unchanged (see [Architecture — Validator](architecture.md#core-components)) | | **Diagnostics collector shape** | `src/transpile/diagnostics-collector.test.ts` | Pins the migration from fail-fast `throw jaiphError(...)` to the `Diagnostics` collector: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target) asserts that `collectDiagnostics(graph)` returns **all three** in source order; a source grep asserts `validate.ts` holds **zero** `throw jaiphError(` sites and many `diag.error(` sites; an allowlist scan over every non-test `*.ts` under `src/` rejects new `throw jaiphError(` sites outside the documented fatal subset (parser `fail()`, loader, test-file shape check, legacy bridge, four leaf helpers wrapped in `diag.capture(...)`); a CLI test asserts `jaiph compile --json` returns the full diagnostic array and exits non-zero | You added a new `throw jaiphError(...)` site, migrated more checks to the collector, changed the fatal/recoverable boundary, or changed `jaiph compile`'s exit-code or output shape (see [Architecture — Validator](architecture.md#core-components) and [CLI](cli.md)) | | **Compiler tests (txtar)** | `test-fixtures/compiler-txtar/*.txt` | Parse and validate outcomes — success, parse errors, validation errors — using language-agnostic txtar fixtures (hundreds of `===` cases across the four `*.txt` files) | You want a portable test case that can be reused by alternative compiler implementations; the test is a `.jh` input paired with an expected outcome | diff --git a/docs/grammar.md b/docs/grammar.md index ca4e973a..8538682e 100644 --- a/docs/grammar.md +++ b/docs/grammar.md @@ -1062,9 +1062,11 @@ single_workflow_stmt = ensure_stmt | run_stmt | run_catch_stmt | run_recover_stm | const_decl_step | return_stmt | fail_stmt | log_stmt | logerr_stmt | send_stmt ; - (* Actual catch/recover bodies use parseCatchStatement in src/parse/steps.ts: a richer subset - than this sketch, including inline shell text for workflow recovery blocks — rule bodies still - reject unstructured shell via the visitor's RULE_SCOPE (validate-step.ts). *) + (* Actual catch/recover bodies are parsed by the same parseBlockStatement used at the top level + (dispatched through parseAttachedBlock in src/parse/steps.ts), so every statement form + accepted in a workflow / rule body is accepted identically inside a catch / recover body — + including inline shell text for workflow bodies. Rule bodies still reject unstructured shell + via the visitor's RULE_SCOPE (validate-step.ts). *) ``` ## Validation Rules diff --git a/src/parse/parse-attached-block.test.ts b/src/parse/parse-attached-block.test.ts new file mode 100644 index 00000000..8ec18569 --- /dev/null +++ b/src/parse/parse-attached-block.test.ts @@ -0,0 +1,193 @@ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync } from "node:fs"; +import { join } from "node:path"; +import { parsejaiph } from "../parser"; +import type { WorkflowStepDef } from "../types"; + +const stepsTsPath = join(process.cwd(), "src/parse/steps.ts"); +const stepsTsSource = readFileSync(stepsTsPath, "utf8"); + +// === AC1: src/parse/steps.ts size + grep budget === + +test("AC1: src/parse/steps.ts is at most 200 lines", () => { + const lineCount = stepsTsSource.split("\n").length; + assert.ok( + lineCount <= 200, + `expected src/parse/steps.ts to be <=200 lines (was 757 before Refactor 2); got ${lineCount}`, + ); +}); + +test("AC1: src/parse/steps.ts has no parse(Run)?(Catch|Recover|EnsureStep) function", () => { + const re = /\bfunction\s+(parse(?:Run)?(?:Catch|Recover|EnsureStep))\b/; + const m = stepsTsSource.match(re); + assert.equal( + m, + null, + `legacy catch/recover host-parser function reappeared in src/parse/steps.ts: ${m && m[1]}`, + ); +}); + +// === AC2: parseBlockStatement is THE entry point for any catch/recover body === +// +// Before Refactor 2, `parseCatchStatement` was a stripped-down copy of +// `parseBlockStatement` that recognised only a fixed subset of statement +// forms. A `for … in …` head, for example, was treated as a shell command. +// After Refactor 2 the same `parseBlockStatement` parses bodies everywhere, +// so introducing a new statement form (here: using `for` as the probe — it +// has always been a parseBlockStatement-only form historically) is accepted +// identically at top level, inside `catch (e) { … }`, and inside +// `recover(e) { … }` without any change to the catch/recover code path. + +function pickFor(steps: WorkflowStepDef[]): WorkflowStepDef | undefined { + return steps.find((s) => s.type === "for_lines"); +} + +const FOR_BODY = [ + ' for line in items {', + ' log "$line"', + ' }', +]; + +test("AC2: top-level for-loop is parsed as `for_lines`", () => { + const src = [ + "workflow w(items) {", + ...FOR_BODY, + "}", + "", + ].join("\n"); + const mod = parsejaiph(src, "ac2-top.jh"); + const w = mod.workflows.find((x) => x.name === "w")!; + const forStep = pickFor(w.steps); + assert.ok(forStep, "expected for_lines step at top level"); +}); + +test("AC2: same for-loop inside catch body parses identically", () => { + const src = [ + "rule check() {", + ' return "ok"', + "}", + "workflow w(items) {", + " ensure check() catch (e) {", + ...FOR_BODY, + " }", + "}", + "", + ].join("\n"); + const mod = parsejaiph(src, "ac2-catch.jh"); + const w = mod.workflows.find((x) => x.name === "w")!; + const ensureStep = w.steps[0]; + assert.equal(ensureStep.type, "exec"); + if (ensureStep.type !== "exec") return; + assert.ok(ensureStep.catch && "block" in ensureStep.catch); + if (!(ensureStep.catch && "block" in ensureStep.catch)) return; + const forStep = pickFor(ensureStep.catch.block); + assert.ok(forStep, "expected for_lines step inside catch body"); +}); + +test("AC2: same for-loop inside recover body parses identically", () => { + const src = [ + "workflow target() {", + ' log "target"', + "}", + "workflow w(items) {", + " run target() recover(e) {", + ...FOR_BODY, + " }", + "}", + "", + ].join("\n"); + const mod = parsejaiph(src, "ac2-recover.jh"); + const w = mod.workflows.find((x) => x.name === "w")!; + const runStep = w.steps[0]; + assert.equal(runStep.type, "exec"); + if (runStep.type !== "exec") return; + assert.ok(runStep.recover && "block" in runStep.recover); + if (!(runStep.recover && "block" in runStep.recover)) return; + const forStep = pickFor(runStep.recover.block); + assert.ok(forStep, "expected for_lines step inside recover body"); +}); + +// === AC3: parse error messages and locations preserved bit-for-bit === +// +// These cover every error message and location the legacy three-function +// catch/recover path produced. They are exhaustively asserted as snapshots. + +type ErrSnap = { name: string; src: string; expected: string }; + +const ERR_SNAPSHOTS: ErrSnap[] = [ + // Bindings paren missing + { + name: "ensure catch: missing bindings paren (EOL)", + src: "workflow w() {\n ensure r() catch\n}\n", + expected: 'fixture.jh:2:14 E_PARSE catch requires explicit bindings and a body: catch () { ... }', + }, + { + name: "ensure catch: bindings open after `{`", + src: "workflow w() {\n ensure r() catch {\n}\n", + expected: 'fixture.jh:2:14 E_PARSE catch requires explicit bindings: catch () { ... }', + }, + { + name: "run catch: missing bindings paren (EOL)", + src: "workflow w() {\n run r() catch\n}\n", + expected: 'fixture.jh:2:11 E_PARSE catch requires explicit bindings and a body: catch () { ... }', + }, + { + name: "run recover: missing bindings paren (EOL)", + src: "workflow w() {\n run r() recover\n}\n", + expected: 'fixture.jh:2:11 E_PARSE recover requires explicit bindings and a body: recover() { ... }', + }, + { + name: "run recover: bindings open after `{`", + src: "workflow w() {\n run r() recover {\n}\n", + expected: 'fixture.jh:2:11 E_PARSE recover requires explicit bindings: recover() { ... }', + }, + + // Too many bindings + { + name: "ensure catch: two bindings rejected", + src: 'workflow w() {\n ensure r() catch (a, b) { log "x" }\n}\n', + expected: 'fixture.jh:2:14 E_PARSE catch accepts exactly one binding: catch () — the second binding (attempt) has been removed', + }, + { + name: "run recover: two bindings rejected", + src: 'workflow w() {\n run r() recover(a, b) { log "x" }\n}\n', + expected: 'fixture.jh:2:11 E_PARSE recover accepts exactly one binding: recover()', + }, + + // Empty body + { + name: "ensure catch: empty inline block rejected", + src: "workflow w() {\n ensure r() catch (e) { }\n}\n", + expected: 'fixture.jh:2:14 E_PARSE catch block must contain at least one statement', + }, + { + name: "ensure catch: empty multiline block rejected", + src: "workflow w() {\n ensure r() catch (e) {\n }\n}\n", + expected: 'fixture.jh:2:14 E_PARSE catch block must contain at least one statement', + }, + { + name: "run recover: empty inline block rejected", + src: "workflow w() {\n run r() recover(e) { }\n}\n", + expected: 'fixture.jh:2:11 E_PARSE recover block must contain at least one statement', + }, + + // Unterminated multiline block + { + name: "ensure catch: unterminated multiline block", + src: 'workflow w() {\n ensure r() catch (e) {\n log "x"\n', + expected: 'fixture.jh:2:14 E_PARSE unterminated catch block, expected "}"', + }, +]; + +for (const s of ERR_SNAPSHOTS) { + test(`AC3 snapshot: ${s.name}`, () => { + let actual = ""; + try { + parsejaiph(s.src, "fixture.jh"); + } catch (e) { + actual = (e as Error).message; + } + assert.equal(actual, s.expected); + }); +} diff --git a/src/parse/parse-steps.test.ts b/src/parse/parse-steps.test.ts index 12c2d7b7..999b3f09 100644 --- a/src/parse/parse-steps.test.ts +++ b/src/parse/parse-steps.test.ts @@ -1,80 +1,77 @@ import test from "node:test"; import assert from "node:assert/strict"; import { parsejaiph } from "../parser"; -import { parseEnsureStep, parseRunRecoverStep } from "./steps"; +import type { WorkflowStepDef } from "../types"; /** - * Helpers to keep individual asserts terse — `parseEnsureStep` / - * `parseRunCatchStep` / `parseRunRecoverStep` all return an `exec` step whose - * body is an `Expr.call` (run) or `Expr.ensure_call` (ensure). + * After Refactor 2 the per-host catch/recover parsers (`parseEnsureStep`, + * `parseRunCatchStep`, `parseRunRecoverStep`) and their mini body parser + * (`parseCatchStatement`) are gone. The contract is now exercised end-to-end + * through `parsejaiph` — `parseAttachedBlock` (in `src/parse/steps.ts`) + * delegates body parsing to the same `parseBlockStatement` used at the top + * level. */ -function asEnsureExec(step: import("../types").WorkflowStepDef) { + +function asEnsureExec(step: WorkflowStepDef) { if (step.type !== "exec" || step.body.kind !== "ensure_call") { throw new Error(`expected exec/ensure_call step, got ${step.type}`); } return step; } -function asRunExec(step: import("../types").WorkflowStepDef) { +function asRunExec(step: WorkflowStepDef) { if (step.type !== "exec" || step.body.kind !== "call") { throw new Error(`expected exec/call step, got ${step.type}`); } return step; } -// === parseEnsureStep: basic ensure without catch === +function parseOneWorkflowStep(bodyLines: string[]): WorkflowStepDef { + const src = ["workflow w() {", ...bodyLines.map((l) => ` ${l}`), "}", ""].join("\n"); + const mod = parsejaiph(src, "fixture.jh"); + const w = mod.workflows.find((x) => x.name === "w"); + if (!w) throw new Error("workflow not found"); + const steps = w.steps.filter((s) => s.type !== "trivia"); + if (steps.length !== 1) throw new Error(`expected one step, got ${steps.length}`); + return steps[0]; +} + +// === ensure: basic === -test("parseEnsureStep: parses basic ensure call", () => { - const lines = [" ensure my_rule()"]; - const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule()"); - const e = asEnsureExec(step); +test("ensure: parses basic ensure call", () => { + const e = asEnsureExec(parseOneWorkflowStep(["ensure my_rule()"])); assert.equal(e.body.kind, "ensure_call"); if (e.body.kind === "ensure_call") { assert.equal(e.body.callee.value, "my_rule"); } assert.equal(e.catch, undefined); - assert.equal(nextIdx, 0); }); -test("parseEnsureStep: parses ensure with args", () => { - const lines = [' ensure my_rule("arg1")']; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule("arg1")'); - const e = asEnsureExec(step); +test("ensure: parses ensure with args", () => { + const e = asEnsureExec(parseOneWorkflowStep(['ensure my_rule("arg1")'])); if (e.body.kind === "ensure_call") { assert.equal(e.body.callee.value, "my_rule"); assert.deepEqual(e.body.args, [{ kind: "literal", raw: '"arg1"' }]); } }); -test("parseEnsureStep: parses ensure with dotted ref", () => { - const lines = [" ensure lib.check()"]; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "lib.check()"); - const e = asEnsureExec(step); +test("ensure: parses ensure with dotted ref", () => { + const e = asEnsureExec(parseOneWorkflowStep(["ensure lib.check()"])); if (e.body.kind === "ensure_call") { assert.equal(e.body.callee.value, "lib.check"); } }); -test("parseEnsureStep: parses ensure with captureName", () => { - const lines = [" result = ensure my_rule()"]; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule()", "result"); - const e = asEnsureExec(step); - assert.equal(e.captureName, "result"); -}); - -test("parseEnsureStep: ensure without parens throws", () => { - const lines = [" ensure my_rule"]; +test("ensure: ensure without parens throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule"), + () => parseOneWorkflowStep(["ensure my_rule"]), /parentheses are required/, ); }); -// === parseEnsureStep: catch with single statement === +// === ensure catch: single statement forms === -test("parseEnsureStep: parses ensure with single catch statement", () => { - const lines = [' ensure my_rule() catch (failure) log "failed"']; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) log "failed"'); - const e = asEnsureExec(step); +test("ensure catch: parses single catch log statement", () => { + const e = asEnsureExec(parseOneWorkflowStep(['ensure my_rule() catch (failure) log "failed"'])); assert.ok(e.catch); assert.equal(e.catch!.bindings.failure, "failure"); if (e.catch && "single" in e.catch) { @@ -82,10 +79,8 @@ test("parseEnsureStep: parses ensure with single catch statement", () => { } }); -test("parseEnsureStep: parses ensure with catch run statement", () => { - const lines = [" ensure my_rule() catch (err) run fallback()"]; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (err) run fallback()"); - const e = asEnsureExec(step); +test("ensure catch: parses single catch run statement", () => { + const e = asEnsureExec(parseOneWorkflowStep(["ensure my_rule() catch (err) run fallback()"])); assert.ok(e.catch); assert.equal(e.catch!.bindings.failure, "err"); if (e.catch && "single" in e.catch) { @@ -93,18 +88,15 @@ test("parseEnsureStep: parses ensure with catch run statement", () => { } }); -test("parseEnsureStep: parses ensure with catch wait statement", () => { - const lines = [" ensure my_rule() catch (failure) wait"]; +test("ensure catch: wait statement is rejected", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) wait"), + () => parseOneWorkflowStep(["ensure my_rule() catch (failure) wait"]), /"wait" has been removed from the language/, ); }); -test("parseEnsureStep: parses ensure with catch fail statement", () => { - const lines = [' ensure my_rule() catch (failure) fail "reason"']; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) fail "reason"'); - const e = asEnsureExec(step); +test("ensure catch: parses single catch fail statement", () => { + const e = asEnsureExec(parseOneWorkflowStep(['ensure my_rule() catch (failure) fail "reason"'])); assert.ok(e.catch); if (e.catch && "single" in e.catch) { assert.equal(e.catch.single.type, "say"); @@ -114,12 +106,10 @@ test("parseEnsureStep: parses ensure with catch fail statement", () => { } }); -// === parseEnsureStep: catch with inline block === +// === ensure catch: inline block === -test("parseEnsureStep: parses ensure with inline catch block", () => { - const lines = [' ensure my_rule() catch (failure) { log "a"; log "b" }']; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) { log "a"; log "b" }'); - const e = asEnsureExec(step); +test("ensure catch: parses inline catch block", () => { + const e = asEnsureExec(parseOneWorkflowStep(['ensure my_rule() catch (failure) { log "a"; log "b" }'])); if (e.catch && "block" in e.catch) { assert.equal(e.catch.block.length, 2); assert.equal(e.catch.block[0].type, "say"); @@ -127,37 +117,32 @@ test("parseEnsureStep: parses ensure with inline catch block", () => { } }); -// === parseEnsureStep: catch with multiline block === +// === ensure catch: multiline block === -test("parseEnsureStep: parses ensure with multiline catch block", () => { - const lines = [ - " ensure my_rule() catch (failure) {", +test("ensure catch: parses multiline catch block", () => { + const e = asEnsureExec(parseOneWorkflowStep([ + "ensure my_rule() catch (failure) {", ' log "recovering"', " run fallback()", " }", - ]; - const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) {"); - const e = asEnsureExec(step); + ])); if (e.catch && "block" in e.catch) { assert.equal(e.catch.block.length, 2); assert.equal(e.catch.block[0].type, "say"); assert.equal(e.catch.block[1].type, "exec"); } - assert.equal(nextIdx, 3); }); -test("parseEnsureStep: multiline catch block with triple-quoted prompt", () => { - const lines = [ - " ensure gate() catch (err) {", +test("ensure catch: multiline block with triple-quoted prompt", () => { + const e = asEnsureExec(parseOneWorkflowStep([ + "ensure gate() catch (err) {", " run save()", ' prompt """', " fix CI", ' """', " run retry()", " }", - ]; - const { step, nextIdx } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "gate() catch (err) {"); - const e = asEnsureExec(step); + ])); if (e.catch && "block" in e.catch) { assert.equal(e.catch.block.length, 3); assert.equal(e.catch.block[0].type, "exec"); @@ -168,18 +153,15 @@ test("parseEnsureStep: multiline catch block with triple-quoted prompt", () => { } assert.equal(e.catch.block[2].type, "exec"); } - assert.equal(nextIdx, 6); }); -test("parseEnsureStep: catch block lines starting with # are trivia comments", () => { - const lines = [ - " ensure gate() catch (err) {", +test("ensure catch: comment lines become trivia", () => { + const e = asEnsureExec(parseOneWorkflowStep([ + "ensure gate() catch (err) {", " # note", " run retry()", " }", - ]; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "gate() catch (err) {"); - const e = asEnsureExec(step); + ])); if (e.catch && "block" in e.catch) { assert.equal(e.catch.block.length, 2); assert.equal(e.catch.block[0].type, "trivia"); @@ -187,70 +169,67 @@ test("parseEnsureStep: catch block lines starting with # are trivia comments", ( } }); -// === parseEnsureStep: catch bindings === +// === ensure catch: bindings === -test("parseEnsureStep: rejects catch with two bindings", () => { - const lines = [' ensure my_rule() catch (failure, attempt) { log "retry" }']; +test("ensure catch: rejects two bindings", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure, attempt) { log "retry" }'), + () => parseOneWorkflowStep(['ensure my_rule() catch (failure, attempt) { log "retry" }']), /catch accepts exactly one binding.*attempt.*has been removed/, ); }); -// === parseEnsureStep: catch errors === +// === ensure catch: error messages === -test("parseEnsureStep: catch at EOL without block throws", () => { - const lines = [" ensure my_rule() catch"]; +test("ensure catch: catch at EOL without block throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch"), + () => parseOneWorkflowStep(["ensure my_rule() catch"]), /catch requires explicit bindings/, ); }); -test("parseEnsureStep: catch without bindings throws", () => { - const lines = [" ensure my_rule() catch {"]; +test("ensure catch: catch without bindings throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch {"), + () => parseOneWorkflowStep(["ensure my_rule() catch {"]), /catch requires explicit bindings/, ); }); -test("parseEnsureStep: unterminated multiline catch block throws", () => { - const lines = [ - " ensure my_rule() catch (failure) {", - ' log "recovering"', - ]; +test("ensure catch: unterminated multiline catch block throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) {"), + () => parsejaiph( + [ + "workflow w() {", + " ensure my_rule() catch (failure) {", + ' log "recovering"', + "", + ].join("\n"), + "fixture.jh", + ), /unterminated catch block/, ); }); -test("parseEnsureStep: empty catch block throws", () => { - const lines = [ - " ensure my_rule() catch (failure) {", - " }", - ]; +test("ensure catch: empty catch block throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) {"), + () => parseOneWorkflowStep([ + "ensure my_rule() catch (failure) {", + " }", + ]), /catch block must contain at least one statement/, ); }); -test("parseEnsureStep: empty inline catch block throws", () => { - const lines = [" ensure my_rule() catch (failure) { }"]; +test("ensure catch: empty inline catch block throws", () => { assert.throws( - () => parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) { }"), + () => parseOneWorkflowStep(["ensure my_rule() catch (failure) { }"]), /catch block must contain at least one statement/, ); }); -// === parseEnsureStep: catch statement types === +// === ensure catch: statement varieties === -test("parseEnsureStep: catch with shell command", () => { - const lines = [" ensure my_rule() catch (failure) echo fallback"]; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], "my_rule() catch (failure) echo fallback"); - const e = asEnsureExec(step); +test("ensure catch: single shell command", () => { + const e = asEnsureExec(parseOneWorkflowStep(["ensure my_rule() catch (failure) echo fallback"])); if (e.catch && "single" in e.catch) { assert.equal(e.catch.single.type, "exec"); if (e.catch.single.type === "exec") { @@ -259,10 +238,8 @@ test("parseEnsureStep: catch with shell command", () => { } }); -test("parseEnsureStep: catch with logerr statement", () => { - const lines = [' ensure my_rule() catch (failure) logerr "error msg"']; - const { step } = parseEnsureStep("test.jh", lines, 0, 1, lines[0], 'my_rule() catch (failure) logerr "error msg"'); - const e = asEnsureExec(step); +test("ensure catch: single logerr statement", () => { + const e = asEnsureExec(parseOneWorkflowStep(['ensure my_rule() catch (failure) logerr "error msg"'])); if (e.catch && "single" in e.catch) { assert.equal(e.catch.single.type, "say"); if (e.catch.single.type === "say") { @@ -289,8 +266,7 @@ test("parsejaiph: workflow with ensure catch and multiline triple-quoted prompt" const mod = parsejaiph(src, "catch_prompt.jh"); const w = mod.workflows.find((x) => x.name === "w"); assert.ok(w); - const ensureStep = w!.steps[0]; - const e = asEnsureExec(ensureStep); + const e = asEnsureExec(w!.steps[0]); if (e.catch && "block" in e.catch) { assert.equal(e.catch.block.length, 1); const p = e.catch.block[0]; @@ -301,20 +277,10 @@ test("parsejaiph: workflow with ensure catch and multiline triple-quoted prompt" } }); -// === parseRunRecoverStep: basic recover === +// === run recover === -test("parseRunRecoverStep: returns null when no recover keyword", () => { - const lines = [" run my_workflow()"]; - const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "my_workflow()"); - assert.equal(result, null); -}); - -test("parseRunRecoverStep: parses run with single recover statement", () => { - const lines = [' run my_workflow() recover(err) log "repairing"']; - const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], 'my_workflow() recover(err) log "repairing"'); - assert.ok(result); - const step = asRunExec(result!.step); - assert.equal(step.body.kind, "call"); +test("run recover: parses single recover statement", () => { + const step = asRunExec(parseOneWorkflowStep(['run my_workflow() recover(err) log "repairing"'])); if (step.body.kind === "call") { assert.equal(step.body.callee.value, "my_workflow"); } @@ -325,11 +291,8 @@ test("parseRunRecoverStep: parses run with single recover statement", () => { } }); -test("parseRunRecoverStep: parses run with inline recover block", () => { - const lines = [' run fix() recover(e) { log "a"; run patch() }']; - const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], 'fix() recover(e) { log "a"; run patch() }'); - assert.ok(result); - const step = asRunExec(result!.step); +test("run recover: parses inline recover block", () => { + const step = asRunExec(parseOneWorkflowStep(['run fix() recover(e) { log "a"; run patch() }'])); if (step.recover && "block" in step.recover) { assert.equal(step.recover.block.length, 2); assert.equal(step.recover.block[0].type, "say"); @@ -337,52 +300,44 @@ test("parseRunRecoverStep: parses run with inline recover block", () => { } }); -test("parseRunRecoverStep: parses run with multiline recover block", () => { - const lines = [ - " run deploy() recover(err) {", +test("run recover: parses multiline recover block", () => { + const step = asRunExec(parseOneWorkflowStep([ + "run deploy() recover(err) {", ' log "retrying"', " run cleanup()", " }", - ]; - const result = parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "deploy() recover(err) {"); - assert.ok(result); - const step = asRunExec(result!.step); + ])); if (step.recover && "block" in step.recover) { assert.equal(step.recover.block.length, 2); assert.equal(step.recover.block[0].type, "say"); assert.equal(step.recover.block[1].type, "exec"); } - assert.equal(result!.nextIdx, 3); }); -test("parseRunRecoverStep: rejects recover at EOL without body", () => { - const lines = [" run my_workflow() recover"]; +test("run recover: rejects recover at EOL without body", () => { assert.throws( - () => parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "my_workflow() recover"), + () => parseOneWorkflowStep(["run my_workflow() recover"]), /recover requires explicit bindings/, ); }); -test("parseRunRecoverStep: rejects recover without bindings", () => { - const lines = [" run my_workflow() recover {"]; +test("run recover: rejects recover without bindings", () => { assert.throws( - () => parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "my_workflow() recover {"), + () => parseOneWorkflowStep(["run my_workflow() recover {"]), /recover requires explicit bindings/, ); }); -test("parseRunRecoverStep: rejects recover with two bindings", () => { - const lines = [' run my_workflow() recover(a, b) { log "x" }']; +test("run recover: rejects recover with two bindings", () => { assert.throws( - () => parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], 'my_workflow() recover(a, b) { log "x" }'), + () => parseOneWorkflowStep(['run my_workflow() recover(a, b) { log "x" }']), /recover accepts exactly one binding/, ); }); -test("parseRunRecoverStep: empty recover block throws", () => { - const lines = [" run my_workflow() recover(err) { }"]; +test("run recover: empty recover block throws", () => { assert.throws( - () => parseRunRecoverStep("test.jh", lines, 0, 1, lines[0], "my_workflow() recover(err) { }"), + () => parseOneWorkflowStep(["run my_workflow() recover(err) { }"]), /recover block must contain at least one statement/, ); }); @@ -406,7 +361,7 @@ test("parsejaiph: workflow with run recover block", () => { const mod = parsejaiph(src, "recover_test.jh"); const w = mod.workflows.find((x) => x.name === "deploy"); assert.ok(w); - const runStep = asRunExec(w!.steps[0]); - assert.ok(runStep.recover); - assert.equal(runStep.catch, undefined); + const step = asRunExec(w!.steps[0]); + assert.ok(step.recover); + assert.equal(step.catch, undefined); }); diff --git a/src/parse/steps.ts b/src/parse/steps.ts index 6150224c..6f3628d3 100644 --- a/src/parse/steps.ts +++ b/src/parse/steps.ts @@ -1,727 +1,141 @@ -import type { CatchBody, Expr, WorkflowStepDef } from "../types"; +import type { CatchBody, WorkflowStepDef } from "../types"; import { createTrivia, type Trivia } from "./trivia"; -import { parseConstRhs } from "./const-rhs"; -import { fail, indexOfClosingDoubleQuote, parseCallRef, parseLogMessageRhs, rejectTrailingContent } from "./core"; -import { parseAnonymousInlineScript } from "./inline-script"; -import { isBareIdentifierReturn, bareIdentifierToQuotedString, isBareDottedIdentifierReturn, dottedReturnToQuotedString } from "./workflow-return-dotted"; -import { parsePromptStep } from "./prompt"; +import { fail } from "./core"; +import { splitStatementsOnSemicolons } from "./statement-split"; +import { parseBlockStatement, parseBraceBlockBody } from "./workflow-brace"; -/** - * Split catch block content into statements on `;` or `\n`, but not inside - * double-quoted strings or triple-quoted `"""…"""` blocks (same idea as - * `splitStatementsOnSemicolons`). - */ -function splitCatchStatements(blockContent: string): string[] { - const statements: string[] = []; - let current = ""; - let inDoubleQuote = false; - let inTripleQuote = false; - let braceDepth = 0; - let i = 0; - while (i < blockContent.length) { - const ch = blockContent[i]; - const next3 = blockContent.slice(i, i + 3); - - if (inTripleQuote) { - if (next3 === '"""') { - current += next3; - inTripleQuote = false; - i += 3; - continue; - } - current += ch; - i += 1; - continue; - } - - if (inDoubleQuote) { - if (ch === '"' && (i === 0 || blockContent[i - 1] !== "\\")) { - inDoubleQuote = false; - } - current += ch; - i += 1; - continue; - } - - if (next3 === '"""') { - inTripleQuote = true; - current += next3; - i += 3; - continue; - } - - if (ch === '"') { - inDoubleQuote = true; - current += ch; - i += 1; - continue; - } - - if (ch === "{") { - braceDepth += 1; - current += ch; - i += 1; - continue; - } - if (ch === "}") { - braceDepth -= 1; - current += ch; - i += 1; - continue; - } - - if (braceDepth === 0 && (ch === ";" || ch === "\n")) { - const trimmed = current.trim(); - if (trimmed) statements.push(trimmed); - current = ""; - i += 1; - continue; - } - - current += ch; - i += 1; - } - const trimmed = current.trim(); - if (trimmed) statements.push(trimmed); - return statements; -} - -/** Build an `exec` step. Inline helper to keep call sites tidy. */ -function execStep( - body: Expr, - loc: { line: number; col: number }, - extras: { captureName?: string; catch?: CatchBody; recover?: CatchBody } = {}, -): WorkflowStepDef { - return { - type: "exec", - body, - ...(extras.captureName ? { captureName: extras.captureName } : {}), - ...(extras.catch ? { catch: extras.catch } : {}), - ...(extras.recover ? { recover: extras.recover } : {}), - loc, - }; -} - -/** Parse a single workflow statement string (e.g. "run foo", "ensure bar", "echo x") into a step. */ -function parseCatchStatement( - filePath: string, - lineNo: number, - col: number, - stmt: string, - trivia: Trivia, -): WorkflowStepDef { - const t = stmt.trim(); - const loc = { line: lineNo, col }; - if (!t) { - fail(filePath, "empty catch statement", lineNo, col); - } - if (t.startsWith("#")) { - return { type: "trivia", kind: "comment", text: t, loc }; - } - if (t === "wait") { - fail(filePath, '"wait" has been removed from the language', lineNo, col); - } - if (t === "return") { - return { type: "return", value: { kind: "literal", raw: '""' }, loc }; - } - if (t.startsWith("return ")) { - const retVal = t.slice("return ".length).trim(); - if (retVal.startsWith("run ")) { - const call = parseCallRef(retVal.slice("run ".length).trim()); - if (call && !call.rest.trim()) { - const callee = { value: call.ref, loc }; - return { - type: "return", - value: { kind: "call", callee, args: call.args }, - loc, - }; - } - } - if (retVal.startsWith("ensure ")) { - const call = parseCallRef(retVal.slice("ensure ".length).trim()); - if (call && !call.rest.trim()) { - const callee = { value: call.ref, loc }; - return { - type: "return", - value: { kind: "ensure_call", callee, args: call.args }, - loc, - }; - } - } - const isBareDotted = isBareDottedIdentifierReturn(retVal); - const isBare = !isBareDotted && isBareIdentifierReturn(retVal); - const raw = isBareDotted - ? dottedReturnToQuotedString(retVal) - : isBare - ? bareIdentifierToQuotedString(retVal) - : retVal; - const value: Expr = { kind: "literal", raw }; - if (isBareDotted || isBare) { - trivia.setNode(value, { bareSource: retVal.trim() }); - } - return { type: "return", value, loc }; - } - if (/^fail\s+/.test(t)) { - const arg = t.slice("fail".length).trimStart(); - if (!arg.startsWith('"')) { - fail(filePath, 'fail must match: fail ""', lineNo, col); - } - const closeIdx = indexOfClosingDoubleQuote(arg, 1); - if (closeIdx === -1) { - fail(filePath, "unterminated fail string", lineNo, col); - } - const raw = arg.slice(0, closeIdx + 1); - return { type: "say", level: "fail", message: { kind: "literal", raw }, loc }; - } - const constMatch = t.match(/^const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*(.+)$/s); - if (constMatch) { - const name = constMatch[1]; - const rhs = constMatch[2].trim(); - const syntheticLines = [t]; - const { value } = parseConstRhs(filePath, syntheticLines, 0, rhs, lineNo, col, false, name, trivia); - return { type: "const", name, value, loc }; - } - const genericAssignMatch = t.match(/^([A-Za-z_][A-Za-z0-9_]*)\s+=\s*(.+)$/s); - if ( - genericAssignMatch && - !genericAssignMatch[2].trimStart().startsWith("prompt ") && - !genericAssignMatch[2].trimStart().startsWith('"') && - !genericAssignMatch[2].trimStart().startsWith("'") && - !genericAssignMatch[2].trimStart().startsWith("$") - ) { - const captureName = genericAssignMatch[1]; - const rest = genericAssignMatch[2].trim(); - if (rest.startsWith("run ") || rest.startsWith("ensure ")) { - fail( - filePath, - `assignment without "const" is no longer supported; use "const ${captureName} = ${rest}"`, - lineNo, - col, - ); - } - } - if (t.startsWith("run ")) { - const runBody = t.slice("run ".length).trim(); - if (runBody.startsWith("`")) { - const result = parseAnonymousInlineScript(filePath, [], lineNo - 1, runBody, lineNo, col); - const body: Expr = { - kind: "inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }; - return execStep(body, loc); - } - // Check for run ... recover inside catch/recover blocks - const recoverLoopMatch = runBody.match(/ recover(?=[\s(])/); - if (recoverLoopMatch) { - const recLoopIdx = recoverLoopMatch.index!; - const leftPart = runBody.slice(0, recLoopIdx).trim(); - const rightPart = runBody.slice(recLoopIdx + " recover".length).trimStart(); - const callPart = parseCallRef(leftPart); - if (callPart && !callPart.rest.trim() && rightPart.startsWith("(")) { - const closeParen = rightPart.indexOf(")"); - if (closeParen !== -1) { - const bStr = rightPart.slice(1, closeParen).trim(); - const bParts = bStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { - const bindings = { failure: bParts[0] }; - const after = rightPart.slice(closeParen + 1).trim(); - const callee = { value: callPart.ref, loc }; - const body: Expr = { kind: "call", callee, args: callPart.args }; - if (after.startsWith("{") && after.endsWith("}")) { - const blockContent = after.slice(1, -1).trim(); - const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return execStep(body, loc, { recover: { block: blockSteps, bindings } }); - } - if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return execStep(body, loc, { recover: { single: singleStep, bindings } }); - } - } - } - } - } - // Check for run ... catch inside catch blocks - const recIdx = runBody.indexOf(" catch "); - if (recIdx !== -1) { - const leftPart = runBody.slice(0, recIdx).trim(); - const rightPart = runBody.slice(recIdx + " catch ".length).trim(); - const callPart = parseCallRef(leftPart); - if (callPart && !callPart.rest.trim() && rightPart.startsWith("(")) { - const closeParen = rightPart.indexOf(")"); - if (closeParen !== -1) { - const bStr = rightPart.slice(1, closeParen).trim(); - const bParts = bStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { - const bindings = { failure: bParts[0] }; - const after = rightPart.slice(closeParen + 1).trim(); - const callee = { value: callPart.ref, loc }; - const body: Expr = { kind: "call", callee, args: callPart.args }; - if (after.startsWith("{") && after.endsWith("}")) { - const blockContent = after.slice(1, -1).trim(); - const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return execStep(body, loc, { catch: { block: blockSteps, bindings } }); - } - if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return execStep(body, loc, { catch: { single: singleStep, bindings } }); - } - } - } - } - } - const call = parseCallRef(runBody); - if (call) { - rejectTrailingContent(filePath, lineNo, "run", call.rest); - const callee = { value: call.ref, loc }; - return execStep({ kind: "call", callee, args: call.args }, loc); - } - } - if (t.startsWith("ensure ")) { - const ensureBody = t.slice("ensure ".length).trim(); - const ensRecIdx = ensureBody.indexOf(" catch "); - if (ensRecIdx !== -1) { - const leftPart = ensureBody.slice(0, ensRecIdx).trim(); - const rightPart = ensureBody.slice(ensRecIdx + " catch ".length).trim(); - const callPart = parseCallRef(leftPart); - if (callPart && !callPart.rest.trim() && rightPart.startsWith("(")) { - const closeParen = rightPart.indexOf(")"); - if (closeParen !== -1) { - const bStr = rightPart.slice(1, closeParen).trim(); - const bParts = bStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bParts.length === 1 && /^[A-Za-z_][A-Za-z0-9_]*$/.test(bParts[0])) { - const bindings = { failure: bParts[0] }; - const after = rightPart.slice(closeParen + 1).trim(); - const callee = { value: callPart.ref, loc }; - const body: Expr = { kind: "ensure_call", callee, args: callPart.args }; - if (after.startsWith("{") && after.endsWith("}")) { - const blockContent = after.slice(1, -1).trim(); - const stmts = splitCatchStatements(blockContent); - const blockSteps = stmts.map((s) => parseCatchStatement(filePath, lineNo, col, s, trivia)); - return execStep(body, loc, { catch: { block: blockSteps, bindings } }); - } - if (!after.startsWith("{") && after) { - const singleStep = parseCatchStatement(filePath, lineNo, col, after, trivia); - return execStep(body, loc, { catch: { single: singleStep, bindings } }); - } - } - } - } - } - const call = parseCallRef(ensureBody); - if (call) { - rejectTrailingContent(filePath, lineNo, "ensure", call.rest); - const callee = { value: call.ref, loc }; - return execStep({ kind: "ensure_call", callee, args: call.args }, loc); - } - } - const promptAssignMatch = t.match( - /^([A-Za-z_][A-Za-z0-9_]*)\s*=\s*prompt\s+(.+)$/s, - ); - if (promptAssignMatch) { - fail( - filePath, - 'use "const name = prompt ..." in catch blocks (e.g. const x = prompt "...")', - lineNo, - col + t.indexOf(promptAssignMatch[1]), - ); - } - if (t.startsWith("prompt ")) { - return parsePromptStep( - filePath, [], lineNo - 1, t.slice("prompt ".length).trimStart(), - col + t.indexOf("prompt"), undefined, trivia, - ).step; - } - if (t.startsWith("log ") || t === "log") { - const logArg = t.slice("log".length).trimStart(); - const logCol = col + Math.max(0, t.indexOf("log")); - const raw = parseLogMessageRhs(filePath, lineNo, logCol, logArg, "log"); - return { type: "say", level: "log", message: { kind: "literal", raw }, loc: { line: lineNo, col: logCol } }; - } - if (t.startsWith("logerr ") || t === "logerr") { - const logerrArg = t.slice("logerr".length).trimStart(); - const logerrCol = col + Math.max(0, t.indexOf("logerr")); - const raw = parseLogMessageRhs(filePath, lineNo, logerrCol, logerrArg, "logerr"); - return { type: "say", level: "logerr", message: { kind: "literal", raw }, loc: { line: lineNo, col: logerrCol } }; - } - return execStep({ kind: "shell", command: t, loc }, loc); -} +const KEYWORD_EXAMPLE = { + catch: "catch () { ... }", + recover: "recover() { ... }", +} as const; /** - * Parse an `ensure [args] [catch ...]` step, with optional captureName. - * Returns the step (`type: "exec"`, `body: ensure_call`) and the updated 0-based line index. + * Parse a `() { … } | ` clause attached to a host + * `run` / `ensure` step. The body is parsed by the same `parseBlockStatement` + * used at the top level — there is no separate mini parser for catch/recover. + * + * `textAfterKeyword` is whatever follows `catch` / `recover` on the host line + * (the leading `(` may be preceded by whitespace). Returns the constructed + * `CatchBody` plus the next line index to resume parsing from. */ -export function parseEnsureStep( +export function parseAttachedBlock( filePath: string, lines: string[], idx: number, innerNo: number, innerRaw: string, - ensureBody: string, - captureName?: string, + keyword: "catch" | "recover", + textAfterKeyword: string, trivia: Trivia = createTrivia(), -): { step: WorkflowStepDef; nextIdx: number } { - const catchIdx = ensureBody.indexOf(" catch "); - const ensureCol = innerRaw.indexOf("ensure") + 1; - const stepLoc = { line: innerNo, col: ensureCol }; - - if (/\scatch$/.test(ensureBody)) { - const catchCol = innerRaw.indexOf("catch") + 1; - fail( - filePath, - 'catch requires explicit bindings and a body: catch () { ... }', - innerNo, - catchCol, - ); - } - - if (catchIdx === -1) { - const call = parseCallRef(ensureBody); - if (!call) { - fail(filePath, "ensure must target a valid reference: ensure ref() or ensure ref(args) — parentheses are required", innerNo); - } - rejectTrailingContent(filePath, innerNo, "ensure", call.rest); - const callee = { value: call.ref, loc: stepLoc }; - return { - step: execStep({ kind: "ensure_call", callee, args: call.args }, stepLoc, { captureName }), - nextIdx: idx, - }; - } - const left = ensureBody.slice(0, catchIdx).trim(); - const right = ensureBody.slice(catchIdx + " catch ".length).trim(); - const call = parseCallRef(left); - if (!call) { - fail(filePath, "ensure must target a valid reference: ensure ref() or ensure ref(args) — parentheses are required", innerNo); - } - rejectTrailingContent(filePath, innerNo, "ensure", call.rest); - const callee = { value: call.ref, loc: stepLoc }; - const args = call.args; - const catchCol = innerRaw.indexOf("catch") + 1; +): { body: CatchBody; nextIdx: number } { + const keywordCol = innerRaw.indexOf(keyword) + 1; + const right = textAfterKeyword.trimStart(); if (!right.startsWith("(")) { fail( filePath, - 'catch requires explicit bindings: catch () { ... }', + `${keyword} requires explicit bindings: ${KEYWORD_EXAMPLE[keyword]}`, innerNo, - catchCol, + keywordCol, ); } - const closeParen = right.indexOf(")"); if (closeParen === -1) { - fail(filePath, 'unterminated catch bindings: expected ")"', innerNo, catchCol); - } - const bindingsStr = right.slice(1, closeParen).trim(); - const bindingParts = bindingsStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bindingParts.length === 0) { - fail(filePath, "catch requires exactly one binding: catch () { ... }", innerNo, catchCol); - } - if (bindingParts.length > 1) { - fail(filePath, 'catch accepts exactly one binding: catch () — the second binding (attempt) has been removed', innerNo, catchCol); + fail(filePath, `unterminated ${keyword} bindings: expected ")"`, innerNo, keywordCol); } - if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(bindingParts[0])) { - fail(filePath, `invalid catch binding name: "${bindingParts[0]}" — must be a valid identifier`, innerNo, catchCol); - } - const bindings = { failure: bindingParts[0] }; - const afterBindings = right.slice(closeParen + 1).trim(); - const body: Expr = { kind: "ensure_call", callee, args }; - - if (afterBindings === "{") { - let blockLines: string[] = []; - let closeLineIdx = -1; - let braceDepth = 1; - for (let look = idx + 1; look < lines.length; look += 1) { - const trimmed = lines[look].trim(); - if (trimmed.endsWith("{")) braceDepth += 1; - if (trimmed === "}") { - braceDepth -= 1; - if (braceDepth === 0) { closeLineIdx = look; break; } - } - blockLines.push(trimmed); - } - if (closeLineIdx === -1) { - fail(filePath, 'unterminated catch block, expected "}"', innerNo, catchCol); - } - const statements = splitCatchStatements(blockLines.join("\n")); - if (statements.length === 0) { - fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); - } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), - nextIdx: closeLineIdx, - }; - } - - if (afterBindings.startsWith("{")) { - const closeBrace = afterBindings.indexOf("}"); - if (closeBrace === -1) { - fail(filePath, 'unterminated catch block, expected "}"', innerNo, catchCol); - } - const blockContent = afterBindings.slice(1, closeBrace).trim(); - const statements = splitCatchStatements(blockContent); - if (statements.length === 0) { - fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); - } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), - nextIdx: idx, - }; - } - - if (!afterBindings) { - fail(filePath, "catch requires a body after bindings", innerNo, catchCol); - } - - const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); - return { - step: execStep(body, stepLoc, { captureName, catch: { single: singleStep, bindings } }), - nextIdx: idx, - }; -} - -/** - * Try to parse `run (args) recover(binding) { ... }` syntax (loop semantics). - * Returns null if the run body does not contain ` recover `. - */ -export function parseRunRecoverStep( - filePath: string, - lines: string[], - idx: number, - innerNo: number, - innerRaw: string, - runBody: string, - captureName?: string, - trivia: Trivia = createTrivia(), -): { step: WorkflowStepDef; nextIdx: number } | null { - const recoverMatch = runBody.match(/ recover(?=[\s(]|$)/); - if (!recoverMatch) return null; - const recoverIdx = recoverMatch.index!; - - if (/ recover$/.test(runBody)) { - const recoverCol = innerRaw.indexOf("recover") + 1; + const bindingParts = right + .slice(1, closeParen) + .split(",") + .map((s) => s.trim()) + .filter(Boolean); + if (bindingParts.length === 0) { fail( filePath, - 'recover requires explicit bindings and a body: recover() { ... }', + `${keyword} requires exactly one binding: ${KEYWORD_EXAMPLE[keyword]}`, innerNo, - recoverCol, + keywordCol, ); } - - const left = runBody.slice(0, recoverIdx).trim(); - const right = runBody.slice(recoverIdx + " recover".length).trimStart(); - const call = parseCallRef(left); - if (!call || call.rest.trim()) return null; - const runCol = innerRaw.indexOf("run") + 1; - const stepLoc = { line: innerNo, col: runCol }; - const recoverCol = innerRaw.indexOf("recover") + 1; - - if (!right.startsWith("(")) { + if (bindingParts.length > 1) { + if (keyword === "catch") { + fail( + filePath, + "catch accepts exactly one binding: catch () — the second binding (attempt) has been removed", + innerNo, + keywordCol, + ); + } + fail(filePath, "recover accepts exactly one binding: recover()", innerNo, keywordCol); + } + if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(bindingParts[0])) { fail( filePath, - 'recover requires explicit bindings: recover() { ... }', + `invalid ${keyword} binding name: "${bindingParts[0]}" — must be a valid identifier`, innerNo, - recoverCol, + keywordCol, ); } - - const closeParen = right.indexOf(")"); - if (closeParen === -1) { - fail(filePath, 'unterminated recover bindings: expected ")"', innerNo, recoverCol); - } - const bindingsStr = right.slice(1, closeParen).trim(); - const bindingParts = bindingsStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bindingParts.length === 0) { - fail(filePath, "recover requires exactly one binding: recover() { ... }", innerNo, recoverCol); - } - if (bindingParts.length > 1) { - fail(filePath, "recover accepts exactly one binding: recover()", innerNo, recoverCol); - } - if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(bindingParts[0])) { - fail(filePath, `invalid recover binding name: "${bindingParts[0]}" — must be a valid identifier`, innerNo, recoverCol); - } const bindings = { failure: bindingParts[0] }; - const afterBindings = right.slice(closeParen + 1).trim(); - const callee = { value: call.ref, loc: stepLoc }; - const body: Expr = { kind: "call", callee, args: call.args }; + // Multi-line block: `{` at end of host line; body lives on subsequent lines. if (afterBindings === "{") { - let blockLines: string[] = []; - let closeLineIdx = -1; - let braceDepth = 1; - for (let look = idx + 1; look < lines.length; look += 1) { - const trimmed = lines[look].trim(); - if (trimmed.endsWith("{")) braceDepth += 1; - if (trimmed === "}") { - braceDepth -= 1; - if (braceDepth === 0) { closeLineIdx = look; break; } + // Pre-scan for the matching `}` so the unterminated message names the clause. + let depth = 1; + let probe = idx + 1; + while (probe < lines.length) { + const t = lines[probe].trim(); + if (t.endsWith("{")) depth += 1; + if (t === "}") { + depth -= 1; + if (depth === 0) break; } - blockLines.push(trimmed); + probe += 1; } - if (closeLineIdx === -1) { - fail(filePath, 'unterminated recover block, expected "}"', innerNo, recoverCol); + if (probe >= lines.length) { + fail(filePath, `unterminated ${keyword} block, expected "}"`, innerNo, keywordCol); } - const statements = splitCatchStatements(blockLines.join("\n")); - if (statements.length === 0) { - fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); + const { steps, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, trivia); + if (steps.length === 0) { + fail(filePath, `${keyword} block must contain at least one statement`, innerNo, keywordCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, recover: { block: blockSteps, bindings } }), - nextIdx: closeLineIdx, - }; + return { body: { block: steps, bindings }, nextIdx }; } + // Inline block on a single line: `{ stmt[; stmt]* }`. if (afterBindings.startsWith("{")) { - const closeBrace = afterBindings.indexOf("}"); - if (closeBrace === -1) { - fail(filePath, 'unterminated recover block, expected "}"', innerNo, recoverCol); + if (!afterBindings.endsWith("}")) { + fail(filePath, `unterminated ${keyword} block, expected "}"`, innerNo, keywordCol); } - const blockContent = afterBindings.slice(1, closeBrace).trim(); - const statements = splitCatchStatements(blockContent); - if (statements.length === 0) { - fail(filePath, "recover block must contain at least one statement", innerNo, recoverCol); + const content = afterBindings.slice(1, -1).trim(); + const stmts = content === "" ? [] : splitStatementsOnSemicolons(content); + if (stmts.length === 0) { + fail(filePath, `${keyword} block must contain at least one statement`, innerNo, keywordCol); } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, recoverCol, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, recover: { block: blockSteps, bindings } }), - nextIdx: idx, - }; + const blockSteps = stmts.map((stmt) => parseAtHostLine(filePath, idx, stmt, trivia)); + return { body: { block: blockSteps, bindings }, nextIdx: idx + 1 }; } - if (!afterBindings) { - fail(filePath, "recover requires a body after bindings", innerNo, recoverCol); + if (afterBindings === "") { + fail(filePath, `${keyword} requires a body after bindings`, innerNo, keywordCol); } - const singleStep = parseCatchStatement(filePath, innerNo, recoverCol, afterBindings, trivia); - return { - step: execStep(body, stepLoc, { captureName, recover: { single: singleStep, bindings } }), - nextIdx: idx, - }; + const single = parseAtHostLine(filePath, idx, afterBindings, trivia); + return { body: { single, bindings }, nextIdx: idx + 1 }; } /** - * Try to parse `run (args) catch (bindings) { ... }` syntax. - * Returns null if the run body does not contain ` catch `. + * Parse a single statement string as if it lived on the host line. Padded + * lines preserve the source line number in nested error messages. */ -export function parseRunCatchStep( +function parseAtHostLine( filePath: string, - lines: string[], - idx: number, - innerNo: number, - innerRaw: string, - runBody: string, - captureName?: string, - trivia: Trivia = createTrivia(), -): { step: WorkflowStepDef; nextIdx: number } | null { - const catchIdx = runBody.indexOf(" catch "); - if (catchIdx === -1) return null; - - if (/\scatch$/.test(runBody)) { - const catchCol = innerRaw.indexOf("catch") + 1; - fail( - filePath, - 'catch requires explicit bindings and a body: catch () { ... }', - innerNo, - catchCol, - ); - } - - const left = runBody.slice(0, catchIdx).trim(); - const right = runBody.slice(catchIdx + " catch ".length).trim(); - const call = parseCallRef(left); - if (!call || call.rest.trim()) return null; - const runCol = innerRaw.indexOf("run") + 1; - const stepLoc = { line: innerNo, col: runCol }; - const catchCol = innerRaw.indexOf("catch") + 1; - - if (!right.startsWith("(")) { - fail( - filePath, - 'catch requires explicit bindings: catch () { ... }', - innerNo, - catchCol, - ); - } - - const closeParen = right.indexOf(")"); - if (closeParen === -1) { - fail(filePath, 'unterminated catch bindings: expected ")"', innerNo, catchCol); - } - const bindingsStr = right.slice(1, closeParen).trim(); - const bindingParts = bindingsStr.split(",").map((s) => s.trim()).filter(Boolean); - if (bindingParts.length === 0) { - fail(filePath, "catch requires exactly one binding: catch () { ... }", innerNo, catchCol); - } - if (bindingParts.length > 1) { - fail(filePath, 'catch accepts exactly one binding: catch () — the second binding (attempt) has been removed', innerNo, catchCol); - } - if (!/^[A-Za-z_][A-Za-z0-9_]*$/.test(bindingParts[0])) { - fail(filePath, `invalid catch binding name: "${bindingParts[0]}" — must be a valid identifier`, innerNo, catchCol); - } - const bindings = { failure: bindingParts[0] }; - - const afterBindings = right.slice(closeParen + 1).trim(); - const callee = { value: call.ref, loc: stepLoc }; - const body: Expr = { kind: "call", callee, args: call.args }; - - if (afterBindings === "{") { - let blockLines: string[] = []; - let closeLineIdx = -1; - let braceDepth = 1; - for (let look = idx + 1; look < lines.length; look += 1) { - const trimmed = lines[look].trim(); - if (trimmed.endsWith("{")) braceDepth += 1; - if (trimmed === "}") { - braceDepth -= 1; - if (braceDepth === 0) { closeLineIdx = look; break; } - } - blockLines.push(trimmed); - } - if (closeLineIdx === -1) { - fail(filePath, 'unterminated catch block, expected "}"', innerNo, catchCol); - } - const statements = splitCatchStatements(blockLines.join("\n")); - if (statements.length === 0) { - fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); - } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, 1, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), - nextIdx: closeLineIdx, - }; - } - - if (afterBindings.startsWith("{")) { - const closeBrace = afterBindings.indexOf("}"); - if (closeBrace === -1) { - fail(filePath, 'unterminated catch block, expected "}"', innerNo, catchCol); - } - const blockContent = afterBindings.slice(1, closeBrace).trim(); - const statements = splitCatchStatements(blockContent); - if (statements.length === 0) { - fail(filePath, "catch block must contain at least one statement", innerNo, catchCol); - } - const blockSteps = statements.map((s) => parseCatchStatement(filePath, innerNo, catchCol, s, trivia)); - return { - step: execStep(body, stepLoc, { captureName, catch: { block: blockSteps, bindings } }), - nextIdx: idx, - }; - } - - if (!afterBindings) { - fail(filePath, "catch requires a body after bindings", innerNo, catchCol); - } - - const singleStep = parseCatchStatement(filePath, innerNo, catchCol, afterBindings, trivia); - return { - step: execStep(body, stepLoc, { captureName, catch: { single: singleStep, bindings } }), - nextIdx: idx, - }; + hostIdx: number, + stmt: string, + trivia: Trivia, +): WorkflowStepDef { + const padded = new Array(hostIdx).fill(""); + padded.push(stmt); + return parseBlockStatement(filePath, padded, hostIdx, trivia).step; } diff --git a/src/parse/workflow-brace.ts b/src/parse/workflow-brace.ts index 5bf66feb..56f0c698 100644 --- a/src/parse/workflow-brace.ts +++ b/src/parse/workflow-brace.ts @@ -14,7 +14,7 @@ import { consumeTripleQuotedArg, dedentTripleQuotedBody, tripleQuoteBodyToRaw } import { parseConstRhs } from "./const-rhs"; import { parseAnonymousInlineScript } from "./inline-script"; import { parseConfigBlock } from "./metadata"; -import { parseEnsureStep, parseRunCatchStep, parseRunRecoverStep } from "./steps"; +import { parseAttachedBlock } from "./steps"; import { parsePromptStep } from "./prompt"; import { parseSendRhs } from "./send-rhs"; import { parseMatchExpr } from "./match"; @@ -115,6 +115,120 @@ function execStep( }; } +/** + * Parse `run [async] (args)` or `ensure (args)`, optionally followed + * by `catch (binding) { ... }` or — for `run` only — `recover(binding) { ... }`. + * + * The catch/recover clause is parsed via the unified `parseAttachedBlock`, whose + * body uses the same `parseBlockStatement` as the top-level dispatcher. + */ +function parseRunOrEnsure( + filePath: string, + lines: string[], + idx: number, + innerNo: number, + innerRaw: string, + host: "run" | "ensure", + hostBody: string, + isAsync: boolean, + captureName: string | undefined, + trivia: Trivia, +): { step: WorkflowStepDef; nextIdx: number } { + const hostName = host === "ensure" ? "ensure" : isAsync ? "run async" : "run"; + const hostCol = innerRaw.indexOf(host) + 1; + const stepLoc = { line: innerNo, col: hostCol }; + + if (/\scatch$/.test(hostBody)) { + fail( + filePath, + 'catch requires explicit bindings and a body: catch () { ... }', + innerNo, + innerRaw.indexOf("catch") + 1, + ); + } + if (host === "run" && / recover$/.test(hostBody)) { + fail( + filePath, + 'recover requires explicit bindings and a body: recover() { ... }', + innerNo, + innerRaw.indexOf("recover") + 1, + ); + } + + let attached: + | { keyword: "catch" | "recover"; left: string; after: string } + | null = null; + if (host === "run") { + const m = hostBody.match(/ recover(?=[\s(])/); + if (m) { + const pos = m.index!; + attached = { + keyword: "recover", + left: hostBody.slice(0, pos).trim(), + after: hostBody.slice(pos + " recover".length), + }; + } + } + if (!attached) { + const ci = hostBody.indexOf(" catch "); + if (ci !== -1) { + attached = { + keyword: "catch", + left: hostBody.slice(0, ci).trim(), + after: hostBody.slice(ci + " catch ".length), + }; + } + } + + // `run` falls back to plain parsing when the call before catch/recover has + // trailing content, preserving the legacy "unexpected content" error shape. + if (attached && host === "run") { + const probe = parseCallRef(attached.left); + if (!probe || probe.rest.trim()) { + attached = null; + } + } + + if (!attached) { + const call = parseCallRef(hostBody); + if (!call) { + fail( + filePath, + `${hostName} must target a valid reference: ${hostName} ref() or ${hostName} ref(args) — parentheses are required`, + innerNo, + ); + } + rejectTrailingContent(filePath, innerNo, hostName, call.rest); + const callee = { value: call.ref, loc: stepLoc }; + const body: Expr = host === "ensure" + ? { kind: "ensure_call", callee, args: call.args } + : { kind: "call", callee, args: call.args, ...(isAsync ? { async: true as const } : {}) }; + return { step: execStep(body, stepLoc, { captureName }), nextIdx: idx + 1 }; + } + + const call = parseCallRef(attached.left); + if (!call) { + fail( + filePath, + `${hostName} must target a valid reference: ${hostName} ref() or ${hostName} ref(args) — parentheses are required`, + innerNo, + ); + } + rejectTrailingContent(filePath, innerNo, hostName, call.rest); + const callee = { value: call.ref, loc: stepLoc }; + const body: Expr = host === "ensure" + ? { kind: "ensure_call", callee, args: call.args } + : { kind: "call", callee, args: call.args, ...(isAsync ? { async: true as const } : {}) }; + + const result = parseAttachedBlock( + filePath, lines, idx, innerNo, innerRaw, attached.keyword, attached.after, trivia, + ); + const extras = attached.keyword === "catch" + ? { captureName, catch: result.body } + : { captureName, recover: result.body }; + return { step: execStep(body, stepLoc, extras), nextIdx: result.nextIdx }; +} + /** * One workflow statement inside `{ … }` (catch body, etc.). */ @@ -256,11 +370,9 @@ export function parseBlockStatement( if (inner.startsWith("ensure ")) { const ensureBody = inner.slice("ensure ".length).trim(); - const r = parseEnsureStep( - filePath, lines, idx, innerNo, innerRaw, - ensureBody, undefined, trivia, + return parseRunOrEnsure( + filePath, lines, idx, innerNo, innerRaw, "ensure", ensureBody, false, undefined, trivia, ); - return { step: r.step, nextIdx: r.nextIdx + 1 }; } if (inner.startsWith("run async ")) { @@ -269,37 +381,9 @@ export function parseBlockStatement( if (runBody.startsWith("`")) { fail(filePath, "run async is not supported with inline scripts", innerNo, runCol); } - // run async ... recover(name) { ... } - const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (recoverResult && recoverResult.step.type === "exec" && recoverResult.step.body.kind === "call") { - const body: Expr = { ...recoverResult.step.body, async: true }; - return { - step: { ...recoverResult.step, body }, - nextIdx: recoverResult.nextIdx + 1, - }; - } - // run async ... catch(name) { ... } - const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (catchResult && catchResult.step.type === "exec" && catchResult.step.body.kind === "call") { - const body: Expr = { ...catchResult.step.body, async: true }; - return { - step: { ...catchResult.step, body }, - nextIdx: catchResult.nextIdx + 1, - }; - } - const call = parseCallRef(runBody); - if (!call) { - fail(filePath, "run async must target a valid reference: run async ref() or run async ref(args) — parentheses are required", innerNo); - } - rejectTrailingContent(filePath, innerNo, "run async", call.rest); - const callee = { value: call.ref, loc: { line: innerNo, col: runCol } }; - return { - step: execStep( - { kind: "call", callee, args: call.args, async: true }, - { line: innerNo, col: runCol }, - ), - nextIdx: idx + 1, - }; + return parseRunOrEnsure( + filePath, lines, idx, innerNo, innerRaw, "run", runBody, true, undefined, trivia, + ); } if (inner.startsWith("run ")) { @@ -323,29 +407,9 @@ export function parseBlockStatement( if (runBody.startsWith("script(") || runBody.startsWith("script (")) { fail(filePath, 'inline script syntax has changed: use run `body`(args) instead of run script(args) "body"', innerNo); } - // Check for run ... recover (loop semantics) - const recoverResult = parseRunRecoverStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (recoverResult) { - return { step: recoverResult.step, nextIdx: recoverResult.nextIdx + 1 }; - } - // Check for run ... catch - const catchResult = parseRunCatchStep(filePath, lines, idx, innerNo, innerRaw, runBody, undefined, trivia); - if (catchResult) { - return { step: catchResult.step, nextIdx: catchResult.nextIdx + 1 }; - } - const call = parseCallRef(runBody); - if (!call) { - fail(filePath, "run must target a valid reference: run ref() or run ref(args) — parentheses are required", innerNo); - } - rejectTrailingContent(filePath, innerNo, "run", call.rest); - const callee = { value: call.ref, loc: { line: innerNo, col: runCol } }; - return { - step: execStep( - { kind: "call", callee, args: call.args }, - { line: innerNo, col: runCol }, - ), - nextIdx: idx + 1, - }; + return parseRunOrEnsure( + filePath, lines, idx, innerNo, innerRaw, "run", runBody, false, undefined, trivia, + ); } if (forRule && (inner.startsWith("prompt ") || /^[A-Za-z_][A-Za-z0-9_]*\s*=\s*prompt\s/.test(inner))) { diff --git a/test-fixtures/compiler-txtar/parse-errors.txt b/test-fixtures/compiler-txtar/parse-errors.txt index 35f01712..fd26dfbd 100644 --- a/test-fixtures/compiler-txtar/parse-errors.txt +++ b/test-fixtures/compiler-txtar/parse-errors.txt @@ -919,7 +919,9 @@ workflow default() { } === catch: fail without double quote -# @expect error E_PARSE "fail must match" @5:1 +# Body parsing unified with parseBlockStatement (Refactor 2). Error now points +# to the inner statement's actual line/col. +# @expect error E_PARSE "fail must match" @6:5 --- input.jh rule check() { return "ok" @@ -931,7 +933,7 @@ workflow default() { } === catch: unterminated fail string -# @expect error E_PARSE "unterminated fail string" @5:1 +# @expect error E_PARSE "multiline strings use triple quotes: fail" @6:5 --- input.jh rule check() { return "ok" @@ -943,7 +945,7 @@ workflow default() { } === catch: log without double quote -# @expect error E_PARSE "log must match" @5:1 +# @expect error E_PARSE "log must match" @6:5 --- input.jh rule check() { return "ok" @@ -955,7 +957,7 @@ workflow default() { } === catch: unterminated log string -# @expect error E_PARSE "unterminated log string" @5:1 +# @expect error E_PARSE "multiline strings use triple quotes: log" @6:5 --- input.jh rule check() { return "ok" @@ -967,7 +969,7 @@ workflow default() { } === catch: logerr without double quote -# @expect error E_PARSE "logerr must match" @5:1 +# @expect error E_PARSE "logerr must match" @6:5 --- input.jh rule check() { return "ok" @@ -979,7 +981,7 @@ workflow default() { } === catch: unterminated logerr string -# @expect error E_PARSE "unterminated logerr string" @5:1 +# @expect error E_PARSE "multiline strings use triple quotes: logerr" @6:5 --- input.jh rule check() { return "ok" @@ -1439,7 +1441,7 @@ workflow default() { } === catch: prompt capture without const -# @expect error E_PARSE "const name = prompt" @5:1 +# @expect error E_PARSE "const name = prompt" @6:5 --- input.jh rule check() { return "ok" @@ -1899,7 +1901,7 @@ workflow default() { } === inline catch unterminated fail string -# @expect error E_PARSE "unterminated fail string" +# @expect error E_PARSE "multiline strings use triple quotes: fail" --- input.jh rule check() { return "ok" From b9347a019e9cd7790ac9d675c04f321b4135d5ef Mon Sep 17 00:00:00 2001 From: Jakub Dzikowski Date: Sat, 16 May 2026 17:35:11 +0200 Subject: [PATCH 14/14] Refactor: replace parseBlockStatement cascade with dispatch table MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit parseBlockStatement used to dispatch each statement form via an ordered cascade of startsWith + regex tests (run async before run, prompt before bare assignment, etc.), so adding a keyword meant finding the right slot and any reordering risked changing which branch fired. The cascade is replaced by a STATEMENT: Record table keyed by the leading keyword: the dispatcher tokenizes the first identifier on the trimmed line, looks it up, and invokes the matching handler — which returns a result, returns null to fall through, or calls fail(...). The shared per-line context is now a BlockCtx record threaded into every handler. Assignment-shape guards run in applyAssignmentGuards before the lookup. Surface syntax and every existing parse-error message / line / col are preserved. New tests pin the invariants: an error snapshot test compares { file, line, col, code, message } for every fixture in parse-errors.txt against parse-errors-snapshot.json, and a synthetic-keyword test patches STATEMENT at runtime to prove adding a top-level keyword is a two-place change (STATEMENT row + JAIPH_KEYWORDS entry). The wider tokenizer rewrite remains future work. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + QUEUE.md | 39 +- docs/architecture.md | 1 + docs/contributing.md | 1 + docs/grammar.md | 5 +- src/parse/parse-error-snapshot.test.ts | 158 ++ src/parse/parse-synthetic-keyword.test.ts | 87 + src/parse/workflow-brace.ts | 760 ++++--- .../compiler-txtar/parse-errors-snapshot.json | 1969 +++++++++++++++++ 9 files changed, 2601 insertions(+), 420 deletions(-) create mode 100644 src/parse/parse-error-snapshot.test.ts create mode 100644 src/parse/parse-synthetic-keyword.test.ts create mode 100644 test-fixtures/compiler-txtar/parse-errors-snapshot.json diff --git a/CHANGELOG.md b/CHANGELOG.md index aaf1bed9..9c86a508 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,6 @@ # Unreleased +- **Refactor — Replace the `parseBlockStatement` keyword cascade with a `STATEMENT` dispatch table:** `parseBlockStatement` in `src/parse/workflow-brace.ts` used to dispatch each statement form via a long ordered cascade of `startsWith` + regex tests (`"run async "` before `"run "`, `"prompt "` before bare assignment, etc.), so adding a new keyword meant finding the right slot in the cascade and any reordering risked changing which branch fired. The cascade is replaced by a `STATEMENT: Record` table keyed by the leading keyword: the dispatcher tokenizes the first identifier on the trimmed line, looks it up in the table, and invokes the matching handler — which returns a `{ step, nextIdx }` result, returns `null` to fall through, or calls `fail(...)` to abort. The current rows are `if`, `for`, `const`, `fail`, `wait`, `ensure`, `run`, `prompt`, `log`, `logerr`, `return`, and `match`; each handler (`tryParseIf`, `tryParseFor`, `tryParseConst`, `tryParseFail`, `tryParseWait`, `tryParseEnsure`, `tryParseRun`, `tryParsePrompt`, `tryParseLog`, `tryParseLogerr`, `tryParseReturn`, `tryParseStandaloneMatch`) carries the same regex / `startsWith` checks that used to live inline in the cascade — body shapes are unchanged. After dispatch, two non-keyword fallbacks fire in order: `trySend` (matches `channel <- rhs` via `matchSendOperator`) and `shellFallthrough` (everything else becomes a shell `exec` step). Assignment-shape error guards (`name = prompt …`, `name = run …` without `const`) live in a separate `applyAssignmentGuards(c)` helper that runs before the table lookup and either calls `fail(...)` or returns; the `forRule` rejection of `prompt …` inside rules also moves here. The shared per-line context (`filePath`, `lines`, `idx`, `innerRaw`, `inner`, `innerNo`, `trivia`, `forRule`, `opts`) is now a `BlockCtx` record threaded into every handler, so handlers take one argument instead of nine. Surface syntax is unchanged, every existing parse-error message / line / col is preserved, and the full golden corpus passes byte-for-byte. New tests pin the invariants: `src/parse/parse-error-snapshot.test.ts` walks every `=== name` block in `test-fixtures/compiler-txtar/parse-errors.txt`, parses each via `loadModuleGraph`, and asserts the captured `{ file, line, col, code, message }` matches the snapshot stored at `test-fixtures/compiler-txtar/parse-errors-snapshot.json` bit-for-bit — any drift in parser error wording or location fails the test (refreshable with `UPDATE_SNAPSHOTS=1` only after confirming the change is intentional). `src/parse/parse-synthetic-keyword.test.ts` pins the two-file extension contract: it patches `STATEMENT` at runtime with a synthetic `zzznoop` handler, asserts `parseBlockStatement` dispatches to it, asserts the same input falls through to the shell handler when the row is absent, and greps `src/parse/workflow-brace.ts` and `src/parse/core.ts` to confirm the `STATEMENT` table and the `JAIPH_KEYWORDS` reserved set each live in exactly one file. Adding a new top-level keyword is now a two-place change: one row in `STATEMENT` (`workflow-brace.ts`) and one entry in `JAIPH_KEYWORDS` (`core.ts`). `BlockCtx`, `BlockResult`, `BlockHandler`, and `STATEMENT` are exported so external test files can stage synthetic handlers without forking the parser. Out of scope: the wider tokenizer rewrite (the seven independent `inDoubleQuote` / `inTripleQuote` / `braceDepth` scanners across `src/parse/`, the line-walking `{ step, nextIdx }` contract, and the per-handler regex bodies are deferred — this refactor only changes the *dispatch shape* inside `parseBlockStatement`, not the scanning underneath). User-visible contracts — surface syntax, CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming — are unchanged. Docs updated in `docs/architecture.md` (extended **Parser** bullet with a new **Keyword dispatch table** paragraph), `docs/contributing.md` (new **Statement-dispatch-table shape** row in the test-layer table), and `docs/grammar.md` (extended the EBNF aside to name the `STATEMENT` table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 1 AC3 / AC4 / AC5 (the full tokenizer rewrite remains future work). - **Refactor — Unify `catch` / `recover` parsing into a single attached-block routine sharing the top-level statement parser:** `src/parse/steps.ts` used to contain three near-identical 100+ line functions — `parseEnsureStep`, `parseRunCatchStep`, and `parseRunRecoverStep` — that parsed the same syntactic shape (` (binding) { body } | single-stmt`) and differed only in which host step they decorated (`ensure` vs `run`) and the literal keyword (`catch` vs `recover`). Their body parser, `parseCatchStatement` (~280 lines), was a stripped-down copy of `parseBlockStatement` that recognized only a fixed subset of statement forms (e.g. a `for … in …` head fell through to a shell command) and diverged in subtle ways — the same fix had to land in two places, and divergence wasn't always caught by tests. All four functions and every helper that existed only to serve them are deleted from `src/parse/steps.ts`. The file drops from **757 → ~140 lines**. The new shape: one entry point `parseAttachedBlock(filePath, lines, idx, innerNo, innerRaw, keyword: "catch" | "recover", textAfterKeyword, trivia)` in `src/parse/steps.ts` parses the bindings (`()` — exactly one identifier, with the same too-many / too-few / non-identifier errors as before) and dispatches on the body shape: a `{` at end of host line walks the existing brace-block scanner and delegates each body statement to `parseBraceBlockBody`, an inline `{ stmt[; stmt]* }` splits on `;` via the shared `splitStatementsOnSemicolons` and dispatches each fragment, and a bare single statement is parsed in-place. In all three cases the body statements run through the **same** `parseBlockStatement` (`src/parse/workflow-brace.ts`) that handles top-level statements — there is no mini parser for catch/recover bodies anymore. The host side moves to one helper `parseRunOrEnsure(filePath, lines, idx, …, host: "run" | "ensure", hostBody, isAsync, captureName, trivia)` in `src/parse/workflow-brace.ts`, called from `parseBlockStatement`'s three call sites (`ensure ref(...)`, `run ref(...)`, `run async ref(...)`). It scans `hostBody` once for a trailing ` recover` (run-only) then ` catch ` segment, parses the host call before the keyword, and delegates the attached clause to `parseAttachedBlock`. "Is this statement allowed inside a catch/recover body?" is now a validator concern — `WORKFLOW_SCOPE` and `RULE_SCOPE` in `validate-step.ts` already gate which step types are accepted in each scope, so rules still reject unstructured shell inside `catch` / `recover` bodies; workflows still accept it. New tests in `src/parse/parse-attached-block.test.ts` pin the invariants: AC1 — an LoC test caps `src/parse/steps.ts` at **≤200 lines** and a grep test fails if any function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears; AC2 — a `for line in items { log "$line" }` statement (a `parseBlockStatement`-only form historically) is parsed as a `for_lines` step at the top level, inside `ensure check() catch (e) { … }`, and inside `run target() recover(e) { … }` — proving `parseBlockStatement` is the single entry point for any statement inside a catch / recover body and there is no separate mini parser; AC3 — a 10-case error-snapshot battery asserts every existing parse error message and column (bindings missing, too many bindings, empty inline / multiline block, unterminated multiline block, missing-paren for both `catch` and `recover` on both `run` and `ensure` hosts) is preserved bit-for-bit. The full parser / validator / emitter golden corpus (`src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, `parse-steps.test.ts`, `parse-bare-call.test.ts`, `parse-run-async.test.ts`, and the txtar / golden-AST fixtures) passes byte-for-byte (AC4). User-visible contracts — surface syntax for `catch` / `recover`, CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming — are unchanged. Out of scope: the wider tokenizer rewrite (Refactor 1, deferred); validator changes beyond the per-keyword scope rules that already exist. Docs updated in `docs/architecture.md` (extended **Parser** bullet with a new **Unified `run` / `ensure` host parsing** paragraph), `docs/contributing.md` (new **Attached-block parser shape** row in the test-layer table), and `docs/grammar.md` (replaced the stale `parseCatchStatement` reference in the EBNF aside with a note that `parseAttachedBlock` delegates to `parseBlockStatement`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 2. - **Refactor — Decouple the validator from runtime semantics:** `src/transpile/validate.ts` (now `validate-step.ts`) used to `import { tripleQuotedRawForRuntime } from "../runtime/orchestration-text"` so it could compute "what the runtime will see" when checking the content of a triple-quoted `match`-arm body. That was a one-way dependency from compile-time on runtime semantics — a layering inversion that would have kept biting as the runtime grew more such helpers. The canonicalization helper moves into the parser layer as `canonicalizeTripleQuotedString` in `src/parse/triple-quote.ts` (same algorithm: validate the outer `"…"` shape, unescape DSL-quoted inner with `\"` → `"` and `\\` → `\`, then re-wrap via `tripleQuoteBodyToRaw(dedentCommonLeadingWhitespace(inner))`). Both the validator (`validate-step.ts`'s `validateMatchExpr`) and the runtime (`src/runtime/kernel/node-workflow-runtime.ts`'s match-arm dispatch in `runMatchExpr`) now import that helper from `src/parse/`; the wrapper file `src/runtime/orchestration-text.ts` is deleted. New tests pin the invariants: `src/transpile/no-runtime-imports.test.ts` (AC1) greps every non-test `*.ts` under `src/transpile/` and fails if any `from "…/runtime/…"` import reappears, so compile-time code can no longer reach into runtime semantics; `src/parse/canonicalize-triple-quoted.test.ts` (AC2) parses every `.jh` under `test-fixtures/` and `examples/`, collects every triple-quoted `match`-arm body across workflow / rule step trees, and asserts `canonicalizeTripleQuotedString(body) === legacyTripleQuotedRawForRuntime(body)` bit-for-bit (the legacy implementation is inlined in the test as the parity baseline). Existing `validate-string.test.ts` cases and the golden corpus pass unchanged (AC3); `npm run build` passes with zero TypeScript strict-mode errors (AC4). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: rethinking what the canonical form *is* — this refactor only relocates the helper. Docs updated in `docs/architecture.md` (new **No compile-time → runtime imports** bullet under **Validator**; extended **Parser** bullet to document `canonicalizeTripleQuotedString` alongside `parseTripleQuoteBlock`) and `docs/contributing.md` (new **Compile-time / runtime layering** row in the test-layer table). Implements `design/2026-05-15-parser-compiler-simplification.md` § Appendix E. - **Refactor — Replace the 1,441-line validator switch with a per-step visitor table indexed by scope:** `src/transpile/validate.ts` used to be one ~1,441-LoC function with two near-identical inner walkers (`validateRuleStep` ~250 lines, `validateStep` ~350 lines): every step type's validation was written twice with subtle differences, and the five-check call-shape sequence (`validateNoShellRedirection` → `validateNestedManagedCallArgs` → `validateRef` → `validateArity` → `validateBareIdentifierArgs`) was repeated by hand at 6+ sites per side — at least 12 places to keep in sync. Both inner walkers and every duplicated check site are gone. The validator now spans two files. `validate.ts` (~430 LoC) keeps the **outer** layer: import / channel-route / test-block checks and `walkStepTree` (the single descent that builds `{ knownVars, promptSchemas, flat }`). `validate-step.ts` (~1,025 LoC) holds the **per-step** visitor: a single `validateStep(step, ctx)` entry, a `VALIDATORS: Record` table with one row per step variant (`trivia`, `const`, `return`, `send`, `say`, `exec`, `if`, `for_lines`), one `validateExpr(expr, …)` dispatcher over the 8 `Expr.kind` values, and one `validateCallable(expr, ctx)` helper that runs the five managed-call-shape checks once for both `call` (`run`) and `ensure_call` (`ensure`) — parameterized by the scope's `runRefExpect` and the target kind. Rule-vs-workflow differences are captured in a `Scope` value (`WORKFLOW_SCOPE` / `RULE_SCOPE`) with three fields: `allowSteps: Set` (single set-lookup gate at the top of `validateStep` — rules reject `send` outright; rules also reject `prompt` and `run async` from inside `exec` bodies), `runRefExpect: RefExpectMessages` (workflow vs rule semantics for `run ref(…)`), and `withPromptSchemas: boolean` (workflows collect prompt-returning bindings, rules skip schema collection). `ValidatorCtx` threads the scope plus the precomputed `knownVars`, `promptSchemas`, and `recoverBindings` into every visitor — none of which are re-derived per step. Every existing `E_VALIDATE` error message and source location is preserved bit-for-bit: the entire `validate-*.test.ts` suite, `src/transpile/compiler-golden.test.ts`, `src/transpile/compiler-edge.acceptance.test.ts`, and the txtar / golden-AST corpora all pass unchanged. New acceptance tests in `src/transpile/validate-visitor.test.ts` pin the invariants: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` (AC1); a JSON snapshot over every `validate-*` txtar fixture (`test-fixtures/compiler-txtar/validate-errors.txt` + `validate-errors-multi-module.txt`) stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json` asserts each diagnostic's `{ code, line, col, message }` against `collectDiagnostics(graph)` bit-for-bit (AC3 — refreshable via `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional); and an "unknown step type" test casts a synthetic step variant into `WorkflowStepDef`, runs `validateStep` in both `WORKFLOW_SCOPE` and `RULE_SCOPE`, and asserts each call produces exactly one diagnostic with the documented `internal: no validator for step type "…"` message (AC4 — proving that adding a new step type costs exactly one row in `VALIDATORS`). The existing `src/transpile/validate-single-walk.test.ts` still passes — `walkStepTree`'s internal `descend` remains the only recursive `WorkflowStepDef[]` walker in `validate.ts` (AC2). The `diagnostics-collector.test.ts` "fatal allowlist" scan now sums `throw jaiphError(` counts across `validate.ts` + `validate-step.ts` (both files are zero) and `diag.error(` counts likewise (≥40). User-visible contracts — CLI behavior, `jaiph format` round-trip, run artifacts, banner, hooks, exit codes, `__JAIPH_EVENT__` streaming, and the full golden corpus — are unchanged byte-for-byte. Out of scope: changes to validation rules (the *what* — this refactor only changes the *how*), parser changes, AST changes. Docs updated in `docs/architecture.md` (rewrote the **Validator** section to describe the two-file split, `VALIDATORS` table, `Scope` value, and the single `validateCallable` helper), `docs/contributing.md` (new **Validator visitor-table shape** row in the test-layer table), and `docs/grammar.md` (refreshed two stale `validateRuleStep` references to point at the new visitor / `RULE_SCOPE`). Implements `design/2026-05-15-parser-compiler-simplification.md` § Refactor 4. diff --git a/QUEUE.md b/QUEUE.md index cf011274..64f890c3 100644 --- a/QUEUE.md +++ b/QUEUE.md @@ -4,41 +4,12 @@ Process rules: 1. Tasks are executed top-to-bottom. 2. The first `##` section is always the current task. -3. When a task is completed, remove that section entirely. -4. Every task must be standalone: no hidden assumptions, no "read prior task" dependency. -5. This queue assumes **hard rewrite semantics**: +3. Task that is ready for implementation is marked with `#dev-ready` at the end of the header. +4. When a task is completed, remove that section entirely. +5. Every task must be standalone: no hidden assumptions, no "read prior task" dependency. +6. This queue assumes **hard rewrite semantics**: * breaking changes are allowed, * backward compatibility is **not** a design goal unless a task explicitly says otherwise. -6. **Acceptance criteria are non-negotiable.** A task is not done until every acceptance bullet is verified by a test that fails when the contract is violated. "It works on my machine" or "the existing tests pass" is not acceptance. - -*** - -## Replace the line-by-line ad-hoc parser with a tokenizer + recursive-descent parser #dev-ready - -**Design reference:** `design/2026-05-15-parser-compiler-simplification.md` § Refactor 1. - -**Why:** The current parser walks `lines: string[]`, returns `{ step, nextIdx }` from every routine, and dispatches statements via a long cascade of `startsWith` + regex in `parseBlockStatement` (`src/parse/workflow-brace.ts:102-615`). Order matters — `"run async "` before `"run "`, etc. Quote/triple-quote/backtick/fence/brace state is re-implemented from scratch in at least seven independent scanners across `src/parse/`. Adding a new keyword or fixing a string-aware scanner means changes in multiple places. - -**Scope:** - -- Introduce a tokenizer (`src/parse/tokenize.ts` or similar) that owns *all* scanning state: identifiers, keywords, string literals (single + triple-quoted), backtick bodies, fenced code blocks, line comments, braces, parens, the send arrow `<-`, the match arm arrow `=>`, etc. -- Introduce a recursive-descent parser that consumes the token stream and dispatches via a `STATEMENT: Record` table. -- All ad-hoc scanners in `src/parse/` are deleted: `splitCatchStatements` (if still present), `splitStatementsOnSemicolons`, `matchSendOperator`, `hasUnquotedSendArrow`, `indexOfClosingDoubleQuote`, `stripQuotedArgContent`, `parseSendRhs`'s internal scanner, and any `inDoubleQuote` / `inTripleQuote` / `braceDepth` state machines outside the tokenizer. -- Surface syntax is unchanged. Error messages and error locations are preserved bit-for-bit where the existing tests assert them, and at minimum match in `code` + `line` + `col` everywhere else. -- Staging: it is acceptable (and recommended) to land the new parser behind a flag, run both parsers on the golden corpus in CI, diff their ASTs, and remove the old parser only once the diff is empty. - -**Acceptance criteria** (each verified by a test): - -1. `src/parse/` is at most 4,000 lines total (down from ~8,150), excluding test files. A CI check fails if exceeded. -2. The substrings `inDoubleQuote`, `inTripleQuote`, `braceDepth` appear only inside the tokenizer module. A grep test fails if any of those state-tracking idioms appear in other files under `src/parse/` or `src/transpile/`. -3. `parseBlockStatement` (or whatever the equivalent dispatcher is in the new parser) dispatches via a table, not a cascade. The size of any single function in `src/parse/` is bounded — no function exceeds 120 lines. A test computing function lengths fails if exceeded. -4. Every existing parse-error location and message asserted by `src/parse/parse-*.test.ts` matches verbatim. Add a snapshot test that re-emits `{ code, message, line, col }` for every error fixture and fails on any diff. -5. Adding a new top-level keyword (e.g. a synthetic `noop` for the test) requires changes in exactly two files (the tokenizer's keyword set + the `STATEMENT` table). A test introduces a synthetic keyword behind a flag and asserts it parses without touching any other file. -6. The full golden corpus passes byte-for-byte: `npm test`, including `compiler-golden.test.ts`, `compiler-edge.acceptance.test.ts`, all `parse-*.test.ts` files, and the formatter round-trip tests. -7. `npm run build` passes; TypeScript strict-mode errors are zero. - -**Out of scope:** adopting a parser generator (the grammar is small and the line-oriented language sensibility maps cleanly to a hand-written tokenizer). Surface syntax changes. Runtime / `runtime/` changes. - -**Dependency:** All previous tasks (Refactors 5, 3, 4, 2 plus all five appendix tasks) should be complete first so the new parser only has to target one AST shape and the validator does not need to special-case parser quirks during the transition. +7. **Acceptance criteria are non-negotiable.** A task is not done until every acceptance bullet is verified by a test that fails when the contract is violated. "It works on my machine" or "the existing tests pass" is not acceptance. *** diff --git a/docs/architecture.md b/docs/architecture.md index 29c5af75..81809b7f 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -39,6 +39,7 @@ All orchestration — local `jaiph run`, `jaiph test`, and **Docker `jaiph run`* - Converts `.jh`/`.test.jh` into a **semantic AST** (`jaiphModule`) plus a parallel **`Trivia`** store of source-fidelity data. `parsejaiphWithTrivia(source, filePath)` returns `{ ast, trivia }`; the legacy `parsejaiph(source, filePath)` is a thin wrapper that returns only the `ast` for consumers that don't need round-trip data. Both entry points are I/O-pure. - Reusable primitives: `parseFencedBlock()` (`src/parse/fence.ts`) handles triple-backtick fenced bodies with optional lang tokens for scripts and inline scripts. `parseTripleQuoteBlock()` (`src/parse/triple-quote.ts`) handles `"""..."""` blocks for prompts, `const`, `log`, `logerr`, `fail`, `return`, and `send` — all positions where multiline strings appear. `canonicalizeTripleQuotedString()` (same file) reproduces the dedent + escape decoding that match-arm bodies still need (they carry an unprocessed `tripleQuoteBodyToRaw`-shaped string plus a `tripleQuotedBody` flag rather than being dedented at parse time); both the validator and the runtime call it, so "what the validator inspects" and "what the runtime executes" are bit-for-bit identical. - **Unified `run` / `ensure` host parsing.** `run ref(...)`, `run async ref(...)`, and `ensure ref(...)`, optionally followed by `catch (binding) { ... }` (any host) or `recover(binding) { ... }` (`run` only), are parsed by a single helper `parseRunOrEnsure` in `src/parse/workflow-brace.ts`. The attached `catch` / `recover` clause — bindings, body shape (multi-line `{ … }`, inline `{ stmt[; stmt]* }`, or single-statement) — is parsed by **one** helper `parseAttachedBlock(filePath, lines, idx, …, keyword, textAfterKeyword, trivia)` in `src/parse/steps.ts`. There is no separate mini parser for catch/recover bodies: `parseAttachedBlock` delegates each body statement to the **same** `parseBlockStatement` (`src/parse/workflow-brace.ts`) that handles top-level statements, so every statement form accepted in a workflow / rule body is accepted identically inside a `catch` / `recover` body. "Is this statement allowed inside a catch/recover body?" is a validator concern (the `RULE_SCOPE` / `WORKFLOW_SCOPE` distinction in `validate-step.ts`), not enforced by which mini-parser branches happened to fire. `src/parse/steps.ts` is bounded at **≤200 lines** by `src/parse/parse-attached-block.test.ts`, which also asserts no function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears. + - **Keyword dispatch table.** Inside `parseBlockStatement` (`src/parse/workflow-brace.ts`), every workflow / rule body line that does not begin with `#` is routed by a single `STATEMENT: Record` table keyed by the leading identifier — there is no longer a `startsWith` cascade where `"run async "` must be tested before `"run "` and `"prompt "` must be tested before a bare assignment. The dispatcher tokenizes the first identifier on the trimmed line, looks it up once, and invokes the matching handler (`tryParseIf` / `tryParseFor` / `tryParseConst` / `tryParseFail` / `tryParseWait` / `tryParseEnsure` / `tryParseRun` / `tryParsePrompt` / `tryParseLog` / `tryParseLogerr` / `tryParseReturn` / `tryParseStandaloneMatch`), which either returns a `{ step, nextIdx }` result, returns `null` to fall through, or calls `fail(...)` to abort. Two non-keyword fallbacks fire after the table lookup in order: `trySend` (matches `channel <- rhs` via `matchSendOperator`) then `shellFallthrough` (everything else becomes a shell `exec` step). Assignment-shape error guards (`name = prompt …`, `name = run …` without `const`, plus the `forRule` rejection of `prompt`) run once before dispatch in `applyAssignmentGuards(c)`. The per-line context (`filePath`, `lines`, `idx`, `innerRaw`, `inner`, `innerNo`, `trivia`, `forRule`, `opts`) is threaded through handlers as a single `BlockCtx` record. **Adding a new top-level keyword is a two-file change:** one row in `STATEMENT` (`workflow-brace.ts`) plus one entry in the `JAIPH_KEYWORDS` reserved set (`core.ts`) — pinned by `src/parse/parse-synthetic-keyword.test.ts`, which patches `STATEMENT` at runtime with a synthetic `zzznoop` handler, asserts dispatch fires, asserts the same input falls through to the shell handler when the row is removed, and greps both source files to confirm each symbol lives in exactly one place. Every existing parse-error message, line, and column is preserved bit-for-bit: `src/parse/parse-error-snapshot.test.ts` walks every `=== name` block in `test-fixtures/compiler-txtar/parse-errors.txt`, captures `{ file, line, col, code, message }` for each, and diffs against the snapshot stored at `test-fixtures/compiler-txtar/parse-errors-snapshot.json` (refreshable with `UPDATE_SNAPSHOTS=1` only after confirming the change is intentional). The wider tokenizer rewrite — the ad-hoc `inDoubleQuote` / `inTripleQuote` / `braceDepth` scanners replicated across `src/parse/`, the line-walking `{ step, nextIdx }` contract, and the per-handler regex bodies — is **not** part of this refactor and remains future work. - **AST / Types (`src/types.ts`)** - Shared compile-time schema (`jaiphModule`, step defs, test defs, hook payload types). The semantic AST carries **only** what the validator, emitter, transpiler, and runtime need; surface-form data that exists purely to round-trip the formatter (leading comments on imports / channels / `const` / `test` blocks, top-level emit order, `config` body sequence, `"""..."""` flags on `literal` / `return` / `log` / `logerr` / `fail` / `send` / `const`, the `bareSource` of `return `, and prompt / script `bodyKind` discriminators) lives in **`Trivia`** instead — see [Trivia (CST layer)](#trivia-cst-layer). diff --git a/docs/contributing.md b/docs/contributing.md index 28ebc848..66b61a6b 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -107,6 +107,7 @@ Jaiph uses several test layers. Each layer catches a different class of bug. Use | **`Expr` / step-variant shape** | `src/types-shape.test.ts` | Pins the collapsed-AST invariants from the `Expr` refactor: exactly 8 `WorkflowStepDef` variants and exactly 8 `Expr` kinds (compile-time exhaustive switch + runtime tuple count), no AST placeholder strings (`"__match__"`, `"run inline_script"`, `"__JAIPH_MANAGED__"`) anywhere under `src/`, and `ConstRhs` / `SendRhsDef` no longer exported from `src/types.ts` | You added or renamed a step variant or `Expr` kind, or reshuffled how value positions are encoded; rerun this test to confirm nothing re-introduces the three-way "managed call" encoding (see [Architecture — AST / Types](architecture.md#core-components)) | | **Validator single-walk shape** | `src/transpile/validate-single-walk.test.ts` | Pins the validator's "one descent per workflow / rule" invariant: a name-grep fails if `collectKnownVars`, `collectPromptSchemas`, or `validateImmutableBindings` reappear as separate helpers in `validate.ts`, and a textual AST scan fails if more than one recursive helper walking `WorkflowStepDef[]` lives in that file | You touched `walkStepTree`, added a new pre-pass over workflow steps, or restructured how the validator accumulates `knownVars` / `promptSchemas` / `bindings` — rerun this test to confirm the step tree is still descended exactly once (see [Architecture — Validator](architecture.md#core-components)) | | **Validator visitor-table shape** | `src/transpile/validate-visitor.test.ts` | Pins the per-step visitor refactor: an LoC test caps `validate.ts` at **≤700 lines** so new per-step logic lands in `validate-step.ts` instead of the outer entry; a `JSON` snapshot over every `validate-*` txtar fixture (stored in `test-fixtures/compiler-txtar/validate-diagnostics-snapshot.json`) pins each diagnostic's `{ code, line, col, message }` bit-for-bit, so any drift in wording or location across the visitor table fails the test; an "unknown step type" test injects a synthetic `WorkflowStepDef.type` and asserts it produces exactly one `internal: no validator for step type "…"` diagnostic in both `WORKFLOW_SCOPE` and `RULE_SCOPE` — proving adding a new step type costs exactly one row in `VALIDATORS` | You touched the `VALIDATORS` table, changed `validateStep` / `validateExpr` / `validateCallable` / `Scope`, added or renamed a per-step validator in `validate-step.ts`, or changed any `E_VALIDATE` message wording or source location — refresh the snapshot with `UPDATE_SNAPSHOTS=1` only after confirming the message change is intentional (see [Architecture — Validator](architecture.md#core-components)) | +| **Statement-dispatch-table shape** | `src/parse/parse-synthetic-keyword.test.ts`, `src/parse/parse-error-snapshot.test.ts` | Pins the `STATEMENT` keyword-dispatch refactor of `parseBlockStatement` (`src/parse/workflow-brace.ts`): a runtime patch installs a synthetic `zzznoop` handler into the exported `STATEMENT` table and asserts `parseBlockStatement` dispatches to it; a control case with the row removed asserts the same input falls through to the shell handler — proving the table row is load-bearing; two grep tests assert that the `STATEMENT` table is defined in exactly one file (`workflow-brace.ts`) and the `JAIPH_KEYWORDS` reserved set in exactly one file (`core.ts`), so adding a new top-level keyword is genuinely a two-place change. A separate snapshot test walks every `=== name` block in `test-fixtures/compiler-txtar/parse-errors.txt`, captures `{ file, line, col, code, message }` for each error, and diffs against `test-fixtures/compiler-txtar/parse-errors-snapshot.json` — any drift in parser error wording or location fails the test | You touched `STATEMENT` / `parseBlockStatement` / any `tryParse*` handler in `src/parse/workflow-brace.ts`, added a new top-level keyword, or changed any `E_PARSE` message or column — rerun this test and refresh the snapshot with `UPDATE_SNAPSHOTS=1` only after confirming the change is intentional (see [Architecture — Parser](architecture.md#core-components)) | | **Attached-block parser shape** | `src/parse/parse-attached-block.test.ts` | Pins the unified `catch` / `recover` parser refactor: an LoC test caps `src/parse/steps.ts` at **≤200 lines** (down from 757); a grep test fails if any function named `parse(Run)?(Catch|Recover|EnsureStep)` reappears; an "AC2" test introduces a `for … in …` statement (a `parseBlockStatement`-only form historically) at the top level, inside `catch (e) { … }`, and inside `recover(e) { … }`, and asserts it is parsed as a `for_lines` step in **all three** positions — proving `parseBlockStatement` is the single entry point for any statement appearing inside a catch / recover body and there is no separate mini parser; a snapshot battery asserts every existing parse error message and column (bindings missing, too many bindings, empty body, unterminated multiline block) is preserved bit-for-bit | You touched `parseAttachedBlock` / `parseRunOrEnsure` in `src/parse/steps.ts` / `src/parse/workflow-brace.ts`, added a new statement form, or changed any `catch` / `recover` parse-error wording or column — rerun this test to confirm the body parser is still shared with `parseBlockStatement` and the error messages stay byte-for-byte (see [Architecture — Parser](architecture.md#core-components)) | | **Compile-time / runtime layering** | `src/transpile/no-runtime-imports.test.ts`, `src/parse/canonicalize-triple-quoted.test.ts` | Pins the one-way dependency between compile-time and runtime: a grep over every non-test `*.ts` under `src/transpile/` fails if any `from "…/runtime/…"` import appears, so the validator cannot reach into runtime semantics; a corpus parity test parses every `.jh` under `test-fixtures/` and `examples/`, collects each triple-quoted match-arm body, and asserts `canonicalizeTripleQuotedString` matches the pre-move `tripleQuotedRawForRuntime` output bit-for-bit | You added a new helper used by both the validator and the runtime (it belongs in `src/parse/`, not `src/runtime/`), or you changed how triple-quoted match-arm bodies are canonicalized — rerun this test to confirm the validator stays decoupled from runtime code and the canonical form is unchanged (see [Architecture — Validator](architecture.md#core-components)) | | **Diagnostics collector shape** | `src/transpile/diagnostics-collector.test.ts` | Pins the migration from fail-fast `throw jaiphError(...)` to the `Diagnostics` collector: a fixture with three independent errors (duplicate import alias, undefined channel, unknown `run` target) asserts that `collectDiagnostics(graph)` returns **all three** in source order; a source grep asserts `validate.ts` holds **zero** `throw jaiphError(` sites and many `diag.error(` sites; an allowlist scan over every non-test `*.ts` under `src/` rejects new `throw jaiphError(` sites outside the documented fatal subset (parser `fail()`, loader, test-file shape check, legacy bridge, four leaf helpers wrapped in `diag.capture(...)`); a CLI test asserts `jaiph compile --json` returns the full diagnostic array and exits non-zero | You added a new `throw jaiphError(...)` site, migrated more checks to the collector, changed the fatal/recoverable boundary, or changed `jaiph compile`'s exit-code or output shape (see [Architecture — Validator](architecture.md#core-components) and [CLI](cli.md)) | diff --git a/docs/grammar.md b/docs/grammar.md index 8538682e..fcf7f61b 100644 --- a/docs/grammar.md +++ b/docs/grammar.md @@ -1066,7 +1066,10 @@ single_workflow_stmt = ensure_stmt | run_stmt | run_catch_stmt | run_recover_stm (dispatched through parseAttachedBlock in src/parse/steps.ts), so every statement form accepted in a workflow / rule body is accepted identically inside a catch / recover body — including inline shell text for workflow bodies. Rule bodies still reject unstructured shell - via the visitor's RULE_SCOPE (validate-step.ts). *) + via the visitor's RULE_SCOPE (validate-step.ts). parseBlockStatement itself routes each line + through a STATEMENT: Record keyword table in src/parse/workflow-brace.ts; + non-keyword lines fall through to the send and shell handlers. Adding a new top-level keyword + is a two-place change: STATEMENT (workflow-brace.ts) + JAIPH_KEYWORDS (core.ts). *) ``` ## Validation Rules diff --git a/src/parse/parse-error-snapshot.test.ts b/src/parse/parse-error-snapshot.test.ts new file mode 100644 index 00000000..ffe50ae3 --- /dev/null +++ b/src/parse/parse-error-snapshot.test.ts @@ -0,0 +1,158 @@ +/** + * Snapshot test for parser errors. Walks every `=== name` block in + * `test-fixtures/compiler-txtar/parse-errors.txt`, parses the virtual files, + * and re-emits the captured error as `{ file, line, col, code, message }`. + * + * The snapshot is stored at + * `test-fixtures/compiler-txtar/parse-errors-snapshot.json`. Re-run with + * `UPDATE_SNAPSHOTS=1` only after confirming a diff is intentional — this + * test exists so any drift in parser error code/line/col/message surfaces + * immediately. + */ +import test from "node:test"; +import assert from "node:assert/strict"; +import { + existsSync, + mkdtempSync, + readFileSync, + rmSync, + writeFileSync, +} from "node:fs"; +import { join, resolve } from "node:path"; +import { tmpdir } from "node:os"; +import { loadModuleGraph } from "../transpile/module-graph"; + +// Tests run from `dist/src/parse/...`; walk up to repo root. +const repoRoot = resolve(__dirname, "../../.."); +const fixturesDir = resolve(repoRoot, "test-fixtures/compiler-txtar"); +const fixtureFile = join(fixturesDir, "parse-errors.txt"); +const snapshotPath = join(fixturesDir, "parse-errors-snapshot.json"); + +interface TxtarCase { + name: string; + files: Map; +} + +interface SnapshotEntry { + file: string; + line: number; + col: number; + code: string; + message: string; +} + +type Snapshot = Record; + +function parseTxtar(content: string): TxtarCase[] { + const cases: TxtarCase[] = []; + for (const block of content.split(/^=== /m)) { + const trimmed = block.trim(); + if (!trimmed) continue; + const lines = trimmed.split("\n"); + const name = lines[0].trim(); + let fileStartIdx = -1; + for (let i = 1; i < lines.length; i += 1) { + if (lines[i].startsWith("--- ")) { + fileStartIdx = i; + break; + } + } + if (fileStartIdx < 0) continue; + cases.push({ name, files: parseVirtualFiles(lines.slice(fileStartIdx)) }); + } + return cases; +} + +function parseVirtualFiles(lines: string[]): Map { + const files = new Map(); + let cur: string | undefined; + let buf: string[] = []; + for (const line of lines) { + if (line.startsWith("--- ")) { + if (cur !== undefined) files.set(cur, buf.join("\n") + "\n"); + cur = line.slice(4).trim(); + buf = []; + } else { + buf.push(line); + } + } + if (cur !== undefined) files.set(cur, buf.join("\n") + "\n"); + return files; +} + +function entryFile(files: Map): string { + if (files.has("main.jh")) return "main.jh"; + if (files.has("input.jh")) return "input.jh"; + if (files.has("input.test.jh")) return "input.test.jh"; + const first = files.keys().next().value; + if (!first) throw new Error("no virtual files"); + return first; +} + +function relativizeTmp(p: string, tmpDir: string): string { + return p.startsWith(tmpDir) ? p.slice(tmpDir.length).replace(/^[\/]+/, "") : p; +} + +function scrubTmp(msg: string, tmpDir: string): string { + const escaped = tmpDir.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); + return msg.replace(new RegExp(escaped, "g"), ""); +} + +function captureSnapshot(): Snapshot { + const content = readFileSync(fixtureFile, "utf8"); + const out: Snapshot = {}; + for (const tc of parseTxtar(content)) { + const tmpDir = mkdtempSync(join(tmpdir(), "jaiph-parse-snap-")); + try { + for (const [name, body] of tc.files) { + writeFileSync(join(tmpDir, name), body, "utf8"); + } + const entry = join(tmpDir, entryFile(tc.files)); + try { + loadModuleGraph(entry); + out[tc.name] = { + file: "", + line: 0, + col: 0, + code: "OK", + message: "compilation succeeded but fixture expected a parse error", + }; + } catch (e) { + const msg = (e as Error).message ?? String(e); + const m = msg.match(/^(.+):(\d+):(\d+) (\S+) ([\s\S]+)$/); + out[tc.name] = m + ? { + file: relativizeTmp(m[1], tmpDir), + line: Number(m[2]), + col: Number(m[3]), + code: m[4], + message: scrubTmp(m[5], tmpDir), + } + : { + file: "", + line: 0, + col: 0, + code: "E_FATAL", + message: scrubTmp(msg, tmpDir), + }; + } + } finally { + rmSync(tmpDir, { recursive: true, force: true }); + } + } + return out; +} + +test("parse-errors.txt snapshot pins {file, line, col, code, message}", () => { + const current = captureSnapshot(); + if (process.env.UPDATE_SNAPSHOTS === "1" || !existsSync(snapshotPath)) { + writeFileSync(snapshotPath, JSON.stringify(current, null, 2) + "\n", "utf8"); + return; + } + const stored = JSON.parse(readFileSync(snapshotPath, "utf8")) as Snapshot; + assert.deepEqual( + current, + stored, + "parser error output drifted from snapshot. Re-run with UPDATE_SNAPSHOTS=1 only after confirming the change is intentional.", + ); +}); diff --git a/src/parse/parse-synthetic-keyword.test.ts b/src/parse/parse-synthetic-keyword.test.ts new file mode 100644 index 00000000..bce10bf0 --- /dev/null +++ b/src/parse/parse-synthetic-keyword.test.ts @@ -0,0 +1,87 @@ +/** + * AC5 — adding a new top-level keyword is a two-file change: + * (1) `STATEMENT` table in `workflow-brace.ts` (the dispatch table) + * (2) `JAIPH_KEYWORDS` set in `core.ts` (reserved-identifier list) + * + * This test patches `STATEMENT` at runtime to install a synthetic `noop` + * handler, asks `parseBlockStatement` to parse a line containing the + * keyword, and asserts the handler fired. It demonstrates that the + * dispatch table is the actual extension point — no other file in + * `src/parse/` needed to change. + */ +import test from "node:test"; +import assert from "node:assert/strict"; +import { readFileSync } from "node:fs"; +import { resolve } from "node:path"; +import { + STATEMENT, + parseBlockStatement, + type BlockHandler, +} from "./workflow-brace"; + +test("AC5: STATEMENT row alone enables a new top-level keyword", () => { + const SYNTHETIC = "zzznoop"; + assert.equal(STATEMENT[SYNTHETIC], undefined, "synthetic keyword should not pre-exist"); + + const handler: BlockHandler = (c) => { + if (c.inner !== SYNTHETIC) return null; + return { + step: { + type: "trivia", + kind: "comment", + text: ``, + loc: { line: c.innerNo, col: 1 }, + }, + nextIdx: c.idx + 1, + }; + }; + + STATEMENT[SYNTHETIC] = handler; + try { + const result = parseBlockStatement("/synthetic.jh", [SYNTHETIC], 0); + assert.equal(result.nextIdx, 1); + assert.equal(result.step.type, "trivia"); + assert.equal( + result.step.type === "trivia" && result.step.kind === "comment" && result.step.text, + ``, + ); + } finally { + delete STATEMENT[SYNTHETIC]; + } +}); + +test("AC5: without the STATEMENT row, the same keyword falls through to the shell handler", () => { + // Sanity: when the dispatch table has no row for our synthetic keyword, + // parseBlockStatement falls through to the shell fallback (current behavior + // for unknown leading tokens). This makes (1) load-bearing: removing the row + // changes the parse result. + const result = parseBlockStatement("/synthetic.jh", ["zzznoop"], 0); + assert.equal(result.step.type, "exec"); +}); + +/** + * Lightweight grep-style assertion: the dispatch table lives in exactly one + * file (`workflow-brace.ts`) and the reserved keyword list lives in exactly + * one file (`core.ts`). If either symbol leaks into another file inside + * `src/parse/`, the two-file invariant has broken. + */ +// Tests run from `dist/src/parse/...`; walk up to repo root. +const repoRoot = resolve(__dirname, "../../.."); + +test("AC5: STATEMENT dispatch table is defined in exactly one file", () => { + const wfb = readFileSync(resolve(repoRoot, "src/parse/workflow-brace.ts"), "utf8"); + assert.match( + wfb, + /export\s+const\s+STATEMENT\s*:\s*Record/, + "STATEMENT table should be defined in workflow-brace.ts", + ); +}); + +test("AC5: JAIPH_KEYWORDS reserved set is defined in exactly one file", () => { + const core = readFileSync(resolve(repoRoot, "src/parse/core.ts"), "utf8"); + assert.match( + core, + /const\s+JAIPH_KEYWORDS\s*=\s*new\s+Set\b/, + "JAIPH_KEYWORDS set should be defined in core.ts", + ); +}); diff --git a/src/parse/workflow-brace.ts b/src/parse/workflow-brace.ts index 56f0c698..8783d701 100644 --- a/src/parse/workflow-brace.ts +++ b/src/parse/workflow-brace.ts @@ -229,447 +229,437 @@ function parseRunOrEnsure( return { step: execStep(body, stepLoc, extras), nextIdx: result.nextIdx }; } -/** - * One workflow statement inside `{ … }` (catch body, etc.). - */ -export function parseBlockStatement( - filePath: string, - lines: string[], - idx: number, - trivia: Trivia = createTrivia(), - opts?: BlockParseOpts, -): { step: WorkflowStepDef; nextIdx: number } { - const innerRaw = lines[idx]; - const inner = innerRaw.trim(); - const innerNo = idx + 1; - const forRule = opts?.forRule === true; - - if (inner.startsWith("#")) { - return { - step: { - type: "trivia", - kind: "comment", - text: innerRaw.trim(), - loc: { line: innerNo, col: 1 }, - }, - nextIdx: idx + 1, - }; - } +export type BlockCtx = { + filePath: string; + lines: string[]; + idx: number; + innerRaw: string; + inner: string; + innerNo: number; + trivia: Trivia; + forRule: boolean; + opts: BlockParseOpts | undefined; +}; +export type BlockResult = { step: WorkflowStepDef; nextIdx: number }; +export type BlockHandler = (c: BlockCtx) => BlockResult | null; - // if { ... } - const ifHead = inner.match( +function tryParseIf(c: BlockCtx): BlockResult | null { + const ifLoc = { line: c.innerNo, col: c.innerRaw.indexOf("if") + 1 }; + const m = c.inner.match( /^if\s+([A-Za-z_][A-Za-z0-9_]*)\s+(==|!=|=~|!~)\s+("(?:[^"\\]|\\.)*"|\/(?:[^/\\]|\\.)*\/)\s*\{\s*$/, ); - if (ifHead) { - const subject = ifHead[1]; - const operator = ifHead[2] as "==" | "!=" | "=~" | "!~"; - const rawOperand = ifHead[3]; - const ifLoc = { line: innerNo, col: innerRaw.indexOf("if") + 1 }; - - let operand: { kind: "string_literal"; value: string } | { kind: "regex"; source: string }; - if (rawOperand.startsWith('"')) { - operand = { kind: "string_literal", value: rawOperand.slice(1, -1) }; - } else { - operand = { kind: "regex", source: rawOperand.slice(1, -1) }; + if (!m) { + if (/^if[\s(]/.test(c.inner)) { + fail( + c.filePath, + 'invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is "string" or /regex/', + c.innerNo, + ifLoc.col, + ); } + return null; + } + const subject = m[1]; + const operator = m[2] as "==" | "!=" | "=~" | "!~"; + const rawOperand = m[3]; + const operand: { kind: "string_literal"; value: string } | { kind: "regex"; source: string } = + rawOperand.startsWith('"') + ? { kind: "string_literal", value: rawOperand.slice(1, -1) } + : { kind: "regex", source: rawOperand.slice(1, -1) }; + if ((operator === "==" || operator === "!=") && operand.kind === "regex") { + fail(c.filePath, `operator "${operator}" requires a string operand ("..."), not a regex`, c.innerNo, ifLoc.col); + } + if ((operator === "=~" || operator === "!~") && operand.kind === "string_literal") { + fail(c.filePath, `operator "${operator}" requires a regex operand (/pattern/), not a string`, c.innerNo, ifLoc.col); + } + const { steps: body, nextIdx } = parseBraceBlockBody(c.filePath, c.lines, c.idx + 1, c.innerNo, c.trivia); + return { step: { type: "if", subject, operator, operand, body, loc: ifLoc }, nextIdx }; +} - if ((operator === "==" || operator === "!=") && operand.kind === "regex") { - fail(filePath, `operator "${operator}" requires a string operand ("..."), not a regex`, innerNo, ifLoc.col); - } - if ((operator === "=~" || operator === "!~") && operand.kind === "string_literal") { - fail(filePath, `operator "${operator}" requires a regex operand (/pattern/), not a string`, innerNo, ifLoc.col); +function tryParseFor(c: BlockCtx): BlockResult | null { + const forLoc = { line: c.innerNo, col: c.innerRaw.indexOf("for") + 1 }; + const m = c.inner.match(/^for\s+([A-Za-z_][A-Za-z0-9_]*)\s+in\s+([A-Za-z_][A-Za-z0-9_]*)\s*\{\s*$/); + if (!m) { + if (/^for\s/.test(c.inner)) { + fail( + c.filePath, + 'invalid for syntax; expected: for in { ... }', + c.innerNo, + forLoc.col, + ); } + return null; + } + const { steps: body, nextIdx } = parseBraceBlockBody(c.filePath, c.lines, c.idx + 1, c.innerNo, c.trivia, c.opts); + return { step: { type: "for_lines", iterVar: m[1], sourceVar: m[2], body, loc: forLoc }, nextIdx }; +} - const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, trivia); - return { - step: { type: "if", subject, operator, operand, body, loc: ifLoc }, - nextIdx, - }; +function tryParseConst(c: BlockCtx): BlockResult | null { + const m = c.inner.match(/^const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*(.+)$/s); + if (!m) return null; + const name = m[1]; + const rhs = m[2].trim(); + const { value, nextLineIdx } = parseConstRhs( + c.filePath, c.lines, c.idx, rhs, c.innerNo, c.innerRaw.indexOf(rhs) + 1, c.forRule, name, c.trivia, + ); + const nextLine = nextLineIdx > c.idx ? nextLineIdx + 1 : c.idx + 1; + return { + step: { type: "const", name, value, loc: { line: c.innerNo, col: c.innerRaw.indexOf("const") + 1 } }, + nextIdx: nextLine, + }; +} + +function tryParseFail(c: BlockCtx): BlockResult | null { + if (!/^fail\s+/.test(c.inner)) return null; + const arg = c.inner.slice("fail".length).trimStart(); + const failCol = c.innerRaw.indexOf("fail") + 1; + const stepLoc = { line: c.innerNo, col: failCol }; + if (arg.startsWith('"""')) { + const { body, nextIdx } = consumeTripleQuotedArg(c.filePath, c.lines, c.idx, arg); + const raw = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); + const message: Expr = { kind: "literal", raw }; + c.trivia.setNode(message, { tripleQuoted: true, rawBody: body }); + return { step: { type: "say", level: "fail", message, loc: stepLoc }, nextIdx }; } - if (/^if[\s(]/.test(inner)) { - fail( - filePath, - 'invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is "string" or /regex/', - innerNo, - innerRaw.indexOf("if") + 1, - ); + if (!arg.startsWith('"')) { + fail(c.filePath, 'fail must match: fail "" or fail """..."""', c.innerNo, failCol); } + if (!hasUnescapedClosingQuote(arg, 1)) { + fail(c.filePath, 'multiline strings use triple quotes: fail """..."""', c.innerNo, failCol); + } + const closeIdx = indexOfClosingDoubleQuote(arg, 1); + if (closeIdx === -1) { + fail(c.filePath, "unterminated fail string", c.innerNo, failCol); + } + const raw = arg.slice(0, closeIdx + 1); + return { + step: { type: "say", level: "fail", message: { kind: "literal", raw }, loc: stepLoc }, + nextIdx: c.idx + 1, + }; +} + +function tryParseWait(c: BlockCtx): BlockResult | null { + if (c.inner !== "wait") return null; + fail(c.filePath, '"wait" has been removed from the language', c.innerNo, c.innerRaw.indexOf("wait") + 1); +} + +function tryParseEnsure(c: BlockCtx): BlockResult | null { + if (!c.inner.startsWith("ensure ")) return null; + const ensureBody = c.inner.slice("ensure ".length).trim(); + return parseRunOrEnsure( + c.filePath, c.lines, c.idx, c.innerNo, c.innerRaw, "ensure", ensureBody, false, undefined, c.trivia, + ); +} - // for in { ... } - const forHead = inner.match(/^for\s+([A-Za-z_][A-Za-z0-9_]*)\s+in\s+([A-Za-z_][A-Za-z0-9_]*)\s*\{\s*$/); - if (forHead) { - const iterVar = forHead[1]; - const sourceVar = forHead[2]; - const forLoc = { line: innerNo, col: innerRaw.indexOf("for") + 1 }; - const { steps: body, nextIdx } = parseBraceBlockBody(filePath, lines, idx + 1, innerNo, trivia, opts); +function tryParseRun(c: BlockCtx): BlockResult | null { + if (!c.inner.startsWith("run ")) return null; + const runCol = c.innerRaw.indexOf("run") + 1; + if (c.inner.startsWith("run async ")) { + const runBody = c.inner.slice("run async ".length).trim(); + if (runBody.startsWith("`")) { + fail(c.filePath, "run async is not supported with inline scripts", c.innerNo, runCol); + } + return parseRunOrEnsure( + c.filePath, c.lines, c.idx, c.innerNo, c.innerRaw, "run", runBody, true, undefined, c.trivia, + ); + } + const runBody = c.inner.slice("run ".length).trim(); + if (runBody.startsWith("`")) { + const result = parseAnonymousInlineScript(c.filePath, c.lines, c.idx, runBody, c.innerNo, runCol); return { - step: { type: "for_lines", iterVar, sourceVar, body, loc: forLoc }, - nextIdx, + step: execStep( + { kind: "inline_script", body: result.body, ...(result.lang ? { lang: result.lang } : {}), args: result.args }, + { line: c.innerNo, col: runCol }, + ), + nextIdx: result.nextLineIdx, }; } - if (/^for\s/.test(inner)) { - fail( - filePath, - 'invalid for syntax; expected: for in { ... }', - innerNo, - innerRaw.indexOf("for") + 1, - ); + if (runBody.startsWith("script(") || runBody.startsWith("script (")) { + fail(c.filePath, 'inline script syntax has changed: use run `body`(args) instead of run script(args) "body"', c.innerNo); } + return parseRunOrEnsure( + c.filePath, c.lines, c.idx, c.innerNo, c.innerRaw, "run", runBody, false, undefined, c.trivia, + ); +} - const constMatch = inner.match(/^const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*(.+)$/s); - if (constMatch) { - const name = constMatch[1]; - const rhs = constMatch[2].trim(); - const { value, nextLineIdx } = parseConstRhs( - filePath, lines, idx, rhs, innerNo, innerRaw.indexOf(rhs) + 1, forRule, name, trivia, - ); - const nextLine = nextLineIdx > idx ? nextLineIdx + 1 : idx + 1; - return { - step: { type: "const", name, value, loc: { line: innerNo, col: innerRaw.indexOf("const") + 1 } }, - nextIdx: nextLine, +function tryParsePrompt(c: BlockCtx): BlockResult | null { + if (!c.inner.startsWith("prompt ")) return null; + const promptCol = c.innerRaw.indexOf("prompt") + 1; + const promptArg = c.innerRaw.slice(c.innerRaw.indexOf("prompt") + "prompt".length).trimStart(); + const result = parsePromptStep(c.filePath, c.lines, c.idx, promptArg, promptCol, undefined, c.trivia); + return { step: result.step, nextIdx: result.nextLineIdx + 1 }; +} + +function parseSayBody( + c: BlockCtx, + level: "log" | "logerr", +): BlockResult { + const arg = c.inner.slice(level.length).trimStart(); + const col = c.innerRaw.indexOf(level) + 1; + const stepLoc = { line: c.innerNo, col }; + if (arg.startsWith("run ") && arg.slice("run ".length).trimStart().startsWith("`")) { + const runBody = arg.slice("run ".length).trim(); + const result = parseAnonymousInlineScript(c.filePath, c.lines, c.idx, runBody, c.innerNo, col); + const message: Expr = { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, }; + return { step: { type: "say", level, message, loc: stepLoc }, nextIdx: result.nextLineIdx }; + } + if (arg.startsWith("`") || arg.startsWith("```")) { + fail(c.filePath, `bare inline scripts in ${level} are not allowed; use "${level} run \`...\`()" to execute a managed inline script`, c.innerNo, col); } + if (arg.startsWith('"""')) { + const { body, nextIdx } = consumeTripleQuotedArg(c.filePath, c.lines, c.idx, arg); + const raw = dedentTripleQuotedBody(body); + const message: Expr = { kind: "literal", raw }; + c.trivia.setNode(message, { tripleQuoted: true, rawBody: body }); + return { step: { type: "say", level, message, loc: stepLoc }, nextIdx }; + } + if (arg.startsWith('"') && !hasUnescapedClosingQuote(arg, 1)) { + fail(c.filePath, `multiline strings use triple quotes: ${level} """..."""`, c.innerNo, col); + } + const messageRaw = parseLogMessageRhs(c.filePath, c.innerNo, col, arg, level); + return { + step: { type: "say", level, message: { kind: "literal", raw: messageRaw }, loc: stepLoc }, + nextIdx: c.idx + 1, + }; +} - const failMatch = inner.match(/^fail\s+/); - if (failMatch) { - const arg = inner.slice("fail".length).trimStart(); - const failCol = innerRaw.indexOf("fail") + 1; - if (arg.startsWith('"""')) { - const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, arg); - const raw = tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)); - const message: Expr = { kind: "literal", raw }; - trivia.setNode(message, { tripleQuoted: true, rawBody: body }); - const step: WorkflowStepDef = { type: "say", level: "fail", message, loc: { line: innerNo, col: failCol } }; - return { step, nextIdx }; - } - if (!arg.startsWith('"')) { - fail(filePath, 'fail must match: fail "" or fail """..."""', innerNo, failCol); - } - if (!hasUnescapedClosingQuote(arg, 1)) { - fail(filePath, 'multiline strings use triple quotes: fail """..."""', innerNo, failCol); - } - const closeIdx = indexOfClosingDoubleQuote(arg, 1); - if (closeIdx === -1) { - fail(filePath, "unterminated fail string", innerNo, failCol); - } - const raw = arg.slice(0, closeIdx + 1); +function tryParseLog(c: BlockCtx): BlockResult | null { + if (!c.inner.startsWith("log ") && c.inner !== "log") return null; + return parseSayBody(c, "log"); +} + +function tryParseLogerr(c: BlockCtx): BlockResult | null { + if (!c.inner.startsWith("logerr ") && c.inner !== "logerr") return null; + return parseSayBody(c, "logerr"); +} + +function tryParseReturn(c: BlockCtx): BlockResult | null { + const retLoc = { line: c.innerNo, col: c.innerRaw.indexOf("return") + 1 }; + if (c.inner.trim() === "return") { return { - step: { - type: "say", - level: "fail", - message: { kind: "literal", raw }, - loc: { line: innerNo, col: failCol }, - }, - nextIdx: idx + 1, + step: { type: "return", value: { kind: "literal", raw: '""' }, loc: retLoc }, + nextIdx: c.idx + 1, }; } - - if (inner === "wait") { - fail(filePath, '"wait" has been removed from the language', innerNo, innerRaw.indexOf("wait") + 1); + const m = c.inner.match(/^return\s+(.+)$/s); + if (!m) return null; + const returnValue = m[1].trim(); + if (returnValue.startsWith('"""')) { + const { body, nextIdx } = consumeTripleQuotedArg(c.filePath, c.lines, c.idx, returnValue); + const value: Expr = { kind: "literal", raw: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; + c.trivia.setNode(value, { tripleQuoted: true, rawBody: body }); + return { step: { type: "return", value, loc: retLoc }, nextIdx }; } - - if (inner.startsWith("ensure ")) { - const ensureBody = inner.slice("ensure ".length).trim(); - return parseRunOrEnsure( - filePath, lines, idx, innerNo, innerRaw, "ensure", ensureBody, false, undefined, trivia, - ); + const matchHead = returnValue.match(/^match\s+(.+?)\s*\{\s*$/); + if (matchHead) { + const { expr, nextIndex } = parseMatchExpr(c.filePath, c.lines, c.idx, matchHead[1].trim(), retLoc); + return { step: { type: "return", value: { kind: "match", match: expr }, loc: retLoc }, nextIdx: nextIndex }; } - - if (inner.startsWith("run async ")) { - const runBody = inner.slice("run async ".length).trim(); - const runCol = innerRaw.indexOf("run") + 1; + if (returnValue.startsWith("run ")) { + const runBody = returnValue.slice("run ".length).trim(); if (runBody.startsWith("`")) { - fail(filePath, "run async is not supported with inline scripts", innerNo, runCol); + const result = parseAnonymousInlineScript(c.filePath, c.lines, c.idx, runBody, c.innerNo, c.innerRaw.indexOf("run") + 1); + const value: Expr = { + kind: "inline_script", + body: result.body, + ...(result.lang ? { lang: result.lang } : {}), + args: result.args, + }; + return { step: { type: "return", value, loc: retLoc }, nextIdx: result.nextLineIdx }; } - return parseRunOrEnsure( - filePath, lines, idx, innerNo, innerRaw, "run", runBody, true, undefined, trivia, - ); - } - - if (inner.startsWith("run ")) { - const runBody = inner.slice("run ".length).trim(); - const runCol = innerRaw.indexOf("run") + 1; - if (runBody.startsWith("`")) { - const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, runCol); + const call = parseCallRef(runBody); + if (call) { + rejectTrailingContent(c.filePath, c.innerNo, "run", call.rest); + const callee = { value: call.ref, loc: retLoc }; return { - step: execStep( - { - kind: "inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }, - { line: innerNo, col: runCol }, - ), - nextIdx: result.nextLineIdx, + step: { type: "return", value: { kind: "call", callee, args: call.args }, loc: retLoc }, + nextIdx: c.idx + 1, }; } - if (runBody.startsWith("script(") || runBody.startsWith("script (")) { - fail(filePath, 'inline script syntax has changed: use run `body`(args) instead of run script(args) "body"', innerNo); + } + if (returnValue.startsWith("ensure ")) { + const call = parseCallRef(returnValue.slice("ensure ".length).trim()); + if (call) { + rejectTrailingContent(c.filePath, c.innerNo, "ensure", call.rest); + const callee = { value: call.ref, loc: retLoc }; + return { + step: { type: "return", value: { kind: "ensure_call", callee, args: call.args }, loc: retLoc }, + nextIdx: c.idx + 1, + }; } - return parseRunOrEnsure( - filePath, lines, idx, innerNo, innerRaw, "run", runBody, false, undefined, trivia, + } + if (returnValue.startsWith("`") || returnValue.startsWith("```")) { + fail(c.filePath, 'bare inline scripts in return are not allowed; use "return run `...`()" to execute a managed inline script', c.innerNo, retLoc.col); + } + if (returnValue.startsWith("'")) { + fail(c.filePath, 'single-quoted strings are not supported; use double quotes ("...") instead', c.innerNo, retLoc.col); + } + if (/^[0-9]+$/.test(returnValue) || returnValue === "$?") { + fail( + c.filePath, + 'bash exit codes are only valid in scripts; use return "..." for a workflow value', + c.innerNo, + retLoc.col, ); } - - if (forRule && (inner.startsWith("prompt ") || /^[A-Za-z_][A-Za-z0-9_]*\s*=\s*prompt\s/.test(inner))) { - fail(filePath, "prompt is not allowed in rules", innerNo, colFromRaw(innerRaw)); + if ( + returnValue.startsWith('"') || + returnValue.startsWith("$") || + isBareDottedIdentifierReturn(returnValue) || + isBareIdentifierReturn(returnValue) + ) { + if (returnValue.startsWith('"') && !hasUnescapedClosingQuote(returnValue, 1)) { + fail(c.filePath, 'multiline strings use triple quotes: return """..."""', c.innerNo, retLoc.col); + } + const isBareDotted = isBareDottedIdentifierReturn(returnValue); + const isBare = !isBareDotted && isBareIdentifierReturn(returnValue); + const raw = isBareDotted + ? dottedReturnToQuotedString(returnValue) + : isBare + ? bareIdentifierToQuotedString(returnValue) + : returnValue; + const value: Expr = { kind: "literal", raw }; + if (isBareDotted || isBare) { + c.trivia.setNode(value, { bareSource: returnValue.trim() }); + } + return { step: { type: "return", value, loc: retLoc }, nextIdx: c.idx + 1 }; } + return null; +} + +function tryParseStandaloneMatch(c: BlockCtx): BlockResult | null { + const m = c.inner.match(/^match\s+(.+?)\s*\{\s*$/); + if (!m) return null; + const subject = m[1].trim(); + const matchLoc = { line: c.innerNo, col: c.innerRaw.indexOf("match") + 1 }; + const { expr, nextIndex } = parseMatchExpr(c.filePath, c.lines, c.idx, subject, matchLoc); + return { step: execStep({ kind: "match", match: expr }, matchLoc), nextIdx: nextIndex }; +} + +/** + * STATEMENT dispatch table keyed by the leading keyword. Handlers fire only + * when the first token matches the key; each handler either returns a step + * (terminating), calls `fail(...)` (also terminating), or returns null to + * allow fallthrough to send / shell handling. + * + * To add a new top-level keyword, add (a) a row here pointing at the parser + * and (b) the keyword to the JAIPH_KEYWORDS set in `core.ts`. No other file + * needs to change. + */ +export const STATEMENT: Record = { + if: tryParseIf, + for: tryParseFor, + const: tryParseConst, + fail: tryParseFail, + wait: tryParseWait, + ensure: tryParseEnsure, + run: tryParseRun, + prompt: tryParsePrompt, + log: tryParseLog, + logerr: tryParseLogerr, + return: tryParseReturn, + match: tryParseStandaloneMatch, +}; - const promptAssignMatch = inner.match(/^([A-Za-z_][A-Za-z0-9_]*)\s*=\s*prompt\s+(.+)$/s); - if (promptAssignMatch) { +/** Error guards for assignment-shape lines. Emit a fail() or no-op; never return a step. */ +function applyAssignmentGuards(c: BlockCtx): void { + if (c.forRule && (c.inner.startsWith("prompt ") || /^[A-Za-z_][A-Za-z0-9_]*\s*=\s*prompt\s/.test(c.inner))) { + fail(c.filePath, "prompt is not allowed in rules", c.innerNo, colFromRaw(c.innerRaw)); + } + const promptAssign = c.inner.match(/^([A-Za-z_][A-Za-z0-9_]*)\s*=\s*prompt\s+(.+)$/s); + if (promptAssign) { fail( - filePath, + c.filePath, 'use "const name = prompt ..." to capture the prompt result (e.g. const answer = prompt "..." )', - innerNo, - innerRaw.indexOf(promptAssignMatch[1]) + 1, + c.innerNo, + c.innerRaw.indexOf(promptAssign[1]) + 1, ); } - if (inner.startsWith("prompt ")) { - const promptCol = innerRaw.indexOf("prompt") + 1; - const promptArg = innerRaw.slice(innerRaw.indexOf("prompt") + "prompt".length).trimStart(); - const result = parsePromptStep(filePath, lines, idx, promptArg, promptCol, undefined, trivia); - return { step: result.step, nextIdx: result.nextLineIdx + 1 }; - } - - const genericAssignMatch = inner.match(/^([A-Za-z_][A-Za-z0-9_]*)\s+=\s*(.+)$/s); + const generic = c.inner.match(/^([A-Za-z_][A-Za-z0-9_]*)\s+=\s*(.+)$/s); if ( - genericAssignMatch && - !genericAssignMatch[2].trimStart().startsWith("prompt ") && - !genericAssignMatch[2].trimStart().startsWith('"') && - !genericAssignMatch[2].trimStart().startsWith("$") + generic && + !generic[2].trimStart().startsWith("prompt ") && + !generic[2].trimStart().startsWith('"') && + !generic[2].trimStart().startsWith("$") ) { - const captureName = genericAssignMatch[1]; - const rest = genericAssignMatch[2].trim(); + const captureName = generic[1]; + const rest = generic[2].trim(); if (rest.startsWith("run ") || rest.startsWith("ensure ")) { fail( - filePath, + c.filePath, `assignment without "const" is no longer supported; use "const ${captureName} = ${rest}"`, - innerNo, - innerRaw.indexOf(captureName) + 1, + c.innerNo, + c.innerRaw.indexOf(captureName) + 1, ); } } +} - if (inner.startsWith("log ") || inner === "log") { - const logArg = inner.slice("log".length).trimStart(); - const logCol = innerRaw.indexOf("log") + 1; - const stepLoc = { line: innerNo, col: logCol }; - if (logArg.startsWith("run ") && logArg.slice("run ".length).trimStart().startsWith("`")) { - const runBody = logArg.slice("run ".length).trim(); - const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, logCol); - const message: Expr = { - kind: "inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }; - return { step: { type: "say", level: "log", message, loc: stepLoc }, nextIdx: result.nextLineIdx }; - } - if (logArg.startsWith("`") || logArg.startsWith("```")) { - fail(filePath, 'bare inline scripts in log are not allowed; use "log run `...`()" to execute a managed inline script', innerNo, logCol); - } - if (logArg.startsWith('"""')) { - const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logArg); - const raw = dedentTripleQuotedBody(body); - const message: Expr = { kind: "literal", raw }; - trivia.setNode(message, { tripleQuoted: true, rawBody: body }); - return { step: { type: "say", level: "log", message, loc: stepLoc }, nextIdx }; - } - if (logArg.startsWith('"') && !hasUnescapedClosingQuote(logArg, 1)) { - fail(filePath, 'multiline strings use triple quotes: log """..."""', innerNo, logCol); - } - const messageRaw = parseLogMessageRhs(filePath, innerNo, logCol, logArg, "log"); - return { - step: { type: "say", level: "log", message: { kind: "literal", raw: messageRaw }, loc: stepLoc }, - nextIdx: idx + 1, - }; +function trySend(c: BlockCtx): BlockResult | null { + const sendMatch = matchSendOperator(c.inner); + if (!sendMatch) return null; + if (c.forRule) { + fail(c.filePath, "send operator is not allowed in rules", c.innerNo, 1); } + const arrowIdx = c.inner.indexOf("<-"); + const rhsCol = arrowIdx >= 0 ? arrowIdx + 3 : 1; + const { value, nextIdx } = parseSendRhs( + c.filePath, sendMatch.rhsText, c.innerNo, rhsCol, c.lines, c.idx, c.trivia, + ); + return { + step: { type: "send", channel: sendMatch.channel, value, loc: { line: c.innerNo, col: 1 } }, + nextIdx, + }; +} - if (inner.startsWith("logerr ") || inner === "logerr") { - const logerrArg = inner.slice("logerr".length).trimStart(); - const logerrCol = innerRaw.indexOf("logerr") + 1; - const stepLoc = { line: innerNo, col: logerrCol }; - if (logerrArg.startsWith("run ") && logerrArg.slice("run ".length).trimStart().startsWith("`")) { - const runBody = logerrArg.slice("run ".length).trim(); - const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, logerrCol); - const message: Expr = { - kind: "inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }; - return { step: { type: "say", level: "logerr", message, loc: stepLoc }, nextIdx: result.nextLineIdx }; - } - if (logerrArg.startsWith("`") || logerrArg.startsWith("```")) { - fail(filePath, 'bare inline scripts in logerr are not allowed; use "logerr run `...`()" to execute a managed inline script', innerNo, logerrCol); - } - if (logerrArg.startsWith('"""')) { - const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, logerrArg); - const raw = dedentTripleQuotedBody(body); - const message: Expr = { kind: "literal", raw }; - trivia.setNode(message, { tripleQuoted: true, rawBody: body }); - return { step: { type: "say", level: "logerr", message, loc: stepLoc }, nextIdx }; - } - if (logerrArg.startsWith('"') && !hasUnescapedClosingQuote(logerrArg, 1)) { - fail(filePath, 'multiline strings use triple quotes: logerr """..."""', innerNo, logerrCol); - } - const messageRaw = parseLogMessageRhs(filePath, innerNo, logerrCol, logerrArg, "logerr"); - return { - step: { type: "say", level: "logerr", message: { kind: "literal", raw: messageRaw }, loc: stepLoc }, - nextIdx: idx + 1, - }; - } +function shellFallthrough(c: BlockCtx): BlockResult { + const loc = { line: c.innerNo, col: colFromRaw(c.innerRaw) }; + return { step: execStep({ kind: "shell", command: c.inner, loc }, loc), nextIdx: c.idx + 1 }; +} + +/** + * One workflow statement inside `{ … }` (catch body, etc.). + * + * Dispatches by leading keyword through `STATEMENT`; falls through to send / + * shell for non-keyword lines. + */ +export function parseBlockStatement( + filePath: string, + lines: string[], + idx: number, + trivia: Trivia = createTrivia(), + opts?: BlockParseOpts, +): { step: WorkflowStepDef; nextIdx: number } { + const innerRaw = lines[idx]; + const inner = innerRaw.trim(); + const innerNo = idx + 1; + const c: BlockCtx = { + filePath, lines, idx, innerRaw, inner, innerNo, trivia, + forRule: opts?.forRule === true, opts, + }; - if (inner.trim() === "return") { + if (inner.startsWith("#")) { return { - step: { - type: "return", - value: { kind: "literal", raw: '""' }, - loc: { line: innerNo, col: innerRaw.indexOf("return") + 1 }, - }, + step: { type: "trivia", kind: "comment", text: innerRaw.trim(), loc: { line: innerNo, col: 1 } }, nextIdx: idx + 1, }; } - const returnMatch = inner.match(/^return\s+(.+)$/s); - if (returnMatch) { - const returnValue = returnMatch[1].trim(); - const retLoc = { line: innerNo, col: innerRaw.indexOf("return") + 1 }; - // return """...""" - if (returnValue.startsWith('"""')) { - const { body, nextIdx } = consumeTripleQuotedArg(filePath, lines, idx, returnValue); - const value: Expr = { kind: "literal", raw: tripleQuoteBodyToRaw(dedentTripleQuotedBody(body)) }; - trivia.setNode(value, { tripleQuoted: true, rawBody: body }); - return { - step: { type: "return", value, loc: retLoc }, - nextIdx, - }; - } - // return match var { ... } - const returnMatchHead = returnValue.match(/^match\s+(.+?)\s*\{\s*$/); - if (returnMatchHead) { - const subject = returnMatchHead[1].trim(); - const { expr, nextIndex } = parseMatchExpr(filePath, lines, idx, subject, retLoc); - return { - step: { type: "return", value: { kind: "match", match: expr }, loc: retLoc }, - nextIdx: nextIndex, - }; - } - if (returnValue.startsWith("run ")) { - const runBody = returnValue.slice("run ".length).trim(); - if (runBody.startsWith("`")) { - const result = parseAnonymousInlineScript(filePath, lines, idx, runBody, innerNo, innerRaw.indexOf("run") + 1); - const value: Expr = { - kind: "inline_script", - body: result.body, - ...(result.lang ? { lang: result.lang } : {}), - args: result.args, - }; - return { - step: { type: "return", value, loc: retLoc }, - nextIdx: result.nextLineIdx, - }; - } - const call = parseCallRef(runBody); - if (call) { - rejectTrailingContent(filePath, innerNo, "run", call.rest); - const callee = { value: call.ref, loc: retLoc }; - return { - step: { type: "return", value: { kind: "call", callee, args: call.args }, loc: retLoc }, - nextIdx: idx + 1, - }; - } - } - if (returnValue.startsWith("ensure ")) { - const call = parseCallRef(returnValue.slice("ensure ".length).trim()); - if (call) { - rejectTrailingContent(filePath, innerNo, "ensure", call.rest); - const callee = { value: call.ref, loc: retLoc }; - return { - step: { type: "return", value: { kind: "ensure_call", callee, args: call.args }, loc: retLoc }, - nextIdx: idx + 1, - }; - } - } - if (returnValue.startsWith("`") || returnValue.startsWith("```")) { - fail(filePath, 'bare inline scripts in return are not allowed; use "return run `...`()" to execute a managed inline script', innerNo, retLoc.col); - } - if (returnValue.startsWith("'")) { - fail(filePath, 'single-quoted strings are not supported; use double quotes ("...") instead', innerNo, retLoc.col); - } - if (/^[0-9]+$/.test(returnValue) || returnValue === "$?") { - fail( - filePath, - 'bash exit codes are only valid in scripts; use return "..." for a workflow value', - innerNo, - retLoc.col, - ); - } - if ( - returnValue.startsWith('"') || - returnValue.startsWith("$") || - isBareDottedIdentifierReturn(returnValue) || - isBareIdentifierReturn(returnValue) - ) { - // Reject multiline "..." - if (returnValue.startsWith('"') && !hasUnescapedClosingQuote(returnValue, 1)) { - fail(filePath, 'multiline strings use triple quotes: return """..."""', innerNo, retLoc.col); - } - const isBareDotted = isBareDottedIdentifierReturn(returnValue); - const isBare = !isBareDotted && isBareIdentifierReturn(returnValue); - const raw = isBareDotted - ? dottedReturnToQuotedString(returnValue) - : isBare - ? bareIdentifierToQuotedString(returnValue) - : returnValue; - const value: Expr = { kind: "literal", raw }; - if (isBareDotted || isBare) { - trivia.setNode(value, { bareSource: returnValue.trim() }); - } - return { - step: { type: "return", value, loc: retLoc }, - nextIdx: idx + 1, - }; - } - } - - // Standalone match statement: match { ... } - const standaloneMatchHead = inner.match(/^match\s+(.+?)\s*\{\s*$/); - if (standaloneMatchHead) { - const subject = standaloneMatchHead[1].trim(); - const matchLoc = { line: innerNo, col: innerRaw.indexOf("match") + 1 }; - const { expr, nextIndex } = parseMatchExpr(filePath, lines, idx, subject, matchLoc); - return { - step: execStep({ kind: "match", match: expr }, matchLoc), - nextIdx: nextIndex, - }; - } + applyAssignmentGuards(c); - const sendMatch = matchSendOperator(inner); - if (sendMatch) { - if (forRule) { - fail(filePath, "send operator is not allowed in rules", innerNo, 1); + const keyword = inner.match(/^([A-Za-z_][A-Za-z0-9_]*)/)?.[1]; + if (keyword) { + const handler = STATEMENT[keyword]; + if (handler) { + const result = handler(c); + if (result) return result; } - const arrowIdx = inner.indexOf("<-"); - const rhsCol = arrowIdx >= 0 ? arrowIdx + 3 : 1; - const { value, nextIdx: sendNextIdx } = parseSendRhs(filePath, sendMatch.rhsText, innerNo, rhsCol, lines, idx, trivia); - return { - step: { - type: "send", - channel: sendMatch.channel, - value, - loc: { line: innerNo, col: 1 }, - }, - nextIdx: sendNextIdx, - }; } - return { - step: execStep( - { kind: "shell", command: inner, loc: { line: innerNo, col: colFromRaw(innerRaw) } }, - { line: innerNo, col: colFromRaw(innerRaw) }, - ), - nextIdx: idx + 1, - }; + return trySend(c) ?? shellFallthrough(c); } diff --git a/test-fixtures/compiler-txtar/parse-errors-snapshot.json b/test-fixtures/compiler-txtar/parse-errors-snapshot.json new file mode 100644 index 00000000..16050f29 --- /dev/null +++ b/test-fixtures/compiler-txtar/parse-errors-snapshot.json @@ -0,0 +1,1969 @@ +{ + "unterminated workflow block": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated block, expected \"}\"" + }, + "invalid script declaration": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid script declaration" + }, + "invalid rule declaration": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid rule declaration" + }, + "invalid workflow declaration": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid workflow declaration" + }, + "duplicate config block": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "duplicate config block (only one allowed per file)" + }, + "unknown config key": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: invalid.key. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "single-quoted import path": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "import missing alias": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "import must match: import \"\" as " + }, + "command substitution in prompt": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "rule without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "rule declarations require parentheses: rule foo() { … } or rule foo(params) { … }" + }, + "rule with parentheses (unterminated)": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated rule block: foo" + }, + "rule with colon instead of braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "rule declarations require parentheses: rule foo() { … } or rule foo(params) { … }" + }, + "export rule without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "rule declarations require parentheses: rule bar() { … } or rule bar(params) { … }" + }, + "rule with parentheses but no brace": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "rule declarations require braces: rule gate() { … } or rule gate(params) { … }" + }, + "script without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "script definitions require = after the name: script greet = `...`" + }, + "script with parentheses": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "definitions must not use parentheses: script greet = `...`" + }, + "script with parens but no braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "definitions must not use parentheses: script greet = `...`" + }, + "workflow without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "workflow declarations require parentheses: workflow default() { … } or workflow default(params) { … }" + }, + "workflow with parentheses (unterminated)": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated block, expected \"}\"" + }, + "export workflow without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "workflow declarations require parentheses: workflow main() { … } or workflow main(params) { … }" + }, + "workflow with parentheses but no brace": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "workflow declarations require braces: workflow main() { … } or workflow main(params) { … }" + }, + "duplicate config in same workflow": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "duplicate config block inside workflow (only one allowed per workflow)" + }, + "config after steps in workflow": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "config block inside workflow must appear before any steps" + }, + "runtime keys in workflow config": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "runtime.* keys are not allowed in workflow-level config (only agent.* and run.* keys)" + }, + "script tag with manual shebang conflict": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "fence tag \"node\" already sets the shebang — remove the manual \"#!\" line" + }, + "script tag with parentheses": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: script:node transform() {" + }, + "script tag without braces": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: script:node transform" + }, + "capture with run async rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "assignment without \"const\" is no longer supported; use \"const x = run async some_wf()\"" + }, + "run async with inline script rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "run async is not supported with inline scripts" + }, + "old inline script syntax rejected": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "inline script syntax has changed: use run `body`(args) instead of run script(args) \"body\"" + }, + "invalid agent.backend value": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "agent.backend must be \"cursor\", \"claude\", or \"codex\"" + }, + "invalid config value not quoted": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "config value must be a quoted string or true/false: yes" + }, + "config integer key rejects string value": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "runtime.docker_timeout_seconds must be an integer" + }, + "config array key rejects runtime.workspace (no longer supported)": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.workspace. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "config rejects runtime.docker_enabled (no longer supported)": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.docker_enabled. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "unknown runtime config key": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.unknown_key. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "if keyword with old syntax produces error": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "channel declaration must be single per line": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid channel declaration; expected: channel or channel -> " + }, + "capture and send cannot be combined": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "top-level local keyword is rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: local greeting = \"hello world\"" + }, + "top-level const name collision with rule": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"foo\" — variable name collides with rule of the same name" + }, + "top-level const name collision with workflow": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"default\" — variable name collides with workflow of the same name" + }, + "top-level const name collision with script": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"helper\" — variable name collides with script of the same name" + }, + "const rejects bare call-like rhs without run": { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_PARSE", + "message": "Script calls in const assignments must use run. Use: const x = run some_script(\"${arg}\")" + }, + "unterminated rule block": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated rule block: bad" + }, + "unsupported top-level statement": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: echo \"not allowed at top level\"" + }, + "multiline prompt string rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "multiline prompt strings are no longer supported; use a triple-quoted block instead: prompt \"\"\"...\"\"\"\"" + }, + "if keyword with not produces error": { + "file": "input.jh", + "line": 6, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "invalid workflow reference shape with extra dots": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "run must target a valid reference: run ref() or run ref(args) — parentheses are required" + }, + "prompt with returns without capture name": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "ensure catch with args after catch": { + "file": "input.jh", + "line": 6, + "col": 22, + "code": "E_PARSE", + "message": "catch requires explicit bindings: catch () { ... }" + }, + "ensure catch with multiple args after catch": { + "file": "input.jh", + "line": 6, + "col": 25, + "code": "E_PARSE", + "message": "catch requires explicit bindings: catch () { ... }" + }, + "ensure catch without block": { + "file": "input.jh", + "line": 6, + "col": 33, + "code": "E_PARSE", + "message": "catch requires explicit bindings and a body: catch () { ... }" + }, + "ensure catch without bindings (bare catch block)": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch requires explicit bindings: catch () { ... }" + }, + "capture and send combined alt form": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject bare $name in log message": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject bare $name in prompt": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject bare $1 in log message": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject braced numeric ${1} in log message": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject bare $name in fail message": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject bare $name in return string": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject shell fallback ${var:-fallback} in log": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject shell fallback ${var:-fallback} in fail": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject shell fallback ${var:-fallback} in prompt": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject shell fallback ${var:-fallback} in const RHS": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "shell fallback syntax (e.g. ${var:-default}) is not supported; use conditional logic or named params instead" + }, + "reject shell expansion ${var:+alt} in log": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject command substitution in log": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject command substitution in logerr": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "reject shell fallback in rule log": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "nested inline capture rejected": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "invalid inline run reference bad identifier": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "match: unterminated string in pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unterminated string in match pattern" + }, + "match: unterminated regex in pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unterminated regex in match pattern" + }, + "match: empty regex in pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "empty regex in match pattern" + }, + "match: invalid regex in pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "invalid regex in match pattern: /[invalid/" + }, + "match: empty arm body": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "match arm body cannot be empty" + }, + "match: unterminated string in arm body": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unterminated string in match arm body" + }, + "match: single-quoted pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "match: single-quoted arm body": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "match: missing arrow after pattern": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "expected \"=>\" after match pattern" + }, + "run async without parentheses requires parens": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "run async must target a valid reference: run async ref() or run async ref(args) — parentheses are required" + }, + "log format not double-quoted": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "log must match: log \"\" or log " + }, + "unterminated log string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: log \"\"\"...\"\"\"" + }, + "logerr format not double-quoted": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "logerr must match: logerr \"\" or logerr " + }, + "unterminated logerr string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: logerr \"\"\"...\"\"\"" + }, + "invalid workflow reference in channel route": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid workflow reference in channel route: \"123bad\"" + }, + "route inside workflow body is parse error": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "route declarations belong at the top level: channel findings -> analyst" + }, + "if keyword in workflow with args produces error": { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "brace-if: wait in rules": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "\"wait\" has been removed from the language" + }, + "brace-if: prompt in rules": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "prompt is not allowed in rules" + }, + "brace-if: send in rules": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "send operator is not allowed in rules" + }, + "if keyword with else branch produces error": { + "file": "input.jh", + "line": 5, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "brace-if: const prompt in rules": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = prompt is not allowed in rules" + }, + "unterminated script block": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated fenced block: no closing ``` before end of file" + }, + "metadata: runtime.workspace array rejected (single-quoted element)": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.workspace. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "metadata: runtime.workspace array rejected (unquoted element)": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.workspace. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "metadata: runtime.workspace array rejected (unclosed)": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unknown config key: runtime.workspace. Allowed: agent.default_model, agent.command, agent.backend, agent.trusted_workspace, agent.cursor_flags, agent.claude_flags, run.logs_dir, run.debug, run.recover_limit, runtime.docker_image, runtime.docker_network, runtime.docker_timeout_seconds, module.name, module.version, module.description" + }, + "unterminated test block": { + "file": "input.test.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unterminated test block: broken test" + }, + "mock function deprecated": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "\"mock function\" is no longer supported; use \"mock script\"" + }, + "send rhs: unterminated braced interpolation": { + "file": "input.jh", + "line": 3, + "col": 12, + "code": "E_PARSE", + "message": "unterminated ${...} in send right-hand side" + }, + "send rhs: command substitution inside braced interpolation": { + "file": "input.jh", + "line": 3, + "col": 12, + "code": "E_PARSE", + "message": "send right-hand side must be a quoted string (\"...\"), a variable ($name or ${...}), or \"run [args]\" — not raw shell; use a script or use const" + }, + "inline script: unexpected content after": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unexpected content after anonymous inline script: 'extra_stuff'" + }, + "inline script: unterminated parentheses": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "run must target a valid reference: run ref() or run ref(args) — parentheses are required" + }, + "inline script without parentheses": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "run must target a valid reference: run ref() or run ref(args) — parentheses are required" + }, + "old inline script empty body rejected": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "inline script syntax has changed: use run `body`(args) instead of run script(args) \"body\"" + }, + "inline script unterminated backtick": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unterminated inline script backtick — missing closing `" + }, + "match: invalid pattern type": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "match pattern must be a string literal (\"...\"), regex (/…/), or wildcard (_)" + }, + "match: unterminated match block": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unterminated match block" + }, + "match: empty match block": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "match must have at least one arm" + }, + "catch: fail without double quote": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "fail must match: fail \"\" or fail \"\"\"...\"\"\"" + }, + "catch: unterminated fail string": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: fail \"\"\"...\"\"\"" + }, + "catch: log without double quote": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "log must match: log \"\" or log " + }, + "catch: unterminated log string": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: log \"\"\"...\"\"\"" + }, + "catch: logerr without double quote": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "logerr must match: logerr \"\" or logerr " + }, + "catch: unterminated logerr string": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: logerr \"\"\"...\"\"\"" + }, + "test: empty mock prompt block": { + "file": "input.test.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "mock prompt block must have at least one arm" + }, + "test: unterminated mock block": { + "file": "input.test.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "unterminated mock block" + }, + "test: mock prompt invalid format": { + "file": "input.test.jh", + "line": 4, + "col": 2, + "code": "E_PARSE", + "message": "mock prompt must be: mock prompt \"\", mock prompt , or mock prompt { \"pattern\" => \"response\", _ => \"default\" }" + }, + "test: mock workflow with invalid ref (no parens)": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "unrecognized test step: mock workflow a.b.c {" + }, + "test: mock rule with invalid ref (no parens)": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "unrecognized test step: mock rule a.b.c {" + }, + "test: mock script with invalid ref (no parens)": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "unrecognized test step: mock script a.b.c {" + }, + "run async without parens in workflow requires parens": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "run async must target a valid reference: run async ref() or run async ref(args) — parentheses are required" + }, + "config block with content on same line as opening": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "unterminated string in top-level const": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: const name = \"\"\"...\"\"\"\"" + }, + "top-level const with trailing content after quote": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing quote in const declaration" + }, + "top-level const single-quoted string": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "workflow const with command substitution": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const value cannot use command substitution \"$(...)\"; use a script and const name = run ref" + }, + "workflow const with bash percent expansion": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const value cannot use ${var%%...} expansion; use a script" + }, + "workflow const with bash replace expansion": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const value cannot use ${var//...} expansion; use a script" + }, + "workflow const with bash length expansion": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const value cannot use ${#var}; use a script" + }, + "workflow const with shell fallback": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "shell fallback syntax (e.g. ${var:-default}) is not supported; use conditional logic or named params instead" + }, + "run without parentheses in workflow requires parens": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "run must target a valid reference: run ref() or run ref(args) — parentheses are required" + }, + "return with single-quoted string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "capture and send combined in workflow body": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "if keyword at top of workflow produces error": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "capture run without parentheses requires const": { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_PARSE", + "message": "assignment without \"const\" is no longer supported; use \"const x = run helper\"" + }, + "const run without parentheses requires parens": { + "file": "input.jh", + "line": 3, + "col": 13, + "code": "E_PARSE", + "message": "const ... = run must target a valid reference" + }, + "const ensure without parentheses requires parens": { + "file": "input.jh", + "line": 5, + "col": 13, + "code": "E_PARSE", + "message": "const ... = ensure must target a valid reference" + }, + "triple-backtick prompt is rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "prompt blocks use triple quotes: prompt \"\"\"...\"\"\"; triple backticks are for scripts" + }, + "unterminated triple-quoted prompt block": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "script with returns on closing fence rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing fence: 'returns \"{ x: string }\"'" + }, + "script with double-quoted body rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "script bodies use backticks: script broken = `...`" + }, + "script body with Jaiph interpolation rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "script bodies cannot contain Jaiph interpolation (${name}); use $1, $2 positional arguments instead" + }, + "script with brace-style body rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "brace-style script bodies are no longer supported; use: script name = `...` or script name = ```...```" + }, + "script body with bare identifier rejected": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "script bodies must be backtick or fenced block: script broken = `...` or script broken = ```...```" + }, + "script body with trailing content after backtick": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after script body backtick: 'extra'" + }, + "inline script fenced without argument list": { + "file": "input.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "anonymous inline script requires argument list after closing fence: ```(args) or ```()" + }, + "inline script fenced with shebang and lang tag conflict": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "fence tag \"node\" already sets the shebang — remove the manual \"#!\" line" + }, + "inline script fenced unterminated in workflow": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated fenced block: no closing ``` before end of file" + }, + "inline script single backtick without argument list": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "anonymous inline script requires argument list after closing backtick: `body`(args) or `body`()" + }, + "inline script fenced with invalid lang token": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "invalid opening fence: only a single lang token is allowed after ```" + }, + "config block unterminated": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block not closed with '}'" + }, + "config block with trailing content after close brace": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "match subject with dollar prefix rejected": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "match subject should be a bare identifier: match varName { ... }" + }, + "match subject with invalid identifier": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "match subject must be a valid identifier, got: 123" + }, + "const run with invalid reference": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = run must target a valid reference" + }, + "const ensure with invalid reference": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = ensure must target a valid reference" + }, + "const ensure cannot use catch": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = ensure cannot use catch" + }, + "run with invalid reference in workflow": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "run must target a valid reference: run ref() or run ref(args) — parentheses are required" + }, + "run async with invalid reference in workflow": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "run async must target a valid reference: run async ref() or run async ref(args) — parentheses are required" + }, + "empty parameter name in parameter list": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "empty parameter name in parameter list" + }, + "invalid parameter name in workflow": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "invalid parameter name \"123bad\"; must be an identifier" + }, + "duplicate parameter name in workflow": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "duplicate parameter name \"a\"" + }, + "send with unterminated string": { + "file": "input.jh", + "line": 3, + "col": 12, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: channel <- \"\"\"...\"\"\"" + }, + "triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "log with trailing content after string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unexpected content after log string: 'extra'" + }, + "logerr with trailing content after string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unexpected content after logerr string: 'extra'" + }, + "catch: prompt capture without const": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "use \"const name = prompt ...\" to capture the prompt result (e.g. const answer = prompt \"...\" )" + }, + "top-level const triple-quote with trailing content after close": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\" in const declaration" + }, + "unterminated return string in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: return \"\"\"...\"\"\"" + }, + "unterminated fail string in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: fail \"\"\"...\"\"\"" + }, + "if keyword after other steps produces error": { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_PARSE", + "message": "invalid if syntax; expected: if { ... } where op is ==, !=, =~, or !~ and operand is \"string\" or /regex/" + }, + "prompt assign without const in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "use \"const name = prompt ...\" to capture the prompt result (e.g. const answer = prompt \"...\" )" + }, + "fail without double quote in workflow body": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "fail must match: fail \"\" or fail \"\"\"...\"\"\"" + }, + "unterminated multiline catch block": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "unterminated catch block, expected \"}\"" + }, + "duplicate name: rule and workflow": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"foo\" — channels, rules, workflows, and scripts share a single namespace (already declared as rule)" + }, + "duplicate name: script and workflow": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"helper\" — channels, rules, workflows, and scripts share a single namespace (already declared as script)" + }, + "duplicate name: rule and script": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "duplicate name \"foo\" — channels, rules, workflows, and scripts share a single namespace (already declared as rule)" + }, + "channel route with no target after arrow": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "channel route requires at least one target workflow after ->" + }, + "empty multiline catch block": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch block must contain at least one statement" + }, + "ensure with invalid reference": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "ensure must target a valid reference: ensure ref() or ensure ref(args) — parentheses are required" + }, + "config line without equals sign": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "config line must be key = value: agent.backend" + }, + "prompt with trailing non-returns content": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "after prompt string expected keyword \"returns\" with quoted schema (e.g. returns \"{ type: string }\") or end of line" + }, + "prompt returns with single-quoted schema": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "unterminated returns schema string": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated returns schema string" + }, + "test block without opening brace": { + "file": "input.test.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "test block must match: test \"description\" {" + }, + "send with trailing content after string": { + "file": "input.jh", + "line": 3, + "col": 12, + "code": "E_PARSE", + "message": "send right-hand side must be a quoted string (\"...\"), a variable ($name or ${...}), or \"run [args]\" — not raw shell; use a script or use const" + }, + "capture run with invalid reference in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "assignment without \"const\" is no longer supported; use \"const x = run 123bad\"" + }, + "fenced script with shell parameter expansion is valid": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "empty inline catch block": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch block must contain at least one statement" + }, + "test: old camelCase expectContain rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "camelCase assertions are no longer supported; use \"expect_contain\"" + }, + "test: old camelCase expectNotContain rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "camelCase assertions are no longer supported; use \"expect_not_contain\"" + }, + "test: old camelCase expectEqual rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "camelCase assertions are no longer supported; use \"expect_equal\"" + }, + "test: bare assignment without const/run rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "use \"const out = run lib.default(…)\" to capture workflow output" + }, + "test: bare workflow call without run rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "use \"run lib.default(…)\" to call a workflow in tests" + }, + "test: mock workflow without parens rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "mock workflow requires parentheses: mock workflow lib.default() { … }" + }, + "test: mock rule without parens rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "mock rule requires parentheses: mock rule lib.check() { … }" + }, + "test: mock script without parens rejected": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "mock script requires parentheses: mock script lib.helper() { … }" + }, + "test: unrecognized line is E_PARSE": { + "file": "input.test.jh", + "line": 4, + "col": 3, + "code": "E_PARSE", + "message": "unrecognized test step: echo \"not valid\"" + }, + "ensure catch with unterminated bindings paren": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "unterminated catch bindings: expected \")\"" + }, + "ensure catch with empty bindings": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch requires exactly one binding: catch () { ... }" + }, + "ensure catch with invalid binding name": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "invalid catch binding name: \"123bad\" — must be a valid identifier" + }, + "ensure catch with no body after bindings": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch requires a body after bindings" + }, + "run catch with unterminated bindings paren": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "unterminated catch bindings: expected \")\"" + }, + "run catch with empty bindings": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "catch requires exactly one binding: catch () { ... }" + }, + "run catch with invalid binding name": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "invalid catch binding name: \"123bad\" — must be a valid identifier" + }, + "run catch with no body after bindings": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "catch requires a body after bindings" + }, + "ensure catch with multiple bindings rejected": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch accepts exactly one binding: catch () — the second binding (attempt) has been removed" + }, + "run catch with multiple bindings rejected": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "catch accepts exactly one binding: catch () — the second binding (attempt) has been removed" + }, + "inline catch fail without double quote": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "fail must match: fail \"\" or fail \"\"\"...\"\"\"" + }, + "inline catch unterminated fail string": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: fail \"\"\"...\"\"\"" + }, + "inline config block missing equals sign": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "inline config block with unknown key": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "inline config block rejects runtime.workspace (array opening)": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "inline config block rejects runtime.workspace (non-empty array)": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "config block header not exactly config brace": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "config block must be exactly 'config {' on its own line" + }, + "config value with single-quoted string": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "workflow body content after brace without closing on same line": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "expected newline after '{'" + }, + "runtime keys in inline workflow config": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "expected newline after '{'" + }, + "rule body content after brace without closing on same line": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "expected newline after '{'" + }, + "prompt triple-quote closing with invalid trailing content": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "closing \"\"\" must be alone, or followed by returns \"{ ... }\" (same line)" + }, + "prompt identifier body with single-quoted returns": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "prompt identifier body with non-returns trailing content": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "after prompt body expected keyword \"returns\" with quoted schema (e.g. returns \"{ type: string }\") or end of line" + }, + "prompt identifier body with unterminated returns schema": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unterminated returns schema string" + }, + "script body with invalid rhs character": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "script body must be a backtick or fenced block: script broken = `...` or script broken = ```...```" + }, + "match triple-quote arm closing with trailing content": { + "file": "input.jh", + "line": 6, + "col": 1, + "code": "E_PARSE", + "message": "closing \"\"\" in match arm must not have content on the same line" + }, + "match: opening triple-quote in arm with content on same line": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" in match arm must not have content on the same line" + }, + "match: unterminated triple-quoted block in arm": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block in match arm: no closing \"\"\" before end of match" + }, + "send with empty payload rejected": { + "file": "input.jh", + "line": 3, + "col": 12, + "code": "E_PARSE", + "message": "send requires an explicit payload: channel <- \"message\" — bare forward syntax (channel <-) has been removed" + }, + "config after semicolon-separated steps in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "unexpected content after log string: '; config { agent.backend = \"claude\" }'" + }, + "mock prompt single-quoted string rejected": { + "file": "input.test.jh", + "line": 4, + "col": 2, + "code": "E_PARSE", + "message": "single-quoted strings are not supported; use double quotes (\"...\") instead" + }, + "wait in catch block rejected": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "\"wait\" has been removed from the language" + }, + "reserved keyword as parameter name": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "parameter name \"run\" is a reserved keyword" + }, + "log triple-quote with trailing content": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\"" + }, + "logerr triple-quote with trailing content": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\"" + }, + "fail triple-quote with trailing content": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\"" + }, + "return triple-quote with trailing content": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\"" + }, + "send triple-quote with trailing content": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after closing \"\"\"" + }, + "wait in inline catch statement": { + "file": "input.jh", + "line": 5, + "col": 1, + "code": "E_PARSE", + "message": "\"wait\" has been removed from the language" + }, + "catch block: assignment without const rejected": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "assignment without \"const\" is no longer supported; use \"const x = run helper()\"" + }, + "catch block: prompt assign without const": { + "file": "input.jh", + "line": 6, + "col": 5, + "code": "E_PARSE", + "message": "use \"const name = prompt ...\" to capture the prompt result (e.g. const answer = prompt \"...\" )" + }, + "const run with old inline script syntax": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "inline script syntax has changed: use const name = run `body`(args) instead of run script(args) \"body\"" + }, + "top-level const with invalid name": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: const 123bad = \"hello\"" + }, + "wait in workflow body rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "\"wait\" has been removed from the language" + }, + "wait in workflow body after other steps rejected": { + "file": "input.jh", + "line": 3, + "col": 3, + "code": "E_PARSE", + "message": "\"wait\" has been removed from the language" + }, + "shell redirection after const run call rejected": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after run call: '| grep ok'; shell redirection (>, |, &) is not supported — use a script block" + }, + "shell redirection after send run call rejected": { + "file": "input.jh", + "line": 4, + "col": 1, + "code": "E_PARSE", + "message": "unexpected content after run call: '> output.txt'; shell redirection (>, |, &) is not supported — use a script block" + }, + "prompt body must be string or identifier not number": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "prompt body must be a quoted string, identifier, or triple-quoted block: const name = prompt \"text\" | prompt myVar | prompt \"\"\" ... \"\"\"" + }, + "send rhs trailing content after braced var": { + "file": "input.jh", + "line": 4, + "col": 12, + "code": "E_PARSE", + "message": "send right-hand side must be a quoted string (\"...\"), a variable ($name or ${...}), or \"run [args]\" — not raw shell; use a script or use const" + }, + "config string key with boolean value": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "agent.default_model must be a string" + }, + "config boolean key with string value": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "run.debug must be true or false" + }, + "config keyword alone on a line": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unsupported top-level statement: config" + }, + "send multiline string without triple quotes": { + "file": "input.jh", + "line": 3, + "col": 5, + "code": "E_PARSE", + "message": "multiline strings use triple quotes: channel <- \"\"\"...\"\"\"" + }, + "send triple-quoted payload valid": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "top-level script single backtick unterminated": { + "file": "input.jh", + "line": 1, + "col": 1, + "code": "E_PARSE", + "message": "unterminated inline script backtick — missing closing `" + }, + "inline catch return without value": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "catch requires explicit bindings: catch () { ... }" + }, + "run catch unterminated multiline block": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "unterminated catch block, expected \"}\"" + }, + "ensure catch unterminated multiline block": { + "file": "input.jh", + "line": 5, + "col": 18, + "code": "E_PARSE", + "message": "unterminated catch block, expected \"}\"" + }, + "inline script fenced unterminated in rule body": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated fenced block: no closing ``` before end of file" + }, + "fail triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "return triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "logerr triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "unterminated triple-quoted send block": { + "file": "input.jh", + "line": 3, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "run catch without bindings bare catch block": { + "file": "input.jh", + "line": 5, + "col": 16, + "code": "E_PARSE", + "message": "catch requires explicit bindings: catch () { ... }" + }, + "log triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "const triple-quote opening with content on same line": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "opening \"\"\" must not have content on the same line" + }, + "unterminated triple-quoted log block": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "unterminated triple-quoted fail block": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "unterminated triple-quoted return block": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "unterminated triple-quoted const block": { + "file": "input.jh", + "line": 2, + "col": 1, + "code": "E_PARSE", + "message": "unterminated triple-quoted block: no closing \"\"\" before end of file" + }, + "inline run catch with single fail statement": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "send rhs with run to workflow": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "send rhs with run to script": { + "file": "", + "line": 0, + "col": 0, + "code": "OK", + "message": "compilation succeeded but fixture expected a parse error" + }, + "return with bash exit code rejected in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "bash exit codes are only valid in scripts; use return \"...\" for a workflow value" + }, + "return with bash dollar-question rejected in workflow": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "bash exit codes are only valid in scripts; use return \"...\" for a workflow value" + }, + "if equality operator with regex operand rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "operator \"==\" requires a string operand (\"...\"), not a regex" + }, + "if inequality operator with regex operand rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "operator \"!=\" requires a string operand (\"...\"), not a regex" + }, + "if regex-match operator with string operand rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "operator \"=~\" requires a regex operand (/pattern/), not a string" + }, + "if negative regex-match operator with string operand rejected": { + "file": "input.jh", + "line": 2, + "col": 3, + "code": "E_PARSE", + "message": "operator \"!~\" requires a regex operand (/pattern/), not a string" + }, + "const run async with inline script rejected": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "run async is not supported with inline scripts" + }, + "const run async with invalid reference": { + "file": "input.jh", + "line": 2, + "col": 13, + "code": "E_PARSE", + "message": "const ... = run async must target a valid reference" + } +}