perf: add DD-061 benchmark harness by larimonious · Pull Request #133 · ntntlang/ntnt

larimonious · 2026-06-21T00:43:58Z

Summary

Adds the DD-061 PR 2 benchmark harness at scripts/bench/run-benchmarks.py.
Adds representative perf fixtures under examples/perf/ for plaintext, JSON, route params, compute loop, templates/partials/loops, CLI compute, and optional PostgreSQL routes.
Updates DD-061 to mark PR 1/2 complete and recommend the template AST cache as PR 3.

Notes

Default runs are database-free; DB routes are opt-in with --include-db and DATABASE_URL.
Results are written to target/perf-bench/ as JSON and Markdown.
Greptile local review got several rounds of harness cleanup; final rerun timed out after dispatch, so this is pushed for the PR-side check rather than continuing the tiny robot treadmill.

Verification

python3 AST parse for scripts/bench/run-benchmarks.py
./target/dev-release/ntnt validate examples/perf
./target/dev-release/ntnt lint examples/perf --strict (suggestions only: missing contracts in perf fixture functions)
python3 scripts/bench/run-benchmarks.py --quick --duration 1
python3 scripts/bench/run-benchmarks.py --duration 500ms rejects unsupported duration units
git diff --check

greptile-apps · 2026-06-21T00:49:06Z

Greptile Summary

This PR implements the DD-061 benchmark harness (PR 2 of the performance roadmap): a Python script at scripts/bench/run-benchmarks.py that builds the dev-release binary, starts a representative ntnt HTTP server, runs routes with wrk or a sequential urllib fallback, captures interpreter-only CLI timings, and writes JSON + Markdown results to target/perf-bench/.

scripts/bench/run-benchmarks.py — 415-line harness with wrk/urllib runners, duration validation, process-group cleanup, git context capture, and Markdown report generation. DB routes are opt-in via --include-db + DATABASE_URL.
examples/perf/ — new fixture directory with server.tnt (plaintext, JSON, route-param, compute, template, and optional PostgreSQL routes), compute_cli.tnt for interpreter-only timing, HTML templates, and a README.
design-docs/dd-061-interpreter-performance-roadmap.md — marks PR 1 and PR 2 complete and advances the recommendation to PR 3 (automatic template AST cache).

Confidence Score: 5/5

Safe to merge — adds opt-in benchmark tooling and fixtures with no effect on CI or production paths.

All changes are additive: a new Python harness, ntnt fixture files, HTML templates, and a doc update. Nothing touches production code paths, CI gates, or existing tests. The harness is guarded behind an explicit invocation and writes results only to target/perf-bench/.

No files require special attention; the two minor logic gaps noted are in the benchmark reporting path only.

Important Files Changed

Filename	Overview
scripts/bench/run-benchmarks.py	New 415-line benchmark harness. wrk/urllib runner, CLI timing, JSON+Markdown output. Two minor logic edge cases: trailing slash in base URL doubles path separators; urllib loop marks a zero-request run as ok.
examples/perf/server.tnt	New HTTP fixture server with plaintext, JSON, route-param, compute, template, and optional DB routes. O(n²) array construction in make_rows is noted in-code as a known limitation; rows are built once at startup so per-request timings are unaffected.
examples/perf/compute_cli.tnt	New interpreter-only CLI benchmark fixture. Simple, deterministic compute loop; no issues.
examples/perf/views/page.html	New template exercising layout, partial include, and for-loop with empty fallback. Correct template syntax.
examples/perf/README.md	New documentation covering quick/full suite invocations, optional DB routes, and benchmarked route inventory. Accurate and consistent with the harness.
design-docs/dd-061-interpreter-performance-roadmap.md	Marks PR 1 and PR 2 checklist items as complete and updates the current recommendation to target PR 3 (template AST cache). Documentation-only change.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant User
    participant Harness as run-benchmarks.py
    participant Cargo
    participant Server as ntnt server.tnt
    participant WRK as wrk / urllib
    participant CLI as ntnt compute_cli.tnt
    participant FS as target/perf-bench/

    User->>Harness: python3 run-benchmarks.py [--quick] [--include-db]
    Harness->>Cargo: cargo build --profile dev-release
    Cargo-->>Harness: ntnt binary
    Harness->>Server: Popen(ntnt run server.tnt) + wait_for_server
    Server-->>Harness: HTTP 2xx on /
    loop for each HTTP benchmark
        Harness->>WRK: run route (wrk -d Ns or urllib loop)
        WRK-->>Harness: RPS / latency result
    end
    Harness->>Server: SIGTERM (killpg)
    Harness->>CLI: ntnt run compute_cli.tnt x N runs
    CLI-->>Harness: elapsed_ms, stdout
    Harness->>FS: write ntnt-perf-stamp.json
    Harness->>FS: write ntnt-perf-stamp.md
    Harness-->>User: paths to JSON + Markdown output

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant User
    participant Harness as run-benchmarks.py
    participant Cargo
    participant Server as ntnt server.tnt
    participant WRK as wrk / urllib
    participant CLI as ntnt compute_cli.tnt
    participant FS as target/perf-bench/

    User->>Harness: python3 run-benchmarks.py [--quick] [--include-db]
    Harness->>Cargo: cargo build --profile dev-release
    Cargo-->>Harness: ntnt binary
    Harness->>Server: Popen(ntnt run server.tnt) + wait_for_server
    Server-->>Harness: HTTP 2xx on /
    loop for each HTTP benchmark
        Harness->>WRK: run route (wrk -d Ns or urllib loop)
        WRK-->>Harness: RPS / latency result
    end
    Harness->>Server: SIGTERM (killpg)
    Harness->>CLI: ntnt run compute_cli.tnt x N runs
    CLI-->>Harness: elapsed_ms, stdout
    Harness->>FS: write ntnt-perf-stamp.json
    Harness->>FS: write ntnt-perf-stamp.md
    Harness-->>User: paths to JSON + Markdown output

_{Reviews (2): Last reviewed commit: "fix: address benchmark review diagnostic..." | Re-trigger Greptile}

perf: add DD-061 benchmark harness

79b4453

greptile-apps Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread scripts/bench/run-benchmarks.py Outdated

Comment thread scripts/bench/run-benchmarks.py Outdated

Comment thread examples/perf/server.tnt

fix: address benchmark review diagnostics

7a50a87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: add DD-061 benchmark harness#133

perf: add DD-061 benchmark harness#133
larimonious wants to merge 2 commits into
mainfrom
perf/dd061-benchmark-harness

larimonious commented Jun 21, 2026

Uh oh!

greptile-apps Bot commented Jun 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

larimonious commented Jun 21, 2026

Summary

Notes

Verification

Uh oh!

greptile-apps Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 21, 2026 •

edited

Loading