Skip to content

Implement: Wire-level artifact upload endpoint (#166)#297

Merged
ealt merged 16 commits into
mainfrom
impl/issue-166-wire-artifact-upload
Jun 7, 2026
Merged

Implement: Wire-level artifact upload endpoint (#166)#297
ealt merged 16 commits into
mainfrom
impl/issue-166-wire-artifact-upload

Conversation

@ealt

@ealt ealt commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Summary

Delivers the additive half of wire-level artifact transfer (#166) — the contract Phase 13d's S3/GCS backend (#174) plugs into:

  • Spec (chapter 7 §16 deposit_artifact / fetch_artifact; chapter 2 §1.5 opaque artifacts_uri; chapter 8 §5.1/§5.5; chapter 9 v1 "Artifact transfer" group) defines an opaque URI deposited/fetched over the wire — workers never need a shared filesystem, can't read/overwrite each other's submissions, and bytes survive instance loss.
  • Storage — an ArtifactBackend blob Protocol (bytes-only, so 13d's S3/GCS drops in without wire changes) with FileArtifactBackend + InMemoryArtifactBackend, plus a Store-side ArtifactMetadata row (created_by for the ACL) across in-memory/sqlite/postgres.
  • WirePOST /v0/experiments/{E}/artifacts (multipart, streamed size cap → 413) and GET .../artifacts?uri=<artifacts_uri> (full opaque URI presented verbatim; per-row ACL: depositor or admin-class → else 403); StoreClient.deposit_artifact / fetch_artifact; task-store-server --max-artifact-bytes + server-private --artifact-blob-dir.
  • Conformance — a v1 "Artifact transfer" scenario group (deposit 201, content integrity, cross-worker 403, admin fetch, unknown 404, fresh-id-per-deposit, safe-delivery headers).

Why: the artifact substrate assumed a shared ${EDEN_EXPERIMENT_DATA_ROOT}/artifacts/ mount between every worker, the task-store-server, and the web-ui — which breaks for any distributed deployment. This moves artifacts to wire deposit/retrieve behind an opaque URI; the physical layout becomes the server's private business.

Impl-stage codex-review converged over 10 rounds (record: docs/plans/review/issue-166/impl/2026-06-04/round-0.md) — all actionable findings fixed (notably: streamed cap enforced before buffering; the §16.2-vs-§1.5 opaque-URI contradiction resolved by presenting the full URI; ArtifactStore split off the shared Store Protocol so StoreClient still conforms; blob-dir/--artifacts-dir overlap rejected to close an ACL bypass; fsync-before-metadata durability).

What this does NOT cover

This PR is purely additive — nothing emits eden://artifacts/… yet; the file:// write/read path is untouched and still in use. Deferrals (each a filed issue):

Fresh-operator walkthrough

N/A for the deposit/fetch wire endpoints — they are not yet adopted by any operator-facing workflow (web-ui / host writers still use file://; adoption is the #290 cutover). The only new operator-facing surface is two task-store-server CLI flags (--max-artifact-bytes, --artifact-blob-dir), exercised by unit tests: --artifact-blob-dir overlapping --artifacts-dir fails startup with a clear message; omitting it logs a non-durable-backend warning. No web-ui or e2e operator workflow changes.

Test plan

  • markdownlint-cli2 (pinned) — clean
  • check-jsonschema metaschema (incl. new artifact-metadata + wire/deposit-artifact-response schemas) — clean
  • spec-xref-check.py — all §-refs resolve
  • check-rename-discipline.py — clean
  • check-complexity.py — 0 blocking
  • uv run ruff check / uv run pyright — clean
  • uv run pytest -q reference/packages reference/services — 2161 passed, 235 skipped
  • uv run pytest -q conformance/ -n auto — 263 passed, 13 skipped
  • check_citations.py — 275 scenarios cite valid MUSTs
  • Compose smoke / e2e — unchanged by this additive PR (no Compose/substrate edits; deposit endpoint unused by the stack until Migrate artifact writers to wire deposit + retire the file:// read path (#166 Wave 3 cutover) #290)

Related issues

Advances #166 (the endpoint + contract + conformance). Deferrals: #290 (cutover), #174 (S3/GCS), #288 (cross-role ACL), #289 (orphan GC), #102 (checkpoint content-addressing). Tooling: #296 (pre-push hook).

🤖 Generated with Claude Code

ealt and others added 15 commits June 4, 2026 10:01
…ifact transfer

Spec amendments for wire-level artifact deposit/fetch:
- 02-data-model.md §1.5: artifacts_uri is opaque; reference deployment
  issues eden://artifacts/<opaque-id> resolved via chapter-7 §16.
- 07-wire-protocol.md: new §16 (Artifact operations — deposit_artifact /
  fetch_artifact, opaque URI, streamed size cap, safe-delivery headers,
  per-row fetch ACL); renumber Implementation latitude §16 → §17; new §9
  error row eden://error/payload-too-large (413); §13.3 row-scoped ACL
  classification; §12 conformance statement.
- 08-storage.md §5.1 reference note re-pointed from hierarchical file://
  to opaque eden:// + private blob backend; new §5.5 metadata row note.
- 09-conformance.md §4/§5: v1 "Artifact transfer" scenario group.
- worker-host-subprocess.md §2.3/§4/§10: staging-then-deposit flow,
  server-internal layout.

New schemas + contract:
- spec/v0/schemas/artifact-metadata.schema.json
- spec/v0/schemas/wire/deposit-artifact-response.schema.json
- eden_contracts.ArtifactMetadata Pydantic model + parity/roundtrip wiring.

Gates: markdownlint, check-jsonschema (metaschema), spec-xref-check,
check-rename-discipline, eden-contracts pytest (240 passed) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- eden_storage.artifact_backend: ArtifactBackend Protocol +
  FileArtifactBackend (exclusive-create / atomic-link + O_NOFOLLOW read
  guards; opaque-id grammar validated so no client path / traversal
  surface) + InMemoryArtifactBackend. Bytes-only by design so a future
  S3/GCS backend (13d) drops in without wire changes.
- Store Protocol gains create_artifact / read_artifact (no event — the
  artifact store is distinct from the event log per 08-storage.md §5);
  _ArtifactOpsMixin + _Tx.artifacts + _get_artifact across in-memory /
  sqlite / postgres backends. sqlite + postgres v8 migration: artifact
  metadata table.
- test_artifact_store.py: metadata create/read/duplicate/no-event
  parametrized across all three backends; blob backend roundtrip /
  exclusive-create / NotFound / invalid-id / symlink-leaf guards.

Gates: ruff, pyright (storage src + eden-wire src), check-complexity,
eden-storage pytest (407 passed, 199 postgres-skipped) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- eden_wire.routers.artifacts: POST /v0/experiments/{E}/artifacts
  (deposit) + GET .../{A} (fetch). Streamed size-cap enforced on the raw
  body BEFORE buffering (Starlette's max_part_size does NOT cap file
  parts — confirmed), then re-parsed via a replay-receive Request; exact
  artifact cap re-checked post-parse → 413 eden://error/payload-too-large.
  Fetch enforces the §16.2 per-row ACL (depositor / admin / admins-group;
  else 403) and returns exact bytes with attachment + nosniff headers.
- errors.PayloadTooLarge (normative 413), registered in the envelope map
  + server exception handlers.
- models.DepositArtifactResponse + wire-schema parity test.
- RouterDeps carries artifact_backend + max_artifact_bytes; make_app
  resolves a default backend (File rooted at artifacts_dir, else
  in-memory) via _resolve_artifact_backend; includes the artifact router.
- StoreClient.deposit_artifact (multipart) / fetch_artifact (bytes);
  _request gains files= support. No read-back ladder (§16.1).
- task-store-server: --max-artifact-bytes flag → build_app → make_app.
- Tests: deposit 201 / fetch bytes / 404 / 413 / experiment-mismatch /
  missing-part 400; auth ACL matrix (depositor / admin / admins-group /
  different-worker 403 / unauthenticated 401); StoreClient roundtrip.

Gates: ruff, pyright, check-complexity, eden-wire + task-store-server
pytest (276 passed) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- conformance/scenarios/test_artifact_transfer.py (CONFORMANCE_GROUP =
  "Artifact transfer"): deposit 201 + opaque uri + size/content_type;
  fetch-by-depositor exact bytes (content integrity); fetch-by-admin;
  fetch-by-different-worker → 403 (cross-worker isolation, §16.2 ACL);
  fetch-unknown → 404; fresh-id-per-deposit (no-overwrite projection).
  Each docstring cites a MUST-bearing §16.1/§16.2/§5.3 section within the
  "Artifact transfer" §5 group entry (check_citations three-legged).
- WireClient.request gains files= support; encodes multipart via a
  standalone httpx.Request so the boundary Content-Type overrides the
  client's default application/json (else the server sees JSON + no part).
- error_vocabulary: eden://error/payload-too-large added to
  IUT_OPTIONAL_TYPES (the deposit cap is operator-configured latitude, not
  portably triggerable; reference 413 coverage lives in eden-wire units).

Gates: check_citations (275 scenarios OK), full conformance suite
(263 passed, 13 skipped) -n auto, ruff, pyright all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CHANGELOG [Unreleased] entry for wire artifact transfer (Waves 0-2+4),
  enumerating every deferral with its issue link (#290 cutover, #174
  S3/GCS, #288 cross-role ACL, #289 orphan-GC, #102 checkpoint
  content-addressing) + the over-cap-not-portably-conformance-testable
  note.
- roadmap one-liner (partial #166: contract+endpoints+conformance;
  cutover → #290).
- glossary: artifacts_uri row + new eden://artifacts/<id> scheme entry
  (role-disjoint from eden://error/) + ArtifactBackend + artifact-store
  rows.

Deferral issues filed: #288 (cross-role ACL), #289 (orphan-GC),
#290 (hard cutover). Gate: markdownlint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P1/P2] Decouple the §16 blob backend from --artifacts-dir: make_app no
  longer derives FileArtifactBackend from artifacts_dir (read-only mount +
  shared with the legacy /_reference route → write-fail + ACL leak).
  Default is now InMemoryArtifactBackend (test posture); the
  task-store-server gains a server-PRIVATE writable --artifact-blob-dir
  flag for the durable FileArtifactBackend, and logs a warning when unset
  (deposits non-durable).
- [P2-spec] Revert worker-host-subprocess §2.3/§4/§10 to the shipped
  file:// layout, each with a "Wire-transfer migration (#166)" forward-note
  to the deferred #290 cutover (the informative binding must match shipped
  host behavior; the normative 07 §16 / 02 §1.5 / 08 §5 wire surface stays).
- [P3] Add the `artifact` table to _READONLY_GRANT_TABLES so the readonly
  Postgres role keeps SELECT after the v8 bump.
- Commit the impl-stage codex-review record (round-0 + convergence).

Gates: ruff, pyright, check-complexity, markdownlint, spec-xref-check,
eden-wire artifact + task-store-server pytest (31 passed) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P1] Move create_artifact/read_artifact OFF the shared Store Protocol
  (StoreClient can't implement them — no wire surface) onto a separate
  ArtifactStore Protocol; the wire handler casts deps.store (always a
  concrete backend server-side). Fixes 12 full-repo pyright errors at the
  StoreClient→Store assignment sites that per-package pyright missed.
- [P2] Add eden://error/payload-too-large to error.schema.json enum.
- [P2] Declare python-multipart in eden-wire deps (Starlette form() needs
  it; was only transitive via web-ui).
- [P2] Map malformed-multipart parser errors to problem+json BadRequest
  (+ regression test).
- [P2] FileArtifactBackend.store uses a unique tempfile.mkstemp temp so
  crash residue can't alias + truncate a committed inode.

Gates: full uv run pyright (0 errors), ruff, check-complexity,
metaschema, eden-storage + eden-wire + task-store-server pytest all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] _read_body_capped checks len(body)+len(chunk) > limit BEFORE
  extending, so a single over-limit chunk is never buffered.
- [P2] fetch sets Content-Type via the raw header (not Response
  media_type) so Starlette doesn't append "; charset=utf-8" — the
  recorded content_type is returned verbatim (§16.2). + regression test.

Gates: ruff, eden-wire artifact pytest (16 passed) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] Deposit handler requires exactly one multipart 'file' part
  (form.multi_items() == [("file", UploadFile)]); rejects multiple parts
  / stray fields / wrong key with 400 bad-request instead of silently
  picking one (§16.1). + regression test.

Gates: ruff, pyright, eden-wire artifact + artifact-transfer conformance
pytest (23 passed) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P1] Fix CI-blocking pyright error: test_no_event_emitted_for_artifact
  kept `store` typed as Store for events() reads, casting only the
  create_artifact call to ArtifactStore (the full-repo pyright gate
  includes tests; per-file src runs missed it).
- [P1, declined-with-rationale] CLI durable-backend: strengthened the
  no-blob-dir warning (spells out the 404-after-restart failure mode);
  declined erroring/disabling because no writer deposits yet (cutover
  deferred to #290) and forcing the flag would regress existing
  Compose/manual startup. Extracted _resolve_artifact_blob_dir helper to
  keep main() under the complexity threshold.

Gates: full pyright (0), ruff, check-complexity, markdownlint,
task-store-server pytest (17) green. codex-review record updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] StoreClient.deposit_artifact encodes the multipart body via a
  standalone httpx.Request + sends raw content with the boundary
  Content-Type set explicitly, so a caller-injected client with a default
  application/json Content-Type still produces a valid deposit (_request
  files= → content=). + regression test.
- [P2] StoreClient.fetch_artifact accepts the full opaque artifacts_uri
  (eden://artifacts/<id>) or the bare id (final path segment), so callers
  never parse the opaque URI by hand. + regression test.

Gates: ruff, pyright (0), full eden-wire pytest (263) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] Conformance Artifact-transfer URI assertion is now scheme-agnostic
  (non-empty RFC-3986 URI, not the reference eden:// scheme) — artifacts_uri
  is opaque + impl-defined (02 §1.5, 08 §5.1); resolvability proven by the
  fetch round-trip.
- [P2] FileArtifactBackend.store fsyncs the blob fd before link + fsyncs the
  root dir after, so a crash can't leave a committed artifacts_uri whose
  bytes are unflushed (§5.2 durability; matches SqliteStore synchronous=FULL).
- [P2, declined] Precise per-part streaming cap: the raw-body streamed cap
  (cap + _MULTIPART_SLACK) already bounds peak memory + rejects grossly-over
  uploads mid-stream; per-part counting needs replacing Starlette's parser —
  disproportionate. Documented.

Gates: ruff, pyright (0), check-complexity, eden-storage artifact +
artifact-transfer conformance pytest green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] build_app rejects --artifact-blob-dir overlapping --artifacts-dir
  (equal / nested / parent) via _reject_blob_dir_overlap → SystemExit at
  startup, closing an ACL-bypass where private §16 blobs would be served
  by the unauthenticated /_reference route. + two tests.
- The other two round-7 findings re-raise the round-6 (per-part streaming
  cap) and round-4 (durable-store-deposit posture) declines; rationale
  unchanged and documented in the codex-review record.

Gates: ruff, pyright (0), check-complexity, task-store-server pytest
(19) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P1] fetch_artifact now takes the FULL opaque artifacts_uri verbatim as
  the ?uri= query param (GET /v0/experiments/{E}/artifacts?uri=<uri>),
  resolving the §16.2-vs-§1.5 contradiction (the client no longer extracts
  an id from the opaque URI; the issuing server maps it back). Works for
  any RFC-3986 scheme (S3 URLs, URNs), not just eden://artifacts/<id>.
  Spec §16/§16.2 + handler (_opaque_id_from_uri; missing uri → 400,
  unrecognized → 404) + StoreClient.fetch_artifact + conformance + tests.
- [P2] Conformance fetch scenario asserts the §16.2 safe-delivery headers
  (recorded Content-Type + Content-Disposition: attachment + nosniff).
- Standing declines unchanged (per-part streaming cap; durable-store
  deposit posture) — documented in the review record.

Gates: full eden-wire pytest (265) + conformance (-n auto, 263) +
check_citations + pyright (0) + ruff + spec-xref + markdownlint green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- [P2] Conformance URI assertion is case-insensitive on the scheme
  ([a-zA-Z]...) so RFC-3986 uppercase schemes (URN:, S3://) aren't rejected.
- [P2] chapter-9 §5 Artifact-transfer row no longer claims mandatory
  over-cap 413 coverage (the cap is operator-configured latitude; 413 is
  IUT-optional / reference-unit-tested); row now also names the
  safe-delivery-header assertion.

The two standing declines were NOT re-raised this round. Gates:
spec-xref, markdownlint, ruff, artifact-transfer conformance (6) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ealt ealt enabled auto-merge (squash) June 4, 2026 20:56
…rtifact-upload

# Conflicts:
#	CHANGELOG.md
#	docs/roadmap.md
#	reference/packages/eden-contracts/src/eden_contracts/__init__.py
#	reference/packages/eden-storage/src/eden_storage/_postgres_schema.py
#	reference/packages/eden-storage/src/eden_storage/_schema.py
#	reference/packages/eden-wire/src/eden_wire/models.py
@ealt ealt merged commit 44bad4b into main Jun 7, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant