refactor(workspace): move all 16 crates under crates/#23
Conversation
Flatten-to-nested workspace cleanup: every first-party member crate moves from <root>/<crate>/ to <root>/crates/<crate>/. Vendored submodule forks (deps/), patches/, tools/ and all non-crate dirs stay at the repo root — the whole point is to separate our crates from third-party/support trees. Because ALL crates move together, intra-workspace `path = "../sibling"` deps stay valid (siblings remain siblings). Only references that escape the moved set change: - root Cargo.toml: members -> crates/<name> - crates/gateway, crates/media-service: ../deps -> ../../deps (5 paths) - runtime constants (repo-root-relative): backend/.local -> crates/backend/.local (llm-access support.rs, email-notifier lib.rs) - backend main.rs dev defaults (backend-crate-relative): ../data -> ../../data; ../frontend/dist kept (frontend moved into crates/ too, so it stays a sibling) — asymmetry documented inline - build/CI/scripts: Makefile, deploy.yml, 9 scripts/*.sh frontend dir & FRONTEND_DIST_DIR defaults & duckdb build-config manifest paths - AGENTS.md structure block: crates/<name>, count 14 -> 16 ci_affected.py is data-driven from members and needs no edit. No build.rs anywhere; all include_str!/CARGO_MANIFEST_DIR refs are in-crate, so zero compile-time path breakage. Cargo.lock unchanged (path deps don't affect it). Verified: cargo metadata resolves all 16 packages; clippy -D warnings --all-targets on the 5 touched crates (transitively builds all 15 non-frontend crates incl. ../../deps escapes); frontend wasm check; backend/email/llm-access test suites pass; deps/lance & deps/lancedb submodules untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request restructures the repository by moving all workspace crates into a crates/ subdirectory and updates the workspace configuration, build scripts, and documentation accordingly. The review comments highlight potential issues with hardcoded relative paths (for database URIs, frontend distribution directory, and email configuration files) when running the application from different working directories (such as the workspace root versus individual crate directories), and suggest robust path resolution logic to handle these cases.
| // Dev defaults assume `cargo run` from this crate dir (crates/backend). The | ||
| // `data/` dir stays at the repo root, so escape two levels; real runs set | ||
| // these env vars to absolute paths. | ||
| let db_uri = env::var("LANCEDB_URI").unwrap_or_else(|_| "../../data/lancedb".to_string()); | ||
| let comments_db_uri = env::var("COMMENTS_LANCEDB_URI") | ||
| .unwrap_or_else(|_| "../../data/lancedb-comments".to_string()); | ||
| let music_db_uri = | ||
| env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| "../data/lancedb-music".to_string()); | ||
| env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| "../../data/lancedb-music".to_string()); |
There was a problem hiding this comment.
The default database paths assume the command is run from the crates/backend directory. If a developer runs the backend from the workspace root (e.g., cargo run -p staticflow-backend), these paths will resolve to ../../data/... which is outside the repository root. We can make this robust by checking if the data directory exists in the current working directory.
// Dev defaults assume `cargo run` from this crate dir (crates/backend) or the workspace root.
// The `data/` dir stays at the repo root.
let db_uri = env::var("LANCEDB_URI").unwrap_or_else(|_| {
if std::path::Path::new("data").exists() {
"data/lancedb".to_string()
} else {
"../../data/lancedb".to_string()
}
});
let comments_db_uri = env::var("COMMENTS_LANCEDB_URI").unwrap_or_else(|_| {
if std::path::Path::new("data").exists() {
"data/lancedb-comments".to_string()
} else {
"../../data/lancedb-comments".to_string()
}
});
let music_db_uri = env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| {
if std::path::Path::new("data").exists() {
"data/lancedb-music".to_string()
} else {
"../../data/lancedb-music".to_string()
}
});There was a problem hiding this comment.
Accurate observation, but declining. These are dev-only fallback defaults that have always assumed CWD = the backend crate dir — the documented make dev-backend flow does cd crates/backend && cargo run, where ../../data/lancedb correctly resolves to repo-root/data/lancedb. Running from the workspace root was already broken before this PR: the old ../data/lancedb default resolved one level outside the repo from root, exactly as ../../data/lancedb does now — neither works from root, so this is not a regression. Every real run flow sets the env vars (start_backend_selfhosted.sh exports LANCEDB_URI/COMMENTS_LANCEDB_URI/MUSIC_LANCEDB_URI; start_backend_from_tmp.sh likewise), so the defaults are never reached in production. The suggested Path::new("data").exists() CWD-sniff is a runtime heuristic this repo steers away from, and it can misfire (any CWD that happens to contain a data/ dir selects the wrong branch). The CWD-independent mechanism — env vars — already exists and is documented.
There was a problem hiding this comment.
Follow-up / course-correction: the maintainer has standardized on the repo root as the canonical run CWD, so in a836633 I flipped these defaults to plain root-relative (data/lancedb etc.) — not the crate-dir-relative form my previous reply defended. Running from the workspace root now resolves correctly, which addresses your original concern directly and without any runtime CWD-sniff. start_backend_selfhosted.sh already cds to root before exec, and make dev-backend now runs cargo run -p static-flow-backend from root too.
| // `../frontend/dist` stays correct: frontend moved into crates/ alongside | ||
| // this crate, so it remains a sibling (crates/frontend/dist). | ||
| let frontend_dist_dir = | ||
| env::var("FRONTEND_DIST_DIR").unwrap_or_else(|_| "../frontend/dist".to_string()); |
There was a problem hiding this comment.
Similarly to the database paths, the default frontend_dist_dir path will resolve incorrectly if the backend is run from the workspace root. We can make this robust by checking if crates/frontend/dist exists in the current working directory.
// Frontend moved into crates/ alongside this crate, so it remains a sibling (crates/frontend/dist).
// Support running from either the workspace root or the crate directory.
let frontend_dist_dir = env::var("FRONTEND_DIST_DIR").unwrap_or_else(|_| {
if std::path::Path::new("crates/frontend/dist").exists() {
"crates/frontend/dist".to_string()
} else {
"../frontend/dist".to_string()
}
});There was a problem hiding this comment.
Declining, same reasoning as the db-path thread. ../frontend/dist is correct for the documented dev flow (cd crates/backend): frontend moved into crates/ so it remains a sibling (crates/frontend/dist). Production sets FRONTEND_DIST_DIR (exported by the start scripts), so this default is never hit there, and running from the workspace root was equally broken before the move (../frontend/dist from root → outside the repo). Id rather not add a runtime Path::exists() CWD-sniff to a dev-only default — the env-var override is the existing CWD-independent path.
There was a problem hiding this comment.
Follow-up: same root-CWD standardization — FRONTEND_DIST_DIR default is now crates/frontend/dist (root-relative) as of a836633, so running from the workspace root resolves correctly. Supersedes my earlier crate-dir-relative reply.
| } | ||
|
|
||
| PathBuf::from("backend/.local/email_accounts.json") | ||
| PathBuf::from("crates/backend/.local/email_accounts.json") |
There was a problem hiding this comment.
If the application is run from the crates/backend directory, returning crates/backend/.local/email_accounts.json as a fallback will resolve to crates/backend/crates/backend/... which is incorrect. We can make this robust by checking if crates/backend exists in the current working directory to determine the correct relative path.
| PathBuf::from("crates/backend/.local/email_accounts.json") | |
| if std::path::Path::new("crates/backend").exists() { | |
| PathBuf::from("crates/backend/.local/email_accounts.json") | |
| } else { | |
| PathBuf::from(".local/email_accounts.json") | |
| } |
There was a problem hiding this comment.
Declining — and note line 297 is not a functional resolution path. resolve_email_accounts_file_path returns default_path ("crates/backend/.local/...") when CWD = repo root, and the .local/... fallback (lines 292-294) when CWD = crates/backend. Line 297 is reached only when the file exists at neither location — i.e. it is genuinely missing — at which point its value is just what the caller uses to build the not-found error message, and both candidate paths would fail anyway. So the crates/backend/crates/backend/... doubling only ever appears in an error string in the file-absent case, never in a successful lookup. It intentionally mirrors DEFAULT_EMAIL_ACCOUNTS_FILE (the canonical path for the error). The suggested CWD-sniff adds a runtime heuristic to an unreachable-in-practice branch.
There was a problem hiding this comment.
Follow-up: with root as the canonical CWD, the line-297 fallback value crates/backend/.local/email_accounts.json is now exactly correct. It is still only reached when the file is absent at both the default and the .local/ fallback, so no path doubling occurs under root CWD.
There was a problem hiding this comment.
💡 Codex Review
static_flow/crates/backend/deploy.sh
Lines 72 to 73 in dc29b17
When this deployment script is run from the repo root, as implied by the surrounding target/release/... and data/lancedb paths, package preparation now fails because the commit deleted the top-level backend/ directory. In that scenario cp "backend/.env.production" ... cannot find the file, so the backend deployment archive is never created; these paths need to point at crates/backend/... or be resolved from the script location.
static_flow/crates/backend/src/music_wish_worker.rs
Lines 542 to 543 in dc29b17
For music-wish runs after a change under crates/cli or crates/shared, this freshness check still asks git about the deleted top-level cli and shared paths, so it will not notice new commits or dirty edits in the moved crates. In that case the worker can keep using an older target/release/sf-cli even though the checkout changed, which is especially risky for the storage-sensitive writes this guard was meant to protect; the paths should be updated to crates/cli and crates/shared.
static_flow/crates/backend/.env.example
Lines 3 to 5 in dc29b17
When developers or local tooling load this example from crates/backend (the file still says the paths are relative to the backend directory), ../data/... now resolves under crates/data instead of the repo-root data directory. That points the backend at the wrong LanceDB roots or creates empty local databases; these sample values need the same ../../data/... adjustment that main.rs received.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codex found 3 real path breakages the move introduced, plus the move exposed an inconsistency in the runtime-default CWD assumption. Unify on the repo root as the canonical run CWD (start_backend_selfhosted.sh already cds there before exec; make dev-backend now runs from root too), so every default is plain repo-root-relative — no `../`, no heuristic. - crates/backend/src/music_wish_worker.rs: the sf-cli freshness guard asked git about the deleted top-level `cli`/`shared` paths, so changes in the moved crates went unnoticed and a stale sf-cli could be reused for storage-sensitive writes. Paths → `crates/cli`, `crates/shared`. - crates/backend/deploy.sh: package prep copied `backend/.env.production` and `backend/staticflow-backend.service` (run from repo root) — now gone. Paths → `crates/backend/...`. - crates/backend/.env.example: sample LanceDB paths were backend-crate-relative (`../data/...`); under root CWD that resolved to the wrong roots. Now `data/...`, comment updated. - crates/backend/src/main.rs: db/frontend dev defaults were crate-relative (`../../data`, `../frontend/dist` from my first pass). Now repo-root-relative (`data/...`, `crates/frontend/dist`), matching the canonical CWD. Real runs still override via env vars. - Makefile dev-backend: run `cargo run -p static-flow-backend` from root instead of `cd crates/backend` (also fixes a latent `cd && [ -f .env ] || cp` precedence bug that cp'd into the wrong dir on cd failure). Verified: clippy -D warnings on static-flow-backend, 80 backend tests pass, rustfmt on the 2 changed .rs files, deps submodules untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex review — all 3 findings fixed (a836633)All three were real. Fixed and unified under "repo root is the canonical run CWD":
Also flipped |
There was a problem hiding this comment.
💡 Codex Review
After this move, the existing Node bridge test still resolves frontend/static/local_media_player_bridge.js, which no longer exists under the repo root. Running node --test crates/frontend/tests/local_media_player_bridge.test.mjs now fails immediately with ENOENT before exercising any assertions; the path needs to be updated to the new crates/frontend/static/... location or derived relative to this test file.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codex CR (round 2) on PR #23: the Node bridge test resolved `frontend/static/local_media_player_bridge.js` via `path.resolve` against the process CWD, which broke after the crates/ move — running `node --test crates/frontend/tests/local_media_player_bridge.test.mjs` failed with ENOENT before any assertion ran. Resolve the bridge asset relative to the test file itself via `fileURLToPath(new URL('../static/...', import.meta.url))`, so it passes from the repo root, the crate dir, or anywhere. Dropped the now-unused `node:path` import. Verified: `node --test` passes 3/3 from the repo root. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sweep the 28 living top-level docs/*.md so structural crate-root path
references (backend/src, frontend/dist, llm-access/src, Cargo.toml,
Trunk.toml, ...) point at the new crates/<name>/ location.
Whitelist-gated transform: only rewrites <crate>/<segment> when <segment>
is a real crate-root entry, never deployment paths (/mnt/llm-access/...),
URL routes (gateway/v1), module .rs refs, or prose ("shared/exclusive
锁"). The 37 dated plan snapshots under docs/superpowers/plans/ are left
intact as point-in-time historical records.
497 rewrites across 28 files; no double-prefix, no deployment/URL
collision (verified).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex round-2 — finding fixed (a5f2ef0) + docs path sweep (22d2939)
Docs — also swept the 28 living All four file classes that the move touched (Rust source, shell/JS scripts, Cargo manifests, docs) are now consistent under the repo-root run convention. |
`test_llm_access_duckdb_build_config.sh` greps `docs/llm-access-cdc-storage-design.zh.md` for duckdb config strings, but that doc was removed before this PR (added in abbc778, deleted in a9b8733) and is absent on master — master's own copy of this test was already failing on these two lines. The reference is unrelated to the crates/ move (it's a doc grep, not a crate path) and nothing invokes this script in CI. The code-side config those lines cross-checked is still asserted by the retained `.cargo/config.toml` + `crates/llm-access*/Cargo.toml` greps, so drop the two dead doc-grep lines. The test now passes clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
The repo root mixed 16 first-party workspace crates in among the non-crate
trees (
deps/,patches/,tools/,scripts/,docs/,content/, …),making the top level hard to scan. This moves every member crate from
<root>/<crate>/into<root>/crates/<crate>/. Vendored submodule forks(
deps/),patches/,tools/, and all other non-crate dirs stay at theroot — the whole point is to separate our crates from third-party/support
trees.
Behavior-preserving structural move only. No logic changes.
Why it's low-risk
Because all crates move together, intra-workspace
path = "../sibling"deps stay valid (siblings remain siblings). Only references that escape the
moved set change. There is no
build.rsanywhere, and everyinclude_str!/CARGO_MANIFEST_DIRref is in-crate, so there is zerocompile-time path breakage.
Cargo.lockis unchanged (path deps don'taffect it).
What changed
Cargo.toml:members→crates/<name>crates/gateway+crates/media-service../deps→
../../deps(5 paths: pingora ×4, ffmpeg-sidecar ×1)backend/.local→crates/backend/.localinllm-access/src/support.rsandemail-notifier/src/lib.rs../data→../../data(
data/stays at root);../frontend/distkept (frontend moved intocrates/too, so it remains a sibling) — this asymmetry is documentedinline in
main.rsMakefile,.github/workflows/deploy.yml, and 9scripts/*.sh(frontend dir,FRONTEND_DIST_DIRdefaults, duckdbbuild-config manifest paths).
scripts/ci_affected.pyis data-driven frommembersand needs no edit.AGENTS.mdstructure block: paths →crates/<name>, count 14 → 16Verification
cargo metadataresolves all 16 workspace packages (proves everypath=— 23 siblings + 5
../../depsescapes — resolves from the new layout)cargo clippy -D warnings --all-targetson the 5 touched crates(transitively builds all 15 non-frontend crates, incl. the
../../depsescapes)
cargo check --target wasm32-unknown-unknownfor the frontendcargo testfor backend / email / llm-access — all passdeps/lance&deps/lancedbsubmodules untouchedNotes for reviewers
the root
Cargo.toml, the two../../depsCargo.tomls, 3 source files,Makefile,deploy.yml, and the scripts.later land on this restructured
master.docs/deep-dive articles still mention bare crate paths inprose (non-executable); left as-is to keep this PR scoped to the move.
🤖 Generated with Claude Code