Skip to content

refactor(workspace): move all 16 crates under crates/#23

Merged
acking-you merged 5 commits into
masterfrom
refactor/move-crates-into-crates-dir
Jun 1, 2026
Merged

refactor(workspace): move all 16 crates under crates/#23
acking-you merged 5 commits into
masterfrom
refactor/move-crates-into-crates-dir

Conversation

@acking-you
Copy link
Copy Markdown
Owner

Summary

The repo root mixed 16 first-party workspace crates in among the non-crate
trees (deps/, patches/, tools/, scripts/, docs/, content/, …),
making the top level hard to scan. This moves every member crate from
<root>/<crate>/ into <root>/crates/<crate>/. Vendored submodule forks
(deps/), patches/, tools/, and all other non-crate dirs stay at the
root — the whole point is to separate our crates from third-party/support
trees.

Behavior-preserving structural move only. No logic changes.

Why it's low-risk

Because all crates move together, intra-workspace path = "../sibling"
deps stay valid (siblings remain siblings). Only references that escape the
moved set
change. There is no build.rs anywhere, and every
include_str! / CARGO_MANIFEST_DIR ref is in-crate, so there is zero
compile-time path breakage. Cargo.lock is unchanged (path deps don't
affect it).

What changed

  • Root Cargo.toml: memberscrates/<name>
  • Escaped path deps: crates/gateway + crates/media-service ../deps
    ../../deps (5 paths: pingora ×4, ffmpeg-sidecar ×1)
  • Runtime constants (repo-root-relative): backend/.local
    crates/backend/.local in llm-access/src/support.rs and
    email-notifier/src/lib.rs
  • Backend dev defaults (backend-crate-relative): ../data../../data
    (data/ stays at root); ../frontend/dist kept (frontend moved into
    crates/ too, so it remains a sibling) — this asymmetry is documented
    inline in main.rs
  • Build/CI/scripts: Makefile, .github/workflows/deploy.yml, and 9
    scripts/*.sh (frontend dir, FRONTEND_DIST_DIR defaults, duckdb
    build-config manifest paths). scripts/ci_affected.py is data-driven from
    members and needs no edit.
  • AGENTS.md structure block: paths → crates/<name>, count 14 → 16

Verification

  • cargo metadata resolves all 16 workspace packages (proves every path=
    — 23 siblings + 5 ../../deps escapes — resolves from the new layout)
  • cargo clippy -D warnings --all-targets on the 5 touched crates
    (transitively builds all 15 non-frontend crates, incl. the ../../deps
    escapes)
  • cargo check --target wasm32-unknown-unknown for the frontend
  • cargo test for backend / email / llm-access — all pass
  • deps/lance & deps/lancedb submodules untouched

Notes for reviewers

  • The diff is dominated by ~443 pure file renames; the reviewable edits are
    the root Cargo.toml, the two ../../deps Cargo.tomls, 3 source files,
    Makefile, deploy.yml, and the scripts.
  • Other open branches / worktrees will hit rename-merge friction when they
    later land on this restructured master.
  • A handful of docs/ deep-dive articles still mention bare crate paths in
    prose (non-executable); left as-is to keep this PR scoped to the move.

🤖 Generated with Claude Code

Flatten-to-nested workspace cleanup: every first-party member crate moves
from <root>/<crate>/ to <root>/crates/<crate>/. Vendored submodule forks
(deps/), patches/, tools/ and all non-crate dirs stay at the repo root —
the whole point is to separate our crates from third-party/support trees.

Because ALL crates move together, intra-workspace `path = "../sibling"`
deps stay valid (siblings remain siblings). Only references that escape
the moved set change:

- root Cargo.toml: members -> crates/<name>
- crates/gateway, crates/media-service: ../deps -> ../../deps (5 paths)
- runtime constants (repo-root-relative): backend/.local ->
  crates/backend/.local (llm-access support.rs, email-notifier lib.rs)
- backend main.rs dev defaults (backend-crate-relative): ../data ->
  ../../data; ../frontend/dist kept (frontend moved into crates/ too, so
  it stays a sibling) — asymmetry documented inline
- build/CI/scripts: Makefile, deploy.yml, 9 scripts/*.sh frontend dir &
  FRONTEND_DIST_DIR defaults & duckdb build-config manifest paths
- AGENTS.md structure block: crates/<name>, count 14 -> 16

ci_affected.py is data-driven from members and needs no edit. No build.rs
anywhere; all include_str!/CARGO_MANIFEST_DIR refs are in-crate, so zero
compile-time path breakage. Cargo.lock unchanged (path deps don't affect it).

Verified: cargo metadata resolves all 16 packages; clippy -D warnings
--all-targets on the 5 touched crates (transitively builds all 15
non-frontend crates incl. ../../deps escapes); frontend wasm check;
backend/email/llm-access test suites pass; deps/lance & deps/lancedb
submodules untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request restructures the repository by moving all workspace crates into a crates/ subdirectory and updates the workspace configuration, build scripts, and documentation accordingly. The review comments highlight potential issues with hardcoded relative paths (for database URIs, frontend distribution directory, and email configuration files) when running the application from different working directories (such as the workspace root versus individual crate directories), and suggest robust path resolution logic to handle these cases.

Comment thread crates/backend/src/main.rs Outdated
Comment on lines +57 to +64
// Dev defaults assume `cargo run` from this crate dir (crates/backend). The
// `data/` dir stays at the repo root, so escape two levels; real runs set
// these env vars to absolute paths.
let db_uri = env::var("LANCEDB_URI").unwrap_or_else(|_| "../../data/lancedb".to_string());
let comments_db_uri = env::var("COMMENTS_LANCEDB_URI")
.unwrap_or_else(|_| "../../data/lancedb-comments".to_string());
let music_db_uri =
env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| "../data/lancedb-music".to_string());
env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| "../../data/lancedb-music".to_string());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default database paths assume the command is run from the crates/backend directory. If a developer runs the backend from the workspace root (e.g., cargo run -p staticflow-backend), these paths will resolve to ../../data/... which is outside the repository root. We can make this robust by checking if the data directory exists in the current working directory.

    // Dev defaults assume `cargo run` from this crate dir (crates/backend) or the workspace root.
    // The `data/` dir stays at the repo root.
    let db_uri = env::var("LANCEDB_URI").unwrap_or_else(|_| {
        if std::path::Path::new("data").exists() {
            "data/lancedb".to_string()
        } else {
            "../../data/lancedb".to_string()
        }
    });
    let comments_db_uri = env::var("COMMENTS_LANCEDB_URI").unwrap_or_else(|_| {
        if std::path::Path::new("data").exists() {
            "data/lancedb-comments".to_string()
        } else {
            "../../data/lancedb-comments".to_string()
        }
    });
    let music_db_uri = env::var("MUSIC_LANCEDB_URI").unwrap_or_else(|_| {
        if std::path::Path::new("data").exists() {
            "data/lancedb-music".to_string()
        } else {
            "../../data/lancedb-music".to_string()
        }
    });

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accurate observation, but declining. These are dev-only fallback defaults that have always assumed CWD = the backend crate dir — the documented make dev-backend flow does cd crates/backend && cargo run, where ../../data/lancedb correctly resolves to repo-root/data/lancedb. Running from the workspace root was already broken before this PR: the old ../data/lancedb default resolved one level outside the repo from root, exactly as ../../data/lancedb does now — neither works from root, so this is not a regression. Every real run flow sets the env vars (start_backend_selfhosted.sh exports LANCEDB_URI/COMMENTS_LANCEDB_URI/MUSIC_LANCEDB_URI; start_backend_from_tmp.sh likewise), so the defaults are never reached in production. The suggested Path::new("data").exists() CWD-sniff is a runtime heuristic this repo steers away from, and it can misfire (any CWD that happens to contain a data/ dir selects the wrong branch). The CWD-independent mechanism — env vars — already exists and is documented.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up / course-correction: the maintainer has standardized on the repo root as the canonical run CWD, so in a836633 I flipped these defaults to plain root-relative (data/lancedb etc.) — not the crate-dir-relative form my previous reply defended. Running from the workspace root now resolves correctly, which addresses your original concern directly and without any runtime CWD-sniff. start_backend_selfhosted.sh already cds to root before exec, and make dev-backend now runs cargo run -p static-flow-backend from root too.

Comment thread crates/backend/src/main.rs Outdated
Comment on lines 71 to 74
// `../frontend/dist` stays correct: frontend moved into crates/ alongside
// this crate, so it remains a sibling (crates/frontend/dist).
let frontend_dist_dir =
env::var("FRONTEND_DIST_DIR").unwrap_or_else(|_| "../frontend/dist".to_string());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similarly to the database paths, the default frontend_dist_dir path will resolve incorrectly if the backend is run from the workspace root. We can make this robust by checking if crates/frontend/dist exists in the current working directory.

    // Frontend moved into crates/ alongside this crate, so it remains a sibling (crates/frontend/dist).
    // Support running from either the workspace root or the crate directory.
    let frontend_dist_dir = env::var("FRONTEND_DIST_DIR").unwrap_or_else(|_| {
        if std::path::Path::new("crates/frontend/dist").exists() {
            "crates/frontend/dist".to_string()
        } else {
            "../frontend/dist".to_string()
        }
    });

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining, same reasoning as the db-path thread. ../frontend/dist is correct for the documented dev flow (cd crates/backend): frontend moved into crates/ so it remains a sibling (crates/frontend/dist). Production sets FRONTEND_DIST_DIR (exported by the start scripts), so this default is never hit there, and running from the workspace root was equally broken before the move (../frontend/dist from root → outside the repo). Id rather not add a runtime Path::exists() CWD-sniff to a dev-only default — the env-var override is the existing CWD-independent path.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: same root-CWD standardization — FRONTEND_DIST_DIR default is now crates/frontend/dist (root-relative) as of a836633, so running from the workspace root resolves correctly. Supersedes my earlier crate-dir-relative reply.

}

PathBuf::from("backend/.local/email_accounts.json")
PathBuf::from("crates/backend/.local/email_accounts.json")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the application is run from the crates/backend directory, returning crates/backend/.local/email_accounts.json as a fallback will resolve to crates/backend/crates/backend/... which is incorrect. We can make this robust by checking if crates/backend exists in the current working directory to determine the correct relative path.

Suggested change
PathBuf::from("crates/backend/.local/email_accounts.json")
if std::path::Path::new("crates/backend").exists() {
PathBuf::from("crates/backend/.local/email_accounts.json")
} else {
PathBuf::from(".local/email_accounts.json")
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining — and note line 297 is not a functional resolution path. resolve_email_accounts_file_path returns default_path ("crates/backend/.local/...") when CWD = repo root, and the .local/... fallback (lines 292-294) when CWD = crates/backend. Line 297 is reached only when the file exists at neither location — i.e. it is genuinely missing — at which point its value is just what the caller uses to build the not-found error message, and both candidate paths would fail anyway. So the crates/backend/crates/backend/... doubling only ever appears in an error string in the file-absent case, never in a successful lookup. It intentionally mirrors DEFAULT_EMAIL_ACCOUNTS_FILE (the canonical path for the error). The suggested CWD-sniff adds a runtime heuristic to an unreachable-in-practice branch.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: with root as the canonical CWD, the line-297 fallback value crates/backend/.local/email_accounts.json is now exactly correct. It is still only reached when the file is absent at both the default and the .local/ fallback, so no path doubling occurs under root CWD.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

cp "backend/.env.production" "$TEMP_DIR/staticflow/.env"
cp "backend/staticflow-backend.service" "$TEMP_DIR/staticflow/"

P2 Badge Update deploy package paths after moving backend

When this deployment script is run from the repo root, as implied by the surrounding target/release/... and data/lancedb paths, package preparation now fails because the commit deleted the top-level backend/ directory. In that scenario cp "backend/.env.production" ... cannot find the file, so the backend deployment archive is never created; these paths need to point at crates/backend/... or be resolved from the script location.


let repo_relative_paths =
["cli", "shared", "Cargo.toml", "Cargo.lock", "deps/lance", "deps/lancedb"];

P2 Badge Track moved CLI sources for worker rebuilds

For music-wish runs after a change under crates/cli or crates/shared, this freshness check still asks git about the deleted top-level cli and shared paths, so it will not notice new commits or dirty edits in the moved crates. In that case the worker can keep using an older target/release/sf-cli even though the checkout changed, which is especially risky for the storage-sensitive writes this guard was meant to protect; the paths should be updated to crates/cli and crates/shared.


# LanceDB path (relative to backend/ or absolute path)
LANCEDB_URI=../data/lancedb
COMMENTS_LANCEDB_URI=../data/lancedb-comments

P2 Badge Fix example DB paths for the new backend depth

When developers or local tooling load this example from crates/backend (the file still says the paths are relative to the backend directory), ../data/... now resolves under crates/data instead of the repo-root data directory. That points the backend at the wrong LanceDB roots or creates empty local databases; these sample values need the same ../../data/... adjustment that main.rs received.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex found 3 real path breakages the move introduced, plus the move
exposed an inconsistency in the runtime-default CWD assumption. Unify on
the repo root as the canonical run CWD (start_backend_selfhosted.sh
already cds there before exec; make dev-backend now runs from root too),
so every default is plain repo-root-relative — no `../`, no heuristic.

- crates/backend/src/music_wish_worker.rs: the sf-cli freshness guard
  asked git about the deleted top-level `cli`/`shared` paths, so changes
  in the moved crates went unnoticed and a stale sf-cli could be reused
  for storage-sensitive writes. Paths → `crates/cli`, `crates/shared`.
- crates/backend/deploy.sh: package prep copied `backend/.env.production`
  and `backend/staticflow-backend.service` (run from repo root) — now
  gone. Paths → `crates/backend/...`.
- crates/backend/.env.example: sample LanceDB paths were
  backend-crate-relative (`../data/...`); under root CWD that resolved to
  the wrong roots. Now `data/...`, comment updated.
- crates/backend/src/main.rs: db/frontend dev defaults were
  crate-relative (`../../data`, `../frontend/dist` from my first pass).
  Now repo-root-relative (`data/...`, `crates/frontend/dist`), matching
  the canonical CWD. Real runs still override via env vars.
- Makefile dev-backend: run `cargo run -p static-flow-backend` from root
  instead of `cd crates/backend` (also fixes a latent `cd && [ -f .env ]
  || cp` precedence bug that cp'd into the wrong dir on cd failure).

Verified: clippy -D warnings on static-flow-backend, 80 backend tests
pass, rustfmt on the 2 changed .rs files, deps submodules untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@acking-you
Copy link
Copy Markdown
Owner Author

Codex review — all 3 findings fixed (a836633)

All three were real. Fixed and unified under "repo root is the canonical run CWD":

  • music_wish_worker.rs sf-cli freshness guard — git pathspecs ["cli","shared",…]["crates/cli","crates/shared",…]. This is the storage-sensitive one you flagged: the guard now correctly detects commits/dirty edits in the moved crates, so a stale target/release/sf-cli is no longer reused for writes.
  • crates/backend/deploy.shbackend/.env.production + backend/staticflow-backend.servicecrates/backend/… (script runs from repo root).
  • crates/backend/.env.example — sample LanceDB paths ../data/…data/… (root-relative) + comment corrected.

Also flipped main.rs dev defaults to root-relative (data/…, crates/frontend/dist) and make dev-backend now runs from root, so every default matches the canonical CWD. Verified: clippy -D warnings on static-flow-backend, 80 backend tests pass, rustfmt on changed files, deps submodules untouched.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

const bridgePath = path.resolve('frontend/static/local_media_player_bridge.js');

P2 Badge Point the bridge test at the moved frontend asset

After this move, the existing Node bridge test still resolves frontend/static/local_media_player_bridge.js, which no longer exists under the repo root. Running node --test crates/frontend/tests/local_media_player_bridge.test.mjs now fails immediately with ENOENT before exercising any assertions; the path needs to be updated to the new crates/frontend/static/... location or derived relative to this test file.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

acking-you and others added 2 commits June 1, 2026 21:35
Codex CR (round 2) on PR #23: the Node bridge test resolved
`frontend/static/local_media_player_bridge.js` via `path.resolve` against
the process CWD, which broke after the crates/ move — running
`node --test crates/frontend/tests/local_media_player_bridge.test.mjs`
failed with ENOENT before any assertion ran.

Resolve the bridge asset relative to the test file itself via
`fileURLToPath(new URL('../static/...', import.meta.url))`, so it passes
from the repo root, the crate dir, or anywhere. Dropped the now-unused
`node:path` import.

Verified: `node --test` passes 3/3 from the repo root.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sweep the 28 living top-level docs/*.md so structural crate-root path
references (backend/src, frontend/dist, llm-access/src, Cargo.toml,
Trunk.toml, ...) point at the new crates/<name>/ location.

Whitelist-gated transform: only rewrites <crate>/<segment> when <segment>
is a real crate-root entry, never deployment paths (/mnt/llm-access/...),
URL routes (gateway/v1), module .rs refs, or prose ("shared/exclusive
锁"). The 37 dated plan snapshots under docs/superpowers/plans/ are left
intact as point-in-time historical records.

497 rewrites across 28 files; no double-prefix, no deployment/URL
collision (verified).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@acking-you
Copy link
Copy Markdown
Owner Author

Codex round-2 — finding fixed (a5f2ef0) + docs path sweep (22d2939)

local_media_player_bridge.test.mjs — real breakage, fixed. The test resolved frontend/static/local_media_player_bridge.js via path.resolve against the process CWD, so after the move node --test crates/frontend/tests/local_media_player_bridge.test.mjs failed with ENOENT before any assertion. Now resolved relative to the test file itself via fileURLToPath(new URL("../static/...", import.meta.url)) — CWD-independent. Verified: passes 3/3 from the repo root.

Docs — also swept the 28 living docs/*.md so structural crate-root path refs now point at crates/<name>/ (whitelist-gated: only real crate-root entries like src/dist/Cargo.toml, never deployment paths like /mnt/llm-access/..., URL routes, or prose). The 37 dated plan snapshots under docs/superpowers/plans/ are intentionally left as point-in-time historical records.

All four file classes that the move touched (Rust source, shell/JS scripts, Cargo manifests, docs) are now consistent under the repo-root run convention.

`test_llm_access_duckdb_build_config.sh` greps
`docs/llm-access-cdc-storage-design.zh.md` for duckdb config strings, but
that doc was removed before this PR (added in abbc778, deleted in
a9b8733) and is absent on master — master's own copy of this test was
already failing on these two lines. The reference is unrelated to the
crates/ move (it's a doc grep, not a crate path) and nothing invokes this
script in CI.

The code-side config those lines cross-checked is still asserted by the
retained `.cargo/config.toml` + `crates/llm-access*/Cargo.toml` greps, so
drop the two dead doc-grep lines. The test now passes clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@acking-you acking-you merged commit 79afa23 into master Jun 1, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant