chore: prune crawler/rag standalone-service references and fix fallout#1931
Conversation
RAG/crawler and search no longer run as standalone HTTP services — that logic is in-process in the Convex backend, which talks to the knowledge DB directly. Remove the now-dangling references across config, docs, and code, and fix the two latent bugs the leftover wiring caused: - status-probe probed dead rag:8001 / crawler:8002 endpoints and so reported a permanent 'degraded' status; probe only Convex liveness now. - the single-node CLI stack had no 'knowledge-db' host, so the in-process RAG/crawler code couldn't resolve its corpus on 'tale start'/'tale deploy'; add a 'knowledge-db' network alias on the db service. Also drops the removed 'rag'/'crawler' commitlint scopes, the rag entry from the controller restart allowlist, and the obsolete RAG_URL / CRAWLER_URL / SEARCH_SERVICE_URL env defaults.
📝 WalkthroughWalkthroughRAG and crawler are removed as standalone external services and treated as in-process functionality within the Convex backend. The Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@services/controller/src/server.ts`:
- Line 32: The ALLOWED Set defined at line 32 has its contents hardcoded in
multiple places: in a 400 error response message (around lines 144-146) and in a
startup log message (around line 202). To prevent drift when ALLOWED is updated,
convert the Set to an Array and derive the display text from it. Create a
derived constant that converts the ALLOWED Set to a comma-separated string or
Array format, then use this derived value in both the error response and the
startup log message instead of hardcoding the allowlist values. This ensures all
three locations stay synchronized whenever ALLOWED changes.
In `@services/platform/status-probe.test.ts`:
- Line 43: The test file uses unsafe type casts with the pattern `as unknown as
typeof fetch` repeatedly throughout multiple test calls to probeServices. Create
a single properly typed mock fixture that implements the fetch interface
correctly, then replace all instances of the `doFetch as unknown as typeof
fetch` cast pattern with a reference to this typed mock fixture. This eliminates
the unsafe type assertions and adheres to TypeScript best practices by providing
proper typing for the fetch mock without relying on type casts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d500453a-5603-4b01-b68b-669e5544a746
📒 Files selected for processing (25)
.commitlintrc.json.env.test.github/CONTRIBUTING.mdREADME.de.mdREADME.fr.mdREADME.mdcompose.web.ymlcompose.ymldocs/de/self-hosted/operate/release-notes/format.mddocs/en/self-hosted/operate/release-notes/format.mddocs/fr/self-hosted/operate/release-notes/format.mdpackages/shared/src/config/base.tsservices/controller/README.mdservices/controller/src/server.tsservices/platform/README.mdservices/platform/convex/node_only/sandbox/helpers/spawner_client.tsservices/platform/env.shservices/platform/lib/shared/schemas/providers.tsservices/platform/server.test.tsservices/platform/status-probe.test.tsservices/platform/status-probe.tsservices/proxy/README.mdtools/cli/src/lib/compose/services/create-db-service.test.tstools/cli/src/lib/compose/services/create-db-service.tstools/opengrep/config.yml
💤 Files with no reviewable changes (1)
- .commitlintrc.json
| * effect on apply-and-restart (the spawner reads it at boot). RAG/crawler used | ||
| * to be here too, but that logic is now in-process in `convex` — a knowledge | ||
| * config change restarts `convex`, not a standalone service. */ | ||
| const ALLOWED = new Set(['convex', 'sandbox']); |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
Derive allowlist display text from ALLOWED to prevent drift.
The set {convex, sandbox} is duplicated in both the 400 error and startup log. If ALLOWED changes again, behavior and messages can diverge.
♻️ Proposed refactor
const ALLOWED = new Set(['convex', 'sandbox']);
+const ALLOWED_TEXT = `{${Array.from(ALLOWED).join(', ')}}`;
@@
- error: `services must be a non-empty subset of {convex, sandbox}${
+ error: `services must be a non-empty subset of ${ALLOWED_TEXT}${
invalid.length ? `; rejected: ${invalid.join(', ')}` : ''
}`,
@@
- `[controller] listening on :${PORT} — allowlist {convex, sandbox}, project=${PROJECT ?? '(any)'}`,
+ `[controller] listening on :${PORT} — allowlist ${ALLOWED_TEXT}, project=${PROJECT ?? '(any)'}`,Also applies to: 144-146, 202-202
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/controller/src/server.ts` at line 32, The ALLOWED Set defined at
line 32 has its contents hardcoded in multiple places: in a 400 error response
message (around lines 144-146) and in a startup log message (around line 202).
To prevent drift when ALLOWED is updated, convert the Set to an Array and derive
the display text from it. Create a derived constant that converts the ALLOWED
Set to a comma-separated string or Array format, then use this derived value in
both the error response and the startup log message instead of hardcoding the
allowlist values. This ensures all three locations stay synchronized whenever
ALLOWED changes.
| test('returns operational with all components up when every probe returns 2xx', async () => { | ||
| test('returns operational with the application up when the probe returns 2xx', async () => { | ||
| const doFetch = vi.fn(() => Promise.resolve(okResponse())); | ||
| const result = await probeServices(doFetch as unknown as typeof fetch); |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Expect 0 matches after refactor.
rg -n "as unknown as typeof fetch" services/platform/status-probe.test.tsRepository: tale-project/tale
Length of output: 1039
Refactor all test calls to use a properly typed fetch mock instead of as unknown as typeof fetch casts.
This pattern violates the TypeScript guideline (never as, unknown, any) and appears throughout the test file on at least 13 lines: 43, 53, 60, 72, 82, 86, 95, 99, 112, 120, 135, 136, 137. Create a single typed mock fixture to eliminate all cast chains at once.
♻️ Refactor pattern
+function fetchMock(response: Response) {
+ return vi.fn(
+ async (_input: RequestInfo | URL, _init?: RequestInit): Promise<Response> =>
+ response,
+ );
+}
@@
- const doFetch = vi.fn(() => Promise.resolve(okResponse()));
- const result = await probeServices(doFetch as unknown as typeof fetch);
+ const doFetch = fetchMock(okResponse());
+ const result = await probeServices(doFetch);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/status-probe.test.ts` at line 43, The test file uses unsafe
type casts with the pattern `as unknown as typeof fetch` repeatedly throughout
multiple test calls to probeServices. Create a single properly typed mock
fixture that implements the fetch interface correctly, then replace all
instances of the `doFetch as unknown as typeof fetch` cast pattern with a
reference to this typed mock fixture. This eliminates the unsafe type assertions
and adheres to TypeScript best practices by providing proper typing for the
fetch mock without relying on type casts.
Sources: Coding guidelines, Learnings
… rename Per-agent `nativeWebTools` opt-in (external-agent only): lifts the managed-mode denial of Claude Code's native WebSearch/WebFetch, threaded config→adapter and exposed as a toggle on the agent editor's Instructions tab (en/de/fr). The integration-skill web-access guidance now matches the agent's actual toolset (native vs. governed), and a GitHub appendix routes a git-auth failure into the existing connect-integration guide flow. Default Claude Code agent cleanup: model is now openrouter:anthropic/claude-opus-4.8 (was a placeholder ccgateway ref that resolved against no provider), no integrations bound by default, and refreshed conversation starters. Bundles the in-progress bifrost→llm-gateway rename (the llm-gateway service + Dockerfile, llm_gateway_admin module, llmGatewayKeyId session field, compose, docs) — intermixed with the above in shared files, so committed together.
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
Two fallout fixes from adding the llm-gateway service in the bifrost→llm-gateway rename: - bun.lock never registered the new @tale/llm-gateway workspace member (matched by the services/* glob), so `bun install --frozen-lockfile` aborted in the shared Setup-toolchain step of every CI job. Regenerate the lockfile so it carries the workspace entry. - the new services/llm-gateway/Dockerfile had no USER instruction, so Trivy raised AVD-DS-0002 (HIGH) — "image user should not be root". Re-assert the upstream image's existing non-root appuser (uid 1000, owns /app and the /app/data SQLite store); the runtime is unchanged.
The release pipeline's service matrix was still the old 9-service list and never built tale-controller or tale-llm-gateway, even though both are referenced by compose and the CLI deploy path (llm-gateway is an always-roll service hit on every deploy). #1931 wired llm-gateway into build.yml but missed release.yml. Add both services to the build matrix, the container-test pull loop, the manifest matrix, and the create-release manifest-verify loop.
Summary
Completes the removal of the standalone
crawler/rag(andsearch) HTTP services. That logic now runs in-process inside the Convex backend, which talks to the knowledge DB directly — so the leftover wiring across config, docs, and code is now dead. This prunes those references and fixes the two latent bugs the stale wiring caused.What changed
rag:8001/crawler:8002/healthendpoints, so the public status feed reported a permanentdegraded. Now probes only Convex liveness; the widerdegraded/per-subsystem vocabulary is kept for a future per-subsystem probe. Tests updated to the single-component shape.tale start/tale deploystack had noknowledge-dbhost, so the in-process RAG/crawler code (getKnowledgeDatabaseUrl, default hostknowledge-db) couldn't resolve its corpus. Added aknowledge-dbnetwork alias on the db service so the runtime default resolves identically across the split and single-node topologies (+ test).RAG_URL/CRAWLER_URL/SEARCH_SERVICE_URLenv defaults (env.sh), the removedrag/crawlercommitlint scopes, and theragentry from the controller restart allowlist (now{convex, sandbox}).Verification
status-probeandcreate-db-serviceunit tests updated alongside the code changes.Summary by CodeRabbit
Documentation
Refactor