Skip to content

feat(courtlistener): add /api/proxy/courtlistener proxy for ChittyCounsel#234

Open
chitcommit wants to merge 2 commits into
mainfrom
feat/courtlistener-proxy
Open

feat(courtlistener): add /api/proxy/courtlistener proxy for ChittyCounsel#234
chitcommit wants to merge 2 commits into
mainfrom
feat/courtlistener-proxy

Conversation

@chitcommit

@chitcommit chitcommit commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a CourtListener REST v4 proxy at /api/proxy/courtlistener/* for ChittyCounsel (PR https://github.com/chittycorp/CHITTYCOUNSEL/pull/4). Mirrors the established google.js proxy pattern with CourtListener-specific behaviors per spec.

Endpoints

  • GET /api/proxy/courtlistener/search — unified search, query passthrough
  • GET /api/proxy/courtlistener/:resource[/:id] — allowlisted resource fetch
  • POST /api/proxy/courtlistener/citation-lookup — citation lookup, body forwarded verbatim

Allowlist: search, opinions, clusters, dockets, docket-entries, recap-documents, audio, citation-lookup, people, courts. Unknown resources → 404 before any upstream call.

Spec compliance vs google.js (deltas)

  • Authorization: Token <key> (CL convention, not Bearer)
  • Fail-closed 503 POLICY_BLOCKED_CHITTYCONNECT_UNAVAILABLE when token absent
  • 15-min Cache API wrap on GETs; X-Bypass-Cache: 1 escape
  • POST /citation-lookup body forwarding
  • Resource allowlist enforced server-side

Credential plane (synthetic env model)

Token is delivered via Cloudflare Secrets Store binding COURTLISTENER_API_TOKEN declared in wrangler.jsonc at:

  • top-level (inherited by dev)
  • env.staging.secrets_store_secrets
  • env.production.secrets_store_secrets

All three reference store e914522471964c3c8cf1e601770edcc3.

Status of the secret VALUE: NOT yet present in the store — verified via npx wrangler secrets-store secret list e914522471964c3c8cf1e601770edcc3 --remote --per-page 100 (no entry matching court). This is a known provisioning gap that chico owns downstream. Until the value is written, the 503 fail-closed path is exercised — deploy of this PR is safe and non-disruptive (no behavior change to existing routes; new routes simply return 503 to consumers).

No operator credential paste path. No op:// URI dependency.

Tests

21 new unit tests in tests/api/courtlistener-routes.test.js, mirroring the established google-routes.test.js pattern (vi.mock on credential-helper — repo-local convention for route unit tests; real service-integration smoke is the post-deploy curl evidence in the test plan below).

Test Files  12 passed (12)
Tests      137 passed (137)

Files

  • src/api/routes/courtlistener.js (new) — proxy implementation
  • src/api/router.js — mounts at /api/proxy/courtlistener
  • wrangler.jsonc — 3 secrets_store_secrets binding entries (top + staging + production)
  • public/openapi.json — 4 path entries under CourtListener tag
  • tests/api/courtlistener-routes.test.js (new) — 21 tests

Test plan

  • npx vitest run tests/api/courtlistener-routes.test.js — 21/21 pass
  • npx vitest run tests/api/ — 137/137 pass (no regressions)
  • npx eslint src/api/routes/courtlistener.js tests/api/courtlistener-routes.test.js — clean
  • wrangler.jsonc parses (JSONC strip-comments + JSON.parse)
  • public/openapi.json parses
  • Operator approval before npm run deploy:staging / npm run deploy:production (safe-deploy)
  • chico provisions COURTLISTENER_API_TOKEN value into Cloudflare Secrets Store e914522471964c3c8cf1e601770edcc3
  • Post-deploy real curl evidence (per the spec):
    curl -sS -H "Authorization: Bearer $CHITTY_CONNECT_TEST_TOKEN" \
      "https://connect.chitty.cc/api/proxy/courtlistener/search?type=o&q=miranda&page_size=3" \
      | jq '{ok, count: (.data.count // 0), first: .data.results[0].caseName}'

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features

  • Added CourtListener proxy API endpoints for search queries, citation lookup, and allowlisted resource access with integrated caching support.

…nsel

Adds a CourtListener REST v4 proxy mirroring the established google.js
route pattern, with CourtListener-specific behaviors per spec:

- Token auth uses `Authorization: Token <key>` (CL convention, not Bearer)
- Token resolved from `env.COURTLISTENER_API_TOKEN` (Cloudflare Secrets
  Store binding declared in wrangler.jsonc for top-level + staging +
  production envs against store e914522471964c3c8cf1e601770edcc3),
  falling back to the credential broker
- Fail-closed 503 `POLICY_BLOCKED_CHITTYCONNECT_UNAVAILABLE` when token
  is not provisioned (no operator credential paste path)
- Resource allowlist enforced before any upstream call:
  search, opinions, clusters, dockets, docket-entries, recap-documents,
  audio, citation-lookup, people, courts
- Routes: GET /search, GET /:resource[/:id], POST /citation-lookup
- 15-min Cache API wrap on GETs; bypass via `X-Bypass-Cache: 1`
- Upstream error bodies truncated to 200 chars; token never echoed

Consumer: ChittyCounsel (chittycorp/CHITTYCOUNSEL PR #4).

Synthetic env credential plane status: binding declared in all three
env blocks (top-level + staging + production). The secret VALUE
`COURTLISTENER_API_TOKEN` is NOT yet present in Cloudflare Secrets
Store e914522471964c3c8cf1e601770edcc3 — verified via
`wrangler secrets-store secret list`. Until chico/credential
provisioner writes the value, the 503 fail-closed path is exercised.
Deploy is safe and non-disruptive (no behavior change to existing
routes); operator approval still required before
`npm run deploy:production`.

Test coverage: 21 unit tests covering token resolution (3 sources +
fail-closed), search, allowlist enforcement (incl. 404 on unknown
resource without upstream call), citation-lookup POST forwarding,
Cache API HIT/MISS + bypass, upstream error truncation, and Token
(not Bearer) auth scheme. Mirrors established google-routes.test.js
pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 4, 2026 05:50
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
❌ Deployment failed
View logs
chittyconnect 812b076 Jun 04 2026, 10:15 AM

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@chitcommit, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 44 minutes and 49 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 045f4949-e212-49cc-94d8-b193db516c8a

📥 Commits

Reviewing files that changed from the base of the PR and between b39b08a and 812b076.

📒 Files selected for processing (3)
  • public/openapi.json
  • src/api/routes/courtlistener.js
  • tests/api/courtlistener-routes.test.js
📝 Walkthrough

Walkthrough

Adds a CourtListener REST API proxy with token-based authentication, resource allowlisting, and optional response caching. Includes router integration, configuration management, OpenAPI documentation, and comprehensive test coverage for token resolution, endpoint forwarding, cache behavior, and security controls.

Changes

CourtListener Proxy Implementation

Layer / File(s) Summary
Token resolution and fail-closed behavior
src/api/routes/courtlistener.js
Defines an allowlist of permitted CourtListener v4 resources and implements token resolution from env.COURTLISTENER_API_TOKEN (with { get() } accessor normalization) and credential broker fallback. Returns 503 with a fixed error code when no token is available.
GET proxy implementation with Cache API
src/api/routes/courtlistener.js
Core GET request forwarding with optional Cloudflare Cache API wrapping (15-minute TTL), X-Cache hit/miss tagging, X-Bypass-Cache support, error body truncation to 200 characters, and JSON validation (502 on parse failure).
Endpoint implementations and URL building
src/api/routes/courtlistener.js
Wires GET /search (forwards query string), POST /citation-lookup (forwards raw body without caching), and generic GET /:resource[/:id] (allowlist-gated with URL encoding for id and 405 blocking of POST-only routes).
API router wiring
src/api/router.js
Imports courtlistenerRoutes and mounts it at /api/proxy/courtlistener in the main API router.
Cloudflare Secrets Store configuration
wrangler.jsonc
Adds COURTLISTENER_API_TOKEN binding to shared, staging, and production environments in secrets_store_secrets arrays.
OpenAPI endpoint documentation
public/openapi.json
Documents four new CourtListener proxy paths (/search, /citation-lookup, /{resource}, /{resource}/{id}) with query parameters, request bodies, response shapes, and 503 error conditions.
Comprehensive test coverage
tests/api/courtlistener-routes.test.js
Adds 349 lines of Vitest tests validating token resolution (string and accessor forms), fail-closed 503 behavior, GET /search forwarding with query preservation, allowlist enforcement with 404/405 responses, POST /citation-lookup body forwarding, cache hit/miss/bypass behavior, and upstream error truncation and JSON validation.

Sequence Diagram

sequenceDiagram
  participant Client
  participant ProxyRouter
  participant TokenResolver
  participant UpstreamAPI
  participant Cache as Cache API
  Client->>ProxyRouter: GET /search?q=term
  ProxyRouter->>TokenResolver: getCourtListenerToken()
  TokenResolver-->>ProxyRouter: token
  ProxyRouter->>Cache: Check cache
  Cache-->>ProxyRouter: miss
  ProxyRouter->>UpstreamAPI: GET /search/?q=term + Authorization: Token
  UpstreamAPI-->>ProxyRouter: 200 JSON
  ProxyRouter->>Cache: Store with TTL
  ProxyRouter-->>Client: 200 JSON + X-Cache: MISS
  Client->>ProxyRouter: POST /citation-lookup
  ProxyRouter->>TokenResolver: getCourtListenerToken()
  TokenResolver-->>ProxyRouter: token
  ProxyRouter->>UpstreamAPI: POST /citation-lookup/ + body + Authorization: Token
  UpstreamAPI-->>ProxyRouter: 200 citations
  ProxyRouter-->>Client: 200 citations
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Poem

🐰 A proxy for courts, now hopping through,
With tokens secured and cached TTL too,
Allowlists guard the upstream request,
Fail-closed when secrets aren't blessed!
Forward the search, the citations bright—
CourtListener's bridge, built secure and right. 📜✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is comprehensive but does not follow the required template structure with explicit sections for Summary, Security & Access, Docs, and Validation. Restructure the description to match the template with explicit Summary, Security & Access (with checkboxes), Docs, and Validation sections including CI status and approval requirements.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding a CourtListener proxy endpoint at /api/proxy/courtlistener.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/courtlistener-proxy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chitcommit

Copy link
Copy Markdown
Contributor Author

Status — credential lane wired, awaiting non-paste secret population

ChittyConnect concierge re-dispatch checkpoint:

Exhaustive credential search results (no CourtListener token exists on this infrastructure):

  • synthetic-shared (1P, service-account scope): 112 items, none CL-related
  • synthetic-prod (1P, service-account scope): 2 items, none CL-related
  • ChittyOS-Core (1P, Connect scope): no CL items
  • ChittyOS (1P, Connect scope): only the empty shell chittypro-legallink-prod (id stedicju7zt6w3aqme3fw3lpuy, credential field has no value, created 2026-03-16, never populated)
  • CF Secrets Store e914522471964c3c8cf1e601770edcc3 (all pages): no entry matching COURTLISTENER* / court* / listener* / cl_*
  • chittyconnect-production Worker secrets: none CL-related
  • All CREDENTIAL_CACHE KV namespaces: no CL keys
  • ChittyConnect broker /api/credentials/courtlistener: returns 404 (no broker entry)
  • chittypro-legallink repo (CHICAGOAPPS): no CourtListener / Free Law refs

What's ready to merge (this PR): route + tests + wrangler binding all green on Test/Lint/CodeQL/Security/compliance. Code is fail-closed — when env.COURTLISTENER_API_TOKEN is unset, the proxy returns 503 POLICY_BLOCKED_CHITTYCONNECT_UNAVAILABLE per spec, never asks for a credential paste.

Remaining blockers (both require human signoff, neither requires a chat paste):

  1. Governance labelssecurity-approved, docs-approved, access-reviewed. wrangler.jsonc touches a new secret binding, so the label gate fires as designed.
  2. CF Secrets Store population — operator populates secret COURTLISTENER_API_TOKEN in store e914522471964c3c8cf1e601770edcc3 via the Cloudflare dashboard (or wrangler secrets-store secret create). Token never enters chat, repo, or 1P UI.

Pairs with: chittycorp/CHITTYCOUNSEL PR #4 (counsel-side /api/research/* proxy). That PR has no infra changes and can merge independently once this PR + secret are landed.

Reference: chittycanon://core/services/chittyconnect#courtlistener-proxy

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new CourtListener REST v4 proxy under /api/proxy/courtlistener/* to support ChittyCounsel, following the established proxy patterns (token injection, allowlisting, cache wrapping, OpenAPI docs, and unit tests).

Changes:

  • Adds src/api/routes/courtlistener.js implementing GET /search, GET /:resource[/:id] (allowlisted), and POST /citation-lookup with token resolution and Cache API wrapping for GETs.
  • Mounts the new proxy router at /api/proxy/courtlistener and documents the endpoints in public/openapi.json.
  • Adds CourtListener secrets-store bindings to wrangler.jsonc and introduces a new unit test suite for the route.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/api/routes/courtlistener.js New CourtListener proxy implementation (token resolution, allowlist, cache wrap, error handling).
src/api/router.js Mounts the CourtListener proxy router under /api/proxy/courtlistener.
wrangler.jsonc Adds COURTLISTENER_API_TOKEN secrets-store binding at top-level + staging + production.
public/openapi.json Documents CourtListener proxy endpoints and parameters.
tests/api/courtlistener-routes.test.js Adds route-level unit tests mirroring the existing proxy testing approach.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +179 to +183
function buildUpstream(resource, id, search) {
const path = id ? `${resource}/${encodeURIComponent(id)}` : `${resource}`;
const qs = search ? (search.startsWith("?") ? search : `?${search}`) : "";
return `${CL_API}/${path}/${qs}`;
}
Comment on lines +110 to +114
if (hit) {
// Clone — Cache API responses are immutable
const cloned = new Response(hit.body, hit);
cloned.headers.set("X-Cache", "HIT");
return cloned;
Comment thread src/api/routes/courtlistener.js Outdated
Comment on lines +137 to +141
const body = await upstream.text().catch(() => "");
return c.json(
{ error: `CourtListener API ${upstream.status}: ${body.slice(0, 200)}` },
upstream.status,
);
Comment thread src/api/routes/courtlistener.js Outdated
Comment on lines +233 to +237
const text = await upstream.text().catch(() => "");
return c.json(
{ error: `CourtListener API ${upstream.status}: ${text.slice(0, 200)}` },
upstream.status,
);
const body = await res.json();
// Truncation may still surface the token from upstream — but the proxy itself
// does not add it. Assert no extra header echo and the token isn't in error key path.
expect(body.error).not.toContain("Authorization");
Comment thread wrangler.jsonc
{ "binding": "CHITTYAUTH_ISSUED_MINT_TOKEN", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" },
{ "binding": "MINT_API_KEY", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" }
{ "binding": "MINT_API_KEY", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" },
{ "binding": "COURTLISTENER_API_TOKEN", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "COURTLISTENER_API_TOKEN" }
Comment thread wrangler.jsonc
{ "binding": "CHITTYAUTH_ISSUED_MINT_TOKEN", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" },
{ "binding": "MINT_API_KEY", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" }
{ "binding": "MINT_API_KEY", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "chittyauth_issued_mint_api_key" },
{ "binding": "COURTLISTENER_API_TOKEN", "store_id": "e914522471964c3c8cf1e601770edcc3", "secret_name": "COURTLISTENER_API_TOKEN" }

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b39b08ac1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +109 to +114
const hit = await cache.match(cacheReq);
if (hit) {
// Clone — Cache API responses are immutable
const cloned = new Response(hit.body, hit);
cloned.headers.set("X-Cache", "HIT");
return cloned;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check the token before serving cache hits

When the Cache API has a hit, this returns the cached CourtListener response before getCourtListenerToken() runs. If the token is removed, missing in a deployed environment, or the Secrets Store binding is temporarily unavailable after a previous successful request populated the cache, the documented fail-closed 503 POLICY_BLOCKED_CHITTYCONNECT_UNAVAILABLE path is bypassed for that URL and callers keep receiving cached data. Resolve the token before cache.match() or skip cache hits when token resolution fails.

Useful? React with 👍 / 👎.

* The cache key is the absolute upstream URL; bypass via X-Bypass-Cache: 1.
*/
async function clGet(c, upstreamUrl) {
const bypass = c.req.header("X-Bypass-Cache") === "1";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add X-Bypass-Cache to CORS allowHeaders

This new bypass path depends on browser clients sending X-Bypass-Cache: 1, but the router's CORS middleware only allows Content-Type, auth, Chitty, and MCP headers, so requests from the allowed ChittyCounsel origin that include this header will fail preflight before reaching this route. Add X-Bypass-Cache to the CORS allowHeaders list or the documented bypass mechanism only works for non-browser callers.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@public/openapi.json`:
- Around line 1336-1339: OpenAPI currently omits the proxy's 502 error for
CourtListener endpoints; update the CourtListener operation responses in
public/openapi.json to include a 502 response (e.g., "description": "Bad Gateway
- upstream fetch failed or invalid JSON") for the same GET and POST proxy
operations described in src/api/routes/courtlistener.js (the proxy handlers that
perform upstream fetch/JSON parsing), ensuring each operation that lists
200/404/503 also lists 502 so the spec matches runtime behavior.

In `@src/api/routes/courtlistener.js`:
- Around line 77-82: getCredential is being passed an undeclared env fallback
("COURTLISTENER_API_TOKEN_FALLBACK") which reintroduces a third credential path;
remove that fallback argument so the call uses the broker/secrets-only
resolution. Locate the call to getCredential in courtlistener.js and change the
invocation to pass the env, the secret path
("integrations/courtlistener/api_token") and the service name ("CourtListener")
only (i.e., drop the fallbackEnvVar parameter), keeping the getCredential
signature usage consistent with credential-helper.js.
- Around line 107-123: The cached-response path returns a HIT before verifying
the CourtListener token, violating fail-closed; fix by obtaining and validating
the token (call getCourtListenerToken(c.env) and check !token with
failClosedNoToken) before returning any cached Response in the cache.match
branch (while preserving the existing try/catch and X-Cache header logic in the
cache handling code paths such as the cache hit branch and the functions
getCourtListenerToken and failClosedNoToken).

In `@tests/api/courtlistener-routes.test.js`:
- Around line 147-155: The test uses a loose length check
(expect(body.error.length).toBeLessThan(400)) which doesn't enforce the 200-char
truncation contract; update the assertions in the "propagates upstream non-2xx
status and truncates body to 200 chars" test to (1) assert the response contains
the first 200 characters of longBody
(expect(body.error).toContain(longBody.slice(0,200))), (2) find where that
200-char prefix appears (const prefixIndex =
body.error.indexOf(longBody.slice(0,200));
expect(prefixIndex).toBeGreaterThan(-1)), and (3) assert the total length is at
most that prefix position + 200
(expect(body.error.length).toBeLessThanOrEqual(prefixIndex + 200)), optionally
also assert the 201st char is not present
(expect(body.error).not.toContain(longBody.slice(200,201))). This replaces the
loose expect(body.error.length).toBeLessThan(400) check and uses longBody,
get("/search"...), and body.error to locate and verify truncation.
- Around line 163-170: Update the "never echoes the API token in error bodies"
test to also assert the actual token value is absent from the error string:
after getting body.error, add an assertion that it does not contain the token
from the test env (use the env token variable used when calling get, e.g.
env.COURT_LISTENER_TOKEN or fallback to the literal "test-cl-token" used in the
mock). Keep the existing Authorization check and ensure you reference the test's
get("/search", env, ...) call and body.error in the assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cf58de92-34cb-4823-a95e-84ceb5cb73e3

📥 Commits

Reviewing files that changed from the base of the PR and between ea5cc5e and b39b08a.

📒 Files selected for processing (5)
  • public/openapi.json
  • src/api/router.js
  • src/api/routes/courtlistener.js
  • tests/api/courtlistener-routes.test.js
  • wrangler.jsonc

Comment thread public/openapi.json
Comment on lines +1336 to +1339
"responses": {
"200": { "description": "Search results", "content": { "application/json": { "schema": { "type": "object" } } } },
"503": { "description": "CourtListener token not provisioned" }
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Document the proxy’s 502 error path in the CourtListener operations.

src/api/routes/courtlistener.js returns 502 for upstream fetch failures and invalid JSON on both the GET and POST proxy paths, but these operations only advertise 200/404/503. That leaves /openapi.json out of sync with the real runtime contract.

Also applies to: 1351-1354, 1365-1369, 1380-1384

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@public/openapi.json` around lines 1336 - 1339, OpenAPI currently omits the
proxy's 502 error for CourtListener endpoints; update the CourtListener
operation responses in public/openapi.json to include a 502 response (e.g.,
"description": "Bad Gateway - upstream fetch failed or invalid JSON") for the
same GET and POST proxy operations described in src/api/routes/courtlistener.js
(the proxy handlers that perform upstream fetch/JSON parsing), ensuring each
operation that lists 200/404/503 also lists 502 so the spec matches runtime
behavior.

Comment on lines +77 to +82
return getCredential(
env,
"integrations/courtlistener/api_token",
"COURTLISTENER_API_TOKEN_FALLBACK",
"CourtListener",
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the undeclared env fallback from token resolution.

src/lib/credential-helper.js falls back to env[fallbackEnvVar] when the broker misses, so passing "COURTLISTENER_API_TOKEN_FALLBACK" silently reopens a third credential path outside the Secrets Store / broker-only policy described for this proxy.

Suggested fix
   return getCredential(
     env,
     "integrations/courtlistener/api_token",
-    "COURTLISTENER_API_TOKEN_FALLBACK",
+    undefined,
     "CourtListener",
   );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return getCredential(
env,
"integrations/courtlistener/api_token",
"COURTLISTENER_API_TOKEN_FALLBACK",
"CourtListener",
);
return getCredential(
env,
"integrations/courtlistener/api_token",
undefined,
"CourtListener",
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/routes/courtlistener.js` around lines 77 - 82, getCredential is being
passed an undeclared env fallback ("COURTLISTENER_API_TOKEN_FALLBACK") which
reintroduces a third credential path; remove that fallback argument so the call
uses the broker/secrets-only resolution. Locate the call to getCredential in
courtlistener.js and change the invocation to pass the env, the secret path
("integrations/courtlistener/api_token") and the service name ("CourtListener")
only (i.e., drop the fallbackEnvVar parameter), keeping the getCredential
signature usage consistent with credential-helper.js.

Comment thread src/api/routes/courtlistener.js
Comment on lines +147 to +155
it("propagates upstream non-2xx status and truncates body to 200 chars", async () => {
const longBody = "x".repeat(500);
globalThis.fetch = vi.fn().mockResolvedValue(errorResponse(403, longBody));
const res = await get("/search", env, { query: "q=foo" });
expect(res.status).toBe(403);
const body = await res.json();
expect(body.error).toContain("403");
expect(body.error.length).toBeLessThan(400);
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Truncation assertions are too loose to enforce the 200-char contract.

Using length < 400 can still pass if truncation regresses well beyond 200 characters. Assert against the contract directly (e.g., max prefix + 200 body chars) so failures are deterministic.

Suggested tightening
 it("propagates upstream non-2xx status and truncates body to 200 chars", async () => {
   const longBody = "x".repeat(500);
   globalThis.fetch = vi.fn().mockResolvedValue(errorResponse(403, longBody));
   const res = await get("/search", env, { query: "q=foo" });
   expect(res.status).toBe(403);
   const body = await res.json();
   expect(body.error).toContain("403");
-  expect(body.error.length).toBeLessThan(400);
+  expect(body.error).toBe(`CourtListener API 403: ${"x".repeat(200)}`);
 });

 it("returns upstream error with truncation", async () => {
   globalThis.fetch = vi.fn().mockResolvedValue(errorResponse(400, "y".repeat(500)));
   const res = await post("/citation-lookup", env, { body: { text: "garbage" } });
   expect(res.status).toBe(400);
   const body = await res.json();
-  expect(body.error.length).toBeLessThan(400);
+  expect(body.error).toBe(`CourtListener API 400: ${"y".repeat(200)}`);
 });

Also applies to: 272-278

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/api/courtlistener-routes.test.js` around lines 147 - 155, The test uses
a loose length check (expect(body.error.length).toBeLessThan(400)) which doesn't
enforce the 200-char truncation contract; update the assertions in the
"propagates upstream non-2xx status and truncates body to 200 chars" test to (1)
assert the response contains the first 200 characters of longBody
(expect(body.error).toContain(longBody.slice(0,200))), (2) find where that
200-char prefix appears (const prefixIndex =
body.error.indexOf(longBody.slice(0,200));
expect(prefixIndex).toBeGreaterThan(-1)), and (3) assert the total length is at
most that prefix position + 200
(expect(body.error.length).toBeLessThanOrEqual(prefixIndex + 200)), optionally
also assert the 201st char is not present
(expect(body.error).not.toContain(longBody.slice(200,201))). This replaces the
loose expect(body.error.length).toBeLessThan(400) check and uses longBody,
get("/search"...), and body.error to locate and verify truncation.

Comment thread tests/api/courtlistener-routes.test.js Outdated
C1 — Drop `Cache-Control: public, max-age=900` from proxy responses.
The Worker Cache API stores responses internally and does not require
this header for HIT semantics. Emitting `public, max-age=…` would also
instruct downstream browsers/CDNs to cache responses from an
authenticated proxy. Matches google.js.

I1 — Reject unknown resources with `400 { error: "path_not_allowed",
resource }` per spec (was 404). Updated openapi.json accordingly:
generic-resource path now declares 400 with a typed error schema;
{resource}/{id} retains a separate 404 for upstream not-found.

I3 — Defense-in-depth: scrub the resolved API token from upstream
error bodies on both GET and POST paths before truncation. If
CourtListener ever reflected the Authorization header value back in
its error response, the token would no longer leak through the proxy.

I5 — Align getCredential(...) call signature with google.js:
(env, brokerPath, fallbackEnvVar, logPrefix). The fallback env var
name now matches the Secrets Store binding name (was a bogus
`_FALLBACK` suffix), so a single configured env value works whether
delivered as a Secrets Store binding or a plain Worker env var.

I6 — Pin upstream Content-Type to application/json on POST
/citation-lookup. CourtListener's citation-lookup endpoint only
accepts JSON; forwarding an arbitrary client-supplied Content-Type
would let a caller confuse the upstream parser.

Tests
-----
- Update existing 404-allowlist test to assert 400 path_not_allowed +
  resource field.
- Strengthen "never echoes the API token" assertion to verify scrub
  marker `[REDACTED]` and absence of the literal token string.
- Cache-miss test asserts Cache-Control header is now absent (was
  asserting max-age=900).
- New: success-path body does not contain the token; CL 429 status
  propagates verbatim; POST scrubs token in error body; POST pins
  Content-Type to application/json regardless of client header;
  POST never consults Cache API.

Deferred (not blocking):
- I2 (dead citation-lookup in GET allowlist): intentional — the
  allowlist entry causes the generic GET route to match and emit
  405 instead of 400 path_not_allowed, which is the correct method
  signal for a wrong-verb call.
- I4 (cache key not tenant-scoped): intentional and documented inline
  — CourtListener data is public Free Law Project content and
  identical across tenants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@chitcommit

Copy link
Copy Markdown
Contributor Author

Review-fix pass on feat/courtlistener-proxy — commit 812b076.

Addressed

C1 — Dropped Cache-Control: public, max-age=900 from proxy responses. Worker Cache API stores internally; emitting that header would also instruct downstream browsers/CDNs to cache authenticated-proxy responses. Matches google.js. Cache-miss test updated to assert the header is absent.

I1 — Unknown resources now return 400 { error: "path_not_allowed", resource } (was 404). openapi.json updated: generic-resource path declares a typed 400 schema; {resource}/{id} retains 404 for upstream not-found. Allowlist test updated.

I3 — Added scrubToken(text, token) and applied it on both GET and POST error paths before truncation. If CourtListener ever reflects the Authorization header value in an error body, the token no longer leaks. Existing "never echoes the API token" assertion strengthened from not.toContain("Authorization") to: token literal absent + [REDACTED] marker present.

I5 — Aligned getCredential(env, brokerPath, fallbackEnvVar, logPrefix) signature with google.js. The fallback env var name is now COURTLISTENER_API_TOKEN (was a bogus _FALLBACK suffix that pointed nowhere), so the same configured value works whether delivered as a Secrets Store binding or a plain Worker env var.

I6 — Pinned upstream Content-Type: application/json on POST /citation-lookup regardless of client header. CourtListener's citation-lookup only accepts JSON; forwarding a client-supplied Content-Type would let a caller confuse the upstream parser.

New tests (I7)

  • 400 path_not_allowed includes the offending resource string
  • scrubs the resolved API token from error body (defense-in-depth) (GET)
  • does not leak the token in success-path response bodies
  • propagates upstream 429 rate-limit status to the client
  • scrubs the API token from POST error bodies
  • pins upstream Content-Type to application/json regardless of client header
  • never consults Cache API for POST (POST is not cached)

Deferred (not blocking — intentional)

  • I2citation-lookup in the GET allowlist is intentional. Keeping it means GET /citation-lookup matches the generic route and returns 405 Method Not Allowed (correct verb signal), instead of falling through to 400 path_not_allowed (wrong signal: the path is allowed, the verb isn't).
  • I4 — Cache key not tenant-scoped: intentional and documented inline. CourtListener data is public Free Law Project content and identical across tenants; tenant-scoping would balloon cache cardinality with no isolation benefit.

Validation

Test Files  40 passed | 1 skipped (41)
     Tests  469 passed | 1 skipped (470)

Baseline was 463 passed (+6 new tests, all green). No deploy performed.

@chitcommit

Copy link
Copy Markdown
Contributor Author

Deploy deferred pending credential routing. Blocker tracked in #237COURTLISTENER_API_TOKEN is not reachable from the prod VM via the canonical 1Password → Secrets Store path (no ChittyOS-Integrations vault access, empty Secrets Store binding, stale CHITTY_CONNECT_TOKEN in synthetic-shared).

Code/tests on this PR are merge-ready independently. Will redeploy and re-run /health once #237 resolves. No credential paste path — operator is OPERATOR ONLY per the ChittyConnect credential layer policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants