Skip to content

feat: multi-replica readiness (P1 hardening)#538

Merged
Zheaoli merged 1 commit into
mainfrom
feat/multi-replica-p1-hardening
Jun 22, 2026
Merged

feat: multi-replica readiness (P1 hardening)#538
Zheaoli merged 1 commit into
mainfrom
feat/multi-replica-p1-hardening

Conversation

@Zheaoli

@Zheaoli Zheaoli commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

First phase of multi-replica readiness: lets PicImpact run as multiple replicas behind a load balancer. Most state is already shared and safe (DB-backed sessions; the preprocess task queue is claimed via a PostgreSQL advisory lock + lease), so this PR only addresses the per-instance concerns that matter when scaling past one replica. No behavior change for single-instance deployments.

Changes

  • auth — shorten signed cookie-cache TTL 30m → 60s (server/auth/index.ts). The better-auth cookie cache is per-instance in-memory state, so a long TTL let a revoked session keep validating on a replica that still held the cached cookie. 60s bounds cross-replica revocation lag while still absorbing repeat reads.
  • preprocess tick — optional shared-secret gate (hono/preprocess-tasks.ts). POST /api/v1/preprocess-tasks/tick is the external-cron driver used in multi-replica setups, so it can be reached publicly. When PREPROCESS_TICK_SECRET is set, callers must send it in the x-preprocess-tick-secret header (constant-time compared, never logged); otherwise 401. When the env var is unset the endpoint stays open, so single-instance / internal-ticker deployments are unaffected.
  • docs — add docs/multi-replica.md (single-driver ticker via PREPROCESS_TICKER_ENABLED=false + one external cron, DB connection budgeting, session-revocation latency, interim per-replica data-cache behavior) and document the new env vars in .env.example.

Notes

  • The shared cross-replica data cache (a Postgres-backed Next cache handler) is a separate follow-up; until it lands, each replica keeps its own in-memory Data Cache and admin changes propagate to other replicas within the existing safety-net TTLs (gallery ~60s, albums/config ~1h). This is documented in docs/multi-replica.md.
  • Out of scope (tracked separately): /api/v1/* and /admin currently rely on client-side auth only; server-side enforcement is a pre-existing gap unrelated to multi-replica and left as-is per maintainer direction. requireAuth already exists in hono/_lib/context.ts if/when we wire it up.

Test plan

  • next build passes; tsc --noEmit clean for the changed files.
  • Tick secret: with PREPROCESS_TICK_SECRET unset, POST /tick works as before; with it set, a request without / with a wrong header returns 401, and a request with the matching header succeeds.

Prepare PicImpact to run as multiple replicas behind a load balancer.

- auth: shorten the better-auth signed cookie-cache TTL from 30m to 60s.
  The cookie cache is per-instance in-memory state, so a long TTL let a
  revoked session keep validating on a replica that still held the cached
  cookie. 60s bounds cross-replica revocation lag while still absorbing
  repeat reads.
- preprocess tick: add an optional shared-secret gate on
  POST /api/v1/preprocess-tasks/tick via PREPROCESS_TICK_SECRET +
  x-preprocess-tick-secret header (constant-time compared, never logged).
  This is the public-cron driver path for multi-replica; when the env var
  is unset the endpoint stays open so single-instance / internal-ticker
  deployments are unaffected.
- docs: add docs/multi-replica.md covering single-driver ticker
  (PREPROCESS_TICKER_ENABLED=false + one external cron), DB connection
  budgeting (N x pool), session revocation latency, and the interim
  per-replica data-cache behavior. Document the new env vars in .env.example.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
picimpact Ready Ready Preview, Comment Jun 22, 2026 11:25am

@Zheaoli Zheaoli merged commit afe2dad into main Jun 22, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant