feat: PostgreSQL-backed cross-replica Data Cache handler (P2)#539
Merged
Conversation
Make admin changes propagate across replicas instantly. PicImpact's public read path is memoised with unstable_cache + revalidateTag (server/lib/cache.ts), but Next's default cache handler keeps that data per-instance, so a revalidateTag on the replica handling an admin write doesn't reach the others. - Add a custom Next cacheHandler (server/lib/pg-cache-handler.cjs) backed by Postgres: cached values + per-tag invalidation timestamps live in shared tables (next_cache_entries / next_cache_tags, auto-created), with a bounded module-level in-memory L1 for hot reads. Wire it in next.config.mjs and set cacheMaxMemorySize: 0 so Next's per-instance memory cache doesn't mask it. - Cross-instance invalidation works by get() returning a miss when any of an entry's tags was invalidated after the entry was written. The tag manifest is Postgres-authoritative: refreshed on a short TTL (so a missed NOTIFY or a transaction-pooled connection that can't LISTEN still converges within ~1s) and busted instantly via LISTEN/NOTIFY when a direct connection is available (uses DIRECT_URL for the listener). Ordering uses the Postgres server clock for both writes and invalidations, so replica clock skew can't miss an invalidation. Consistency is instant (a revalidated tag forces recompute on the next read on every replica); per-entry safety-net TTLs are left to Next. - Add a long-unused-entry sweep (server/lib/cache-cleanup.ts) driven by the same single external cron as the preprocess tick, so it never runs as a per-replica timer. - Add pg (and @types/pg); pg was already declared in serverExternalPackages. - Document the shared cache + DIRECT_URL/CACHE_HANDLER_DEBUG in docs/multi-replica.md and .env.example. Verified with a 2-instance build against Postgres: shared reads, cross-replica invalidation, cold-start reload after a missed NOTIFY, and a stable steady state. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
- docs/multi-replica.md §2: account for the handler's own pool (max 4) + the persistent LISTEN client (~+5 connections per replica) in the max_connections budgeting, and cross-reference the DIRECT_URL requirement for LISTEN. - cache-cleanup.ts: note that only next_cache_entries is swept; next_cache_tags is a fixed small enum today, but flag that unbounded tags (e.g. per-image) would need their own sweep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cross-replica cancellation is already implemented, not a follow-up: the cancel endpoint sets status='cancelling' (atomic, cross-replica) and the running replica re-reads it at its per-image checkpoint and stops — within ~one item, not a lease window. Correct §5 which previously described it pessimistically. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Delivers instant cross-replica consistency for the public Data Cache — the core of multi-replica readiness. PicImpact memoises its public read path with
unstable_cache+revalidateTag(server/lib/cache.ts), but Next's default cache handler keeps that data per-instance, so arevalidateTagon the replica handling an admin write doesn't reach the others; they serve stale data until the safety-net TTL expires. This routes the Data Cache through Postgres so an invalidation on one replica is seen by all.No application-layer changes:
server/lib/cache.tsis untouched — wiring a customcacheHandleris enough for the existingunstable_cachecalls to flow through it.How it works
server/lib/pg-cache-handler.cjs— a custom NextcacheHandler:next_cache_entries+next_cache_tags, auto-created), shared by all replicas; L1 = a bounded, module-level in-memory cache for hot reads (the handler is instantiated per request, so shared state lives at module scope).get()returns a miss when any of an entry's tags was invalidated after the entry was written. Next's own in-process tag manifest can't be fed from another replica, but it only runs whenget()returns data — soget()is made the decision point, consulting the shared manifest.NOTIFY, or a transaction-pooled connection that can'tLISTEN, still converges within ~1s) and busted instantly viaLISTEN/NOTIFYwhen a direct connection is available (usesDIRECT_URLfor the listener, sinceLISTENneeds a persistent session).revalidateTagforces a recompute on the next read on every replica. Cross-instance stale-while-revalidate is intentionally not implemented (it would let a replica serve one stale read after an admin change); per-entry safety-net TTLs (revalidateseconds) are still honored by Next.next.config.mjs— wires the handler and setscacheMaxMemorySize: 0so Next's per-instance memory cache doesn't mask it.server/lib/cache-cleanup.ts— a long-unused-entry sweep, driven by the same single external cron as the preprocess tick (/api/v1/preprocess-tasks/tick), so it never runs as a per-replica timer.pg+@types/pg(pgwas already declared inserverExternalPackages).DIRECT_URLusage, andCACHE_HANDLER_DEBUGindocs/multi-replica.mdand.env.example.Notes
unstable_cache(FETCH-kind) entries currently flow through the handler; PicImpact's routes are all dynamic (no ISR) and image optimization keeps its own cache. The handler stores any kind generically and never throws into the request path.Test plan
tsc --noEmitclean for the changed files;next buildpasses with the handler wired.next startx2, one database):revalidateTagon A → next read on B recomputes (cross-replica invalidation);revalidateTagon A (B misses the NOTIFY), restart B → B recomputes (cold-start reload makes Postgres authoritative);