Skip to content

Decouple Guardian canonicalization from API instances #190

Description

@MCarlomagno

Summary

Guardian’s current canonicalization flow runs as an in-process background worker inside the server. That works for a single instance, but it becomes a bottleneck and a correctness risk when scaling Guardian horizontally across multiple machines.

We should separate canonicalization from the main request-serving path so multiple Guardian API instances can accept traffic, while a dedicated canonicalization worker is responsible for finalizing candidate deltas.

Problem

Today, canonicalization is driven by per-instance polling and account metadata flags. This creates a few scaling and coordination problems:

  • API instances and canonicalization logic are tightly coupled.
  • Multi-instance deployments risk duplicate processing and race conditions.
  • Candidate creation and canonicalization scheduling are not modeled as a durable shared job flow.
  • Finalization touches multiple pieces of state and should be coordinated more explicitly.

Goal

Introduce a high-level architecture where:

  • Guardian API instances only validate and persist candidate deltas.
  • Candidate creation also creates a durable canonicalization task in shared storage.
  • A separate canonicalization worker/service consumes those tasks and performs finalization.
  • Canonicalization remains sequential per account and safe across restarts.
  • The system can support multiple Guardian API instances without relying on in-process background jobs.

Expected outcome

After this change, we should be able to run:

  • many Guardian API nodes for request handling
  • one dedicated canonicalization worker initially
  • a path to safely support multiple workers later through proper claiming/locking semantics

Acceptance criteria

  • Canonicalization execution is no longer tied to every Guardian API process.
  • Candidate persistence and canonicalization task creation are coordinated durably.
  • A worker can recover unfinished canonicalization tasks after restart.
  • Finalization updates are applied consistently and do not leave partially completed state.
  • Per-account ordering is preserved during canonicalization.
  • Failed canonicalizations end in an explicit terminal or retryable state.
  • The design is validated for multi-instance deployment with Postgres/shared storage.

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Feature.

Projects

Status
Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions