Skip to content

[§4 step 2 post-pivot] Consume LML POST /api/v1/identity/bulk-resolve-libraries #802

@jakebromberg

Description

@jakebromberg

Goal

Replace the retired source-leg backfill job (jobs/library-identity-backfill/) with a thin consumer of LML's new POST /api/v1/identity/bulk-resolve-libraries endpoint. Per the architecture pivot (#800), Backend stops reading LML's discogs-cache PG directly; it caches LML's verdict via the HTTP contract.

Scope

Build a one-shot ETL job (in the same jobs/ shape as the existing artist-identity-etl) that:

  1. Selects libraries needing identity refresh: WHERE library.canonical_entity_id IS NOT NULL OR library.id IN (SELECT library_id FROM library_identity WHERE last_refreshed_at < NOW() - interval '7 days') — incremental, bounded.
  2. POSTs them to LML in batches of 500 with (artist_name, album_title) denormalized.
  3. UPSERTs the response into library_identity (one row per library_id) and library_identity_source (one row per provenance entry).
  4. Emits Sentry-traced metrics: rows_resolved / rows_unresolved / rows_skipped / lml_latency.

What we keep from the retired job

  • The library_identity + library_identity_source schemas (Backend's cache of LML's verdict).
  • The artists table mirror shape (for in-process reads from the request hot path).
  • The DRY_RUN env var pattern (locked JSON output for stage/prod parity).

What we remove

  • jobs/library-identity-backfill/ and its source-leg readers.
  • The BACKFILL_LEG dispatcher.
  • All cross-DB connection setup (DATABASE_URL_DISCOGS).

Acceptance criteria

  • New job in jobs/library-identity-consumer/ (name TBD) wired into Manual Build & Deploy.
  • Consumes POST /api/v1/identity/bulk-resolve-libraries per the v0.7 contract.
  • Idempotent: rerunning is safe; UPSERT prevents dup rows.
  • DRY_RUN locked JSON schema; integration test covers happy path + retry path + unresolved row.
  • Old jobs/library-identity-backfill/ deleted in the same PR (no tombstone scripts).
  • CLAUDE.md + docs/env-vars.md updated.

Sequencing

Blocked by:

Cannot ship until both are in place.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions