Skip to content

Fix keyword-merge 500 on legacy sort_value column (#1352)#1362

Merged
jonfroehlich merged 1 commit into
masterfrom
1352-keyword-merge-sortvalue-fix
Jun 21, 2026
Merged

Fix keyword-merge 500 on legacy sort_value column (#1352)#1362
jonfroehlich merged 1 commit into
masterfrom
1352-keyword-merge-sortvalue-fix

Conversation

@jonfroehlich

Copy link
Copy Markdown
Member

Follow-up bugfix to #1361. The "Merge selected keywords" action 500'd on -test with:

IntegrityError: null value in column "sort_value" of relation
"website_publication_keywords" violates not-null constraint

Root cause

The deployed website_*_keywords through tables carry a leftover NOT-NULL sort_value column — keywords used to be a SortedManyToManyField (like authors/project_umbrellas). So obj.keywords.add(target), which INSERTs a fresh through row, fails the constraint. settings_test builds the schema from the current plain-ManyToManyField models, so the column isn't present locally — which is exactly why the original tests passed but prod/-test failed.

Fix

Merge at the through-table level: repoint each existing (object, source) row to (object, target) (an UPDATE that preserves sort_value) instead of inserting a new row. If the object already carries the target — including when it's tagged with two source keywords at once — the source row is deleted so no duplicate (object, keyword) pair is created.

Tests

  • New regression test reproduces the deployed drift by adding a NOT-NULL sort_value column to the through table before merging (SET CONSTRAINTS ALL IMMEDIATE to clear the deferred FK triggers first), and asserts the merge succeeds.
  • New two-sources-on-one-object dedup test.
  • Full suite green: test_keyword_merge + test_data_health + test_admin_changelist (50 tests).

🤖 Generated with Claude Code

The keyword-merge action 500'd on -test/prod with "null value in column
sort_value violates not-null constraint". The deployed website_*_keywords
through tables carry a leftover NOT-NULL sort_value column (keywords used to be
a SortedManyToManyField), so obj.keywords.add(target) — which INSERTs a fresh
through row — fails the constraint. settings_test builds the schema from the
current plain-M2M models, so the column isn't present locally and the original
tests couldn't catch it.

Fix: merge at the through-table level by repointing each existing
(object, source) row to (object, target) — an UPDATE that preserves the row's
sort_value — instead of inserting a new row. If the object already has the
target (including when it's tagged with two sources at once), the source row is
deleted so no duplicate (object, keyword) pair is created.

Adds a regression test that reproduces the deployed drift by adding a NOT-NULL
sort_value column to the through table before merging, plus a two-sources-on-
one-object dedup test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jonfroehlich

Copy link
Copy Markdown
Member Author

Root cause verified against the prod dump (2026-06-14)

Confirmed this is a legacy SortedManyToManyField artifact on one through table, not a guess:

  • Of the six website_*_keywords tables, only website_publication_keywords has a sort_value column (nullable on prod). The other five have none — proving a plain M2M never creates it.
  • In that table's data: 986 of 1,215 rows have sequential sort_values (1..N per publication) — the signature of SortedManyToManyField — and 229 are NULL (newer plain .add()s, which prod allows because its column is nullable).

So: the field was once SortedManyToManyField, was reverted to plain ManyToManyField, and the column was left behind. -test's copy of the column is NOT NULL (env migration drift), which is why the merge 500'd there but not on prod.

The repoint approach in this PR is correct for both the nullable (prod) and NOT NULL (-test) variants.

Related latent bug (separate follow-up)

On -test, any INSERT into website_publication_keywords fails the NOT NULL constraint — that includes editing a Publication's keywords in the admin, not just this merge action. The durable cleanup is to drop the vestigial sort_value column so the schema matches the model everywhere; tracking that separately since it's a guarded schema migration with prod-deploy risk.

@jonfroehlich

Copy link
Copy Markdown
Member Author

The deeper cleanup (dropping the vestigial sort_value column, which also fixes the latent publication-keyword admin-edit 500 on -test) is now tracked in #1363. This PR remains the immediate, schema-agnostic fix for the merge action.

@jonfroehlich jonfroehlich merged commit 12f923f into master Jun 21, 2026
3 checks passed
jonfroehlich added a commit that referenced this pull request Jun 21, 2026
Keyword taxonomy cleanup (#1352): "Merge selected keywords" admin action,
"Duplicate keywords" Data Health check with deep links to the merge changelist,
and whitespace normalization on Keyword.save(). Merge operates at the M2M
through-table level for compatibility with the legacy sort_value column on
website_publication_keywords (#1362).

Admin hygiene (#1346): Photo changelist no longer 500s on a missing image file
and surfaces the owning project. Plus SEO on-page fixes (duplicate <title>,
/None sponsor links, YouTube embed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jonfroehlich jonfroehlich deleted the 1352-keyword-merge-sortvalue-fix branch June 22, 2026 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant