perf(contests): swap aggregate_user.score filter for karma-reported authors#814
Merged
Merged
Conversation
5 tasks
13f2b26 to
228370d
Compare
…uthors
The previous shadow-ban filter on the contest discovery list used
`aggregate_user.score < 0` (AAO output). Two problems:
1. `aggregate_user` has no index covering `score`, so the CTE forced a
full seq scan on every cold call. /v1/events/remix-contests?status=all
was hanging ~22s cold-cache (warm: ~100ms).
2. The AAO signal is a separate moderation lane from the community
karma-reports system that already governs comment visibility.
The two can drift.
Fix: align the contest filter with the comment-visibility filter. A host
is shadow-banned from contest discovery if they authored a comment that
crossed the same `high_karma_reporters` threshold (sum of reporters'
follower_count >= karmaCommentCountThreshold) that hides the comment
itself on v1_track_comments / v1_event_comments. The new CTE
`karma_reported_authors` lifts the comment-level signal up to user_id.
`muted_by_karma` is unchanged — still filters hosts muted by high-karma
users.
`comment_reports` is a small table indexed on `comment_id`, and the new
CTE only adds a hash-join on comments (PK lookup per hkr row), so the
cost is bounded by report volume rather than user-table size — no
sequential scan over millions of aggregate_user rows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
228370d to
c7d4c50
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
/v1/events/remix-contests?limit=12&offset=0&status=allis timing out / hanging the contests page in production. Measured againstapi.audius.co:status=allstatus=endedstatus=activeRoot cause
The shadowban filter introduced in #803 added an
aggregate_user.score < 0CTE on the contest query.aggregate_userhas one row per user (millions of rows) and no index coveringscore, so the CTE runs a full sequential scan ofaggregate_useron every cold call.Approach — align with the existing comment-visibility filter
Rather than indexing around
aggregate_user.score, this PR switches the contest filter to the same community-karma signal that already governs comment visibility onv1_track_comments/v1_event_comments. A comment is hidden when reporters' summed follower_count crosseskarmaCommentCountThreshold(thehigh_karma_reportersCTE). This PR lifts that signal from comment_id up to user_id: if a host has authored any comment that crossed the threshold, their contests are hidden too.New CTEs on the contest query:
Filter:
muted_by_karmais unchanged — still in the filter list. The AAO score remains the moderation lane for the comment endpoints themselves (v1_track_comments,v1_event_comments,v1_fan_club_feed,v1_track_comment_countare all untouched).Why this is faster
The new CTE walks
comment_reports(a small table, indexed oncomment_id) joined withaggregate_useron the reporter id, plus a hash-join oncommentsfor the author lookup. The cost is bounded by report volume — no sequential scan over millions ofaggregate_userrows.Supersedes #813
#813 proposed a partial index on
aggregate_user (user_id) WHERE score < 0. That works but keeps the AAO score as the moderation lane for contests; this PR routes contests through the same karma-reports signal as comments, which is the preferred semantic. Closing #813.Test plan
go build ./api/...cleango vet ./api/...cleanTestRemixContestsExcludesShadowbannedHostsupdated to seedcomment_reports+aggregate_user.follower_countbump and assert the karma-reported host is excluded/v1/events/remix-contests?status=allcold on staging — expect sub-secondlow_abuse_score,high_karma_reporters,muted_by_karma,deactivated_usersall still in place🤖 Generated with Claude Code