Skip to content

Profile page query takes hundreds of seconds on cold DB cache, causing thundering herd #489

@dahlia

Description

@dahlia

On a fresh container start (cold PostgreSQL buffer cache), the profile page (/@username) query takes 391 seconds in production (4M rows / 13 GB posts table). Observed in logs:

duration: 391012.489 ms  execute <unnamed>: select "posts"."id", ... from "posts" left join lateral ... (posts_replies, posts_reactions, posts_shares, posts_bookmarks, ...)

Since the query doesn't return, concurrent requests pile up — each starting its own identical query — until the connection pool is exhausted and the entire site returns no response.

Two likely contributors:

  1. No index on (actor_id, published DESC) — timeline queries ORDER BY published DESC without a covering index, forcing a sort of all matching rows.
  2. posts_replies lateral join has no LIMIT — for each post in the result, all replies are fetched (WHERE reply_target_id = posts.id). On a popular post this can be a large scan.

Suggested fixes:

  • Add migration:
    CREATE INDEX posts_actor_id_published_index ON posts (actor_id, published DESC);
    -- or, if visibility is also filtered:
    CREATE INDEX posts_visibility_actor_id_published_index ON posts (visibility, actor_id, published DESC);
  • Add a LIMIT (e.g. 5) on the posts_replies lateral join, or omit it from list queries entirely
  • Set statement_timeout in the app to fail fast instead of blocking connections indefinitely

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions