Skip to content

fix(worker): eliminate N+1 query in ProcessFlakesTask#950

Draft
sentry[bot] wants to merge 1 commit into
mainfrom
seer/fix/worker-process-flakes-n-plus-1-lEE6cI
Draft

fix(worker): eliminate N+1 query in ProcessFlakesTask#950
sentry[bot] wants to merge 1 commit into
mainfrom
seer/fix/worker-process-flakes-n-plus-1-lEE6cI

Conversation

@sentry
Copy link
Copy Markdown
Contributor

@sentry sentry Bot commented May 30, 2026

This PR addresses an N+1 query issue identified in the ProcessFlakesTask, specifically within the process_flakes_for_commit function in apps/worker/services/test_analytics/ta_process_flakes.py.

Problem:
Previously, for each commit, the process_flakes_for_commit function would iterate through N uploads. For each upload, it would call process_single_upload, which in turn would perform a separate SELECT query via get_testruns(upload) and a separate Testrun.objects.bulk_update operation. This resulted in N SELECT queries and N UPDATE queries on the ta_timeseries_testrun table, leading to an N+1 performance bottleneck.

Solution:
To resolve this, the following changes were implemented:

  1. Introduced get_all_testruns: A new function get_all_testruns(upload_ids) was added. This function now fetches all relevant Testrun objects for all given upload_ids in a single batched SELECT query, significantly reducing database round-trips.
  2. Refactored process_single_upload: The signature of process_single_upload was updated to accept a pre-fetched list of Testrun objects directly, removing its internal get_testruns call and the bulk_update operation.
  3. Optimized process_flakes_for_commit: This function now first collects all upload.ids. It then calls get_all_testruns once to retrieve all necessary Testrun objects. After processing these Testrun objects in memory via process_single_upload, a single Testrun.objects.bulk_update is performed at the end of the loop for all modified Testrun instances.

Impact:
This change reduces the database operations for fetching and updating Testrun objects from O(N) (where N is the number of uploads) to O(1), drastically improving the performance and reducing the load on the database during flake processing.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

Fixes WORKER-YNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants