Skip to content

fix(signup): Commit db state to unlock db and prevent empty standby_site#6470

Merged
Bowrna merged 6 commits into
frappe:developfrom
Bowrna:pick_standby_site
May 25, 2026
Merged

fix(signup): Commit db state to unlock db and prevent empty standby_site#6470
Bowrna merged 6 commits into
frappe:developfrom
Bowrna:pick_standby_site

Conversation

@Bowrna
Copy link
Copy Markdown
Contributor

@Bowrna Bowrna commented May 20, 2026

Commit db state to unlock db that is causing the empty standby_site to return and creating new site during signup. This causes a longer time.

@Bowrna Bowrna requested a review from ssiyad as a code owner May 20, 2026 10:41
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR attempts to fix a race condition in signup where multiple concurrent requests find no standby site (due to FOR UPDATE SKIP LOCKED) and each fall through to create a fresh site, causing slower signups. It wraps the standby-site claim block in a try-except that rolls back and restores is_standby=1 on failure, and reduces the FOR UPDATE lock count in get_preferred_site from 3 to 2.

  • The rollback-on-error guard is a genuine improvement: previously, an exception during setup could leave a standby site permanently consumed without completing the signup.
  • The FOR UPDATE lock count reduction (3\u21922) lowers the number of rows unnecessarily held during the transaction, slightly reducing contention.
  • The stated goal of "committing db state to unlock db" is not achieved \u2014 no frappe.db.commit() is added after the initial frappe.db.set_value(\"is_standby\", 0), so the row lock is still held and is_standby=0 remains invisible to concurrent transactions for the entire duration of the try block.

Confidence Score: 3/5

The error-recovery path is an improvement, but the primary goal of stopping concurrent signups from bypassing the standby pool is not achieved as written, and a partial-failure window around site.reload() and set_site_domain() can leave a site permanently consumed without a working domain.

The except block correctly rolls back and restores is_standby on most failure paths. However, the missing frappe.db.commit() means the FOR UPDATE lock is held throughout the entire try block, so concurrent requests still skip the locked row and fall through to create a new site. Additionally, site.reload() and set_site_domain() sit outside the recovery guard, so a failure there commits is_standby=0 and the partial site state while marking the request as an error.

press/saas/doctype/product_trial/product_trial.py — specifically the standby-site claim block around lines 115–132

Important Files Changed

Filename Overview
press/saas/doctype/product_trial/product_trial.py Adds rollback-on-error recovery for standby-site claim and reduces FOR UPDATE row-lock count (3→2), but is missing a frappe.db.commit() needed to release the lock early and make is_standby=0 visible to concurrent transactions; also leaves site.reload() and set_site_domain() outside the recovery guard.

Sequence Diagram

sequenceDiagram
    participant A as Request A (signup)
    participant B as Request B (concurrent signup)
    participant DB as Database

    A->>DB: SELECT ... FOR UPDATE SKIP LOCKED (get_standby_site)
    DB-->>A: site_S (row locked)
    A->>DB: "set_value(is_standby=0) in T_A NOT committed"
    Note over DB: Row still locked by T_A, is_standby=0 not yet visible

    B->>DB: "SELECT ... WHERE is_standby=1 FOR UPDATE SKIP LOCKED"
    Note over DB: site_S is locked, skipped by SKIP LOCKED
    DB-->>B: none, no other standby sites

    B->>DB: INSERT new Site (fallback path)

    A->>DB: site.save(), set_site_domain, etc.
    A->>DB: COMMIT T_A (end of request)

    Note over A,DB: With frappe.db.commit() after set_value T_A commits early lock released B sees is_standby=0 via WHERE clause
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
press/saas/doctype/product_trial/product_trial.py:115-116
**Missing commit defeats the concurrency unlock**

The PR description says the goal is to "commit db state to unlock db", but no `frappe.db.commit()` is called after this `set_value`. Without a commit:
1. The `FOR UPDATE` row-lock acquired in `get_standby_site()` remains held for the entire duration of the try block.
2. The `is_standby=0` change is invisible to other transactions.

A concurrent signup request's `SELECT … WHERE is_standby=1 … FOR UPDATE SKIP LOCKED` will still skip this row (because it is locked, not because it's already taken), get `None`, and fall through to create a fresh site — the exact race condition this PR aims to prevent. Adding `frappe.db.commit()` immediately after the `set_value` call would release the lock and make `is_standby=0` visible, so other transactions filter the row out via the WHERE clause instead of colliding on the lock.

### Issue 2 of 2
press/saas/doctype/product_trial/product_trial.py:131-132
**Partial failure after `site.save()` leaves `is_standby=0` unrecovered**

`site.reload()` and `self.set_site_domain(site, site_domain)` execute outside the try-except guard. If either raises, the exception propagates to the caller's except block in `create_site`, which calls `self.save()` and commits the current transaction. At that point the transaction includes `set_value("is_standby", 0)` and `site.save()`, so the site ends up persisted with `is_standby=0`, `team` assigned, and trial fields set — but without its domain — while the `ProductTrialRequest` is marked `"Error"`. The standby site is permanently consumed without a working signup. Extending the try-except to cover these two calls (or moving them inside it) would make recovery consistent.

Reviews (1): Last reviewed commit: "fix(signup): Commit db state to unlock d..." | Re-trigger Greptile

Comment thread press/saas/doctype/product_trial/product_trial.py
Comment thread press/saas/doctype/product_trial/product_trial.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 28.30189% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.05%. Comparing base (c1c0ed6) to head (b16bb87).
⚠️ Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
press/saas/doctype/product_trial/product_trial.py 0.00% 20 Missing ⚠️
press/api/site.py 40.00% 18 Missing ⚠️

❌ Your patch check has failed because the patch coverage (28.30%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6470      +/-   ##
===========================================
+ Coverage    49.95%   50.05%   +0.10%     
===========================================
  Files          954      954              
  Lines        78941    78960      +19     
  Branches       366      366              
===========================================
+ Hits         39437    39527      +90     
+ Misses       39479    39408      -71     
  Partials        25       25              
Flag Coverage Δ
dashboard 60.73% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Bowrna Bowrna force-pushed the pick_standby_site branch from 046975c to 20def66 Compare May 20, 2026 11:12
@Bowrna Bowrna force-pushed the pick_standby_site branch from 520654b to c580df1 Compare May 22, 2026 08:16
@Bowrna Bowrna force-pushed the pick_standby_site branch from 1074834 to 8dab901 Compare May 22, 2026 09:44
@Bowrna Bowrna force-pushed the pick_standby_site branch from 0317d8d to 00aadc1 Compare May 25, 2026 08:47
@Bowrna Bowrna merged commit 311f813 into frappe:develop May 25, 2026
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant