Fix flaky test_read_timerange: neutralize the row-cleaner during the test#97
Merged
Merged
Conversation
…test Root cause (confirmed): clean_database() is the ONLY code path that deletes rows from a service-db table, and it re-reads DAEPLOY_SERVICE_DB_TABLE_LIMIT on every call. When a clean runs with a short *time* limit (the "1seconds" that the db_limit_second fixture sets) it trims the OLDEST rows as they age past the limit. If such a clean runs mid-test, test_read_timerange's reads disagree and a wider time window can return FEWER rows than a narrower one — observed as "assert 200 == 195/191/183/175". Demonstrated directly: clean_database() with a 1-second limit wipes a freshly-written 200-row series. Earlier attempts (unique timestamps 841d194; invariant assertions; ENGINE pool dispose) all missed this because the mutation is a delete, not a write drop or a stale snapshot. Fix: pin a huge DAEPLOY_SERVICE_DB_TABLE_LIMIT ("36500days") for the duration of the test via monkeypatch (auto-restored), so any concurrent clean_database is a no-op for this test's data. With deletion ruled out, all 200 rows persist and the original "== 200" filtering assertions hold deterministically. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
def233d to
72843aa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause (confirmed)
clean_database()(daeploy/_service/db.py:208,213) is the only code path in the package that deletes rows, and it re-readsDAEPLOY_SERVICE_DB_TABLE_LIMITon every call. When a clean runs with a short time limit — the"1seconds"that thedb_limit_secondfixture sets — it deletes the oldest rows as they age past the limit.If such a clean runs while
test_read_timerangeis reading, the table shrinks between reads, so a wider time window can return fewer rows than a narrower one — exactly the observedassert 200 == 195 / 191 / 183 / 175. Demonstrated directly:clean_database()with a 1-second limit wipes a freshly-written 200-row series (200 → 0).This is why the earlier attempts missed it — the mutation is a delete, not a write-drop (841d194 unique timestamps; the invariant rewrite) or a stale snapshot (the
ENGINE.dispose()attempt).Fix
Pin a huge
DAEPLOY_SERVICE_DB_TABLE_LIMIT("36500days") for the duration of the test viamonkeypatch(auto-restored), so any concurrentclean_databaseis a no-op for this test's data. With deletion ruled out, all 200 rows persist and the meaningful== 200filtering assertions hold deterministically.Note
This makes
test_read_timerangehermetic against the cleaner. The underlying why does a clean run mid-test (a leaked periodic clean / event loop or test-ordering interaction across the SDK suite) is a broader test-isolation question worth a separate look, but neutralizing the sole deleter fixes this flake at its mechanism.Verification
Couldn't reproduce locally (load/timing dependent), so CI is the verifier. Test + the db-limit tests pass locally across repeated runs; black + flake8 clean.
🤖 Generated with Claude Code