Skip to content

Methodology improvements for RLCR environment-limitation detection and deferral handling #178

@mockiemochi

Description

@mockiemochi

Summary

A stagnation circuit breaker fired after 4 rounds on a loop where 5 of 7 acceptance criteria were resolved efficiently, but 2 remained blocked — one by environment-specific test flakiness and one by an external credential dependency. The core RLCR methodology worked well for code-implementation ACs, but lacked mechanisms to handle environmental limitations and formal deferrals.

Proposed Improvements

1. Scope threshold gate

After each round, measure the ratio of new files changed to unique issues closed. If fewer than 3 distinct issues are addressed per round after Round 2, trigger automatic escalation: either switch to a focused sub-loop with a single-issue contract, or introduce a "spike round" requiring 3 alternative approaches before choosing one.

2. Environment capability probe

When a fix fails repeatedly across rounds, require the implementer to run a minimal capability probe (e.g., a 5-line script testing the specific API or behavior in isolation) and report results before proposing a solution. This surfaces environment incompatibilities early, before rounds are wasted on approaches that cannot work in the review environment.

3. Fix verification escrow

Before a round can claim an issue resolved, the fix must pass verification escrow: demonstrate it working in an environment matching the review environment, or document why it cannot. If the same issue is claimed resolved in 3 consecutive rounds but fails review each time, automatically escalate to "pair-debug" mode where implementer and reviewer collaboratively investigate rather than alternating claim-and-refute.

4. Review finding taxonomy (DEFECT / ENV_LIMIT / FLAKY)

Add a classification tag to each review finding:

  • DEFECT: fixable by code change
  • ENV_LIMIT: requires environment change or workaround documentation
  • FLAKY: non-deterministic, needs hardening

If a finding is classified ENV_LIMIT for 2 consecutive rounds, require the implementer to document the limitation and provide an alternative verification path rather than demanding an ever-more-clever workaround.

5. Plan deviation approval

Any item from the original plan that is deferred or implemented differently must require an explicit deviation record with: (1) original requirement, (2) actual implementation, (3) justification, and (4) reviewer approval flag. This prevents gradual erosion of plan intent.

6. Micro-rounds for isolated issues

When a round's remaining work is confined to a single test file or environmental configuration, allow up to 3 micro-rounds within one review cycle. A micro-round is: implementer changes the test, runs it locally, reports pass/fail. No full review, contract, or summary. Only if all 3 micro-rounds fail does the issue escalate back to a full round.

7. Environment attestation in summaries

Every validation claim in a round summary must include an environment attestation block: runtime versions, OS, relevant environment variables, and a note if the review environment is known to differ. If a test passes locally but is environment-sensitive, flag it as "passes locally, may differ in review" rather than claiming unconditional pass.

Impact

These are incremental refinements, not structural changes. The stagnation circuit breaker correctly fired in the observed session, but the loop could have terminated ~2 rounds earlier with better environment-limitation detection and formal deferral handling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions