Fix coding_env API compatibility and safety reward false positives#635
Fix coding_env API compatibility and safety reward false positives#635abhinavgautam01 wants to merge 8 commits into
Conversation
Greptile SummaryThis PR fixes three
Confidence Score: 5/5Safe to merge — all previously-identified bugs are resolved and the core transform/env logic is correct. The two previous out-of-diff findings (CodeQualityTransform crash on deeply-nested code and the dead timeout_s parameter in step()) are both fixed in the current code. The remaining comments are style-level and do not affect runtime correctness. No files require special attention for correctness; tests/envs/test_coding_safety_transform.py is worth a second look for CI portability. Important Files Changed
Sequence DiagramsequenceDiagram
participant Orch as Orchestrator
participant Env as PythonCodeActEnv
participant Exec as PyExecutor
participant ST as CodeSafetyTransform
participant QT as CodeQualityTransform
Orch->>Env: "reset(seed?, episode_id?, **kwargs)"
Env->>Exec: PyExecutor() (fresh instance)
Env->>ST: create_safe_coding_transform()
Env-->>Orch: "CodeObservation(reward=0.0)"
Orch->>Env: "step(CodeAction(code=...))"
Env->>Exec: run(code)
Exec-->>Env: ExecutionResult(stdout, stderr, exit_code)
Env->>ST: __call__(obs + last_code metadata)
Note over ST: ast.parse to _detect_violation()
ST-->>Env: "obs(reward=-1.0 or 0.0)"
Env->>QT: __call__(obs)
QT-->>Env: obs(reward adjusted)
Env-->>Orch: CodeObservation
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
tests/envs/test_coding_safety_transform.py:9-10
**Test missing sys.path bootstrap — breaks without explicit PYTHONPATH**
The sibling test `test_python_codeact_reset.py` guards against missing path setup with `sys.path.insert(...)` for the repo root plus `src/`, then uses the fully-qualified prefix `envs.coding_env.*`. This file uses the short form `coding_env.*`, which works only when the runner sets `PYTHONPATH=src:envs`. Running bare (`pytest tests/`) or in a CI job that doesn't set that variable produces an `ImportError`, while the sibling test still passes. Following the same bootstrap pattern would make the suite self-contained.
### Issue 2 of 2
envs/coding_env/server/__init__.py:18-23
**Lazy loader doesn't cache the resolved class — `__getattr__` fires on every access**
The current implementation returns `PythonCodeActEnv` on each call to `__getattr__` but never writes it back onto the module. Any code that repeatedly accesses `coding_env.server.PythonCodeActEnv` as an attribute (rather than a one-time `from … import`) will re-execute the `__getattr__` body on each access. The standard fix is to set the resolved name on the module after the first load, turning subsequent lookups into a direct `__dict__` hit.
```suggestion
def __getattr__(name: str) -> Any:
if name == "PythonCodeActEnv":
from .python_codeact_env import PythonCodeActEnv
import sys
setattr(sys.modules[__name__], name, PythonCodeActEnv)
return PythonCodeActEnv
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
```
Reviews (7): Last reviewed commit: "Harden coding env quality transform" | Re-trigger Greptile |
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review Report — Tier 1/Tier 2
Automated Checks
- Lint:
envs/coding_env/server/app.pyhas ausortimport sort violation (pre-existing, but CI blocker) - Debug code: CLEAN — no debug artifacts introduced
Tier 1: Fixes Required
1. step() signature incomplete vs base class
python_codeact_env.py — The base class Environment.step() declares step(self, action: ActT, timeout_s: Optional[float] = None, **kwargs: Any). The concrete implementation only has step(self, action: Action). This is a Liskov substitution violation. Since the PR reformats this method signature, this is the right moment to bring it into conformance with the base class.
2. Import sort violation in app.py
app.py — usort reports the file needs import sorting. Pre-existing issue but will block CI on merge.
3. sys.path manipulation in new test file
test_coding_safety_transform.py — Uses sys.path.insert() instead of relying on the project's standard PYTHONPATH=src:envs test setup. Should use conftest.py or the standard env var approach for consistency.
4. SyntaxError handling is a behavioral regression (minor)
transforms.py — When ast.parse() fails on syntactically invalid code containing dangerous patterns (e.g., import os\n\x00), _detect_violation() returns None and reward is set to 0.0. The old regex approach would have caught import os even in broken code. This is likely acceptable (CodeQualityTransform handles syntax errors separately), but should be documented as an intentional trade-off.
Tier 2: Alignment Discussion
ALIGNMENT FLAG: reset() signature extended with seed, episode_id, **kwargs
- Principle at stake: API Invariant #1 from INVARIANTS.md
- Assessment: The base class
Environment.reset()already declares these parameters. Adding them to the concrete class bringscoding_envinto compliance with the documented invariant, not violating it. This is correct. - Concern: The
episode_idoverride allows the training orchestrator to force a specific episode ID. Confirm that no MCP-exposed tool forwardsepisode_idtoreset(), which would let an agent influence its own episode identity. - Suggested reviewer: @Darktex
ALIGNMENT FLAG: AST safety check coverage
- Principle at stake: Rewards inside environment (RFC 002)
- Assessment: The switch from regex to AST is a clear improvement — eliminates false positives from string literals (
print("import os")) and similarly-named functions (myopen()). The AST check correctly only inspectsast.Namenodes, sodb.exec("sql")is properly ignored (contradicting Greptile's P1 claim). - Concern: The blocked set is the same scope as before, but since this is a rewrite of the safety logic, worth asking: is the set complete? Notably absent:
importlib.import_module("os")bypasses the check since it's anast.Attributecall, notast.Name. Similarlypathlib,socket,ctypes. May be acceptable if the environment has OS-level isolation. - Suggested reviewer: @Darktex
Clarification on Greptile's Prior P1 Claims
Greptile's review (from May 2) flagged two P1 issues that do not apply to the actual diff:
- "
timeout_saccepted but never applied" — The actual diff does NOT addtimeout_stostep(). It only reformats the existing signature. Greptile appears to have reviewed a different version of the code. - "Attribute-based call detection introduces new false positives" — The actual code checks only
ast.Name(bare function calls likeexec(...)), NOTast.Attribute(method calls likedb.exec(...)). The testtest_does_not_flag_attribute_method_named_execexplicitly verifies this. Greptile's claim is incorrect.
Summary
- 2 mechanical fixes needed (
step()signature conformance, import sort) - 1 minor style fix (sys.path manipulation in tests)
- 1 behavioral note to document (SyntaxError handling trade-off)
- 2 alignment questions for human review (episode_id/MCP boundary, safety check coverage)
- Greptile's P1 flags are not applicable to the actual diff
Verdict: REQUEST_CHANGES
Automated review by Claude Code | Learn more
|
@greptile review |
Accept optional reset/step parameters in PythonCodeActEnv and add tests for episode_id and timeout_s handling.
ebf663c to
4e9c397
Compare
|
@greptile review |
|
@greptile review |
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code, not a human review.
Alignment Review: PR #635
Tier 1: Bugs & Issues
1. reset() signature missing **kwargs from base class
The base class Environment.reset() accepts **kwargs: Any for forward-compatibility. This PR adds seed and episode_id but omits **kwargs. If the framework ever passes an additional keyword argument to reset(), PythonCodeActEnv will raise a TypeError while other environments will not. Please add **kwargs to match the full base signature.
2. seed parameter accepted but unused — intentional?
seed is accepted in the new signature but never referenced in the method body. This is likely fine (no random state to seed), but a brief inline comment clarifying this would prevent future contributors from assuming it's a bug.
3. AST-based safety detection has known bypass vectors
The switch from regex to AST analysis is a clear improvement that eliminates false positives. However, the following bypass patterns will NOT be caught:
getattr(__builtins__, 'eval')(code)— attribute-based accessimportlib.import_module('os')— dynamic import via importlib__builtins__['eval'](code)— subscript-based access
These are acceptable since the transform is a reward shaping heuristic, not a security boundary (container isolation provides actual sandboxing per INVARIANTS.md). However, the class docstring currently says it "evaluates code safety" which may mislead contributors into treating it as a security control. Consider adding a note: "This is a reward heuristic, not a security sandbox."
Tier 2: Alignment
ALIGNMENT FLAG: reset() should mirror full base signature
- Principle: INVARIANTS.md §API Invariant 1 — Gymnasium API signatures must not change without a major version bump
- Concern: Omitting
**kwargsfrom the override breaks forward-compatibility. AllEnvironmentsubclasses should mirror the full base class signature.
What looks good
- AST-based detection is a significant improvement — eliminates false positives on string literals and user-defined functions
- New test file covers both true positives and false negative cases thoroughly
- Lazy import pattern in
__init__.pyis clean and well-motivated episode_idoverride support aligns with the Gym-like API contract
Automated review by Claude Code | Learn more
|
@greptile review |
|
@greptile review |
|
@greptile review |
|
ping @Darktex |
Summary
This PR fixes multiple
coding_envreliability issues with a focus on API compatibility and reward correctness.It aligns
PythonCodeActEnvwith expected optionalreset/stepparameters, removes duplicate server entrypoint execution blocks, and replaces regex-based safety checkswith AST-based detection to prevent reward false positives.
It also adds focused tests for the new behavior and safety-detection edge cases.
Type of Change
Alignment Checklist
Before submitting, verify:
.claude/docs/PRINCIPLES.mdand this PR aligns with our principles.claude/docs/INVARIANTS.mdand no invariants are violated/pre-submit-pr(orbash .claude/hooks/lint.shand tests) and addressed all issuesRFC Status
Test Plan
Reviewers can verify with: