Skip to content

Fix coding_env API compatibility and safety reward false positives#635

Open
abhinavgautam01 wants to merge 8 commits into
huggingface:mainfrom
abhinavgautam01:fix/coding-env-api-signature
Open

Fix coding_env API compatibility and safety reward false positives#635
abhinavgautam01 wants to merge 8 commits into
huggingface:mainfrom
abhinavgautam01:fix/coding-env-api-signature

Conversation

@abhinavgautam01
Copy link
Copy Markdown

@abhinavgautam01 abhinavgautam01 commented May 2, 2026

Summary

This PR fixes multiple coding_env reliability issues with a focus on API compatibility and reward correctness.
It aligns PythonCodeActEnv with expected optional reset/step parameters, removes duplicate server entrypoint execution blocks, and replaces regex-based safety checks
with AST-based detection to prevent reward false positives.
It also adds focused tests for the new behavior and safety-detection edge cases.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • New environment
  • Refactoring

Alignment Checklist

Before submitting, verify:

  • I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
  • I have checked .claude/docs/INVARIANTS.md and no invariants are violated
  • I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

  • Not required (bug fix, docs, minor refactoring)
  • RFC exists: #___
  • RFC needed (will create before merge)

Test Plan

Reviewers can verify with:

# Lint / syntax checks for touched files
uv run ruff check envs/coding_env/server/python_codeact_env.py \
 envs/coding_env/server/app.py \
 envs/coding_env/server/transforms.py \
 envs/coding_env/server/__init__.py \
 tests/envs/test_python_codeact_reset.py \
 tests/envs/test_coding_safety_transform.py

uv run python -m py_compile \
 envs/coding_env/server/python_codeact_env.py \
 envs/coding_env/server/app.py \
 envs/coding_env/server/transforms.py \
 envs/coding_env/server/__init__.py \
 tests/envs/test_python_codeact_reset.py \
 tests/envs/test_coding_safety_transform.py

# Focused behavior tests
PYTHONPATH=src:envs uv run pytest tests/envs/test_python_codeact_reset.py -q
PYTHONPATH=src:envs uv run pytest tests/envs/test_coding_safety_transform.py -q
PYTHONPATH=src:envs uv run pytest tests/envs/test_coding_env_integration.py -q

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 2, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 2, 2026

Greptile Summary

This PR fixes three coding_env reliability issues: AST-based safety detection replaces regex to eliminate false positives from string literals and same-named user functions, PythonCodeActEnv.reset() gains the seed/episode_id/**kwargs signature required by the Gymnasium API contract, and the __init__.py module adopts a lazy import to decouple transforms from the optional smolagents runtime dependency.

  • transforms.py: CodeSafetyTransform now uses ast.walk to detect only direct builtin calls and top-level os/subprocess imports; attribute-method calls like db.exec() are no longer penalised. CodeQualityTransform broadened its ast.parse guard to (SyntaxError, RecursionError, ValueError).
  • python_codeact_env.py: reset() accepts seed, episode_id, and **kwargs for API compatibility; step() is restored to the canonical step(action) -> Observation signature from INVARIANTS.md.
  • __init__.py / app.py: Module-level lazy loading via __getattr__ and an import reorder in app.py.

Confidence Score: 5/5

Safe to merge — all previously-identified bugs are resolved and the core transform/env logic is correct.

The two previous out-of-diff findings (CodeQualityTransform crash on deeply-nested code and the dead timeout_s parameter in step()) are both fixed in the current code. The remaining comments are style-level and do not affect runtime correctness.

No files require special attention for correctness; tests/envs/test_coding_safety_transform.py is worth a second look for CI portability.

Important Files Changed

Filename Overview
envs/coding_env/server/init.py Switched to lazy __getattr__-based import to avoid pulling in optional smolagents dependency; the resolved class is not cached on the module, so __getattr__ fires on every attribute access rather than only the first.
envs/coding_env/server/app.py Import reordering only — openenv.core.env_server moved before local coding_env imports; no functional change.
envs/coding_env/server/python_codeact_env.py Added seed, episode_id, and **kwargs to reset() for Gymnasium API compatibility; step() signature restored to the canonical step(action) -> Observation; both changes align with INVARIANTS.md.
envs/coding_env/server/transforms.py Replaced regex-based safety detection with AST-based _detect_violation; broadened CodeQualityTransform exception guard to include RecursionError and ValueError; both previous P1 findings are resolved.
tests/envs/test_coding_safety_transform.py New test file covering safety-transform false positives and recursion-error handling; missing the sys.path bootstrap used by the sibling test, so it silently breaks without PYTHONPATH=src:envs.
tests/envs/test_python_codeact_reset.py Added two new reset tests for episode_id override and empty-string preservation; both correctly exercise the new guard in python_codeact_env.py.

Sequence Diagram

sequenceDiagram
    participant Orch as Orchestrator
    participant Env as PythonCodeActEnv
    participant Exec as PyExecutor
    participant ST as CodeSafetyTransform
    participant QT as CodeQualityTransform

    Orch->>Env: "reset(seed?, episode_id?, **kwargs)"
    Env->>Exec: PyExecutor() (fresh instance)
    Env->>ST: create_safe_coding_transform()
    Env-->>Orch: "CodeObservation(reward=0.0)"

    Orch->>Env: "step(CodeAction(code=...))"
    Env->>Exec: run(code)
    Exec-->>Env: ExecutionResult(stdout, stderr, exit_code)
    Env->>ST: __call__(obs + last_code metadata)
    Note over ST: ast.parse to _detect_violation()
    ST-->>Env: "obs(reward=-1.0 or 0.0)"
    Env->>QT: __call__(obs)
    QT-->>Env: obs(reward adjusted)
    Env-->>Orch: CodeObservation
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
tests/envs/test_coding_safety_transform.py:9-10
**Test missing sys.path bootstrap — breaks without explicit PYTHONPATH**

The sibling test `test_python_codeact_reset.py` guards against missing path setup with `sys.path.insert(...)` for the repo root plus `src/`, then uses the fully-qualified prefix `envs.coding_env.*`. This file uses the short form `coding_env.*`, which works only when the runner sets `PYTHONPATH=src:envs`. Running bare (`pytest tests/`) or in a CI job that doesn't set that variable produces an `ImportError`, while the sibling test still passes. Following the same bootstrap pattern would make the suite self-contained.

### Issue 2 of 2
envs/coding_env/server/__init__.py:18-23
**Lazy loader doesn't cache the resolved class — `__getattr__` fires on every access**

The current implementation returns `PythonCodeActEnv` on each call to `__getattr__` but never writes it back onto the module. Any code that repeatedly accesses `coding_env.server.PythonCodeActEnv` as an attribute (rather than a one-time `from … import`) will re-execute the `__getattr__` body on each access. The standard fix is to set the resolved name on the module after the first load, turning subsequent lookups into a direct `__dict__` hit.

```suggestion
def __getattr__(name: str) -> Any:
    if name == "PythonCodeActEnv":
        from .python_codeact_env import PythonCodeActEnv
        import sys

        setattr(sys.modules[__name__], name, PythonCodeActEnv)
        return PythonCodeActEnv
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
```

Reviews (7): Last reviewed commit: "Harden coding env quality transform" | Re-trigger Greptile

Comment thread envs/coding_env/server/transforms.py Outdated
Comment thread envs/coding_env/server/python_codeact_env.py
Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review Report — Tier 1/Tier 2

Automated Checks

  • Lint: envs/coding_env/server/app.py has a usort import sort violation (pre-existing, but CI blocker)
  • Debug code: CLEAN — no debug artifacts introduced

Tier 1: Fixes Required

1. step() signature incomplete vs base class

python_codeact_env.py — The base class Environment.step() declares step(self, action: ActT, timeout_s: Optional[float] = None, **kwargs: Any). The concrete implementation only has step(self, action: Action). This is a Liskov substitution violation. Since the PR reformats this method signature, this is the right moment to bring it into conformance with the base class.

2. Import sort violation in app.py

app.pyusort reports the file needs import sorting. Pre-existing issue but will block CI on merge.

3. sys.path manipulation in new test file

test_coding_safety_transform.py — Uses sys.path.insert() instead of relying on the project's standard PYTHONPATH=src:envs test setup. Should use conftest.py or the standard env var approach for consistency.

4. SyntaxError handling is a behavioral regression (minor)

transforms.py — When ast.parse() fails on syntactically invalid code containing dangerous patterns (e.g., import os\n\x00), _detect_violation() returns None and reward is set to 0.0. The old regex approach would have caught import os even in broken code. This is likely acceptable (CodeQualityTransform handles syntax errors separately), but should be documented as an intentional trade-off.

Tier 2: Alignment Discussion

ALIGNMENT FLAG: reset() signature extended with seed, episode_id, **kwargs

  • Principle at stake: API Invariant #1 from INVARIANTS.md
  • Assessment: The base class Environment.reset() already declares these parameters. Adding them to the concrete class brings coding_env into compliance with the documented invariant, not violating it. This is correct.
  • Concern: The episode_id override allows the training orchestrator to force a specific episode ID. Confirm that no MCP-exposed tool forwards episode_id to reset(), which would let an agent influence its own episode identity.
  • Suggested reviewer: @Darktex

ALIGNMENT FLAG: AST safety check coverage

  • Principle at stake: Rewards inside environment (RFC 002)
  • Assessment: The switch from regex to AST is a clear improvement — eliminates false positives from string literals (print("import os")) and similarly-named functions (myopen()). The AST check correctly only inspects ast.Name nodes, so db.exec("sql") is properly ignored (contradicting Greptile's P1 claim).
  • Concern: The blocked set is the same scope as before, but since this is a rewrite of the safety logic, worth asking: is the set complete? Notably absent: importlib.import_module("os") bypasses the check since it's an ast.Attribute call, not ast.Name. Similarly pathlib, socket, ctypes. May be acceptable if the environment has OS-level isolation.
  • Suggested reviewer: @Darktex

Clarification on Greptile's Prior P1 Claims

Greptile's review (from May 2) flagged two P1 issues that do not apply to the actual diff:

  1. "timeout_s accepted but never applied" — The actual diff does NOT add timeout_s to step(). It only reformats the existing signature. Greptile appears to have reviewed a different version of the code.
  2. "Attribute-based call detection introduces new false positives" — The actual code checks only ast.Name (bare function calls like exec(...)), NOT ast.Attribute (method calls like db.exec(...)). The test test_does_not_flag_attribute_method_named_exec explicitly verifies this. Greptile's claim is incorrect.

Summary

  • 2 mechanical fixes needed (step() signature conformance, import sort)
  • 1 minor style fix (sys.path manipulation in tests)
  • 1 behavioral note to document (SyntaxError handling trade-off)
  • 2 alignment questions for human review (episode_id/MCP boundary, safety check coverage)
  • Greptile's P1 flags are not applicable to the actual diff

Verdict: REQUEST_CHANGES


Automated review by Claude Code | Learn more

@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

@abhinavgautam01 abhinavgautam01 force-pushed the fix/coding-env-api-signature branch from ebf663c to 4e9c397 Compare May 13, 2026 19:56
@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

@abhinavgautam01 abhinavgautam01 requested a review from Darktex May 13, 2026 20:14
Copy link
Copy Markdown
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is an automated review by Claude Code, not a human review.


Alignment Review: PR #635

Tier 1: Bugs & Issues

1. reset() signature missing **kwargs from base class

The base class Environment.reset() accepts **kwargs: Any for forward-compatibility. This PR adds seed and episode_id but omits **kwargs. If the framework ever passes an additional keyword argument to reset(), PythonCodeActEnv will raise a TypeError while other environments will not. Please add **kwargs to match the full base signature.

2. seed parameter accepted but unused — intentional?

seed is accepted in the new signature but never referenced in the method body. This is likely fine (no random state to seed), but a brief inline comment clarifying this would prevent future contributors from assuming it's a bug.

3. AST-based safety detection has known bypass vectors

The switch from regex to AST analysis is a clear improvement that eliminates false positives. However, the following bypass patterns will NOT be caught:

  • getattr(__builtins__, 'eval')(code) — attribute-based access
  • importlib.import_module('os') — dynamic import via importlib
  • __builtins__['eval'](code) — subscript-based access

These are acceptable since the transform is a reward shaping heuristic, not a security boundary (container isolation provides actual sandboxing per INVARIANTS.md). However, the class docstring currently says it "evaluates code safety" which may mislead contributors into treating it as a security control. Consider adding a note: "This is a reward heuristic, not a security sandbox."

Tier 2: Alignment

ALIGNMENT FLAG: reset() should mirror full base signature

  • Principle: INVARIANTS.md §API Invariant 1 — Gymnasium API signatures must not change without a major version bump
  • Concern: Omitting **kwargs from the override breaks forward-compatibility. All Environment subclasses should mirror the full base class signature.

What looks good

  • AST-based detection is a significant improvement — eliminates false positives on string literals and user-defined functions
  • New test file covers both true positives and false negative cases thoroughly
  • Lazy import pattern in __init__.py is clean and well-motivated
  • episode_id override support aligns with the Gym-like API contract

Automated review by Claude Code | Learn more

@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

Comment thread envs/coding_env/server/transforms.py
@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

@abhinavgautam01
Copy link
Copy Markdown
Author

@greptile review

@abhinavgautam01 abhinavgautam01 requested a review from Darktex May 14, 2026 02:53
@abhinavgautam01
Copy link
Copy Markdown
Author

ping @Darktex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants