Skip to content

[codex] Fix mini-swe-agent 2 quickstart#1782

Draft
hemildesai wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
hemildesai:codex/mini-swe-agent2-quickstart
Draft

[codex] Fix mini-swe-agent 2 quickstart#1782
hemildesai wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
hemildesai:codex/mini-swe-agent2-quickstart

Conversation

@hemildesai

Copy link
Copy Markdown
Contributor

Summary

Fixes the mini-swe-agent 2 README quickstart so it matches the current Gym CLI workflow and the checked-in sandbox defaults.

This PR addresses the reported quickstart issues:

  • adds a real Quick Start flow: prerequisites, environment variables, start servers, run one small eval, and expected outputs
  • replaces deprecated ng_run, ng_collect_rollouts, and ng_reward_profile examples with gym env start, gym eval run, and gym eval profile
  • adds a committed one-row smoke input at responses_api_agents/mini_swe_agent_2/data/example.jsonl
  • standardizes quickstart generation settings on max_output_tokens: 16384
  • documents OpenSandbox, model endpoint, and SWE-bench image prerequisites
  • keeps the server example aligned with the config defaults: cpu: 2, memory_mib: 8192, disk_gib: 20, and step_limit: 250
  • adds an expected outputs section describing rollout, materialized input, aggregate metrics, and per-instance artifacts
  • adds a unit-test guard that the committed smoke file stays one valid SWE-bench-style row

Validation

  • python -m json.tool responses_api_agents/mini_swe_agent_2/data/example.jsonl
  • wc -l responses_api_agents/mini_swe_agent_2/data/example.jsonl
  • stale-command scan for deprecated commands and the old missing data filename
  • uv run pytest responses_api_agents/mini_swe_agent_2/tests/test_app.py -q
  • uv run pre-commit run --files responses_api_agents/mini_swe_agent_2/README.md responses_api_agents/mini_swe_agent_2/data/example.jsonl responses_api_agents/mini_swe_agent_2/tests/test_app.py

Signed-off-by: Hemil Desai <hemild@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant