Skip to content

[misc] chore: sync latest code base#61

Merged
yyDing1 merged 3 commits into
mainfrom
yy/sync
Jun 15, 2026
Merged

[misc] chore: sync latest code base#61
yyDing1 merged 3 commits into
mainfrom
yy/sync

Conversation

@yyDing1

@yyDing1 yyDing1 commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

Minor data-pipeline / eval fixes and housekeeping for the SWE-agent examples.

Data preprocessing

  • swe_rebench.py: add a SKIP_INSTANCES hook and drop instances without an image; remove the git checkout <base_commit> / git clean steps from the per-instance reset; print the final instance count.
  • swe_rebench_v2.py: filter out instances whose deployment image is None.

Eval / reward

  • reward/swe_rebench.py: skip the test-file reset when a task has no modified test files (avoids running git checkout <commit> with no paths).

Agent loop

  • agent_loop.py: move the per-run log handler and config logging inside the concurrency semaphore.

Misc

  • run_infer.sh: point the example at the modal parquet/config and bump --nnodes + sampling params.
  • README.md: add supervisor note to the citation.
  • Bump the verl submodule.

yyDing1 and others added 3 commits June 15, 2026 11:57
…to 7aed6b2

Revert the swe_rebench data source back to dyyyyyyyy/swe-rebench-filtered
(split=train) and restore the verl submodule to 7aed6b2 (drop the 5a38699 bump).

Co-authored-by: Cursor <cursoragent@cursor.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates data preprocessing scripts for SWE-rebench, adjusts agent interaction parameters in run_infer.sh, moves logging initialization inside the semaphore block in agent_loop.py, and adds a safety check for empty test files in swe_rebench.py rewards. The review feedback suggests two improvements: defensively creating the output directory in agent_loop.py to avoid a potential FileNotFoundError before adding the file handler, and using shlex.quote and -- in git checkout within swe_rebench.py to prevent shell injection and option parsing issues.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +32 to +35
if test_files:
reset_tests_command = f"git checkout {base_commit} {' '.join(test_files)}"
else:
reset_tests_command = "echo 'skip reset'"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

To prevent potential shell injection vulnerabilities or command execution failures, filenames should be properly escaped using shlex.quote when constructing shell commands. Additionally, using -- before the file list in git checkout is a best practice to prevent filenames starting with a dash (-) from being interpreted as command-line options.

Suggested change
if test_files:
reset_tests_command = f"git checkout {base_commit} {' '.join(test_files)}"
else:
reset_tests_command = "echo 'skip reset'"
if test_files:
import shlex
reset_tests_command = f"git checkout {base_commit} -- {' '.join(shlex.quote(f) for f in test_files)}"
else:
reset_tests_command = "echo 'skip reset'"

Comment thread uni_agent/agent_loop.py
self.logger.info(f"output_dir: {self.output_dir}")

async with self._semaphore:
add_file_handler(self.output_dir / "run.log", self.run_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The directory self.output_dir is not guaranteed to exist when add_file_handler is called. If the logging handler tries to open the log file before the directory is created, it can raise a FileNotFoundError. Creating the directory defensively before adding the file handler prevents this potential failure.

Suggested change
add_file_handler(self.output_dir / "run.log", self.run_id)
self.output_dir.mkdir(parents=True, exist_ok=True)
add_file_handler(self.output_dir / "run.log", self.run_id)

@yyDing1 yyDing1 merged commit 1981878 into main Jun 15, 2026
4 of 5 checks passed
@yyDing1 yyDing1 deleted the yy/sync branch June 15, 2026 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant