swe-grep-oss

Overview

Environment ID: swe-grep-oss
Short description: Environment for evaluating and developing models like SWE-grep

Datasets

Primary dataset(s): SWE-Bench Lite

Task

Type: <single-turn | multi-turn | tool use>
Parser: <e.g., ThinkParser, XMLParser, custom>
Rubric overview:

Quickstart

Run an evaluation with your model of choice (repos are cloned automatically and deleted after each rollout):

Default rollout clone root: system temp directory under swe-grep-oss-repos
Rollout directories are unique per rollout and look like <repo>_<instance_id>_<random_suffix>
Repositories are cloned directly at the target commit with git clone --revision <sha> --depth 1 when supported, with a git init + fetch fallback for older Git versions
Set SWE_GREP_ENV_BACKEND=sandbox to switch from the default local env to a sandbox-backed env
The sandbox variant uses a minimal public image (python:3.11-slim) with 1 CPU core, 2 GB RAM, and 5 GB disk, then installs git, jq, and ripgrep during setup before checking out the repo into /workspace/repo

uv run vf-eval swe-grep-oss \
  --api-base-url https://api.openai.com/v1 \
  --api-key-var OPENAI_API_KEY \
  --model "gpt-4o-mini" \
  --num-examples 2 \
  --rollouts-per-example 1

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.codex/skills		.codex/skills
.prime		.prime
configs		configs
docs		docs
outputs/evals		outputs/evals
prime-configs		prime-configs
prompts		prompts
rewards		rewards
tests		tests
tools		tools
utils		utils
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
rlx.toml		rlx.toml
swe_grep_oss.py		swe_grep_oss.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

swe-grep-oss

Overview

Datasets

Task

Quickstart

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

swe-grep-oss

Overview

Datasets

Task

Quickstart

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages