Skip to content

Update Harbor converter runtime isolation#411

Open
nancyjlau wants to merge 1 commit into
mainfrom
nancyjlau/harbor-convert-runtime-isolation
Open

Update Harbor converter runtime isolation#411
nancyjlau wants to merge 1 commit into
mainfrom
nancyjlau/harbor-convert-runtime-isolation

Conversation

@nancyjlau
Copy link
Copy Markdown
Contributor

@nancyjlau nancyjlau commented May 8, 2026

this PR updates the Harbor converter based on issues found while converting Harbor-format tasks.

changes:

  • preserve Harbor task workdir behavior, including [environment].workdir overrides, and make sure the original challenge workdir is writable for demoted coding tools
  • keep multiline HEALTHCHECK continuations intact when commenting out the original CMD/ENTRYPOINT
  • emit v5-compatible task fields: slug, agent_config, and validation
  • skip scoring/source-context files when baking task data (SCORING.md, advisory/reference markdown files, .dockerignore)

runtime/deploy fixes (@ryantzr1):

  • run the generated HUD server via hud on the generated venv PATH instead of wrapping startup with uv run
  • pin the generated controller venv to Python 3.12 for hosted analysis compatibility
  • clear stale verifier reward files before each scenario run
  • use a longer bash timeout for Harbor coding tasks

isolation fix:

  • baked task data now lives under /root/.hud_harbor/tasks
  • task data is root-only (chown root:root + chmod go-rwx)
  • EditTool, ReadTool, GrepTool, GlobTool, and ListTool are scoped to AGENT_WORKDIR
  • keeps the working deploy path: WORKDIR /hud, ENV PATH="/hud/.venv/bin:$PATH", CMD ["hud", "dev", "env:env", "--stdio"]

validation:

  • uv run pytest hud/cli/convert/tests/test_harbor.py
  • uv run ruff check hud/cli/convert/harbor.py hud/cli/convert/__init__.py hud/cli/convert/tests/test_harbor.py

this was also fully tested from converting one Harbor task that had issues, deployed the generated environment, and syncing the generated taskset successfully


Note

Medium Risk
Changes affect how converted Harbor environments run (workdir handling, tool scoping, Dockerfile/entrypoint, and reward file lifecycle), which could break existing converted tasks or alter agent filesystem access. However the scope is limited to the Harbor converter and associated writer logic with added tests.

Overview
Improves Harbor conversions to better isolate source task data from agents while preserving expected runtime behavior. Converted images now store task bundles under root-only /root/.hud_harbor/tasks, scope filesystem/edit tools to the extracted/configured Harbor workdir, and ensure that workdir is writable for the demoted agent user.

Updates generated runtime artifacts: Dockerfile generation preserves multi-line HEALTHCHECK continuations, pins the HUD venv sync to Python 3.12, switches startup to hud (via venv PATH), and clears stale reward files before each run; the scenario also runs test.sh from the Harbor workdir and increases BashTool timeout.

Updates output compatibility and hygiene: taskset entries now include stable slugs plus v5 fields agent_config and validation, and write_result skips copying scoring/source-context files (e.g., SCORING.md, .dockerignore, and reference-style markdown). Tests were expanded to cover slug generation, workdir extraction/override, healthcheck preservation, task-data hiding, and the new skip rules.

Reviewed by Cursor Bugbot for commit 11d8032. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6568d5f. Configure here.

Comment thread hud/cli/convert/harbor.py Outdated
Co-authored-by: Ryan Tan <63581031+ryantzr1@users.noreply.github.com>
@nancyjlau nancyjlau force-pushed the nancyjlau/harbor-convert-runtime-isolation branch from 6568d5f to 11d8032 Compare May 8, 2026 17:12
@nancyjlau
Copy link
Copy Markdown
Contributor Author

for testing that an agent can see stuff inside the harbor files. requires docker running, the hud-python venv, and the current branch checked out
harbor-agent-leak-smoke.sh

currently can no longer see the content with the current fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants