Update Harbor converter runtime isolation#411
Open
nancyjlau wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6568d5f. Configure here.
Co-authored-by: Ryan Tan <63581031+ryantzr1@users.noreply.github.com>
6568d5f to
11d8032
Compare
Contributor
Author
|
for testing that an agent can see stuff inside the harbor files. requires docker running, the hud-python venv, and the current branch checked out currently can no longer see the content with the current fixes |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

this PR updates the Harbor converter based on issues found while converting Harbor-format tasks.
changes:
[environment].workdiroverrides, and make sure the original challenge workdir is writable for demoted coding toolsHEALTHCHECKcontinuations intact when commenting out the originalCMD/ENTRYPOINTslug,agent_config, andvalidationSCORING.md, advisory/reference markdown files,.dockerignore)runtime/deploy fixes (@ryantzr1):
hudon the generated venvPATHinstead of wrapping startup withuv runisolation fix:
/root/.hud_harbor/taskschown root:root+chmod go-rwx)EditTool,ReadTool,GrepTool,GlobTool, andListToolare scoped toAGENT_WORKDIRWORKDIR /hud,ENV PATH="/hud/.venv/bin:$PATH",CMD ["hud", "dev", "env:env", "--stdio"]validation:
uv run pytest hud/cli/convert/tests/test_harbor.pyuv run ruff check hud/cli/convert/harbor.py hud/cli/convert/__init__.py hud/cli/convert/tests/test_harbor.pythis was also fully tested from converting one Harbor task that had issues, deployed the generated environment, and syncing the generated taskset successfully
Note
Medium Risk
Changes affect how converted Harbor environments run (workdir handling, tool scoping, Dockerfile/entrypoint, and reward file lifecycle), which could break existing converted tasks or alter agent filesystem access. However the scope is limited to the Harbor converter and associated writer logic with added tests.
Overview
Improves Harbor conversions to better isolate source task data from agents while preserving expected runtime behavior. Converted images now store task bundles under root-only
/root/.hud_harbor/tasks, scope filesystem/edit tools to the extracted/configured Harbor workdir, and ensure that workdir is writable for the demoted agent user.Updates generated runtime artifacts: Dockerfile generation preserves multi-line
HEALTHCHECKcontinuations, pins the HUD venv sync to Python 3.12, switches startup tohud(via venvPATH), and clears stale reward files before each run; the scenario also runstest.shfrom the Harbor workdir and increasesBashTooltimeout.Updates output compatibility and hygiene: taskset entries now include stable
slugs plus v5 fieldsagent_configandvalidation, andwrite_resultskips copying scoring/source-context files (e.g.,SCORING.md,.dockerignore, and reference-style markdown). Tests were expanded to cover slug generation, workdir extraction/override, healthcheck preservation, task-data hiding, and the new skip rules.Reviewed by Cursor Bugbot for commit 11d8032. Bugbot is set up for automated code reviews on this repo. Configure here.