Updates for supporting CVDP Agentic subset by arti4nvj · Pull Request #1744 · NVIDIA-NeMo/Gym

arti4nvj · 2026-06-25T23:20:56Z

Refactor the CVDP resources server to verify RTL using the Apptainer Provider sandbox instead of the previous Docker harness, and add an agentic CVDP agent.

Split harness from verification logic: extracted the harness execution into a separate harness.py so app.py holds only the verifier logic.
Apptainer-based verification: the resources server now runs the CVDP test harness inside an Apptainer Provider sandbox.
New agentic agent (cvdp_agent/agentic_app.py): wraps the Claude Code agent, installs it into the Apptainer sandbox, and lets the model edit files and self-test with the in-container EDA tools.
Configs, tests, and README updated for both the non-agentic and agentic flows.

For n=1, for the agentic non-commerical subset, seeing 35.87% pass rate (compared to 40% from the original cvdp infra). For the non-agentic non-commerical subset, seeing 41.72% (in line with original cvdp infra).

Add an ApptainerProvider implementing the SandboxProvider protocol via the local apptainer CLI: persistent instance lifecycle, exec with user/fakeroot mapping, bind-mount file transfer, status, readiness probe, and teardown. Register it under the name "apptainer" and add unit tests plus a README. Signed-off-by: Arti Jain <artij@nvidia.com>

Signed-off-by: Arti Jain <artij@nvidia.com>

Parse Claude Code's authoritative num_turns from the stream-json result event and include it in the returned metadata. Signed-off-by: Arti Jain <artij@nvidia.com>

Add the CVDP code-generation environment built on the Apptainer sandbox provider: resources server with harness execution, non-agentic and agentic cvdp_agent harnesses, configs, tests, and example dataset. Signed-off-by: Arti Jain <artij@nvidia.com>

Signed-off-by: Christian Munley <cmunley@nvidia.com>

…_server

…cvdp_resources_server # Conflicts: # resources_servers/cvdp/README.md

copy-pr-bot · 2026-06-25T23:20:59Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cwing-nvidia · 2026-06-25T23:31:59Z

+
+There are two ways to drive this resources server:
+
+- **Non-agentic** (`cvdp_agent`, `responses_api_agents/cvdp_agent/app.py`, config `configs/cvdp_agent.yaml`): the model emits the RTL directly in its text response; the server parses it out and runs the harness.


it's a bit confusing to describe this path as non-agentic, considering cvdp_agent is itself an agent that we are using in the first scenario

cwing-nvidia · 2026-06-25T23:32:14Z

+There are two ways to drive this resources server:
+
+- **Non-agentic** (`cvdp_agent`, `responses_api_agents/cvdp_agent/app.py`, config `configs/cvdp_agent.yaml`): the model emits the RTL directly in its text response; the server parses it out and runs the harness.
+- **Agentic** (`cvdp_agent_agentic`, `responses_api_agents/cvdp_agent/agentic_app.py`, config `configs/cvdp_agent_agentic.yaml`): runs Claude Code **inside** the EDA sim container so it can edit files on disk and self-test with the in-container EDA tools, then reports the files it wrote back to the server as `rtl_files` for grading. See `[responses_api_agents/cvdp_agent/](../../responses_api_agents/cvdp_agent/)`.


I'd recommend we think about harness as first-class composable unit, describe this as illustration using Claude Code but could swap in other harnesses as well.

cwing-nvidia · 2026-06-25T23:35:18Z

what's the rationale for splitting the verifier into two files app.py and the naming behind harness.py?

cmunley1 · 2026-06-26T02:08:57Z

would suggest considering an approach like this to reuse all agent harnesses with 0 code rewriting https://github.com/NVIDIA-NeMo/Gym/blob/main/responses_api_agents/anyterminal_agent/app.py#L190

arti4nvj and others added 10 commits June 24, 2026 00:04

Merge branch 'main' into artij/apptainer_provider_cvdp

cc0fe46

update to allow piping input via stdin for longer inputs

ee2153a

Signed-off-by: Arti Jain <artij@nvidia.com>

added a function that accepts one of many binds via provider_options.

be7cf87

Signed-off-by: Arti Jain <artij@nvidia.com>

added support for daemon running

d29eee7

Signed-off-by: Arti Jain <artij@nvidia.com>

feat(claude_code_agent): surface num_turns in parsed metadata

41c86d0

Parse Claude Code's authoritative num_turns from the stream-json result event and include it in the returned metadata. Signed-off-by: Arti Jain <artij@nvidia.com>

trim

e7dd066

Signed-off-by: Christian Munley <cmunley@nvidia.com>

Merge apptainer provider updates (PR #1742 fixes) into cvdp_resources…

49531fe

…_server

Merge branch 'main' of https://github.com/NVIDIA-NeMo/Gym into artij/…

26373a3

…cvdp_resources_server # Conflicts: # resources_servers/cvdp/README.md

arti4nvj requested a review from hemildesai June 25, 2026 23:21

cwing-nvidia reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updates for supporting CVDP Agentic subset#1744

Updates for supporting CVDP Agentic subset#1744
arti4nvj wants to merge 10 commits into
mainfrom
artij/cvdp_resources_server

arti4nvj commented Jun 25, 2026

Uh oh!

copy-pr-bot Bot commented Jun 25, 2026

Uh oh!

cwing-nvidia Jun 25, 2026

Uh oh!

cwing-nvidia Jun 25, 2026

Uh oh!

cwing-nvidia Jun 25, 2026

Uh oh!

cmunley1 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		There are two ways to drive this resources server:

		- Non-agentic (`cvdp_agent`, `responses_api_agents/cvdp_agent/app.py`, config `configs/cvdp_agent.yaml`): the model emits the RTL directly in its text response; the server parses it out and runs the harness.

Uh oh!

Conversation

arti4nvj commented Jun 25, 2026

Uh oh!

copy-pr-bot Bot commented Jun 25, 2026

Uh oh!

cwing-nvidia Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cwing-nvidia Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cwing-nvidia Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

cmunley1 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants