From 7f1fb5913dacae4fbcbbaddce8ef302bab553bd5 Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 2 Jun 2026 11:40:22 -0400 Subject: [PATCH 1/6] docs: fill CLI reference gaps for data prep and rollout collection Add missing CLI reference coverage in cli-commands.mdx: - ng_materialize_prompts: full parameter table, example, and a note comparing it to ng_prepare_data (closes #1347). - resume_from_cache: params-table entries for ng_collect_rollouts and ng_e2e_collect_rollouts plus a "Resume interrupted runs" section covering materialized inputs, incremental flush, matching, and the stale-cache footgun (closes #1239). - Generation parameters: dedicated section documenting how temperature, top_p, and max_output_tokens are passed via ++responses_create_params. rather than as standalone flags (closes #637). Co-Authored-By: Claude Opus 4.8 Signed-off-by: Lawrence Lane --- .../latest/pages/reference/cli-commands.mdx | 69 ++++++++++++++++++- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index d24b69612a..3efb6bde89 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -130,7 +130,8 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. See [Generation parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. See [Resume interrupted runs](#resume-interrupted-runs). | **Example** @@ -144,6 +145,37 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` +#### Generation parameters + +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Each override is merged into every input row's existing `responses_create_params`: + +```bash +ng_collect_rollouts \ + +agent_name=example_single_tool_call_simple_agent \ + +input_jsonl_fpath=weather_query.jsonl \ + +output_jsonl_fpath=weather_rollouts.jsonl \ + ++responses_create_params.temperature=1.0 \ + ++responses_create_params.top_p=1.0 \ + ++responses_create_params.max_output_tokens=4096 +``` + +The same syntax works for `ng_e2e_collect_rollouts`. Any field accepted by the Responses API create params can be set this way (for example, `++responses_create_params.reasoning.effort=low`). + +#### Resume interrupted runs + +Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. + +How it works: + +- **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. +- **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. +- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped gracefully and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. + + +If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. + + ### `ng_e2e_collect_rollouts` / `nemo_gym_e2e_collect_rollouts` Spin up all necessary servers and perform a batch of rollout collection using each dataset inside the provided configs. @@ -154,7 +186,8 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. See [Generation parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. See [Resume interrupted runs](#resume-interrupted-runs). | **Examples** @@ -263,6 +296,38 @@ ng_prepare_data "+config_paths=[${config_paths}]" \ --- +### `ng_materialize_prompts` / `nemo_gym_materialize_prompts` + +Apply a prompt template to raw JSONL data, producing materialized JSONL with populated `responses_create_params.input` for RL training. + +Each input row must **not** already have a populated `responses_create_params.input`; the command applies the prompt template from `prompt_config` to each row, fills in the input, and preserves the row's other fields. + +**Parameters** + +| Parameter | Type | Description | +| --- | --- | --- | +| `input_jsonl_fpath` | str | Raw JSONL data (rows without `responses_create_params.input`). | +| `prompt_config` | str | Path to the prompt YAML file to apply. | +| `output_jsonl_fpath` | str | Output path for the materialized JSONL with populated prompts. | + +**Example** + +```bash +ng_materialize_prompts \ + +input_jsonl_fpath=data/my_dataset.jsonl \ + +prompt_config=/path/to/my_prompt.yaml \ + +output_jsonl_fpath=my_dataset_materialized.jsonl +``` + + +**Which data-preparation command should I use?** + +- **`ng_materialize_prompts`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need to turn it into prompt-ready rows. +- **`ng_prepare_data`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. + + +--- + ## Dataset Registry - GitLab Commands for uploading, downloading, and managing datasets in GitLab Model Registry. From 52b52f73761b84a1c3377320d6caf8c8df3b0e31 Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 2 Jun 2026 12:00:21 -0400 Subject: [PATCH 2/6] docs: apply NVIDIA style guide to CLI reference additions - "See" -> "Refer to" in the new cross-references (accessibility wording). - Title-case the two new subheadings (Generation Parameters, Resume Interrupted Runs); anchors are unchanged so links still resolve. - Drop the academic adverb "gracefully". Co-Authored-By: Claude Opus 4.8 Signed-off-by: Lawrence Lane --- .../latest/pages/reference/cli-commands.mdx | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index 3efb6bde89..139f05a978 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -130,8 +130,8 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. See [Generation parameters](#generation-parameters). | -| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. See [Resume interrupted runs](#resume-interrupted-runs). | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Example** @@ -145,7 +145,7 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` -#### Generation parameters +#### Generation Parameters Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Each override is merged into every input row's existing `responses_create_params`: @@ -161,7 +161,7 @@ ng_collect_rollouts \ The same syntax works for `ng_e2e_collect_rollouts`. Any field accepted by the Responses API create params can be set this way (for example, `++responses_create_params.reasoning.effort=low`). -#### Resume interrupted runs +#### Resume Interrupted Runs Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. @@ -170,7 +170,7 @@ How it works: - **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. - **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. - **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. -- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped gracefully and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. @@ -186,8 +186,8 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. See [Generation parameters](#generation-parameters). | -| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. See [Resume interrupted runs](#resume-interrupted-runs). | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Examples** From 91526fcedfed6bf4742d8ccf1642900cc40aac53 Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 23 Jun 2026 11:15:23 -0400 Subject: [PATCH 3/6] docs: extend CLI reference cleanup with pip install and v0.3.0 mirror Add PyPI-first installation tabs (#1191), mirror CLI reference additions into v0.3.0 stable docs, and clarify that responses_create_params overrides use a shallow merge per review feedback on #1498. --- .../latest/pages/get-started/installation.mdx | 18 ++++- .../latest/pages/reference/cli-commands.mdx | 4 +- .../v0.3.0/pages/get-started/installation.mdx | 18 ++++- .../v0.3.0/pages/reference/cli-commands.mdx | 69 ++++++++++++++++++- 4 files changed, 103 insertions(+), 6 deletions(-) diff --git a/fern/versions/latest/pages/get-started/installation.mdx b/fern/versions/latest/pages/get-started/installation.mdx index bd20b4f646..ca0865ae14 100644 --- a/fern/versions/latest/pages/get-started/installation.mdx +++ b/fern/versions/latest/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index 139f05a978..52f8dc2acc 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -147,7 +147,7 @@ ng_collect_rollouts \ #### Generation Parameters -Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Each override is merged into every input row's existing `responses_create_params`: +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): ```bash ng_collect_rollouts \ @@ -159,7 +159,7 @@ ng_collect_rollouts \ ++responses_create_params.max_output_tokens=4096 ``` -The same syntax works for `ng_e2e_collect_rollouts`. Any field accepted by the Responses API create params can be set this way (for example, `++responses_create_params.reasoning.effort=low`). +The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. #### Resume Interrupted Runs diff --git a/fern/versions/v0.3.0/pages/get-started/installation.mdx b/fern/versions/v0.3.0/pages/get-started/installation.mdx index bd20b4f646..ca0865ae14 100644 --- a/fern/versions/v0.3.0/pages/get-started/installation.mdx +++ b/fern/versions/v0.3.0/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx index d24b69612a..52f8dc2acc 100644 --- a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx +++ b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx @@ -130,7 +130,8 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Example** @@ -144,6 +145,37 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` +#### Generation Parameters + +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): + +```bash +ng_collect_rollouts \ + +agent_name=example_single_tool_call_simple_agent \ + +input_jsonl_fpath=weather_query.jsonl \ + +output_jsonl_fpath=weather_rollouts.jsonl \ + ++responses_create_params.temperature=1.0 \ + ++responses_create_params.top_p=1.0 \ + ++responses_create_params.max_output_tokens=4096 +``` + +The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. + +#### Resume Interrupted Runs + +Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. + +How it works: + +- **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. +- **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. +- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. + + +If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. + + ### `ng_e2e_collect_rollouts` / `nemo_gym_e2e_collect_rollouts` Spin up all necessary servers and perform a batch of rollout collection using each dataset inside the provided configs. @@ -154,7 +186,8 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Examples** @@ -263,6 +296,38 @@ ng_prepare_data "+config_paths=[${config_paths}]" \ --- +### `ng_materialize_prompts` / `nemo_gym_materialize_prompts` + +Apply a prompt template to raw JSONL data, producing materialized JSONL with populated `responses_create_params.input` for RL training. + +Each input row must **not** already have a populated `responses_create_params.input`; the command applies the prompt template from `prompt_config` to each row, fills in the input, and preserves the row's other fields. + +**Parameters** + +| Parameter | Type | Description | +| --- | --- | --- | +| `input_jsonl_fpath` | str | Raw JSONL data (rows without `responses_create_params.input`). | +| `prompt_config` | str | Path to the prompt YAML file to apply. | +| `output_jsonl_fpath` | str | Output path for the materialized JSONL with populated prompts. | + +**Example** + +```bash +ng_materialize_prompts \ + +input_jsonl_fpath=data/my_dataset.jsonl \ + +prompt_config=/path/to/my_prompt.yaml \ + +output_jsonl_fpath=my_dataset_materialized.jsonl +``` + + +**Which data-preparation command should I use?** + +- **`ng_materialize_prompts`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need to turn it into prompt-ready rows. +- **`ng_prepare_data`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. + + +--- + ## Dataset Registry - GitLab Commands for uploading, downloading, and managing datasets in GitLab Model Registry. From be1b7fc2b79c6bbfd6970bb04dc821124cc1d8dc Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 23 Jun 2026 11:15:45 -0400 Subject: [PATCH 4/6] docs: add DCO sign-off for CLI reference cleanup follow-up No-op commit to satisfy DCO on the pip install and shallow-merge doc updates pushed in 91526fce. Signed-off-by: Lawrence Lane From d8d92f3479890b0cbfe7a706c91b0043984b5a76 Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 23 Jun 2026 11:16:05 -0400 Subject: [PATCH 5/6] Revert "docs: extend CLI reference cleanup with pip install and v0.3.0 mirror" This reverts commit 91526fcedfed6bf4742d8ccf1642900cc40aac53. Signed-off-by: Lawrence Lane --- .../latest/pages/get-started/installation.mdx | 18 +---- .../latest/pages/reference/cli-commands.mdx | 4 +- .../v0.3.0/pages/get-started/installation.mdx | 18 +---- .../v0.3.0/pages/reference/cli-commands.mdx | 69 +------------------ 4 files changed, 6 insertions(+), 103 deletions(-) diff --git a/fern/versions/latest/pages/get-started/installation.mdx b/fern/versions/latest/pages/get-started/installation.mdx index ca0865ae14..bd20b4f646 100644 --- a/fern/versions/latest/pages/get-started/installation.mdx +++ b/fern/versions/latest/pages/get-started/installation.mdx @@ -8,25 +8,9 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. +Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. - - -```bash -pip install nemo-gym -``` - -Or with [uv](https://docs.astral.sh/uv/): - -```bash -uv venv --python 3.12 && source .venv/bin/activate -uv pip install nemo-gym -``` - -The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. - - ```bash diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index 52f8dc2acc..139f05a978 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -147,7 +147,7 @@ ng_collect_rollouts \ #### Generation Parameters -Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Each override is merged into every input row's existing `responses_create_params`: ```bash ng_collect_rollouts \ @@ -159,7 +159,7 @@ ng_collect_rollouts \ ++responses_create_params.max_output_tokens=4096 ``` -The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. +The same syntax works for `ng_e2e_collect_rollouts`. Any field accepted by the Responses API create params can be set this way (for example, `++responses_create_params.reasoning.effort=low`). #### Resume Interrupted Runs diff --git a/fern/versions/v0.3.0/pages/get-started/installation.mdx b/fern/versions/v0.3.0/pages/get-started/installation.mdx index ca0865ae14..bd20b4f646 100644 --- a/fern/versions/v0.3.0/pages/get-started/installation.mdx +++ b/fern/versions/v0.3.0/pages/get-started/installation.mdx @@ -8,25 +8,9 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. +Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. - - -```bash -pip install nemo-gym -``` - -Or with [uv](https://docs.astral.sh/uv/): - -```bash -uv venv --python 3.12 && source .venv/bin/activate -uv pip install nemo-gym -``` - -The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. - - ```bash diff --git a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx index 52f8dc2acc..d24b69612a 100644 --- a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx +++ b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx @@ -130,8 +130,7 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | -| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | **Example** @@ -145,37 +144,6 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` -#### Generation Parameters - -Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): - -```bash -ng_collect_rollouts \ - +agent_name=example_single_tool_call_simple_agent \ - +input_jsonl_fpath=weather_query.jsonl \ - +output_jsonl_fpath=weather_rollouts.jsonl \ - ++responses_create_params.temperature=1.0 \ - ++responses_create_params.top_p=1.0 \ - ++responses_create_params.max_output_tokens=4096 -``` - -The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. - -#### Resume Interrupted Runs - -Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. - -How it works: - -- **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. -- **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. -- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. -- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. - - -If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. - - ### `ng_e2e_collect_rollouts` / `nemo_gym_e2e_collect_rollouts` Spin up all necessary servers and perform a batch of rollout collection using each dataset inside the provided configs. @@ -186,8 +154,7 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | -| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | **Examples** @@ -296,38 +263,6 @@ ng_prepare_data "+config_paths=[${config_paths}]" \ --- -### `ng_materialize_prompts` / `nemo_gym_materialize_prompts` - -Apply a prompt template to raw JSONL data, producing materialized JSONL with populated `responses_create_params.input` for RL training. - -Each input row must **not** already have a populated `responses_create_params.input`; the command applies the prompt template from `prompt_config` to each row, fills in the input, and preserves the row's other fields. - -**Parameters** - -| Parameter | Type | Description | -| --- | --- | --- | -| `input_jsonl_fpath` | str | Raw JSONL data (rows without `responses_create_params.input`). | -| `prompt_config` | str | Path to the prompt YAML file to apply. | -| `output_jsonl_fpath` | str | Output path for the materialized JSONL with populated prompts. | - -**Example** - -```bash -ng_materialize_prompts \ - +input_jsonl_fpath=data/my_dataset.jsonl \ - +prompt_config=/path/to/my_prompt.yaml \ - +output_jsonl_fpath=my_dataset_materialized.jsonl -``` - - -**Which data-preparation command should I use?** - -- **`ng_materialize_prompts`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need to turn it into prompt-ready rows. -- **`ng_prepare_data`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. - - ---- - ## Dataset Registry - GitLab Commands for uploading, downloading, and managing datasets in GitLab Model Registry. From a46a96c44d443f449a8ba97132e9ecdc791b3034 Mon Sep 17 00:00:00 2001 From: Lawrence Lane Date: Tue, 23 Jun 2026 11:16:06 -0400 Subject: [PATCH 6/6] docs: extend CLI reference cleanup with pip install and v0.3.0 mirror Add PyPI-first installation tabs (#1191), mirror CLI reference additions into v0.3.0 stable docs, and clarify that responses_create_params overrides use a shallow merge per review feedback on #1498. Signed-off-by: Lawrence Lane --- .../latest/pages/get-started/installation.mdx | 18 ++++- .../latest/pages/reference/cli-commands.mdx | 4 +- .../v0.3.0/pages/get-started/installation.mdx | 18 ++++- .../v0.3.0/pages/reference/cli-commands.mdx | 69 ++++++++++++++++++- 4 files changed, 103 insertions(+), 6 deletions(-) diff --git a/fern/versions/latest/pages/get-started/installation.mdx b/fern/versions/latest/pages/get-started/installation.mdx index bd20b4f646..ca0865ae14 100644 --- a/fern/versions/latest/pages/get-started/installation.mdx +++ b/fern/versions/latest/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index 139f05a978..52f8dc2acc 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -147,7 +147,7 @@ ng_collect_rollouts \ #### Generation Parameters -Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Each override is merged into every input row's existing `responses_create_params`: +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): ```bash ng_collect_rollouts \ @@ -159,7 +159,7 @@ ng_collect_rollouts \ ++responses_create_params.max_output_tokens=4096 ``` -The same syntax works for `ng_e2e_collect_rollouts`. Any field accepted by the Responses API create params can be set this way (for example, `++responses_create_params.reasoning.effort=low`). +The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. #### Resume Interrupted Runs diff --git a/fern/versions/v0.3.0/pages/get-started/installation.mdx b/fern/versions/v0.3.0/pages/get-started/installation.mdx index bd20b4f646..ca0865ae14 100644 --- a/fern/versions/v0.3.0/pages/get-started/installation.mdx +++ b/fern/versions/v0.3.0/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx index d24b69612a..52f8dc2acc 100644 --- a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx +++ b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx @@ -130,7 +130,8 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Example** @@ -144,6 +145,37 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` +#### Generation Parameters + +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): + +```bash +ng_collect_rollouts \ + +agent_name=example_single_tool_call_simple_agent \ + +input_jsonl_fpath=weather_query.jsonl \ + +output_jsonl_fpath=weather_rollouts.jsonl \ + ++responses_create_params.temperature=1.0 \ + ++responses_create_params.top_p=1.0 \ + ++responses_create_params.max_output_tokens=4096 +``` + +The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. + +#### Resume Interrupted Runs + +Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. + +How it works: + +- **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. +- **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. +- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. + + +If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. + + ### `ng_e2e_collect_rollouts` / `nemo_gym_e2e_collect_rollouts` Spin up all necessary servers and perform a batch of rollout collection using each dataset inside the provided configs. @@ -154,7 +186,8 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Examples** @@ -263,6 +296,38 @@ ng_prepare_data "+config_paths=[${config_paths}]" \ --- +### `ng_materialize_prompts` / `nemo_gym_materialize_prompts` + +Apply a prompt template to raw JSONL data, producing materialized JSONL with populated `responses_create_params.input` for RL training. + +Each input row must **not** already have a populated `responses_create_params.input`; the command applies the prompt template from `prompt_config` to each row, fills in the input, and preserves the row's other fields. + +**Parameters** + +| Parameter | Type | Description | +| --- | --- | --- | +| `input_jsonl_fpath` | str | Raw JSONL data (rows without `responses_create_params.input`). | +| `prompt_config` | str | Path to the prompt YAML file to apply. | +| `output_jsonl_fpath` | str | Output path for the materialized JSONL with populated prompts. | + +**Example** + +```bash +ng_materialize_prompts \ + +input_jsonl_fpath=data/my_dataset.jsonl \ + +prompt_config=/path/to/my_prompt.yaml \ + +output_jsonl_fpath=my_dataset_materialized.jsonl +``` + + +**Which data-preparation command should I use?** + +- **`ng_materialize_prompts`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need to turn it into prompt-ready rows. +- **`ng_prepare_data`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. + + +--- + ## Dataset Registry - GitLab Commands for uploading, downloading, and managing datasets in GitLab Model Registry.