diff --git a/fern/versions/latest/pages/get-started/installation.mdx b/fern/versions/latest/pages/get-started/installation.mdx index ff3b5281be..9d0b00a59a 100644 --- a/fern/versions/latest/pages/get-started/installation.mdx +++ b/fern/versions/latest/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/latest/pages/reference/cli-commands.mdx b/fern/versions/latest/pages/reference/cli-commands.mdx index 0cd1e5a5bb..5f641d1807 100644 --- a/fern/versions/latest/pages/reference/cli-commands.mdx +++ b/fern/versions/latest/pages/reference/cli-commands.mdx @@ -95,6 +95,10 @@ gym eval run \ +wandb_project=gym-dev ``` + +Sampling parameters such as `temperature`, `top_p`, and `max_output_tokens` have dedicated flags on `gym eval run`. When passing them via Hydra instead (`++responses_create_params.*`), overrides are merged into each input row with a **shallow** merge (top-level keys only). Nested overrides such as `++responses_create_params.reasoning.effort=low` replace the entire nested dict — other fields under that key are not preserved. + + --- ## General @@ -269,6 +273,13 @@ gym dataset render \ --output preview.jsonl ``` + +**Which data-preparation command should I use?** + +- **`gym dataset render`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need prompt-ready rows. Maps to legacy `ng_materialize_prompts`. +- **`gym dataset collate`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. Maps to legacy `ng_prepare_data`. + + ### `gym dataset collate` Validate and collate a dataset, generating metrics and statistics. @@ -444,7 +455,7 @@ Collate data, start the servers, and collect rollouts. This is the main evaluati | `--model-type NAME` | Load the named model server type config. | | `--search-dir DIR` | Extra root directory to search for named components. Repeatable. | | `--no-serve` | Collect against already-running servers instead of starting them. | -| `--resume` | Resume from cached rollouts instead of recollecting. | +| `--resume` | Resume from cached rollouts instead of recollecting. Maps to legacy `+resume_from_cache=true`. Refer to [Resume interrupted runs](#resume-interrupted-runs). | | `--agent`, `-a` | Agent to collect rollouts with. | | `--input`, `-i` | Input tasks JSONL file. | | `--output`, `-o` | Output rollouts JSONL file. | @@ -488,6 +499,21 @@ gym eval run --no-serve \ --num-repeats '{agent_alpha: 4, agent_beta: 1, _default: 1}' ``` +#### Resume interrupted runs + +Pass `--resume` to restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. + +How it works: + +- **Materialized inputs.** On the first run, the fully expanded input rows (after `--num-repeats`, `--limit`, `--prompt-config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `--output` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. +- **Incremental output.** Successful rollouts are flushed to the main output JSONL after each completion; retriable failures go to a `_failures.jsonl` sidecar, so partial progress survives a crash. +- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. Without `--resume`, existing output is cleared before the run. + + +If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. + + ### `gym eval aggregate` Merge sharded rollout results into a single rollouts file with aggregate metrics. diff --git a/fern/versions/v0.3.0/pages/get-started/installation.mdx b/fern/versions/v0.3.0/pages/get-started/installation.mdx index bd20b4f646..ca0865ae14 100644 --- a/fern/versions/v0.3.0/pages/get-started/installation.mdx +++ b/fern/versions/v0.3.0/pages/get-started/installation.mdx @@ -8,9 +8,25 @@ position: 2 Python 3.12 is required. Refer to [Prerequisites](/get-started/prerequisites) for full system requirements. -Clone from git to get the latest features and environments. If you intend to use NeMo Gym with NeMo RL, use the latest NGC container. +Install from PyPI for the quickest setup. Clone from git if you need the latest features and environments, or use the NGC container if you intend to use NeMo Gym with NeMo RL. + + +```bash +pip install nemo-gym +``` + +Or with [uv](https://docs.astral.sh/uv/): + +```bash +uv venv --python 3.12 && source .venv/bin/activate +uv pip install nemo-gym +``` + +The package includes built-in environments and CLI commands. Config and data paths resolve against the package install location automatically. To pin a release: `pip install nemo-gym==0.3.1`. + + ```bash diff --git a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx index d24b69612a..52f8dc2acc 100644 --- a/fern/versions/v0.3.0/pages/reference/cli-commands.mdx +++ b/fern/versions/v0.3.0/pages/reference/cli-commands.mdx @@ -130,7 +130,8 @@ Perform a batch of rollout collection. | `num_repeats` | Optional[int] | The number of times to repeat each example to run. Useful if you want to calculate mean@k, such as mean@4 or mean@16. | | `num_repeats_add_seed` | bool | When num_repeats >1, add a "seed" parameter on the Responses create params. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Example** @@ -144,6 +145,37 @@ ng_collect_rollouts \ +num_samples_in_parallel=10 ``` +#### Generation Parameters + +Sampling parameters such as `temperature`, `max_output_tokens`, and `top_p` are not standalone CLI flags — they are passed as overrides inside `responses_create_params` using Hydra's nested dot syntax. Overrides are merged into each input row's existing `responses_create_params` with a **shallow** merge (top-level keys only): + +```bash +ng_collect_rollouts \ + +agent_name=example_single_tool_call_simple_agent \ + +input_jsonl_fpath=weather_query.jsonl \ + +output_jsonl_fpath=weather_rollouts.jsonl \ + ++responses_create_params.temperature=1.0 \ + ++responses_create_params.top_p=1.0 \ + ++responses_create_params.max_output_tokens=4096 +``` + +The same syntax works for `ng_e2e_collect_rollouts`. Top-level fields such as `temperature` and `max_output_tokens` are straightforward. For nested objects (for example, `++responses_create_params.reasoning.effort=low`), the entire nested dict replaces the row's existing value at that key — other fields under the same nested object are not preserved. + +#### Resume Interrupted Runs + +Setting `+resume_from_cache=true` lets you restart the **same command** after a crash or interruption and pick up only the rows that have not finished yet. It works for both `ng_collect_rollouts` and `ng_e2e_collect_rollouts`, across any environment. + +How it works: + +- **Materialized inputs.** On the first run, the fully expanded input rows (after `num_repeats`, `limit`, `prompt_config`, and any overrides) are written to a sidecar file next to your output. The path is derived from `output_jsonl_fpath` by appending `_materialized_inputs` to the stem — so `rollouts.jsonl` produces `rollouts_materialized_inputs.jsonl`. +- **Incremental output.** Results are flushed to `output_jsonl_fpath` after each completed rollout, so partial output survives a crash. +- **Matching.** On resume, completed work is matched by `(task_index, rollout_index)` against the materialized inputs, and already-completed rows are skipped. The run prints a summary such as the number of original input rows, rows already done, and rows that still need to be run. +- **Fallback.** If either the materialized inputs or the output file is missing, resume is skipped and the run starts fresh. With the default `resume_from_cache=False`, existing output is cleared before the run. + + +If you change the config, schema, or data between runs, the materialized inputs become stale and resume will diff against the old expansion. Delete the `*_materialized_inputs.jsonl` file (and the output file) to start fresh. + + ### `ng_e2e_collect_rollouts` / `nemo_gym_e2e_collect_rollouts` Spin up all necessary servers and perform a batch of rollout collection using each dataset inside the provided configs. @@ -154,7 +186,8 @@ Spin up all necessary servers and perform a batch of rollout collection using ea | --- | --- | --- | | `output_jsonl_fpath` | str | The output data JSONL file path. | | `num_samples_in_parallel` | Optional[int] | Limit the number of concurrent samples running at once. | -| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. | +| `responses_create_params` | Dict | Overrides for the `responses_create_params`, such as `temperature` and `max_output_tokens`. Refer to [Generation Parameters](#generation-parameters). | +| `resume_from_cache` | bool | Resume an interrupted run by skipping rows already completed. Default: `False`. Refer to [Resume Interrupted Runs](#resume-interrupted-runs). | **Examples** @@ -263,6 +296,38 @@ ng_prepare_data "+config_paths=[${config_paths}]" \ --- +### `ng_materialize_prompts` / `nemo_gym_materialize_prompts` + +Apply a prompt template to raw JSONL data, producing materialized JSONL with populated `responses_create_params.input` for RL training. + +Each input row must **not** already have a populated `responses_create_params.input`; the command applies the prompt template from `prompt_config` to each row, fills in the input, and preserves the row's other fields. + +**Parameters** + +| Parameter | Type | Description | +| --- | --- | --- | +| `input_jsonl_fpath` | str | Raw JSONL data (rows without `responses_create_params.input`). | +| `prompt_config` | str | Path to the prompt YAML file to apply. | +| `output_jsonl_fpath` | str | Output path for the materialized JSONL with populated prompts. | + +**Example** + +```bash +ng_materialize_prompts \ + +input_jsonl_fpath=data/my_dataset.jsonl \ + +prompt_config=/path/to/my_prompt.yaml \ + +output_jsonl_fpath=my_dataset_materialized.jsonl +``` + + +**Which data-preparation command should I use?** + +- **`ng_materialize_prompts`** — a focused, standalone step that applies a prompt template to raw rows to populate `responses_create_params.input`. No servers are started. Use it when you have raw data and just need to turn it into prompt-ready rows. +- **`ng_prepare_data`** — the full preparation pipeline for training: it can download missing datasets, validate data, and compute dataset metrics, writing train/validation splits and metrics artifacts. Use it to prepare and validate datasets for training or PR submission. + + +--- + ## Dataset Registry - GitLab Commands for uploading, downloading, and managing datasets in GitLab Model Registry.