diff --git a/models/sweeps/add-w-and-b-to-your-code.mdx b/models/sweeps/add-w-and-b-to-your-code.mdx
index 206045728c..1d7e0bf4b6 100644
--- a/models/sweeps/add-w-and-b-to-your-code.mdx
+++ b/models/sweeps/add-w-and-b-to-your-code.mdx
@@ -19,7 +19,7 @@ Next you define a function called `main` that mimics a typical training loop. Fo
This code is a mock training script. It does not train a model, but simulates the training process by generating random accuracy and loss values. The purpose of this code is to demonstrate how to integrate W&B into your training script.
-```python
+```python lines title="train.py"
import random
import numpy as np
@@ -64,13 +64,17 @@ To use the W&B Python SDK to start, stop, and manage sweeps, follow the instruct
-Create a YAML configuration file with your sweep configuration. The
-configuration file contains the hyperparameters you want the sweep to explore. In
-the following example, the batch size (`batch_size`), epochs (`epochs`), and
-the learning rate (`lr`) hyperparameters are varied during each sweep.
+Create a YAML file that defines the hyperparameters to optimize and the metric to optimize. W&B uses this file to determine which hyperparameters to vary during the sweep and which metric to optimize.
+
+Add the name of your Python script to the program key in the YAML file on line 1.
+
+ The sweep agent selects a value from the `values` list and passes it to `wandb.config` in the training script. For example, if you define the `batch_size` parameter with the values `[16, 32, 64]`, the sweep agent selects one of those values and passes it to the training script as `wandb.config.batch_size`.
-```yaml
-# config.yaml
+The following YAML file corresponds to the original training script shown earlier. The training script varies the batch_size, lr, and epochs hyperparameters. The YAML file defines the same hyperparameters and specifies the values to try for each one on lines 8 to 14.
+
+The training script also computes the validation accuracy metric, val_acc. The YAML file specifies that the sweep should maximize val_acc on line 5.
+
+```yaml lines title="config.yaml"
program: train.py
method: random
name: sweep
@@ -89,38 +93,45 @@ parameters:
For more information on how to create a W&B Sweep configuration, see [Define sweep configuration](/models/sweeps/define-sweep-configuration/).
-You must provide the name of your Python script for the `program` key
-in your YAML file.
+After you define your sweep configuration in a YAML file, you need to add W&B to your training script to read in the YAML file and log the metric you want to optimize for.
-Next, add the following to the code example:
+Within your training script, add the following code snippets to integrate W&B:
-1. Import the W&B Python SDK (`wandb`) and PyYAML (`yaml`). PyYAML is used to read in our YAML configuration file.
-2. Read in the configuration file.
-3. Use [`wandb.init()`](/models/ref/python/functions/init) to start a background process to sync and log data as a [W&B Run](/models/ref/python/experiments/run). Pass the config object to the config parameter.
-4. Define hyperparameter values from `wandb.Run.config` instead of using hard coded values.
-5. Log the metric you want to optimize with [`wandb.Run.log()`](/models/ref/python/experiments/run.md/#method-runlog). You must log the metric defined in your configuration. Within the configuration dictionary (`sweep_configuration` in this example) you define the sweep to maximize the `val_acc` value.
+1. Import the W&B Python SDK (`wandb`).
+2. Initialize a [run](/models/runs) with `wandb.init()`.
+3. Read the YAML configuration file with a Python package such as yaml, and pass the configuration to `wandb.init()`.
+4. Pass the configuration object to the config parameter of `wandb.init()`.
+5. Retrieve the hyperparameter values from `wandb.Run.config` so that your script uses the values defined in the YAML file instead of hard-coded values. W&B flattens configuration values, so you can access nested values with dot notation or bracket notation as though they were top-level keys.
+6. Log the metric that you want to optimize with `wandb.Run.log()`.
-```python
+
+You must log the metric you defined in your configuration.
+
+
+The following code snippet shows how to integrate W&B into your training script. Lines 4 to 7 show how to read in the YAML configuration file and pass the configuration to `wandb.init()`.
+
+Lines 9 and 10 show how to fetch the hyperparameter values from the `wandb.Run.config` object. Line 17 shows how to log the metric you are optimizing for (`val_acc`) to W&B.
+
+```python lines title="train.py"
import wandb
import yaml
import random
import numpy as np
-
def train_one_epoch(epoch, lr, batch_size):
+ """Simulates training for one epoch and returns the training accuracy and loss."""
acc = 0.25 + ((epoch / 30) + (random.random() / 10))
loss = 0.2 + (1 - ((epoch - 1) / 10 + random.random() / 5))
return acc, loss
-
def evaluate_one_epoch(epoch):
+ """Simulates evaluation for one epoch and returns the validation accuracy and loss."""
acc = 0.1 + ((epoch / 20) + (random.random() / 10))
loss = 0.25 + (1 - ((epoch - 1) / 10 + random.random() / 6))
return acc, loss
-
def main():
- # Set up your default hyperparameters
+ # Read in the configuration file
with open("./config.yaml") as file:
config = yaml.load(file, Loader=yaml.FullLoader)
@@ -142,9 +153,62 @@ def main():
main()
```
-In your CLI, set a maximum number of runs for the sweep
-agent to try. This is optional. This example we set the
-maximum number to 5.
+
+**W&B flattens configuration values passed to `wandb.init(config=)`**
+
+Normally, you access nested values in a configuration object with dot notation or bracket notation. For example, consider the following nested configuration:
+
+```yaml sample.yaml
+key1: value1
+key2:
+ nested_key1: nested_value1
+ nested_key2: nested_value2
+```
+
+You then read in the file with `yaml` and pass the configuration to `wandb.init(config=)`:
+
+```python
+import yaml
+
+with open("sample.yaml") as file:
+ yaml_sample = yaml.load(file, Loader=yaml.FullLoader)
+```
+
+You can then access `nested_value1` with `yaml_sample["key2"]["nested_key1"]` or `yaml_sample.key2.nested_key1`.
+
+When you pass a configuration to `wandb.init(config=)`, W&B flattens the values. This means that you access nested values as though they were top-level keys.
+
+For example, consider the following YAML file:
+
+```yaml config.yaml
+program: train.py
+method: random
+name: sweep
+metric:
+ goal: maximize
+ name: val_acc
+parameters:
+epochs:
+ values: [10, 20, 30]
+learning_rate:
+ min: 0.001
+ max: 0.1
+```
+
+After you read in the file and pass the configuration to `wandb.init(config=)`, access the `goal` value with `run.config["goal"]` instead of `run.config["metric"]["goal"]` or `run.config.metric.goal`.
+
+```python
+import yaml
+with open("config.yaml") as file:
+ config = yaml.load(file, Loader=yaml.FullLoader)
+with wandb.init(config=config) as run:
+ # Access the metric goal
+ metric_goal = run.config["goal"] # "maximize"
+```
+
+
+
+In your shell, set a maximum number of runs for the sweep agent to try. This is optional. In this example, we set the maximum number to 5.
```bash
NUM=5
@@ -153,7 +217,7 @@ NUM=5
Next, initialize the sweep with the [`wandb sweep`](/models/ref/cli/wandb-sweep) command. Provide the name of the YAML file. Optionally provide the name of the project for the project flag (`--project`):
```bash
-wandb sweep --project sweep-demo-cli config.yaml
+wandb sweep --project project_name config.yaml
```
This returns a sweep ID. For more information on how to initialize sweeps, see
@@ -164,7 +228,7 @@ the sweep job with the [`wandb agent`](/models/ref/cli/wandb-agent)
command:
```bash
-wandb agent --count $NUM your-entity/sweep-demo-cli/sweepID
+wandb agent --count $NUM your-entity/project_name/sweepID
```
For more information, see [Start sweep jobs](./start-sweep-agents).
diff --git a/models/sweeps/define-sweep-configuration.mdx b/models/sweeps/define-sweep-configuration.mdx
index 25c0d7cf74..c5cb2e05fe 100644
--- a/models/sweeps/define-sweep-configuration.mdx
+++ b/models/sweeps/define-sweep-configuration.mdx
@@ -3,32 +3,33 @@ description: Learn how to create configuration files for sweeps.
title: Overview
---
+Use a sweep configuration to define the hyperparameters to optimize during training. You can specify the hyperparameters to optimize, the search strategy to use, and other sweep settings.
-A W&B Sweep combines a strategy for exploring hyperparameter values with the code that evaluates them. The strategy can be as simple as trying every option or as complex as Bayesian Optimization and Hyperband ([BOHB](https://arxiv.org/abs/1807.01774)).
+The following sections describe the top-level structure of a sweep configuration. For a comprehensive list of top-level keys, see [Sweep configuration options](./sweep-config-keys).
-Define a sweep configuration either in a [Python dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) or a [YAML](https://yaml.org/) file. How you define your sweep configuration depends on how you want to manage your sweep.
-
-
-Define your sweep configuration in a YAML file if you want to initialize a sweep and start a sweep agent from the command line. Define your sweep in a Python dictionary if you initialize a sweep and start a sweep entirely within a Python script or notebook.
-
+## Basic structure
-The following guide describes how to format your sweep configuration. See [Sweep configuration options](./sweep-config-keys) for a comprehensive list of top-level sweep configuration keys.
+Sweep configurations use key-value pairs and nested structures. You can define your sweep configuration in a YAML file or in a Python dictionary. The structure of the sweep configuration is the same regardless of where you define it.
-## Basic structure
+
+**Where to define your sweep configuration?**
-Both sweep configuration format options (YAML and Python dictionary) utilize key-value pairs and nested structures.
+Define your sweep configuration in a YAML file if you want to manage sweeps from the command line or keep the sweep configuration separate from your training code.
-Use top-level keys within your sweep configuration to define qualities of your sweep search such as the name of the sweep ([`name`](./sweep-config-keys) key), the parameters to search through ([`parameters`](./sweep-config-keys#parameters) key), the methodology to search the parameter space ([`method`](./sweep-config-keys#method) key), and more.
+Define your sweep configuration in a Python dictionary if your training algorithm is defined in a Python script or notebook, or if you want to keep the sweep configuration close to your training code.
+
+Top-level keys define qualities of your sweep search such as the name of the sweep ([`name`](./sweep-config-keys) key), the parameters to search through ([`parameters`](./sweep-config-keys#parameters) key), the methodology to search the parameter space ([`method`](./sweep-config-keys#method) key), and more.
-For example, the following code snippets show the same sweep configuration defined within a YAML file and within a Python dictionary. Within the sweep configuration there are five top level keys specified: `program`, `name`, `method`, `metric` and `parameters`.
+The values associated with each key can be a string, a number, a list, or another nested key-value pair. The value type depends on the key.
+For example, the following code snippet shows a sweep configuration with the `method`, `metric`, and `parameters` keys. The method key specifies the search strategy (`bayes`). The `metric` key specifies the metric to optimize and whether to minimize or maximize it. The `parameters` key specifies the hyperparameters to optimize and their values or distributions.
-Define a sweep configuration in a YAML file if you want to manage sweeps interactively from the command line (CLI)
+The following code snippet shows how to define a sweep configuration in a YAML file named `config.yaml`:
-```yaml title="config.yaml"
+```yaml lines title="config.yaml"
program: train.py
name: sweepdemo
method: bayes
@@ -46,13 +47,14 @@ parameters:
optimizer:
values: ["adam", "sgd"]
```
+
+Within the top level `parameters` key (line 7), the following keys are nested: `learning_rate` (line 8), `batch_size` (line 11), `epochs` (line 14), and `optimizer` (line 17). For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more.
+
-Define a sweep in a Python dictionary data structure if you define training algorithm in a Python script or notebook.
-
The following code snippet stores a sweep configuration in a variable named `sweep_configuration`:
-```python title="train.py"
+```python lines title="train.py"
sweep_configuration = {
"name": "sweepdemo",
"method": "bayes",
@@ -65,16 +67,21 @@ sweep_configuration = {
},
}
```
-
-
+Within the top level `parameters` key (line 5), the following keys are nested: `learning_rate` (line 6), `batch_size` (line 7), `epochs` (line 8), and `optimizer` (line 10). For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more.
-Within the top level `parameters` key, the following keys are nested: `learning_rate`, `batch_size`, `epoch`, and `optimizer`. For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more. For more information, see the [parameters](./sweep-config-keys#parameters) section in [Sweep configuration options](./sweep-config-keys).
+
+
+
+See [Define sweep configuration options](./sweep-config-keys) for a comprehensive list of top-level sweep configuration keys and their associated values.
+
## Double nested parameters
-Sweep configurations support nested parameters. To define a nested parameter, include an additional `parameters` key under the top-level parameter name.
+Sweep configurations support nested parameters. Double nested parameters are useful for organizing your hyperparameters into categories. For example, you can group hyperparameters related to the optimizer under an `optimizer` category and group hyperparameters related to the model architecture under a `model` category.
+
+To define a nested parameter, include an additional `parameters` key under the top-level parameter name.
The following example shows a sweep configuration with three nested parameters: `nested_category_1`, `nested_category_2`, and `nested_category_3`. Each nested parameter includes two additional parameters: `momentum` and `weight_decay`.
@@ -87,7 +94,7 @@ The following code snippets show how to define nested parameters in both a YAML
-```yaml
+```yaml title="config.yaml"
program: sweep_nest.py
name: nested_sweep
method: random