The S Mazes and the M1 Mazes with Their Tests#19
Conversation
There was a problem hiding this comment.
Pull request overview
This PR brings the S (S1–S6) and M1 (single key-door) maze JSON specifications into the root repository (out of the ogbench submodule), along with dedicated unit tests and supporting test helpers to validate maze structure and solvability.
Changes:
- Adds new maze-spec test suites for S mazes and M1 mazes, plus shared maze test utilities.
- Introduces a test-helper BFS solver wrapper that delegates planning to the existing
gridworldbaseline planner. - Adds the S/M1 experimental maze JSONs under
mazes/exp_maze_jsons/, plus editor and submodule metadata files.
Reviewed changes
Copilot reviewed 30 out of 52 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_S_maze_types.py | Adds unit tests validating navigation, metadata, and solvability for S1–S6 maze specs. |
| tests/test_M1_maze_types.py | Adds unit tests validating M1 naming, structure parity with S mazes, and key-door chain constraints. |
| tests/maze_test_utils.py | Adds shared loaders and assertions for maze spec validation and BFS-based solvability checks. |
| tests/BFS_solver.py | Provides a legacy-shaped BFS test helper that delegates to gridworld.baselines planning. |
| mazes/exp_maze_jsons/S1/8x8_empty_room_0.json | Adds S1 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S1/8x8_empty_room_1.json | Adds S1 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/S2/8x8_corridor_0.json | Adds S2 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S2/8x8_corridor_1.json | Adds S2 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/S3/10x10_corridor_0.json | Adds S3 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S3/10x10_corridor_1.json | Adds S3 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/S4/10x10_dense_0.json | Adds S4 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S4/10x10_dense_1.json | Adds S4 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/S5/14x14_corridor_0.json | Adds S5 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S5/14x14_corridor_1.json | Adds S5 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/S6/14x14_dense_0.json | Adds S6 variant 0 maze JSON spec. |
| mazes/exp_maze_jsons/S6/14x14_dense_1.json | Adds S6 variant 1 maze JSON spec. |
| mazes/exp_maze_jsons/M1/8x8_corridor_kr_0.json | Adds M1 8x8 corridor variant 0 key-door maze spec. |
| mazes/exp_maze_jsons/M1/8x8_corridor_kr_1.json | Adds M1 8x8 corridor variant 1 key-door maze spec. |
| mazes/exp_maze_jsons/M1/10x10_corridor_kr_0.json | Adds M1 10x10 corridor variant 0 key-door maze spec. |
| mazes/exp_maze_jsons/M1/10x10_corridor_kr_1.json | Adds M1 10x10 corridor variant 1 key-door maze spec. |
| mazes/exp_maze_jsons/M1/10x10_dense_kr_0.json | Adds M1 10x10 dense variant 0 key-door maze spec. |
| mazes/exp_maze_jsons/M1/10x10_dense_kr_1.json | Adds M1 10x10 dense variant 1 key-door maze spec. |
| mazes/exp_maze_jsons/M1/14x14_corridor_kr_0.json | Adds M1 14x14 corridor variant 0 key-door maze spec. |
| mazes/exp_maze_jsons/M1/14x14_corridor_kr_1.json | Adds M1 14x14 corridor variant 1 key-door maze spec. |
| mazes/exp_maze_jsons/M1/14x14_dense_kr_0.json | Adds M1 14x14 dense variant 0 key-door maze spec. |
| mazes/exp_maze_jsons/M1/14x14_dense_kr_1.json | Adds M1 14x14 dense variant 1 key-door maze spec. |
| .vscode/settings.json | Adds workspace-level Python environment manager/package manager settings. |
| .vscode/launch.json | Adds a Python debug configuration with a pre-set PYTHONPATH. |
| .gitmodules | Adds an ogbench submodule entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| task_spec = _to_task_spec(spec) | ||
| ctx = TaskPlanningContext(task_spec) | ||
| start_state = ctx.initial_state() | ||
| actions, final_state = _shortest_plan( |
There was a problem hiding this comment.
This is a semantic merge conflict with PR #18 (CC @seanrivera). Git won't catch this because the two PRs touch different files (scorer modifies gridworld/baselines.py; this PR adds a new
BFS_solver.py). But after they both merge this will lead to an error.
Currently in main today, _shortest_plan returns (actions, state) which is a 2-tuple. The scorer PR changes it to (actions, state, states_explored) - 3-tuple. Whichever PR lands second will silently auto-merge but every test that calls solve() here crashes with: ValueError: too many values to unpack (expected 2)
Two ways to handle it:
Option A: unpack by index so the same line works
against either return shape across the merge window. result = _shortest_plan(ctx, start_state, ...)
actions, final_state = result[0], result[1]
Option B (probably cleaner): drop the private-API import entirely and use the public plan_bfs_path(spec) from 'gridworld.baselines. It returns a PlannedPathwithaction_labels, positions, success` already populated -
so most of the loop at lines 54-70 becomes unnecessary.
| test_case.assertLess(x, width, label) | ||
| test_case.assertGreaterEqual(y, 0, label) | ||
| test_case.assertLess(y, height, label) | ||
| test_case.assertNotIn(tuple(point), walls, label) |
There was a problem hiding this comment.
Lines 42-47 are an exact copy of lines 36-41 inside the same
for-loop body. Six assertions run twice
Summary
Moves the S maze set and the M1 single key-door maze set from inside the
ogbenchsubmodule into the root repo along with maze-specific tests and rendered maze images.Maze Sets
S1: 8x8 empty room, no mechanisms - Baseline, can model navigate at all? Also validates environment
S2: 8x8 corridor with turns, no mechanisms - Does turning matter?
S3: 10x10 corridor with turns, no mechanisms - Moderate size navigation
S4: 10x10 dense maze (dead ends), no mechanisms - Does topology matter at moderate size?
S5: 14x14 corridor with turns, no mechanisms - Large grid navigation
S6: 14x14 dense maze (dead ends), no mechanisms - Large grid + topology combined
M1: Single key-door - Basic mechanism comprehension example
Visualizing Mazes
Maze-Specific Tests
From the repo root:
Tests should all pass: