New min angle graph#850
Open
klei22 wants to merge 6 commits into
Open
Conversation
klei22
commented
Jun 17, 2026
Collaborator
There was a problem hiding this comment.
Pull request overview
Adds an optional “LM-head minimum-angle graph” export pipeline to the training loop, producing per-eval CSV/JSON snapshots and a local HTML viewer to inspect how nearest-neighbor angular structure evolves over training.
Changes:
- Add a blockwise LM-head nearest-neighbor (min-angle) exporter that writes CSV + JSON metadata.
- Wire the exporter into
train.pywith new CLI args and README usage documentation. - Add an exploration config, a demo script, and a standalone Plotly-based HTML viewer for browsing snapshots.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
utils/min_angle_graph_export.py |
New blockwise cosine/angle nearest-neighbor computation and CSV/JSON export writer. |
train.py |
Preloads stdlib modules to avoid shadowing; adds per-validation min-angle graph export hook. |
train_args.py |
Adds CLI flags controlling export directory, cadence, block size, device, and label. |
README.md |
Documents how to enable exports and view them in the Plotly HTML page. |
explorations/min_angle_graph_export.yaml |
Smoke-test experiment config enabling exports on small runs. |
demos/min_angle_graph_export_demo.sh |
Scripted demo running the exploration config and pointing to the viewer. |
analysis/min_angle_graph_plotly_viewer.html |
Local, file-based Plotly viewer for stepping through exported CSV snapshots. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+24
to
+25
| def compute_min_angle_graph(weight, block_size=2048, compute_device="auto"): | ||
| """Compute each row vector's closest non-self row by signed angular distance.""" |
Comment on lines
+157
to
+159
| "has_token_text_escaped": escaped_token_texts is not None, | ||
| "angle_definition": "signed 0-180 degrees, closest non-self token by maximum cosine", | ||
| } |
Comment on lines
+19
to
+29
| _repo_dir = os.path.dirname(os.path.abspath(__file__)) | ||
| _removed_sys_path_entries = [] | ||
| for _entry in ("", _repo_dir): | ||
| while _entry in sys.path: | ||
| sys.path.remove(_entry) | ||
| _removed_sys_path_entries.append(_entry) | ||
| import copy as _stdlib_copy | ||
| import dataclasses as _stdlib_dataclasses | ||
| for _entry in reversed(_removed_sys_path_entries): | ||
| sys.path.insert(0, _entry) | ||
| del _entry, _removed_sys_path_entries, _repo_dir, _stdlib_copy, _stdlib_dataclasses |
Comment on lines
+1933
to
+1938
| if self.args.export_min_angle_graph_each_eval: | ||
| if live: | ||
| live.stop() | ||
| self.export_min_angle_graph(losses) | ||
| if live: | ||
| live.start() |
Comment on lines
+265
to
+269
| loss. The export treats each LM-head row as a token vector, streams | ||
| row/column blocks through the selected compute device, excludes each token's | ||
| self-distance, and records the closest non-self token by signed `0°–180°` | ||
| angular distance. The full `vocab_size × vocab_size` angle matrix is never | ||
| materialized. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.