Add `aimx trace distribution` with tensor (histogram) output modes by blizhan · Pull Request #15 · blizhan/aimx

blizhan · 2026-04-26T08:55:39Z

Motivation

Provide a first-class trace distribution path so users can inspect Aim-tracked distribution sequences (histogram/tensor payloads) from the CLI.
Enable both human-friendly terminal inspection and script-friendly exports (CSV/JSON) of per-step histogram data.

Description

Add DistributionPoint and DistributionSeries types and implement collect_distribution_series(...) in aimx.aim_bridge.metric_stats that uses Repo.query_distributions(...) to collect per-step step, epoch, weights, and bin_edges data.
Add step-range filtering and sampling helpers filter_distribution_by_step_range and subsample_distribution, and keep metric helpers unchanged.
Extend the trace invocation parser to accept an explicit distribution target and route to a distribution-specific execution path in run_trace_command, preserving existing metric trace behavior.
Implement distribution renderers in src/aimx/rendering/trace_views.py: render_distribution_table, render_distribution_csv, and render_distribution_json, plus a small _format_tensor helper and labeling helper; update metric render helpers to coexist.
Update CLI help and README.md with trace distribution usage examples and add unit tests for parsing and renderers (tests/unit/test_trace_distribution_views.py) and adjust existing trace parsing tests.

Testing

Ran unit tests: uv run pytest tests/unit/test_trace_helpers.py tests/unit/test_trace_distribution_views.py tests/unit/test_router.py tests/unit/test_owned_commands.py, which completed successfully with 32 passed.
Ran a focused integration/contract subset: uv run pytest tests/integration/test_trace_command.py tests/contract/test_trace_contract.py, which completed with 1 passed, 12 skipped due to the sample-repo fixture behavior.
All automated tests executed for these changes passed or were skipped as expected.

Codex Task

gemini-code-assist

Code Review

This pull request introduces support for Aim distribution sequences, allowing users to fetch and render histogram weights via the aimx trace distribution command. Key additions include new data models for distributions, collection logic from the Aim repository, and rendering capabilities for terminal tables, CSV, and JSON outputs. The review feedback highlights opportunities to optimize step range filtering into a single pass and suggests refactoring the command execution logic to reduce duplication between metric and distribution processing paths.

gemini-code-assist · 2026-04-26T08:56:47Z

+    if start is not None:
+        points = [point for point in points if point.step >= start]
+    if end is not None:
+        points = [point for point in points if point.step <= end]


The step range filtering can be optimized to a single pass. This avoids creating an intermediate list when both start and end are provided, which improves performance for series with many points.

if start is not None or end is not None: points = [ point for point in points if (start is None or point.step >= start) and (end is None or point.step <= end) ]

gemini-code-assist · 2026-04-26T08:56:47Z

+        if invocation.target == "distribution":
+            from aimx.aim_bridge.metric_stats import (
+                collect_distribution_series,
+                filter_distribution_by_step_range,
+                subsample_distribution,
+            )

-        if not series_list:
-            return QueryCommandResult(exit_status=0, output="No matching metrics found.")
+            series_list = collect_distribution_series(invocation.expression, normalized_repo_path)
+            if not series_list:
+                return QueryCommandResult(exit_status=0, output="No matching distributions found.")

-        # Step range filter is a hard constraint applied before density subsampling
-        if invocation.step_slice is not None:
-            step_start, step_end = parse_step_slice(invocation.step_slice)
-            series_list = [filter_by_step_range(s, step_start, step_end) for s in series_list]
-            # Drop empty series so they don't clutter plots
-            series_list = [s for s in series_list if s.count > 0]
+            if invocation.step_slice is not None:
+                step_start, step_end = parse_step_slice(invocation.step_slice)
+                series_list = [
+                    filter_distribution_by_step_range(s, step_start, step_end) for s in series_list
+                ]
+                series_list = [s for s in series_list if s.count > 0]

-        if not series_list:
-            return QueryCommandResult(exit_status=0, output="No data in the requested step range.")
+            if not series_list:
+                return QueryCommandResult(exit_status=0, output="No data in the requested step range.")

-        # Density subsampling for visualisation
-        needs_sample = any(
-            x is not None for x in (invocation.head, invocation.tail, invocation.every)
-        )
-        if needs_sample:
-            series_list = [
-                subsample(s, head=invocation.head, tail=invocation.tail, every=invocation.every)
-                for s in series_list
-            ]
-
-        if invocation.mode == "json":
-            output = render_trace_json(series_list)
-        elif invocation.mode == "csv":
-            output = render_csv(series_list)
-        elif invocation.mode == "table":
-            output = render_trace_table(series_list, no_color=effective_no_color)
+            needs_sample = any(
+                x is not None for x in (invocation.head, invocation.tail, invocation.every)
+            )
+            if needs_sample:
+                series_list = [
+                    subsample_distribution(
+                        s, head=invocation.head, tail=invocation.tail, every=invocation.every
+                    )
+                    for s in series_list
+                ]
+
+            if invocation.mode == "json":
+                output = render_distribution_json(series_list)
+            elif invocation.mode == "csv":
+                output = render_distribution_csv(series_list)
+            else:
+                output = render_distribution_table(series_list, no_color=effective_no_color)
        else:
-            output = render_plot(
-                series_list,
-                width=invocation.width,
-                height=invocation.height,
+            from aimx.aim_bridge.metric_stats import (
+                collect_metric_series,
+                filter_by_step_range,
+                subsample,
            )

+            series_list = collect_metric_series(invocation.expression, normalized_repo_path)
+
+            if not series_list:
+                return QueryCommandResult(exit_status=0, output="No matching metrics found.")
+
+            # Step range filter is a hard constraint applied before density subsampling
+            if invocation.step_slice is not None:
+                step_start, step_end = parse_step_slice(invocation.step_slice)
+                series_list = [filter_by_step_range(s, step_start, step_end) for s in series_list]
+                # Drop empty series so they don't clutter plots
+                series_list = [s for s in series_list if s.count > 0]
+
+            if not series_list:
+                return QueryCommandResult(exit_status=0, output="No data in the requested step range.")
+
+            # Density subsampling for visualisation
+            needs_sample = any(
+                x is not None for x in (invocation.head, invocation.tail, invocation.every)
+            )
+            if needs_sample:
+                series_list = [
+                    subsample(s, head=invocation.head, tail=invocation.tail, every=invocation.every)
+                    for s in series_list
+                ]
+
+            if invocation.mode == "json":
+                output = render_trace_json(series_list)
+            elif invocation.mode == "csv":
+                output = render_csv(series_list)
+            elif invocation.mode == "table":
+                output = render_trace_table(series_list, no_color=effective_no_color)
+            else:
+                output = render_plot(
+                    series_list,
+                    width=invocation.width,
+                    height=invocation.height,
+                )


There is significant code duplication between the distribution and metrics execution paths in run_trace_command. Both paths follow the same sequence: collect series, filter by step range, subsample, and render based on the mode.

Consider refactoring this into a unified pipeline. You could define a set of functions (collect, filter, subsample, render) based on the invocation.target and then execute the pipeline once. This would improve maintainability and ensure that improvements to the tracing logic (like new filters or sampling methods) are automatically applied to both metrics and distributions.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 589f405094

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T09:05:57Z

            "             Repo defaults to the current directory.",
            "             Short run hashes in the expression are transparently expanded.",
            "             Example: aimx trace \"metric.name=='loss'\" --repo data --steps 100:500",
+            "             Example: aimx trace distribution \"distribution.name=='weights'\" --repo data --json",


Use distributions in trace-distribution expressions

This example advertises distribution.name, but Aim’s Repo.query_distributions(...) binds the sequence variable as distributions (plural). When users copy this command, the query variable is undefined and the new trace path reports no matches, making the feature look broken even when data exists. Update the expression to distributions.name == 'weights' (and align related docs/tests) or add alias rewriting before executing the query.

Useful? React with 👍 / 👎.

blizhan · 2026-04-26T09:43:53Z

@codex 审核代码

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 589f405094

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T09:48:28Z

+        query_result = repo.query_distributions(
+            expression, report_mode=QueryReportMode.DISABLED


Normalize distribution query variable before execution

collect_distribution_series forwards the user expression directly to Repo.query_distributions, but Aim’s distribution sequence variable is distributions (see aim.sdk.sequences.distribution_sequence.Distributions.sequence_name()), while this change’s own examples/tests use distribution.name == .... In that common case, Aim treats distribution as undefined and the query silently yields no matches, so aimx trace distribution reports “No matching distributions found” even when data exists. Please rewrite/alias the singular form before calling query_distributions (or reject it with a clear error) so documented expressions actually return results.

Useful? React with 👍 / 👎.

Add trace distribution support with tensor rendering

589f405

blizhan added the codex label Apr 26, 2026 — with ChatGPT Codex Connector

gemini-code-assist Bot reviewed Apr 26, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

blizhan merged commit 589f405 into main Apr 30, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `aimx trace distribution` with tensor (histogram) output modes#15

Add `aimx trace distribution` with tensor (histogram) output modes#15
blizhan merged 1 commit into
mainfrom
codex/github-mention-feature-request-trace-distribution-support

blizhan commented Apr 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Uh oh!

blizhan commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		query_result = repo.query_distributions(
		expression, report_mode=QueryReportMode.DISABLED

Conversation

blizhan commented Apr 26, 2026

Motivation

Description

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

blizhan commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant