Skip to content

[visualizer] feat: add memory timeline HTML visualizer#70

Open
tifning wants to merge 3 commits into
verl-project:mainfrom
tifning:main
Open

[visualizer] feat: add memory timeline HTML visualizer#70
tifning wants to merge 3 commits into
verl-project:mainfrom
tifning:main

Conversation

@tifning

@tifning tifning commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds a Plotly-based interactive memory timeline visualizer (memory_html) that renders
NPU memory allocation events as an interactive Gantt chart. Key features:

  • Dual-chart layout: Chart1 shows global memory usage trend (line + area fill); Chart2
    shows per-operator Gantt bars, color-coded by operator name with bidirectional zoom sync.
  • Time-window segmentation: Large datasets (100k+ events) are split into at most 20
    time-window segments, each generating a standalone HTML+JS file pair with Prev/Next
    navigation. Chart1 data (global timeline) is shared across all segments.
  • Segment smart hint: When the user zooms Chart1 into a different time range, a "Best:
    Seg N →" hint appears with a clickable link to the matching segment file.
  • Overlap detection: When multiple bars of the same operator stack at the same time
    point, the hover tooltip and detail panel both list all overlapping events with size and
    timing info. Click any row in the detail panel to navigate between overlapping events.
  • Call stack display: call_stack field is stored per event and shown in the detail
    panel on bar click (empty call stacks display as (empty) gracefully).
  • Absolute time display: Chart1 x-axis tick labels and hover tooltips show absolute
    timestamps, while internal data uses relative times for compact JSON output.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include pipeline, parser, visualizer, data, deployment, perf, algo, env, doc, cfg, ci, misc
    • If this PR involves multiple modules, separate them with , like [mstx, ci]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][mstx, torch_profile] feat: support timeline parsing

Test

python -m rl_insight.main input.path=xx timeline.parser.type=memory timeline.visualizer.type=memory_html output.path=xx

image image

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

New files:

  • rl_insight/visualizer/memory_visualizer.pyMemoryVisualizer class registered as
    memory_html cluster visualizer. Handles data preprocessing, segment splitting,
    JSON generation, and HTML template injection.
  • rl_insight/visualizer/memory_template.html — Plotly-based HTML template (~3 KB)
    with all rendering logic in vanilla JS. Data lives in a separate detail_data.js file.

Modified files:

  • rl_insight/visualizer/__init__.py — Added from .memory_visualizer import MemoryVisualizer
    to register the memory_html visualizer.

Key design decisions:

  • Data arrays are stored in external .js files (not inlined in HTML), keeping the HTML
    template lightweight and cacheable.
  • call_stack strings use a pool+index pattern (CS_POOL / CS_IDX) for deduplication,
    minimizing JSON output size.
  • Operator bars are grouped into a single Plotly trace per operator using x:[] and
    base:[] arrays, reducing trace count from O(events) to O(unique_operators).
  • Segment overlap uses start + duration > seg_start AND start < seg_end so events
    spanning segment boundaries appear in both files.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

- Plotly-based interactive Gantt chart with operator grouping
- Time-window segmentation for large datasets (max 20 segments)
- Chart1 memory trend line + Chart2 operator Gantt synchronized zoom
- Segment navigation with smart hint (suggest best segment for range)
- Call stack display in detail panel on bar click
- Overlap detection: hover and click show all stacked bars per operator
- Absolute time display on x-axis tick labels and hover tooltip

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new MemoryVisualizer and an accompanying HTML template to generate interactive memory allocation timelines and operator Gantt charts. The feedback focuses on critical performance optimizations and a potential bug fix: replacing an $O(N \times M)$ nested loop in _build_chart1_data with a sweep-line (two-pointer) approach, vectorizing or optimizing DataFrame iterations to avoid slow iterrows() calls, and handling relative paths correctly in directory creation to prevent a FileNotFoundError.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread recipe/visualizer/memory_visualizer.py
Comment thread rl_insight/visualizer/memory_visualizer.py Outdated
Comment thread rl_insight/visualizer/memory_visualizer.py Outdated
Comment thread rl_insight/visualizer/memory_visualizer.py Outdated
@tardis-key

tardis-key commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

As a complete feature, we also need the data (/data), the datachecr(/rl_insight/data/data_checker.py)the data documentation (/docs/data), the examples (/examples), e2e test (tests/special_e2e).

Comment thread rl_insight/data/data_checker.py Outdated
Comment thread rl_insight/visualizer/memory_visualizer.py Outdated
Comment thread rl_insight/data/data_checker.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants