USHIFT-7312: Add interactive per-VM PCP performance dashboard skill#210
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: suleymanakbas91 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a MicroShift CI PCP dashboard workflow that extracts per-scenario metrics and metadata, generates a self-contained HTML dashboard, and documents the skill entrypoint. It also updates the skill registration symlink and plugin version metadata. ChangesMicroShift CI PCP dashboard pipeline
Estimated code review effort: 4 (Complex) | ~50 minutes Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 9 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (9 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (2)
plugins/microshift-ci/skills/pcp-dashboard/SKILL.md (2)
103-107: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMove failure handling next to the step it affects.
Per the skill-writing guidance, keep fallback and edge-case rules inline with the relevant step so the model reads them in sequence. The
pcp2jsonfallback belongs with Step 2, while the “no tarballs” and “skip scenario” behaviors belong with the input/discovery and extraction steps.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/microshift-ci/skills/pcp-dashboard/SKILL.md` around lines 103 - 107, The error-handling guidance in SKILL.md is grouped too broadly, so move each fallback rule next to the step it applies to: place the `pcp2json` container fallback in the Step 2 section, and move the “no PCP tarballs found” and “skip failed scenario” behaviors into the input/discovery and scenario extraction sections respectively. Update the relevant step headings or bullet lists in SKILL.md so the rules appear inline with the workflow, using the existing `pcp2json`, tarball discovery, and scenario extraction sections as anchors.Source: Learnings
6-6: 🔒 Security & Privacy | 🔵 Trivial | ⚡ Quick winDrop
WebFetchunless the workflow באמת needs it.The documented steps only use local shell actions plus
Read/Write;WebFetchadds network access to a user-invocable skill without a stated use case.Suggested frontmatter trim
-allowed-tools: Bash, Read, Write, WebFetch +allowed-tools: Bash, Read, Write🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/microshift-ci/skills/pcp-dashboard/SKILL.md` at line 6, The skill frontmatter currently includes WebFetch even though the documented workflow only uses local shell actions plus Read and Write. Update the allowed-tools entry in SKILL.md to remove WebFetch, keeping only the tools actually needed, and ensure the skill metadata stays aligned with the workflow described in the rest of the file.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.py`:
- Around line 141-148: The dashboard rendering in create-pcp-dashboard.py is
building artifact-derived content with innerHTML, which can turn vm_hosts, mount
points, and device labels into executable markup. Update the info card and
related rendering paths (the code that uses infoCard in this block, plus the
similar sections referenced later in the file) to construct DOM nodes with
createElement() and set textContent for all dynamic values instead of
concatenating strings into innerHTML.
- Around line 376-380: The HTML assembly in build_html currently injects
data_json, scenarios_json, and timezone directly into inline HTML/JavaScript, so
update build_html and the related rendering paths that use JS_APP and the
template placeholders to use context-aware escaping instead of raw string
replacement. Embed JSON payloads in a safe script container or otherwise ensure
they are not interpreted as executable script, and HTML-escape the timezone
label before inserting it into the document. Also review the downstream template
sections referenced by build_html and the scenario/chart rendering helpers to
apply the same escaping consistently anywhere CI-derived values are rendered.
In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py`:
- Around line 83-91: The output directory creation in extract_scenarios.py
should handle bare filenames passed via the --output argument, because
os.path.dirname(output) can be empty for values like scenarios.json. Update the
logic around args.output and the os.makedirs call so it only creates a directory
when a parent path exists, while still preserving the default scenarios.json
path under args.workdir in the main flow.
- Around line 36-40: The JUnit parsing in parse_junit currently uses ET.parse on
artifact-supplied XML, which is unsafe for untrusted input and can exhaust
resources before the exception handling applies. Update parse_junit to use
defusedxml.ElementTree instead of xml.etree.ElementTree, keeping the same
parsing flow and exception handling around the tree creation. Make sure the
change is localized to the parse_junit function and any related ElementTree
import so all JUnit artifact parsing goes through the safe parser.
In `@plugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.sh`:
- Line 1: The script header in generate-dashboard.sh uses the wrong Bash
shebang; update the file’s shebang to the repo-required interpreter path so it
matches the shell contract. Make the change at the top of the script and keep it
consistent with the project guidance from CONTRIBUTING.md, using the same script
entry point identifier for generate-dashboard.sh.
- Around line 37-39: Validate the --parallel option in generate-dashboard.sh
before assigning PARALLEL: the current argument handling in the --parallel case
accepts 0 or non-numeric values, which later breaks the native-mode throttle
loop and can fail under set -e. Update the option parsing near the --parallel
branch (and any reused validation at the later matching block) to allow only
positive integer values via an allow-list check, and reject anything else with a
clear error before continuing.
- Around line 84-90: The container image probe in generate-dashboard.sh is
Podman-specific and fails to detect existing images when CONTAINER_RT is Docker,
causing unnecessary rebuilds; update the check in the image-availability logic
near the ${CONTAINER_RT} image exists call to use a Docker-compatible inspection
path such as docker image inspect (or a runtime-appropriate equivalent) before
falling back to the build step. Keep the existing control flow in the container
setup section so the script still reuses CONTAINER_IMAGE when present and only
invokes ${CONTAINER_RT} build when the image is missing.
- Around line 166-168: The cleanup trap in process_tarball() is using RETURN,
which causes tmp_dir to be removed when nested helpers like extract_metric()
return instead of waiting for the whole function to finish. Update the trap in
process_tarball() to use a scope that only cleans up after process_tarball()
completes, and keep the temporary archive tree available for all metric
extractions; this should also satisfy shellcheck expectations for the shell
script.
In `@plugins/microshift-ci/skills/pcp-dashboard/SKILL.md`:
- Line 2: The skill frontmatter identifier is inconsistent with its parent
directory, so update the `name` in `SKILL.md` to match the directory name used
for `plugins/microshift-ci/skills/pcp-dashboard`, or rename the
directory/frontmatter pair so they align. Check the `name` field in the skill
metadata and ensure it follows the `SKILL-GUIDELINES.md` requirement that the
skill name matches the containing directory.
---
Nitpick comments:
In `@plugins/microshift-ci/skills/pcp-dashboard/SKILL.md`:
- Around line 103-107: The error-handling guidance in SKILL.md is grouped too
broadly, so move each fallback rule next to the step it applies to: place the
`pcp2json` container fallback in the Step 2 section, and move the “no PCP
tarballs found” and “skip failed scenario” behaviors into the input/discovery
and scenario extraction sections respectively. Update the relevant step headings
or bullet lists in SKILL.md so the rules appear inline with the workflow, using
the existing `pcp2json`, tarball discovery, and scenario extraction sections as
anchors.
- Line 6: The skill frontmatter currently includes WebFetch even though the
documented workflow only uses local shell actions plus Read and Write. Update
the allowed-tools entry in SKILL.md to remove WebFetch, keeping only the tools
actually needed, and ensure the skill metadata stays aligned with the workflow
described in the rest of the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6310164c-d5af-4f7d-be21-d422a0226679
⛔ Files ignored due to path filters (1)
plugins/microshift-ci/scripts/pcp-graphs/vendor/chart.umd.min.jsis excluded by!**/*.min.js,!**/vendor/**
📒 Files selected for processing (5)
.claude/skills/microshift-ci-pcp-dashboardplugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.pyplugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.pyplugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.shplugins/microshift-ci/skills/pcp-dashboard/SKILL.md
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py (1)
55-59: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winHandle malformed numeric suite attributes without aborting the whole extraction.
A JUnit file with non-numeric
tests/failures/errors/skipped/timeattributes will raise here and stop the entire dashboard build instead of degrading just that scenario to"unknown". Per CONTRIBUTING.md: “Validate at trust boundaries with allow-lists, not deny-lists.” Keep these coercions inside aValueError/TypeErrorguard and returnNoneon bad artifact data. As per path instructions, “Review all changes against the guidelines in CONTRIBUTING.md” and “Validate at trust boundaries with allow-lists, not deny-lists.”Proposed fix
- tests = int(suite.get("tests", "0")) - failures = int(suite.get("failures", "0")) - errors = int(suite.get("errors", "0")) - skipped = int(suite.get("skipped", "0")) - time_sec = float(suite.get("time", "0")) - timestamp = suite.get("timestamp", "") + try: + tests = int(suite.get("tests", "0")) + failures = int(suite.get("failures", "0")) + errors = int(suite.get("errors", "0")) + skipped = int(suite.get("skipped", "0")) + time_sec = float(suite.get("time", "0")) + except (TypeError, ValueError): + return None + timestamp = suite.get("timestamp", "")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py` around lines 55 - 59, The suite attribute coercions in the scenario parser can abort extraction when `tests`, `failures`, `errors`, `skipped`, or `time` contain malformed values. Update the parsing logic in `extract_scenarios.py` so the `tests`/`failures`/`errors`/`skipped`/`time_sec` conversions are wrapped in a `ValueError`/`TypeError` guard, and have the scenario helper return `None` when bad artifact data is encountered. Keep the existing `suite` parsing flow intact and ensure only the affected scenario is dropped while the rest of the dashboard build continues.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py`:
- Around line 55-59: The suite attribute coercions in the scenario parser can
abort extraction when `tests`, `failures`, `errors`, `skipped`, or `time`
contain malformed values. Update the parsing logic in `extract_scenarios.py` so
the `tests`/`failures`/`errors`/`skipped`/`time_sec` conversions are wrapped in
a `ValueError`/`TypeError` guard, and have the scenario helper return `None`
when bad artifact data is encountered. Keep the existing `suite` parsing flow
intact and ensure only the affected scenario is dropped while the rest of the
dashboard build continues.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: a0334d17-1bd9-4899-a2f0-7edae0823d0c
📒 Files selected for processing (4)
plugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.pyplugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.pyplugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.shplugins/microshift-ci/skills/pcp-dashboard/SKILL.md
✅ Files skipped from review due to trivial changes (1)
- plugins/microshift-ci/skills/pcp-dashboard/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.py
- plugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.sh
New standalone skill (microshift-ci:pcp-dashboard) that generates a self-contained HTML dashboard from per-VM PCP archives in MicroShift CI artifacts. Accepts either a local workdir path or a Prow job URL. Reuses existing extract/parse scripts from CI doctor without modifying the doctor workflow. Includes container fallback for pcp2json on macOS via podman/docker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix innerHTML injection: use createElement/textContent for all artifact-derived values (info cards, stats rows, empty state) - Escape JSON payloads via <script type="application/json"> to prevent script breakout; HTML-escape timezone label - Use /usr/bin/bash shebang per repo convention - Validate --parallel as positive integer - Use `image inspect` instead of Podman-specific `image exists` - Replace trap RETURN with explicit cleanup calls - Redact VM hostnames and full paths from log messages - Guard os.makedirs for bare --output filenames - Add size-limited XML parsing for JUnit files - Drop unused WebFetch from skill allowed-tools - Move error handling inline with relevant SKILL.md steps - Add missing docstrings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New pcp-dashboard skill added. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b701c34 to
2670cef
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py (1)
15-15: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick winSwitch JUnit parsing to
defusedxml.The
MAX_XML_SIZEguard limits file size but does not stop internal entity-expansion (billion-laughs) attacks, which use tiny inputs.ET.fromstring()on artifact-supplied XML remains unsafe, and ruff (S314) still fails on Line 47 — violating "Python code must pass ruff." Per CONTRIBUTING.md: "Validate at trust boundaries with allow-lists, not deny-lists." Parse viadefusedxml.ElementTreeto fully neutralize the class.🔒 Proposed fix
-import xml.etree.ElementTree as ET +import defusedxml.ElementTree as ET +import xml.etree.ElementTree as StdET- except (ET.ParseError, FileNotFoundError, OSError): + except (StdET.ParseError, FileNotFoundError, OSError):Please confirm
defusedxmlis an available/declared dependency for this plugin's runtime.defusedxml ElementTree fromstring usage and ParseError exceptionAlso applies to: 47-47
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py` at line 15, The JUnit XML parsing in extract_scenarios.py still uses xml.etree.ElementTree and ET.fromstring, which leaves artifact-supplied XML vulnerable to entity-expansion attacks and triggers ruff S314. Switch the parsing/imports in the JUnit handling path to defusedxml.ElementTree, keeping the existing MAX_XML_SIZE check as a secondary guard, and update the ParseError handling to use the defusedxml exception types so the functions around ET.fromstring and the JUnit parsing logic remain safe.Sources: Path instructions, Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.py`:
- Around line 397-399: Add test coverage for escape_json_for_script in the PCP
dashboard script to validate both breakout neutralization and normal behavior.
Create positive and negative cases that assert dangerous payloads containing
</script> and <!-- are escaped by escape_json_for_script, and add a benign JSON
case that still parses correctly with JSON.parse after escaping. Use the
escape_json_for_script helper as the main symbol to target in the new tests.
In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py`:
- Around line 39-69: Add tests for parse_junit to cover both success and failure
paths, since it drives status parsing from XML. In the extract_scenarios.py test
coverage, verify a passing testsuite, suites with failures/errors, oversized
input blocked by MAX_XML_SIZE, malformed or missing XML returning None, and XML
with no testsuite element returning None; use parse_junit and MAX_XML_SIZE as
the main symbols to target.
- Line 26: The os.walk loop in extract_scenarios.py has an unused third loop
variable that Ruff flags with B007. Update the loop around os.walk(build_dir) to
rename the unused variable to a conventional placeholder (for example, _) while
keeping the existing root and dirs handling unchanged, so the extract_scenarios
logic still works and passes lint.
- Around line 55-60: The numeric attribute parsing in extract_scenarios.py is
not protected by the existing error handling, so malformed junit.xml values can
crash the whole extraction. Update the scenario parsing logic around the suite
attribute reads in the extraction function to safely handle non-numeric
tests/failures/errors/skipped/time values, using the existing try/except flow or
a small conversion helper so bad input is skipped or defaulted instead of
raising ValueError. Keep the fix localized to the code that reads suite
attributes and builds the scenario data.
---
Duplicate comments:
In `@plugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.py`:
- Line 15: The JUnit XML parsing in extract_scenarios.py still uses
xml.etree.ElementTree and ET.fromstring, which leaves artifact-supplied XML
vulnerable to entity-expansion attacks and triggers ruff S314. Switch the
parsing/imports in the JUnit handling path to defusedxml.ElementTree, keeping
the existing MAX_XML_SIZE check as a secondary guard, and update the ParseError
handling to use the defusedxml exception types so the functions around
ET.fromstring and the JUnit parsing logic remain safe.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: d20a788a-d0d6-40c9-8ce4-4138dfaafe5e
⛔ Files ignored due to path filters (1)
plugins/microshift-ci/scripts/pcp-graphs/vendor/chart.umd.min.jsis excluded by!**/*.min.js,!**/vendor/**
📒 Files selected for processing (7)
.claude-plugin/marketplace.json.claude/skills/microshift-ci-pcp-dashboardplugins/microshift-ci/.claude-plugin/plugin.jsonplugins/microshift-ci/scripts/pcp-graphs/create-pcp-dashboard.pyplugins/microshift-ci/scripts/pcp-graphs/extract_scenarios.pyplugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.shplugins/microshift-ci/skills/pcp-dashboard/SKILL.md
✅ Files skipped from review due to trivial changes (4)
- plugins/microshift-ci/.claude-plugin/plugin.json
- .claude-plugin/marketplace.json
- .claude/skills/microshift-ci-pcp-dashboard
- plugins/microshift-ci/skills/pcp-dashboard/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/microshift-ci/scripts/pcp-graphs/generate-dashboard.sh
- Wrap int()/float() conversions in safe helpers to handle malformed JUnit XML attributes without crashing the whole extraction - Rename unused `files` loop variable to `_files` (ruff B007) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Skill symlinks in .claude/skills/ are no longer created manually. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename pcp-dashboard → job-pcp-dashboard to clarify per-job scope - Replace --workdir with --url in generate-dashboard.sh: parses Prow URL, downloads artifacts via gsutil, auto-computes workdir - Simplify SKILL.md from 130 to 33 lines by pushing all logic into the script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove docker from container runtime detection — podman only - Note platform-specific open command (open/xdg-open) in skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Discover and process hypervisor PCP archives from infra-pmlogs alongside per-VM scenario archives. In container mode, copy archives to $TMPDIR so the podman VM can mount them; in native mode, read directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add --url single-job mode to download-jobs.sh. Refactor generate-dashboard.sh to delegate URL parsing and artifact download to the shared script instead of reimplementing both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/lgtm |
Summary
microshift-ci:job-pcp-dashboard) that generates a self-contained interactive HTML dashboard from per-VM PCP archives in MicroShift CI artifactspcp2jsonon macOS via podman auto-built imagePreview
Test plan
2070356586173304832— 12/14 scenarios processed, 463KB HTML generated2067819910142103552— 12/14 scenarios, 459KB HTML, confirming generalitypcp2jsonis unavailable🤖 Generated with Claude Code