update the doc, script and workflow#47
Conversation
|
Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine. No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR switches the pipeline runner to module invocation ( ChangesPipeline Module Execution Migration
Possibly Related PRs
Suggested Labels
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
.github/workflows/theseus-engine.yml (1)
152-156:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCopy logic drops
.statusmarkers in the orphan-fix path.
cp -r data/*skips hidden directories, sodata/.statuscan be lost during branch ancestry repair. That can make the later status-marker check report no data and skip PR updates.Copy hidden and non-hidden data safely
- cp -r data/* "$SAVE_DIR"/ 2>/dev/null || true + cp -a data/. "$SAVE_DIR"/ 2>/dev/null || true @@ - cp -r "$SAVE_DIR"/* data/ 2>/dev/null || true + cp -a "$SAVE_DIR"/. data/ 2>/dev/null || trueAlso applies to: 167-175
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/theseus-engine.yml around lines 152 - 156, The copy commands using "cp -r data/*" and "cp -r \"$SAVE_DIR\"/*" drop hidden files like data/.status; change those copies to preserve hidden entries (e.g., use a form that copies the directory contents including dotfiles such as "cp -a data/. \"$SAVE_DIR\"/" and the reverse for "$SAVE_DIR" to data/) so data/.status is never lost; update the two occurrences around the SAVE_DIR operations and keep the mkdir -p and rm -rf logic unchanged.scripts/run_pipeline.py (1)
45-169:⚠️ Potential issue | 🟠 Major | ⚡ Quick winResolve the blocking pylint errors in this file.
The current CI output shows
too-many-locals,too-many-branches,too-many-statements, andimport-outside-toplevelerrors here, which blocks merge. Please either split orchestration into stage helpers + top-level imports, or add explicit targeted pylint disables where laziness is intentional.Suggested minimal unblock (targeted disables)
-def run_pipeline( +def run_pipeline( # pylint: disable=too-many-locals,too-many-branches,too-many-statements @@ - from scripts.analyse_repository import ( + from scripts.analyse_repository import ( # pylint: disable=import-outside-toplevel process_repository, ) @@ - from scripts.add_fossils import backfill_fossils, update_survivor_fossils + from scripts.add_fossils import ( # pylint: disable=import-outside-toplevel + backfill_fossils, + update_survivor_fossils, + ) @@ - import argparse + import argparse # pylint: disable=import-outside-toplevel🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/run_pipeline.py` around lines 45 - 169, The run_pipeline function currently triggers pylint errors (too-many-locals, too-many-branches, too-many-statements, import-outside-toplevel); fix by either extracting stage helpers or adding targeted disables: add a module- or function-level pylint disable for the orchestration checks (e.g. on the run_pipeline def add "# pylint: disable=too-many-locals,too-many-branches,too-many-statements,import-outside-toplevel") and/or move the dynamic imports (process_repository, backfill_fossils, update_survivor_fossils) to top-level imports so import-outside-toplevel is resolved; ensure references remain to run_pipeline, process_repository, backfill_fossils, update_survivor_fossils and run_cleanup so callers are unaffected.Source: Pipeline failures
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/theseus-engine.yml:
- Line 55: The workflow is vulnerable because it interpolates matrix.repo
directly into the run line and the create-pr step uses cp -r data/* which omits
dotfiles like data/.status; update the pipeline invocation to pass matrix.repo
via environment (e.g., REPO_NAME) and reference $REPO_NAME in the run command
(so the value is not subject to shell expansion) and change the create-pr copy
logic to preserve dotfiles and metadata (replace cp -r data/* "$SAVE_DIR"/ with
a dotfile-aware copy such as cp -a data/. "$SAVE_DIR"/ or use rsync -a) so
data/.status is retained for the “Check for status markers” step.
---
Outside diff comments:
In @.github/workflows/theseus-engine.yml:
- Around line 152-156: The copy commands using "cp -r data/*" and "cp -r
\"$SAVE_DIR\"/*" drop hidden files like data/.status; change those copies to
preserve hidden entries (e.g., use a form that copies the directory contents
including dotfiles such as "cp -a data/. \"$SAVE_DIR\"/" and the reverse for
"$SAVE_DIR" to data/) so data/.status is never lost; update the two occurrences
around the SAVE_DIR operations and keep the mkdir -p and rm -rf logic unchanged.
In `@scripts/run_pipeline.py`:
- Around line 45-169: The run_pipeline function currently triggers pylint errors
(too-many-locals, too-many-branches, too-many-statements,
import-outside-toplevel); fix by either extracting stage helpers or adding
targeted disables: add a module- or function-level pylint disable for the
orchestration checks (e.g. on the run_pipeline def add "# pylint:
disable=too-many-locals,too-many-branches,too-many-statements,import-outside-toplevel")
and/or move the dynamic imports (process_repository, backfill_fossils,
update_survivor_fossils) to top-level imports so import-outside-toplevel is
resolved; ensure references remain to run_pipeline, process_repository,
backfill_fossils, update_survivor_fossils and run_cleanup so callers are
unaffected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 40f8a39c-8c5c-4bec-b086-551fddecf03a
📒 Files selected for processing (3)
.github/workflows/theseus-engine.ymldocs/CONFIGURATION.mdscripts/run_pipeline.py
- Pass matrix.repo via REPO_NAME env var instead of inline template interpolation for defense-in-depth (no shell expansion risk) - Replace cp -r data/* with cp -a data/. to preserve dotfiles like data/.status during orphaned-branch ancestry fix
|
@coderabbitai review |
✅ Action performedReview finished.
|
Summary by CodeRabbit