Skip to content

update the doc, script and workflow#47

Merged
Asifdotexe merged 2 commits into
mainfrom
42-fix-action-failure-issue
Jun 6, 2026
Merged

update the doc, script and workflow#47
Asifdotexe merged 2 commits into
mainfrom
42-fix-action-failure-issue

Conversation

@Asifdotexe

@Asifdotexe Asifdotexe commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • Chores
    • Switched pipeline execution to run via Poetry/module invocation and updated docs to match.
  • Bug Fixes
    • Improved automated branch-repair flow to preserve and restore workspace data, reducing risk of accidental data removal during PR creation.

@Asifdotexe Asifdotexe self-assigned this Jun 6, 2026
@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine.

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aefd455b-c003-40a5-ab1d-d8d855152bdd

📥 Commits

Reviewing files that changed from the base of the PR and between 71cc56b and 044f1b4.

📒 Files selected for processing (1)
  • .github/workflows/theseus-engine.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/theseus-engine.yml

📝 Walkthrough

Walkthrough

The PR switches the pipeline runner to module invocation (python -m scripts.run_pipeline), rewires imports to scripts.* with a path guard, updates the workflow and docs to use Poetry module execution, and replaces a full git cleanup with a save/restore of data/ during shared-branch fixes.

Changes

Pipeline Module Execution Migration

Layer / File(s) Summary
Module execution imports and path handling
scripts/run_pipeline.py
Docstring updated to document python -m scripts.run_pipeline. Top-level imports changed to scripts._path_guard, scripts.load_config, and package-qualified stage handlers; Stage 1 and Stage 2 now import from scripts.analyse_repository and scripts.add_fossils.
Workflow and documentation updates
.github/workflows/theseus-engine.yml, docs/CONFIGURATION.md
analyze job now runs the pipeline with poetry run python -m scripts.run_pipeline, exposing REPO_NAME via env while preserving CLI flags. Documentation example command updated to python -m scripts.run_pipeline --repo REPO-NAME.
Shared branch cleanup simplification
.github/workflows/theseus-engine.yml
Removed git rm -rf --cached . / git clean -fdx full cleanup in the "Fix shared branch ancestry" path; instead the workflow copies data/. to a temp save, checks out origin/main, recreates data/ subdirectories, and restores saved contents.

Possibly Related PRs

  • Asifdotexe/Theseus#45: Overlaps on workflow create-pr shared-branch preparation and data-preservation/reset behavior.
  • Asifdotexe/Theseus#41: Related workflow changes for how the pipeline is executed via Poetry/Python in CI.
  • Asifdotexe/Theseus#46: Similar updates to workflow shared-branch logic and the scripts module import/path handling.

Suggested Labels

bug, documentation

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰
Scripts hop in as modules, neat and spry,
Poetry hums as they leap and fly,
Imports aligned, path guard at the gate,
Data tucked safe while branches await,
A rabbit's cheer for a cleaner pipeline sky.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'update the doc, script and workflow' is vague and generic, using non-descriptive terms that don't convey the meaningful intent of the changes. Revise the title to be more specific about the primary change, such as 'Fix action failures by using environment variables for matrix repo and preserving dotfiles' or 'Use environment variables in workflow and update import paths for module execution'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 42-fix-action-failure-issue

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/theseus-engine.yml (1)

152-156: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Copy logic drops .status markers in the orphan-fix path.

cp -r data/* skips hidden directories, so data/.status can be lost during branch ancestry repair. That can make the later status-marker check report no data and skip PR updates.

Copy hidden and non-hidden data safely
-          cp -r data/* "$SAVE_DIR"/ 2>/dev/null || true
+          cp -a data/. "$SAVE_DIR"/ 2>/dev/null || true
@@
-          cp -r "$SAVE_DIR"/* data/ 2>/dev/null || true
+          cp -a "$SAVE_DIR"/. data/ 2>/dev/null || true

Also applies to: 167-175

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/theseus-engine.yml around lines 152 - 156, The copy
commands using "cp -r data/*" and "cp -r \"$SAVE_DIR\"/*" drop hidden files like
data/.status; change those copies to preserve hidden entries (e.g., use a form
that copies the directory contents including dotfiles such as "cp -a data/.
\"$SAVE_DIR\"/" and the reverse for "$SAVE_DIR" to data/) so data/.status is
never lost; update the two occurrences around the SAVE_DIR operations and keep
the mkdir -p and rm -rf logic unchanged.
scripts/run_pipeline.py (1)

45-169: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the blocking pylint errors in this file.

The current CI output shows too-many-locals, too-many-branches, too-many-statements, and import-outside-toplevel errors here, which blocks merge. Please either split orchestration into stage helpers + top-level imports, or add explicit targeted pylint disables where laziness is intentional.

Suggested minimal unblock (targeted disables)
-def run_pipeline(
+def run_pipeline(  # pylint: disable=too-many-locals,too-many-branches,too-many-statements
@@
-    from scripts.analyse_repository import (
+    from scripts.analyse_repository import (  # pylint: disable=import-outside-toplevel
         process_repository,
     )
@@
-    from scripts.add_fossils import backfill_fossils, update_survivor_fossils
+    from scripts.add_fossils import (  # pylint: disable=import-outside-toplevel
+        backfill_fossils,
+        update_survivor_fossils,
+    )
@@
-    import argparse
+    import argparse  # pylint: disable=import-outside-toplevel
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/run_pipeline.py` around lines 45 - 169, The run_pipeline function
currently triggers pylint errors (too-many-locals, too-many-branches,
too-many-statements, import-outside-toplevel); fix by either extracting stage
helpers or adding targeted disables: add a module- or function-level pylint
disable for the orchestration checks (e.g. on the run_pipeline def add "#
pylint:
disable=too-many-locals,too-many-branches,too-many-statements,import-outside-toplevel")
and/or move the dynamic imports (process_repository, backfill_fossils,
update_survivor_fossils) to top-level imports so import-outside-toplevel is
resolved; ensure references remain to run_pipeline, process_repository,
backfill_fossils, update_survivor_fossils and run_cleanup so callers are
unaffected.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/theseus-engine.yml:
- Line 55: The workflow is vulnerable because it interpolates matrix.repo
directly into the run line and the create-pr step uses cp -r data/* which omits
dotfiles like data/.status; update the pipeline invocation to pass matrix.repo
via environment (e.g., REPO_NAME) and reference $REPO_NAME in the run command
(so the value is not subject to shell expansion) and change the create-pr copy
logic to preserve dotfiles and metadata (replace cp -r data/* "$SAVE_DIR"/ with
a dotfile-aware copy such as cp -a data/. "$SAVE_DIR"/ or use rsync -a) so
data/.status is retained for the “Check for status markers” step.

---

Outside diff comments:
In @.github/workflows/theseus-engine.yml:
- Around line 152-156: The copy commands using "cp -r data/*" and "cp -r
\"$SAVE_DIR\"/*" drop hidden files like data/.status; change those copies to
preserve hidden entries (e.g., use a form that copies the directory contents
including dotfiles such as "cp -a data/. \"$SAVE_DIR\"/" and the reverse for
"$SAVE_DIR" to data/) so data/.status is never lost; update the two occurrences
around the SAVE_DIR operations and keep the mkdir -p and rm -rf logic unchanged.

In `@scripts/run_pipeline.py`:
- Around line 45-169: The run_pipeline function currently triggers pylint errors
(too-many-locals, too-many-branches, too-many-statements,
import-outside-toplevel); fix by either extracting stage helpers or adding
targeted disables: add a module- or function-level pylint disable for the
orchestration checks (e.g. on the run_pipeline def add "# pylint:
disable=too-many-locals,too-many-branches,too-many-statements,import-outside-toplevel")
and/or move the dynamic imports (process_repository, backfill_fossils,
update_survivor_fossils) to top-level imports so import-outside-toplevel is
resolved; ensure references remain to run_pipeline, process_repository,
backfill_fossils, update_survivor_fossils and run_cleanup so callers are
unaffected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 40f8a39c-8c5c-4bec-b086-551fddecf03a

📥 Commits

Reviewing files that changed from the base of the PR and between 98803a2 and 71cc56b.

📒 Files selected for processing (3)
  • .github/workflows/theseus-engine.yml
  • docs/CONFIGURATION.md
  • scripts/run_pipeline.py

Comment thread .github/workflows/theseus-engine.yml Outdated
- Pass matrix.repo via REPO_NAME env var instead of inline template
  interpolation for defense-in-depth (no shell expansion risk)
- Replace cp -r data/* with cp -a data/. to preserve dotfiles like
  data/.status during orphaned-branch ancestry fix
@Asifdotexe

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Asifdotexe Asifdotexe merged commit 6d12adf into main Jun 6, 2026
2 checks passed
@Asifdotexe Asifdotexe deleted the 42-fix-action-failure-issue branch June 6, 2026 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant