cli: add option to not get the all-altloc selection string from find_altloc_selections.py#221
Conversation
|
Warning Review limit reached
More reviews will be available in 50 minutes and 56 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdds an Changesinclude_all_altlocs flag and integration tests
docker-entrypoint.sh whitespace fix
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds a CLI-controlled option to suppress the “all altloc residues per chain” selection emitted by find_altloc_selections(), enabling workflows that only want span-based selections.
Changes:
- Extend
find_altloc_selections()withinclude_all_altlocsto optionally omit the final per-chain “all altlocs” selection. - Add
--no-all-altlocstoscripts/eval/find_altloc_selections.pyto expose the behavior via CLI. - Minor formatting/maintenance updates (docs whitespace,
tyrule ordering, lockfile hash, trailing whitespace).
Reviewed changes
Copilot reviewed 3 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/sampleworks/utils/cif_utils.py |
Adds include_all_altlocs flag and gates emission of the final per-chain selection. |
scripts/eval/find_altloc_selections.py |
Wires CLI flag through to find_altloc_selections() and updates row processing. |
scripts/eval/EVALUATION.md |
Whitespace/formatting cleanup only. |
pyproject.toml |
Reorders tool.ty.rules entries (no functional behavior change expected). |
pixi.lock |
Updates local package hash due to changes. |
docker-entrypoint.sh |
Removes trailing whitespace in help text. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Spans of altlocs shorter than this are not yielded as selection strings, but ARE | ||
| included in the final selections which includes all residues with altlocs in each chain. | ||
| include_all_altlocs : bool | ||
| If True (default), yield a final per-chain selection string containing all residues | ||
| with altlocs regardless of span length. |
There was a problem hiding this comment.
Should update this too.
| @@ -38,6 +41,9 @@ def find_altloc_selections( | |||
| Minimum number of consecutive residues to consider an altloc selection. | |||
| Spans of altlocs shorter than this are not yielded as selection strings, but ARE | |||
| included in the final selections which includes all residues with altlocs in each chain. | |||
| include_all_altlocs : bool | |||
| If True (default), yield a final per-chain selection string containing all residues | |||
| with altlocs regardless of span length. | |||
|
|
|||
| Yields | |||
| ------ | |||
| @@ -72,12 +78,13 @@ def find_altloc_selections( | |||
| # FIXME use new style selection https://github.com/diff-use/sampleworks/issues/56 | |||
| yield f"chain {chain} and resi {start}-{end}" # old style, more compact, selection | |||
|
|
|||
| if chain not in all_altloc_selections: | |||
| all_altloc_selections[chain] = [] | |||
| if start == end: | |||
| all_altloc_selections[chain].append(f"(res_id == {start})") | |||
| else: | |||
| all_altloc_selections[chain].append(f"(res_id >= {start} and res_id <= {end})") | |||
| if include_all_altlocs: | |||
| if chain not in all_altloc_selections: | |||
| all_altloc_selections[chain] = [] | |||
| if start == end: | |||
| all_altloc_selections[chain].append(f"(res_id == {start})") | |||
| else: | |||
| all_altloc_selections[chain].append(f"(res_id >= {start} and res_id <= {end})") | |||
| find_altloc_selections(cif_file, altloc_label, min_span, include_all_altlocs) | ||
| ) | ||
| if not selections: | ||
| logger.warning(f"No altlocs found for {cif_file}") |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/sampleworks/utils/cif_utils.py (1)
40-46:⚠️ Potential issue | 🟡 MinorDocstring now overstates short-span inclusion behavior.
The
min_spandescription still reads as unconditional inclusion in final selections, but this is now conditional oninclude_all_altlocs=True. Please align this text to prevent API confusion.📝 Proposed docstring fix
min_span : int Minimum number of consecutive residues to consider an altloc selection. - Spans of altlocs shorter than this are not yielded as selection strings, but ARE - included in the final selections which includes all residues with altlocs in each chain. + Spans shorter than this are not yielded as individual span selections. + When ``include_all_altlocs`` is True, they are still included in the final + per-chain aggregate selections.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/sampleworks/utils/cif_utils.py` around lines 40 - 46, Update the docstring for the parameters min_span and include_all_altlocs in src/sampleworks/utils/cif_utils.py to clarify behavior: state that spans of altlocs shorter than min_span are not yielded as selection strings, and that those short spans will only be included in the final per-chain selection string if include_all_altlocs is True; mention both parameter names (min_span, include_all_altlocs) so the maintainer can locate the docstring to adjust the wording accordingly.
🧹 Nitpick comments (1)
scripts/eval/find_altloc_selections.py (1)
9-11: Add a NumPy-style docstring to_process_row().This function is modified in this PR but still lacks a NumPy-style docstring, and it has an observable side effect (warning log when selections are empty).
📚 Proposed docstring addition
def _process_row( row: pd.Series, altloc_label: str, min_span: int, include_all_altlocs: bool ) -> pd.Series: + """Convert one input row into the output selection schema. + + Parameters + ---------- + row : pd.Series + Input row with structure and map metadata. + altloc_label : str + CIF altloc field name. + min_span : int + Minimum span length for yielded altloc segments. + include_all_altlocs : bool + Whether to include per-chain aggregate altloc selections. + + Returns + ------- + pd.Series + Output row used by downstream evaluation scripts. + + Notes + ----- + Logs a warning when no altloc selection is found. + """As per coding guidelines, "Always include NumPy-style docstrings for every function and class."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/eval/find_altloc_selections.py` around lines 9 - 11, Add a NumPy-style docstring to the function _process_row describing its purpose, parameters (row: pd.Series, altloc_label: str, min_span: int, include_all_altlocs: bool), return type (pd.Series) and behavior; explicitly document the observable side effect that it may emit a warning log when selections are empty and any exceptions or edge cases (e.g., empty inputs or filtered results). Keep the docstring in NumPy style with short summary, Parameters, Returns, and Notes/Warnings sections and reference the function's behavior on empty selections so callers know about the logging side effect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/sampleworks/utils/cif_utils.py`:
- Around line 40-46: Update the docstring for the parameters min_span and
include_all_altlocs in src/sampleworks/utils/cif_utils.py to clarify behavior:
state that spans of altlocs shorter than min_span are not yielded as selection
strings, and that those short spans will only be included in the final per-chain
selection string if include_all_altlocs is True; mention both parameter names
(min_span, include_all_altlocs) so the maintainer can locate the docstring to
adjust the wording accordingly.
---
Nitpick comments:
In `@scripts/eval/find_altloc_selections.py`:
- Around line 9-11: Add a NumPy-style docstring to the function _process_row
describing its purpose, parameters (row: pd.Series, altloc_label: str, min_span:
int, include_all_altlocs: bool), return type (pd.Series) and behavior;
explicitly document the observable side effect that it may emit a warning log
when selections are empty and any exceptions or edge cases (e.g., empty inputs
or filtered results). Keep the docstring in NumPy style with short summary,
Parameters, Returns, and Notes/Warnings sections and reference the function's
behavior on empty selections so callers know about the logging side effect.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8f537b4e-5d11-48a4-8fdc-9852fde03f03
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
docker-entrypoint.shpyproject.tomlscripts/eval/EVALUATION.mdscripts/eval/find_altloc_selections.pysrc/sampleworks/utils/cif_utils.py
|
I will add some tests, converting to draft |
17e624f to
146d29c
Compare
146d29c to
81f1609
Compare
marcuscollins
left a comment
There was a problem hiding this comment.
It's probably worth addressing the Copilot suggestions, but otherwise looks good to me.
| find_altloc_selections(cif_file, altloc_label, min_span, include_all_altlocs) | ||
| ) | ||
| if not selections: | ||
| logger.warning(f"No altlocs found for {cif_file}") |
| Spans of altlocs shorter than this are not yielded as selection strings, but ARE | ||
| included in the final selections which includes all residues with altlocs in each chain. | ||
| include_all_altlocs : bool | ||
| If True (default), yield a final per-chain selection string containing all residues | ||
| with altlocs regardless of span length. |
There was a problem hiding this comment.
Should update this too.
81f1609 to
cc91e38
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/eval/test_find_altloc_selections_script.py (1)
20-27: ⚡ Quick winAdd NumPy-style docstrings to helper functions and fixture.
As per coding guidelines, every function and class should have a NumPy-style docstring. The following are missing docstrings:
_load_script(lines 20-27): Should document that it dynamically imports the script module by file path for testing without requiring installation.find_altloc_script(lines 30-32): Should document that it provides the loaded script module._make_args(lines 58-72): Should document that it constructs anargparse.Namespacefor test invocation.This is inconsistent with
_altloc_input_csv(lines 35-55), which does include a docstring.Also applies to: 30-32, 58-72
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/eval/test_find_altloc_selections_script.py` around lines 20 - 27, Add NumPy-style docstrings to three functions that currently lack documentation. In the _load_script function, document that it dynamically imports the script module from a file path to enable testing without requiring installation. In the find_altloc_script function, document that it provides the loaded script module as a fixture. In the _make_args function, document that it constructs an argparse.Namespace object for test invocation. Follow the same docstring format and style used in the existing _altloc_input_csv function to maintain consistency across the test file.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/eval/test_find_altloc_selections_script.py`:
- Around line 20-27: Add NumPy-style docstrings to three functions that
currently lack documentation. In the _load_script function, document that it
dynamically imports the script module from a file path to enable testing without
requiring installation. In the find_altloc_script function, document that it
provides the loaded script module as a fixture. In the _make_args function,
document that it constructs an argparse.Namespace object for test invocation.
Follow the same docstring format and style used in the existing
_altloc_input_csv function to maintain consistency across the test file.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b203d5f7-78a6-40ae-8374-049fe821e8ab
📒 Files selected for processing (4)
docker-entrypoint.shscripts/eval/find_altloc_selections.pysrc/sampleworks/utils/cif_utils.pytests/eval/test_find_altloc_selections_script.py
✅ Files skipped from review due to trivial changes (1)
- docker-entrypoint.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- scripts/eval/find_altloc_selections.py
- src/sampleworks/utils/cif_utils.py
cc91e38 to
7051383
Compare
Summary by CodeRabbit
New Features
--no-all-altlocsCLI option to exclude per-chain aggregate selections from evaluation script output, retaining only span-based selections instead.Bug Fixes
Tests