Skip to content

Reorganize repo; add project setup#2

Open
ptuan5 wants to merge 1 commit into
mainfrom
pr/1-restructure
Open

Reorganize repo; add project setup#2
ptuan5 wants to merge 1 commit into
mainfrom
pr/1-restructure

Conversation

@ptuan5

@ptuan5 ptuan5 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Huge structural changes!

Review README to get the overall view of the repo.
I intend to make it a mixture of R and python, building on Jake's pipeline and add several other utilities. Important changes:

  • Jake's R scripts moved to r/ and numbered by workflow step. There should be few/no changes within the scripts themselves.
  • Add Python QC script (equivalent to 2_qc_check) and QC config
  • Plans for new utilities (they will be reviewed as later PR, you can check other branches of this repo)
  • Add dependency management for R and Python

- R scripts moved to r/ and numbered by workflow step (2_qc_check, 3_combine_batches, 4_outliers, 5_heatmap)
- Python QC script and pose corner correction moved to python/
- QC config moved to config/QC_params.yaml
- Exploratory script moved to notebooks/explore_features.py
- pose_corner_correction.py: hardcoded paths replaced with --input_dir/--output_dir argparse args
- Add pyproject.toml, uv.lock, .python-version for Python dependency management
- Add renv.lock for R dependency management
- README: full rewrite with module overview table and per-script documentation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ptuan5 ptuan5 force-pushed the pr/1-restructure branch from aee51c9 to 1044d89 Compare June 11, 2026 19:32
Comment thread README.md
│ ├── 2_qc_check_cli.R same as above, CLI version for automation
│ ├── 3_combine_batches.R merge feature files + metadata into unified dataset
│ ├── 4_outliers.R outlier detection and QC figures (project-specific template)
│ ├── 5_heatmap.R phenotype correlation heatmaps

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the heatmap feature can be removed for now. The current version does not work and I'm not sure anyone is actually working on it. Jake may disagree though.

Comment thread README.md

### 5 — Heatmap · `r/5_heatmap.R`

Generates phenotype correlation heatmaps. Under active development.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above. Is this actually something we want to include in the standard pipeline?

@michberger michberger left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this structure works. It is good to have all of the Nextflow/JABS related utilities we've developed in the same place. Is there much difference between the Python and R versions of step 2 (QC check)? If this is the only step that is in Python version, how would someone move onto the next step? Would they have to switch to R?

Also, as I indicated in comments, do we want to keep the Heatmap feature in the repository or assume that would fall under individual post-processing analysis. I'm not sure how useful a big correlation image really is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants