Skip to content

重构分类流水线:提升可扩展性与可读性#1

Open
stap1e wants to merge 2 commits into
mainfrom
cursor/refactor-extensibility-8cbe
Open

重构分类流水线:提升可扩展性与可读性#1
stap1e wants to merge 2 commits into
mainfrom
cursor/refactor-extensibility-8cbe

Conversation

@stap1e

@stap1e stap1e commented May 28, 2026

Copy link
Copy Markdown
Owner

背景

原仓库中 classify_*.py 存在大量重复代码,路径硬编码在脚本内,且 K 折流程在划分前剥离了 NSE 列,导致无法做「K 折 + NSE」实验。

主要改动

模块化流水线(上一轮)

  • config/ExperimentConfigmodels/factory.pypipeline/experiment.py
  • 入口脚本精简为配置 + run_experiment()

路径配置外置(本轮)

  • config/experiments.yaml:定义 lab1 / single / kfold / kfold_nse 等 profile
  • 环境变量CLS_DATA_ROOTCLS_RESULTS_ROOTCLS_EXPERIMENT
  • config/load_config.py:解析 ${VAR:-default}{data_root} 占位符
  • 统一 CLIrun_experiment.py--profile / --list-profiles / -c 自定义 YAML

K 折 + NSE(本轮)

  • K 折改为对含 CPC(及 NSE)的原始行分层划分,每折再走完整 prepare_labeled_frames → LASSO → NSE 拼接
  • 新增 classify_kfold_nse.py 与 profile kfold_nse

使用示例

cd code/cls
pip install -r requirements.txt

export CLS_DATA_ROOT=/path/to/excel/data
export CLS_RESULTS_ROOT=/path/to/results

python run_experiment.py --list-profiles
python classify_lab1.py
python classify_kfold_nse.py
python run_experiment.py -p kfold_nse

Windows:

set CLS_DATA_ROOT=D:\thrid_beijing_hospital_data
python classify_kfold_nse.py

配置文件

编辑 code/cls/config/experiments.yaml 即可增删 profile,无需改 Python。

Open in Web Open in Cursor 

cursoragent and others added 2 commits May 28, 2026 15:43
- Extract shared training flow into pipeline/experiment.py
- Add ExperimentConfig and classifier factory for extensibility
- Slim classify_lab1, classify_single, classify_kfold entry scripts
- Centralize radiomics drop columns and add README/requirements
- Fix k-fold CPC labels to match lab1 (CPC 1-2=0, 3-5=1)

Co-authored-by: glik <stap1e@users.noreply.github.com>
- Add config/experiments.yaml with lab1, single, kfold, kfold_nse profiles
- Load paths via CLS_DATA_ROOT, CLS_RESULTS_ROOT and {data_root} placeholders
- Unified CLI (--profile, --list-profiles) via config/cli.py and run_experiment.py
- Fix k-fold to split raw rows before preprocessing so NSE is preserved per fold
- Add classify_kfold_nse.py entry script and PyYAML dependency

Co-authored-by: glik <stap1e@users.noreply.github.com>
@stap1e stap1e marked this pull request as ready for review May 28, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants