Add turnover recording, turnover-control analysis, and real-market check by weich97 · Pull Request #143 · weich97/TreLLM

weich97 · 2026-06-15T05:01:20Z

Reviewer-driven additions for the execution-sensitivity paper (01): per-run turnover recording, a turnover-controlled ranking-stability analysis (within-tercile tau 0.63 vs full-leaderboard 0.24 - turnover is the primary driver but does not fully explain the reordering), and a real-market hybrid check running deterministic agents on real Yahoo OHLCV across the execution ladder.

Test plan

ruff clean; turnover-control + real-market scripts run end-to-end

Reviewer-driven additions for the execution-sensitivity paper: - Execution sweep now records per-run turnover (turnover_events). - analyze_turnover_control.py: bins agents into turnover terciles and recomputes E0-vs-E1 Kendall tau within each bin. Within-tercile tau averages 0.63---higher than the full-leaderboard tau (0.24 in high volatility) but well below 1---so turnover is the primary driver of the directional effect yet does not fully explain the reordering. - run_execution_sensitivity_real.py: runs the deterministic agents on real Yahoo OHLCV across the same execution ladder to test whether the leaderboard reordering persists on empirical price/volume paths (no API access; deterministic and reproducible).

weich97 merged commit 96b9ec4 into main Jun 15, 2026
10 checks passed

weich97 deleted the exec-sensitivity-turnover-realmarket branch June 15, 2026 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add turnover recording, turnover-control analysis, and real-market check#143

Add turnover recording, turnover-control analysis, and real-market check#143
weich97 merged 1 commit into
mainfrom
exec-sensitivity-turnover-realmarket

weich97 commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

weich97 commented Jun 15, 2026

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant