Centralize AI config and reorganize skills#837
Conversation
These directories contain exploratory/demo scripts that are not part of the main package and should not count toward docstring coverage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…TimeSeries Sensitivity checks like PlaceboInTime and OutcomeFalsification need to create fresh, unfitted copies of models. These _clone methods preserve all configuration (components, sample_kwargs, mode) while resetting fitted state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Time Adds a "random" selection_method that randomly samples eligible placebo windows from the pre-intervention period, with constraints on minimum training fraction, minimum gap between folds, and optional period exclusion. Also fixes the assurance simulation to correctly model the alternative hypothesis as null baseline noise + expected treatment effect (theta_new + expected_effect), matching the paper's formulation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces outcome falsification, which re-fits the experiment with alternative outcome formulas and reports their estimated effect sizes with HDI intervals. This is an informational check (no pass/fail) that lets researchers assess whether the pattern of effects across outcomes is consistent with their causal story. Inspired by the "causal detective" approach in Gallea (2026). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a developer notebook documenting the placebo-in-time and outcome falsification sensitivity check methodology with worked examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move agents, commands, and rules from .cursor/ to .github/ as the single source of truth. Both .cursor/ and .claude/ now consume these via symlinks. Add 4 scoped rules (core-code, testing, documentation, marimo) extracted from AGENTS.md and basic.mdc. Create CLAUDE.md as symlink to AGENTS.md. Remove 6 redundant causalpy_* command stubs replaced by the unified skill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate designing-experiments, performing-causal-analysis, and running-placebo-analysis into one causalpy-analysis skill that covers the full analysis workflow: method selection, model choice, fitting, results, sensitivity checks, and pipelines. All 9 experiment classes, 7 models, and 11 checks are documented with real API signatures. Reference files cover each method family in depth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract demo datasets from causalpy-analysis into standalone example-datasets skill — datasets are a learning/testing tool, not part of a real analysis workflow. Add feature-exploration skill for systematically exploring APIs through minimal reproducible examples. Remove old loading-datasets skill (replaced by example-datasets). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New skill for structured falsification of causal claims, inspired by the Causal Mindset Framework (Gallea, 2026). Guides users through a 5-phase investigation: frame the claim, hunt alternative explanations, design falsification tests, evaluate evidence, assess generalizability. Three specialized agents support the workflow: - threat-assessor: identifies confounders, reverse causation, bias - falsification-runner: maps threats to CausalPy checks and executes - evidence-synthesizer: weighs all evidence into a final verdict Reference files cover counterfactual analysis, a threat catalog, and mapping of all 10 CausalPy check classes to alternative explanations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
This should be merge only after: |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #837 +/- ##
==========================================
+ Coverage 93.77% 94.31% +0.53%
==========================================
Files 77 80 +3
Lines 11881 12265 +384
Branches 696 721 +25
==========================================
+ Hits 11142 11568 +426
+ Misses 546 501 -45
- Partials 193 196 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Closing this PR — the agent infrastructure work (centralising config in The remaining non-agentic changes (OutcomeFalsification check, random placebo fold selection, Thanks for the work here @cetagostini — the skill consolidation in particular is a big improvement. |
Closes #640 [EDIT BY BEN]
Summary
.github/— agents, rules, commands, and skills now live in.github/as the single source of truth. Both.cursor/and.claude/consume them via symlinks. One edit propagates to both tools.causalpy-analysisskill covering all 9 experiments, 7 models, and 11 checks.Changes
Infrastructure (Commit 1)
.cursor/agents/to.github/agents/.cursor/commands/to.github/commands/.github/rules/(core-code, testing, documentation, marimo).cursor/{agents,rules,commands}directories with symlinks to.github/.claude/{agents,rules,commands}symlinks to.github/CLAUDE.mdas symlink toAGENTS.mdcausalpy_*command stubs andbasic.mdcSkills (Commits 2-4)
causalpy-analysis/— unified analysis workflow with 9 reference filesexample-datasets/— standalone skill for demo datasets (separated from analysis)feature-exploration/— skill for exploring APIs through minimal reproducible examplescausal-detective/— falsification investigation skill with threat catalog, counterfactual analysis, and falsification test referencesAgents (Commits 1, 4)
ci-failure-investigatorandmerge-conflict-analyst(moved to.github/agents/)threat-assessor,falsification-runner,evidence-synthesizer(new)Test plan
ls -la .claude/agents/ .cursor/agents/CLAUDE.mdloads in Claude Code sessionscausal-detectiveskill triggers on falsification questions🤖 Generated with Claude Code