Skip to content

FeatureBased: train/test leakage in per-user leave-one-out cross-validation #17

Description

@jonfroehlich

Found in the #16 correctness/prose/comments audit (verified by reading the cells + tracing pandas semantics).

In Projects/GestureRecognizer/GestureRecognizer-FeatureBased.ipynb, the per-user "leave-one-trial-out" cross-validation (the cell that selects a test gesturer, and the loop-over-all-gesturers cell right after it) leaks the held-out test rows into the training set, so the reported cross-user accuracies are inflated.

just_test_gesturer = df.loc[df['gesturer'] == "JonGestures"]   # keeps df's original index labels (e.g. 550..599)
...
for train_index, test_index in skf.split(just_test_gesturer, just_test_gesturer_y_true):
    df_training = df.drop(test_index)              # <-- BUG
    ...
    X_test = just_test_gesturer.iloc[test_index]   # positional (correct)

test_index from StratifiedKFold.split is positional into just_test_gesturer (0..N-1). But df.drop(test_index) drops by index label, and df has a default RangeIndex, so it removes rows whose labels are 0,1,5,... — i.e. rows belonging to whichever gesturer sits at those positions, not the held-out test rows. Jon's actual test rows (labels ~550-599) are therefore kept in df_training, while some other gesturer's rows are dropped. Result: the model trains on the very rows it is scored against.

Fix

Drop by the test rows' real labels, e.g.:

test_labels = just_test_gesturer.index[test_index]
df_training = df.drop(test_labels)

(and update the comment "everything but the test indices for this fold", which is currently false).

While here: these CV cells use StratifiedKFold(..., shuffle=True, random_state=None), so results change every run — consider seeding for a reproducible narrative.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions