GCAFS-algorithm

GCAFS: Graph Cost-Aware Feature Selection for Inference-Optimized Gait-Based Human Activity Recognition

Feature selection for high-dimensional sensor-based Human Activity Recognition (HAR) datasets presents a two-fold challenge; selecting features that maximise classification performance while minimising the computational cost of deriving them at inference time. Conventional filter methods such as Mutual Information (MI) ranking treat every feature as having an equal or unmodelled extraction cost, whereas the UCI-HAR dataset contains 561 features derived from a hierarchical signal-processing pipeline, where many features share intermediate transformations and therefore have meaningful cost pooling. This paper presents GCAFS: a Graph Cost-Aware Feature Selection algorithm that explicitly models the UCI-HAR feature-derivation pipeline as a Directed Acyclic Graph (DAG) and incorporates dynamic, context-sensitive costs and redundancy into a greedy feature-ranking procedure.

The DAG encodes eighteen signal nodes, including raw acceleration, gyroscope, jerk, and their frequency-domain counterparts, and assigns heuristic operation costs to twenty-three primitive transformations such as FFT (10,304), Butterworth filtering (3,328), and simpler statistics such as Mean (321). Dynamic feature importance is computed at each selection step by discounting a candidate feature's normalised MI score by its Spearman-correlation-based redundancy with already-selected features, and its marginal DAG cost. This formulation is fully deterministic and operates in linear time per step.

Applying GCAFS, with a certain hyperparameter configuration, to the UCI-HAR training split (7,352 windows, 554 retained features after cleaning) yields 34 selected features with a total DAG computation cost of 28,615; a 12.1% reduction relative to a naive top-34 MI baseline (32,554) and a 36.1% reduction relative to mRMR (44,791). On the held-out test set (2,947 windows, 30 subjects), the GCAFS feature subset achieves 89.1% accuracy with a Linear SVM, outperforming the MI baseline by 6.9% and the mRMR baseline by 5.0% under subject-aware 5-fold cross-validation. These results demonstrate that modelling the shared-transformation structure of feature derivation enables simultaneously cheaper and more discriminative feature subsets.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GCAFS-algorithm

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GCAFS-algorithm

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages