reinforce-cursor

Offline reinforcement learning agent that synthesizes human-like mouse cursor trajectories, trained on real user telemetry inside a MuJoCo physics simulation. Evaluated under adversarial classification with 18 kinematic features.

Method

Data: 45,775 mouse trajectories from the Balabit Mouse Dynamics Challenge (10 users)
Environment: MuJoCo 3.x rigid-body simulation — 2D force-controlled mouse on a desk surface
Training: Implicit Q-Learning (IQL) via d3rlpy on offline dataset with biomechanical reward shaping
Evaluation: 4 detectors — DELBOT RNN, GradientBoosting (Human vs synthetic bots), One-Class SVM (18 kinematic features), Adversarial GradBoost (Human vs Agent)

Reward Design

Reach shaping: distance penalty + exponential approach bonus (3·exp(-d/0.015))
Velocity damping: penalizes motion near target to prevent oscillation
Fitts' Law speed shaping: per-step penalty for deviating from Fitts-predicted progress curve
Kinematic constraints: acceleration/jerk penalties, effort cost, velocity band matching
Episode rewards: Fitts compliance, velocity profile correlation, path efficiency, submovement count

Evaluation Protocol

Detection rates are reported per source (Human, Agent, Linear Bot, Bezier Bot) across 4 detectors with multi-seed validation and k-fold cross-validation.

Repository Layout

bmds/ — core package (data, environment, reward, training, utilities)
scripts/ — evaluation and visualization (05–12)
run_training.py — end-to-end pipeline (download → dataset → train → evaluate)

Setup

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
npm install

Quick Start

Full pipeline:

python run_training.py

Train IQL on existing dataset:

python run_training.py --skip-download --skip-dataset-build --algorithm iql --steps 100000

Run the 4-detector gauntlet:

python scripts/11_multi_detector_gauntlet.py --n-movements 100 --seed 42

Dependencies

MuJoCo 3.x, d3rlpy 1.1.x, PyTorch, Gymnasium, scikit-learn, NumPy, SciPy, Node.js (DELBOT RNN)

References

Fitts, P. M. (1954). The information capacity of the human motor system. J. Exp. Psych., 47(6).
Flash, T. & Hogan, N. (1985). The coordination of arm movements. J. Neuroscience, 5(7).
Balabit Mouse Dynamics Challenge (2016). github.com/balabit/Mouse-Dynamics-Challenge
Seno, T. & Imai, M. (2022). d3rlpy: An offline deep RL library. JMLR, 23(315).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
bmds		bmds
output/visualizations		output/visualizations
scripts		scripts
.gitignore		.gitignore
README.md		README.md
github_topics.txt		github_topics.txt
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_training.py		run_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reinforce-cursor

Method

Reward Design

Evaluation Protocol

Repository Layout

Setup

Quick Start

Dependencies

References

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

reinforce-cursor

Method

Reward Design

Evaluation Protocol

Repository Layout

Setup

Quick Start

Dependencies

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages