Behavioral keystroke authentication system that combines standard password hashing (argon2id) with typing rhythm analysis to detect whether the person typing a password is the legitimate user.
npm install
npm start
# Open http://localhost:3456 in Chrome
npm test # Run engine unit testsRequires Node.js 18+. All data stays local (SQLite stored in ./data/).
When a user types their password, the system captures how they type — not just what they type. It builds a behavioral profile from timing data: how long each key is held, how fast they transition between specific character pairs, their acceleration/deceleration rhythm, pause patterns, and physical key distance on a QWERTY keyboard. On subsequent logins, it compares the new typing sample against the stored profile and produces a similarity score (0-100%).
If the score is above the tolerance band threshold, the user is authenticated via behavioral match alone. If below, 2FA would be triggered (in production — the prototype flags it but still logs you in for testing purposes).
keystroke-auth-prototype/
server.js Express server (port 3456), API routes, rate limiting, sessions
engine.js Behavioral analysis engine (pure math, no dependencies)
db.js SQLite database layer (sql.js, persists to ./data/)
public/
index.html Single-page frontend (3 tabs: Login, Train, Diagnostics)
capture.js Client-side keystroke capture SDK v2
test/
engine.test.js Unit tests for engine.js (63 tests)
-
Capture (
capture.jsv2): Attaches to input fields, records keydown/keyup timestamps withevent.key(character) andevent.code(physical key). Computes character-pair digraphs (t->e), per-character dwell times, trigraphs, speed curves. -
Transport (
server.js): SDK data goes directly to the engine — no adapter or translation layer. Rate-limited endpoints with session expiry. -
Analysis (
engine.jsv2): Extracts features including QWERTY physical key distance normalization. Compares against stored profile using Gaussian similarity on z-scores, produces a weighted composite score. -
Storage (
db.js): Profiles (raw samples), auth events, and training samples stored in SQLite via sql.js (pure JS, no native bindings).
- Character-pair analysis: Digraphs keyed by actual characters (
t->e) instead of positions (0->1). Enables cross-password pattern recognition. - QWERTY key distance: Full US keyboard physical coordinate map. Flight times normalized by physical distance between keys.
- Raw sample storage: Last 25 samples stored raw (replaces lossy EMA). Stats computed dynamically. No information loss.
- No adapter: SDK data goes directly to engine. Eliminated the lossy
sdkToEngine()translation layer. - Rate limiting: Per-endpoint limits (auth: 30/15min, training: 20/min, general: 60/min).
- Session expiry: 1-hour TTL with
crypto.randomBytestokens. Automatic cleanup. - Hard anomaly detection: Runs as a gate on both training and login (not just informational).
- 63 unit tests: Full coverage of engine functions.
The similarity score is a weighted average of 6 features:
| Feature | Weight | What It Measures |
|---|---|---|
| Character digraphs | 25% | Time between specific character pairs (e.g., t->e) |
| Distance-normalized digraphs | 15% | Flight time / physical key distance on QWERTY |
| Key hold duration | 25% | Per-character dwell time |
| Speed curve/rhythm | 15% | Acceleration/deceleration pattern across the string |
| Overall speed | 10% | Characters per second |
| Pause pattern | 10% | Total duration + backspace rate |
For each feature:
- Compute the difference between the sample value and the profile's stored mean
- Convert to a z-score:
z = (value - mean) / std - Convert z to similarity via Gaussian:
similarity = e^(-0.5 * z^2)- z=0 (exact match): 100%
- z=0.5: 88%
- z=1.0: 61%
- z=2.0: 14%
- z=3.0: 1%
The std used in step 2 has an adaptive floor that decreases as the profile matures:
- 1-2 samples: 42-40ms floor (generous -- avoids false rejections)
- 5 samples: 32ms
- 10 samples: 20ms
- 13+ samples: 12ms (trusts the real learned variance)
This is how more data makes the model more discriminating.
The engine contains a full US QWERTY keyboard physical coordinate map with proper row stagger offsets:
Row 0 (number): ` 1 2 3 4 5 6 7 8 9 0 - = (y=0, x starts at 0)
Row 1 (QWERTY): q w e r t y u i o p [ ] \ (y=1, x offset 1.5)
Row 2 (home): a s d f g h j k l ; ' (y=2, x offset 1.75)
Row 3 (bottom): z x c v b n m , . / (y=3, x offset 2.25)
Row 4 (space): [space] (y=4, centered)
Distance = Euclidean between key center coordinates in key-width units. This normalizes digraph flight times: t->e (adjacent, distance ~1.0) vs t->p (far apart, distance ~5.0) naturally have different timing expectations.
Profiles store the last 25 raw feature extractions. Stats (mean, std) are computed dynamically when needed:
profile = {
samples: [feature1, feature2, ...], // Last 25 raw extractions
totalSampleCount: 28, // Lifetime count (includes evicted)
lastUpdated: "2026-02-12T..."
}
Benefits over EMA:
- No information loss from exponential decay
- Can recompute stats with different parameters without retraining
- Outliers don't permanently corrupt the profile
- Full variance information preserved
Not every sample updates the profile. The adaptive merge threshold prevents bad samples from corrupting it:
mergeThreshold = min(0.40, 0.20 + sampleCount * 0.025)
- Young profiles (< 5 samples): accept scores >= 20%
- Mature profiles (8+): require >= 40%
- Anomaly-flagged samples: never merged regardless of score
The tolerance band is the minimum score needed to pass behavioral authentication:
band = 0.55 - 0.40 * e^(-0.15 * n)
| Samples | Band | Meaning |
|---|---|---|
| 1 | 19% | Very lenient -- profile barely exists |
| 5 | 35% | Building -- still accepting wide variance |
| 10 | 47% | Moderate -- catches clearly abnormal typing |
| 15 | 52% | Near plateau |
| 20+ | ~54% | Effectively at maximum |
Maximum is 55%. Normal typing scores 80-85%, so this leaves a 25-30% margin. Abnormal/deliberate typing typically scores 30-50%.
These are firm limits that reject samples outright (both training and login):
| Check | Threshold | What It Catches |
|---|---|---|
| Superhuman speed | Any inter-key interval < 25ms | Automated injection, scripts |
| Perfect periodicity | All intervals within 3ms of each other | Bots, replay attacks |
| Exact replay | All intervals identical | Literal recording playback |
| Bot CV | Coefficient of variation < 0.05 | Near-zero timing variation |
Anomaly detection runs before scoring. Flagged samples are scored for diagnostics but never merged into the profile and never pass behavioral auth.
All API endpoints are rate-limited via express-rate-limit:
| Endpoint | Limit | Window |
|---|---|---|
/api/register, /api/verify |
30 requests | 15 minutes |
/api/train |
20 requests | 1 minute |
| All other endpoints | 60 requests | 1 minute |
- Tokens generated via
crypto.randomBytes(24)(base64url encoded) - 1-hour TTL per session
- Automatic cleanup of expired sessions every 10 minutes
- In-memory Map (prototype limitation -- lost on restart)
All endpoints accept/return JSON. Auth token passed via Authorization: Bearer <token> header.
| Method | Path | Auth | Rate Limit | Purpose |
|---|---|---|---|---|
| POST | /api/register |
No | 30/15min | Create account. Body: {email, password, typingData} |
| POST | /api/verify |
No | 30/15min | Login attempt. Body: {email, password, typingData} |
| POST | /api/train |
Yes | 20/min | Submit training sample. Body: {password, typingData} |
| POST | /api/profile |
Yes | 60/min | Get current typing profile |
| POST | /api/auth-events |
Yes | 60/min | Get auth event history |
| POST | /api/training-history |
Yes | 60/min | Get training sample history |
/api/verify returns:
authenticated: boolean -- whether behavioral check passedmethod:behavioral_match|2fa_required|2fa_required_anomaly|password_only|profile_buildingbehavioralScore: 0.00-1.00 similarity scorescoreDetails: per-feature breakdown (digraphScore, distNormScore, dwellScore, rhythmScore, speedScore, pauseScore)toleranceBand: current threshold for this profilebotCheck: anomaly detection result
/api/train returns:
similarityScore: how this sample compared to the existing profileprofileUpdated: whether the sample was merged into the profileanomalyRejected: whether hard anomaly detection blocked the sampleanomalyReason: human-readable explanation if rejected
Client-side IIFE attached to window.KeystrokeCapture.
API:
attach(element)-- Start capturing on an input elementdetach(element)-- Stop capturinggetData(element)-- Get captured typing datareset(element)-- Clear captured datagetRealtimeDiagnostics(element)-- Live metrics for UI display
Captured Data Shape (v2):
{
"keystrokes": [{"position": 0, "key": "t", "code": "KeyT", "dwellTime": 85, "timestamp": 100}],
"digraphs": [{"keys": "t->e", "codes": "KeyT->KeyE", "flightTime": 45, "keyDownToKeyDown": 130}],
"trigraphs": [{"keys": "t->e->s", "totalTime": 260, "rhythm": [130, 130]}],
"dwellTimes": [{"position": 0, "key": "t", "duration": 85}],
"pauses": [{"afterPosition": 3, "duration": 250}],
"backspaceCount": 0,
"shiftHoldPatterns": [{"duration": 120}],
"totalDuration": 1100,
"overallWPM": 44,
"typingSpeedCurve": [130, 125, 135],
"deviceType": "desktop"
}Characters are lowercased for consistent keying. Uses event.key for character identity, event.code for physical key identity, filters event.repeat, uses WeakMap for per-element state.
SQLite via sql.js (pure JavaScript, no native bindings). Persisted to ./data/keystroke-auth.db.
- users: id (UUID), email, password_hash (argon2id), created_at
- profiles: user_id (FK), profile_data (JSON), sample_count, confidence, updated_at
- auth_events: user_id, event_type, similarity_score, passed_behavioral, device_type, details (JSON), timestamp
- training_samples: user_id, typing_data (JSON), similarity_score, score_details (JSON), timestamp
Argon2id with OWASP-recommended parameters:
- Memory: 64MB (
memoryCost: 65536) - Iterations: 3 (
timeCost: 3) - Parallelism: 4
63 unit tests covering:
- Utility functions (mean, std, Gaussian similarity)
- QWERTY key positions and distances
- Feature extraction from SDK data
- Profile creation, sample addition, storage caps
- Stats computation (keyed, scalar, speed curve)
- Comparison functions (keyed features, speed curve, scalar)
- Full compareSample integration
- Tolerance band calculation
- Anomaly detection (normal, superhuman, periodic, replay, low CV)
npm test- In-memory sessions: Token store is a JS Map, lost on server restart
- No real 2FA: Behavioral failures are flagged but don't trigger actual SMS/email verification
- Token always issued on correct password: In production, behavioral failure would block token issuance until 2FA completes
- Single-user SQLite: Not suitable for concurrent production use
- No HTTPS: Runs over HTTP on localhost only
- Hand-tuned weights: Feature weights are manually set, not learned from data
- Gaussian assumption: Real typing distributions may be skewed or multimodal
express^4.21.2 -- HTTP serverexpress-rate-limit^8.2.1 -- API rate limitingargon2^0.41.1 -- Password hashing (argon2id)sql.js^1.12.0 -- Pure JS SQLite (no native build required)uuid^11.0.5 -- User ID generation