🤖 J.A.R.V.I.S.

A fully-offline, movie-style voice assistant for Windows — English + Hindi, 24 skills, biometric security, hand & eye control, and Doctor-Strange-style hand magic.

Speech recognition runs locally (Vosk). It talks back, shows a live arc-reactor HUD, recognizes your face, obeys only your voice, and — with an optional AI brain — answers anything.

⚡ Highlights


🗣️ Fully offline speech	Vosk-powered recognition — no cloud needed for the core experience
🌐 Bilingual	Understands & replies in English and Hindi (native Hinglish, not translation)
🧠 Optional AI brain	Plug in a Claude API key for conversational answers with memory
🎬 Holographic HUD	Fullscreen sci-fi dashboard — 3D reactor, globe, live waveform, system gauges
👤 Face recognition	Greets you by name, self-learning, fully offline (OpenCV LBPH)
🔒 Voice lock	Obeys only your enrolled voiceprint — ignores everyone else
✋ Hand & eye control	Your hand becomes the mouse; or drive the cursor with your head/eyes
🪄 Hand magic	Doctor-Strange / Iron-Man gesture spells — portals, lightning, energy balls
🛠️ 24 voice skills	Apps, web, media, timers, maths, notes, system reports & more
📦 Standalone build	One click → a portable `.exe` you can share (no Python needed)

🚀 Quick Start

# 1. Install dependencies
py -3.13 -m venv .venv
.venv\Scripts\python.exe -m pip install -r requirements.txt

# 2. Download speech models (see table below)

# 3. Run
run.bat

When the reactor pulses blue, it's listening:

🎙️ "Jarvis, open chrome" · "Jarvis, system status" · "Jarvis, magic mode" · "Jarvis, speak Hindi"

Speech models (kept out of the repo — large binaries)

Language	Download	Extract into
English	vosk-model-small-en-in-0.4 (~36 MB)	`model\`
Hindi	vosk-model-small-hi-0.22 (~42 MB)	`model-hi\`

🎙️ Voice Skills

Fuzzy phrase matching — natural sentences work, you don't need exact words.

Say "Jarvis, ..."	Does
`open chrome / vs code / <any installed app>`	Opens it (auto-discovers Start Menu apps)
`search for <x>` / `youtube <x>` / `wikipedia <x>`	Web search
`volume up / mute`, `play / pause / next song`	Volume & media
`minimize / close window`, `switch app`, `show desktop`	Window control
`set a timer for 5 minutes` / `remind me in 10 minutes`	Timers & reminders (survive restart)
`what is 15 percent of 2000`	Maths & percentages
`take a note <x>` / `read my notes`	Notes
`what's the capital of japan` / any question	Looks it up, reads the answer aloud
`system status` / `battery` / `diagnostics`	Spoken CPU / RAM / battery report
`magic mode` / `iron man` / `doctor strange`	Fullscreen cinematic hand-magic ✨
`hand tracking` / `virtual mouse`	Hand becomes the mouse
`eye control`	Hands-free head/eye cursor
`speak hindi` / `speak english`	Switch language
`take a screenshot` / `type <words>`	Screenshot / dictation
`lock the computer` / `shutdown` / `restart`	System (confirms first)

…and more — see the full list in the sections below.

🪄 The gesture experiences

All launched by voice:

"magic mode" → a fullscreen, cinematic 3D magic window. Starts LOCKED — perform your secret key spell (default: fist → open palm → horns 🤘) to unlock. Then cast spells with real 3D depth: portals, tilted shields, lightning between your hands, a 3D energy ball you grow and throw, a force-field dome, and time-freeze. Only your enrolled voice can launch it.
"hand tracking" → your hand becomes the mouse (in-air and on-surface).
"eye control" → hands-free cursor; move your head, blink to click.

🎬 Holographic HUD

Say "Jarvis, show yourself" — the whole screen becomes a sci-fi dashboard: a 3D arc reactor with live voice spectrum, a live clock, real CPU/RAM/battery gauges, a rotating 3D wireframe globe, your live voice waveform, and a scrolling comms log. Esc/F to return.

🔐 Security

Voice lock — run enroll.bat once; JARVIS builds a 256-number voiceprint and from then on obeys only your voice (in testing: owner matched 0.99, others rejected at 0.45).
Face recognition — run enroll_face.bat; greets you by name, self-learns over time, fully offline (OpenCV LBPH). (Your enrolled face/voice data stays local and is git-ignored.)
Hidden settings — the API-key panel stays invisible until you speak a secret word ("override protocol").

🌐 English + Hindi

Say "speak Hindi" → JARVIS switches to natural Hinglish ("Mera naam JARVIS hai, sir", "Abhi 3 baj rahe hain") — replies, greetings, and hourly chimes all switch. It also understands Hindi commands ("awaaz badhao", "chrome chalu karo", "agla gaana") via a Hindi speech model running alongside the English one.

🧠 Optional AI brain

Say "override protocol" to unlock Settings, paste a Claude API key, and JARVIS answers any question conversationally — with short-term memory for follow-ups ("capital of Japan?" → "and its population?"). Without a key, offline DuckDuckGo/Wikipedia answers work fine. (The key is stored locally in config.json, which is git-ignored — copy config.json.example to start.)

🧩 Architecture

jarvis.py      → main app + window/UI loop
core.py        → speech recognition, TTS, command routing
skills.py      → all 24 voice skills (extend here)
brain.py       → optional Claude AI integration + conversation memory
answers.py     → offline web answers (DuckDuckGo / Wikipedia)
hud.py         → holographic fullscreen dashboard
boot.py        → cinematic boot sequence
faceauth.py    → offline face recognition (OpenCV LBPH)
voiceauth.py   → voiceprint speaker verification
proactive.py   → self-initiated greetings / chimes
reminders.py   → persistent timers & reminders
appindex.py    → Start-Menu app discovery

Add your own command: open skills.py, copy a Skill class, set its triggers, implement run(), register it in build_skills(). App aliases live in appindex.py.

📦 Standalone build

Double-click build.bat → dist\JARVIS\JARVIS.exe, a portable folder you can zip and share (no Python needed). See DISTRIBUTION.md.

_{Built by Gauransh Ahuja · Python · Vosk · OpenCV · 100% offline}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 J.A.R.V.I.S.

A fully-offline, movie-style voice assistant for Windows — English + Hindi, 24 skills, biometric security, hand & eye control, and Doctor-Strange-style hand magic.

⚡ Highlights

🚀 Quick Start

Speech models (kept out of the repo — large binaries)

🎙️ Voice Skills

🪄 The gesture experiences

🎬 Holographic HUD

🔐 Security

🌐 English + Hindi

🧠 Optional AI brain

🧩 Architecture

📦 Standalone build

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
DISTRIBUTION.md		DISTRIBUTION.md
README.md		README.md
answers.py		answers.py
appindex.py		appindex.py
boot.py		boot.py
brain.py		brain.py
build.bat		build.bat
config.json.example		config.json.example
core.py		core.py
enroll.bat		enroll.bat
enroll_face.bat		enroll_face.bat
enroll_face.py		enroll_face.py
enroll_voice.py		enroll_voice.py
faceauth.py		faceauth.py
hud.py		hud.py
jarvis.py		jarvis.py
jarvis.spec		jarvis.spec
paths.py		paths.py
proactive.py		proactive.py
reminders.py		reminders.py
requirements.txt		requirements.txt
run.bat		run.bat
skills.py		skills.py
voiceauth.py		voiceauth.py

Folders and files

Latest commit

History

Repository files navigation

🤖 J.A.R.V.I.S.

A fully-offline, movie-style voice assistant for Windows — English + Hindi, 24 skills, biometric security, hand & eye control, and Doctor-Strange-style hand magic.

⚡ Highlights

🚀 Quick Start

Speech models (kept out of the repo — large binaries)

🎙️ Voice Skills

🪄 The gesture experiences

🎬 Holographic HUD

🔐 Security

🌐 English + Hindi

🧠 Optional AI brain

🧩 Architecture

📦 Standalone build

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages