An accessible visual aid for people with visual impairments
-
Install Node.js (v14 or higher)
-
Install dependencies:
npm install
-
Create
.envfile with your API keys:GEMINI_API_KEY=your_key ELEVENLABS_API_KEY=your_key ELEVENLABS_VOICE_ID=your_voice_id -
Start the server:
npm start
-
Open browser to the URL shown in the terminal (ex.
http://localhost:3000orhttp://localhost:3001)
- Allow camera access when prompted
- Press SPACEBAR to activate listening (or click the "Scan" button)
- Say "Scan" to caption what the camera sees
- Caption will be displayed and spoken aloud
- Press SPACEBAR again to stop listening
- View previous scans by clicking the 📋 history button (bottom right)
- Click the ? button (bottom left) or press H or ? for help and keyboard shortcuts
- Press L to view scan log 📋 of previous entries
- SPACEBAR: Toggle voice recognition on/off
- H or ?: Open/close help panel with complete instructions
- L: Open/close scan log (history)
- ESC: Close any open panel (help or history)
- TAB: Navigate between controls
- ENTER: Activate focused button
- All controls have visible focus indicators (yellow outline)
- Press SPACEBAR once: Activates microphone (button turns green, starts listening)
- Audio feedback: High beep sound
- Screen reader announces: "Voice recognition activated"
- Say "Scan": Triggers image capture and captioning
- Audio feedback: Mid beep sound
- Screen reader announces: "Scan command detected"
- Press SPACEBAR again: Deactivates microphone (button turns blue, stops listening)
- Audio feedback: Low beep sound
- Screen reader announces: "Voice recognition deactivated"
- This toggle design is more accessible and easier to use
All major actions provide audio cues:
- 🔊 High beep: Listening activated
- 🔉 Mid beep: Scan command detected
- 🔈 Low beep: Listening deactivated
- ✓ Two ascending beeps: Caption successfully generated
- ✗ Descending beeps: Error occurred
- Opens automatically when you first load the site
- Click the ? button (bottom left) or press H or ? key to toggle help
- Includes:
- Quick start guide
- Complete keyboard shortcuts reference
- Button color explanations with visual indicators
- Audio feedback descriptions
- Usage tips
- Accessible with screen readers
- Full keyboard navigation
- Press ESC to close
This application is designed with accessibility as a core priority, following WCAG 2.1 Level AA guidelines:
- ✅ Full ARIA labels on all interactive elements
- ✅ Semantic HTML structure with proper roles
- ✅ Live region announcements for state changes
- ✅ Alternative text where appropriate
- ✅ Skip-to-content link for keyboard users
- ✅ Complete keyboard control (no mouse required)
- ✅ Visible focus indicators on all interactive elements
- ✅ Logical tab order
- ✅ Focus management (trapped in modals, restored on close)
- ✅ Keyboard shortcuts (Spacebar, H, ?, L, ESC)
- ✅ Built-in help panel with complete keyboard reference
- ✅ Beep sounds for mode changes (helps blind users know state)
- ✅ Text-to-speech for all captions
- ✅ Different tones for different actions
- ✅ Non-intrusive volume levels
- ✅ High contrast colors
- ✅ Large, easy-to-read text
- ✅ Clear visual state indicators (color + shape)
- ✅ Consistent UI patterns
- ✅ Yellow focus outlines for visibility
- ✅ NVDA screen reader (Windows)
- ✅ JAWS screen reader (Windows)
- ✅ Keyboard-only navigation
- ✅ High contrast mode
- All scans are automatically saved with their images and captions
- Click the 📋 button in the bottom-right corner to view history
- History is stored in your browser's local storage
- Delete individual items by clicking the 🗑️ button on each entry
- History persists across sessions (up to 50 items)
If you see a "network" error and the microphone button turns red:
- Check Internet Connection: Voice recognition requires an active internet connection as it uses cloud-based speech recognition services
- Retry: Press SPACEBAR to activate listening again once you have internet connection
- Button Colors:
- 🟢 Green: Microphone active (listening)
- 🟠 Orange: Speech detected
- 🔴 Red: Error occurred (network issue or microphone problem)
- 🔵 Blue: Ready (press spacebar to start)
- ⚪ Gray: Paused during scan/caption
- Press SPACEBAR once to start listening (button turns green)
- Speak clearly and say "scan" or "scanning"
- The word "scan" can be anywhere in your phrase (e.g., "please scan this")
- Press SPACEBAR again to stop listening when done
- If nothing happens, check browser console for errors
- Microphone not working: Check browser permissions and system microphone settings
- Camera not showing: Ensure camera permissions are granted in browser settings
- No audio playback: Check that your speakers/headphones are working and volume is up
Made by Team itsJohnSight for Stevens QuackHacks 2026 #WeAreJohnSnow